Agnostic Interactive Imitation Learning: New Theory and Practical Algorithms

Yichen Li, Chicheng Zhang

Research output: Contribution to journalConference articlepeer-review

Abstract

We study interactive imitation learning, where a learner interactively queries a demonstrating expert for action annotations, aiming to learn a policy that has performance competitive with the expert, using as few annotations as possible. We focus on the general agnostic setting where the expert demonstration policy may not be contained in the policy class used by the learner. We propose a new oracle-efficient algorithm MFTPL-P (abbreviation for Mixed Follow the Perturbed Leader with Poisson perturbations) with provable finite-sample guarantees, under the assumption that the learner is given access to samples from some “explorative” distribution over states. Our guarantees hold for any policy class, which is considerably broader than prior state of the art. We further propose BOOTSTRAP-DAGGER, a more practical variant that does not require additional sample access. Empirically, MFTPL-P and BOOTSTRAP-DAGGER notably surpass online and offline imitation learning baselines in continuous control tasks.

Original languageEnglish (US)
Pages (from-to)29278-29323
Number of pages46
JournalProceedings of Machine Learning Research
Volume235
StatePublished - 2024
Externally publishedYes
Event41st International Conference on Machine Learning, ICML 2024 - Vienna, Austria
Duration: Jul 21 2024Jul 27 2024

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Agnostic Interactive Imitation Learning: New Theory and Practical Algorithms'. Together they form a unique fingerprint.

Cite this