Efficient Reductions for Inverse Reinforcement Learning

Efficient Reductions for Inverse Reinforcement Learning

Abstract: Interactive approaches to imitation learning like inverse reinforcement learning (IRL) have become the preferred approach for problems that range from autonomous driving to mapping. Despite its impressive empirical performance, robustness to compounding errors + causal confounders, and sample efficiency, IRL comes with a strong computational burden: the requirement to repeatedly solve a reinforcement learning (RL) problem in the inner loop. If we pause and take a step back, this is rather odd: we’ve reduced the easier problem of imitation to the harder problem of RL. In this talk, we will discuss a new paradigm for IRL that leverages a more informed reduction to expert competitive RL (rather than to globally optimal RL), allowing us to provide strong guarantees at a lower computational cost. Specifically, we will present a trifecta of efficient algorithms for IRL that use information from the expert demonstrations during RL to curtail unnecessary exploration, allowing us to dramatically speed up the overall procedure, both in theory and practice.

Bio: Gokul Swamy is a 4th year PhD student in the Robotics Institute at Carnegie Mellon University, where he works with Drew Bagnell and Steven Wu. His research centers on efficient algorithms for interactive learning from implicit human feedback (e.g. imitation learning, reinforcement learning from human feedback, learning without access to full state information) and builds on techniques from RL, game theory, and causal inference. He has spent summers at Google Research, MSR, NVIDIA, Aurora, and SpaceX and holds M.S. / B.S. degrees from UC Berkeley.