Video URL

https://old.simons.berkeley.edu/talks/game-theoretic-approach-offline-reinforcement-learning

A Game-Theoretic Approach to Offline Reinforcement Learning

(2022). A Game-Theoretic Approach to Offline Reinforcement Learning. The Simons Institute for the Theory of Computing. https://old.simons.berkeley.edu/talks/game-theoretic-approach-offline-reinforcement-learning

A Game-Theoretic Approach to Offline Reinforcement Learning. The Simons Institute for the Theory of Computing, Oct. 11, 2022, https://old.simons.berkeley.edu/talks/game-theoretic-approach-offline-reinforcement-learning

          @misc{ scivideos_22743,
            doi = {},
            url = {https://old.simons.berkeley.edu/talks/game-theoretic-approach-offline-reinforcement-learning},
            author = {},
            keywords = {},
            language = {en},
            title = {A Game-Theoretic Approach to Offline Reinforcement Learning},
            publisher = {The Simons Institute for the Theory of Computing},
            year = {2022},
            month = {oct},
            note = {22743 see, \url{https://scivideos.org/simons-institute/22743}}
          }

Ching-An Cheng (Microsoft Research)

October 11, 2022

Talk number22743

Source RepositorySimons Institute

Subject

Computer Science

Abstract

Offline reinforcement learning (RL) is a paradigm for designing agents that can learn from existing datasets. Because offline RL can learn policies without collecting new data or expensive expert demonstrations, it offers great potentials for solving real-world problems. However, offline RL faces a fundamental challenge: oftentimes data in real world can only be collected by policies meeting certain criteria (e.g., on performance, safety, or ethics). As a result, existing data, though being large, could lack diversity and have limited usefulness. In this talk, I will introduce a generic game-theoretic approach to offline RL. It frames offline RL as a two-player game where a learning agent competes with an adversary that simulates the uncertain decision outcomes due to missing data coverage. By this game analogy, I will present a systematic and provably correct framework to design offline RL algorithms that can learn good policies with state-of-the-art empirical performance. In addition, I will show that this framework reveals a natural connection between offline RL and imitation learning, which ensures the learned policies to be always no worse than the data collection policies regardless of hyperparameter choices.

Supported by

Video URL

A Game-Theoretic Approach to Offline Reinforcement Learning

Abstract

Intro to Meta-Complexity: Part 2

Intro to Meta-Complexity: Part 1

Aggregative Efficiency of Bayesian Learning in Networks

(Relaxing) Common Belief for Social Networks

Organizing Modular Production

Video URL

A Game-Theoretic Approach to Offline Reinforcement Learning

APA

MLA

BibTex

Abstract

Intro to Meta-Complexity: Part 2

Intro to Meta-Complexity: Part 1

Aggregative Efficiency of Bayesian Learning in Networks

(Relaxing) Common Belief for Social Networks

Organizing Modular Production