Video URL

https://simons.berkeley.edu/talks/tbd-244

Zap Q-learning with Nonlinear Function Approximation

(2020). Zap Q-learning with Nonlinear Function Approximation. The Simons Institute for the Theory of Computing. https://simons.berkeley.edu/talks/tbd-244

Zap Q-learning with Nonlinear Function Approximation. The Simons Institute for the Theory of Computing, Dec. 02, 2020, https://simons.berkeley.edu/talks/tbd-244

          @misc{ scivideos_16824,
            doi = {},
            url = {https://simons.berkeley.edu/talks/tbd-244},
            author = {},
            keywords = {},
            language = {en},
            title = {Zap Q-learning with Nonlinear Function Approximation},
            publisher = {The Simons Institute for the Theory of Computing},
            year = {2020},
            month = {dec},
            note = {16824 see, \url{https://scivideos.org/Simons-Institute/16824}}
          }

Sean Meyn (University of Florida)

December 02, 2020

Talk number16824

Source RepositorySimons Institute

Subject

Computer Science

Abstract

Zap Q-learning is a recent class of reinforcement learning algorithms, motivated primarily as a means to accelerate convergence. Stability theory has been absent outside of two restrictive classes: the tabular setting, and optimal stopping. This paper introduces a new framework for analysis of a more general class of recursive algorithms known as stochastic approximation. Based on this general theory, it is shown that Zap Q-learning is consistent under a non-degeneracy assumption, even when the function approximation architecture is nonlinear. Zap Q-learning with neural network function approximation emerges as a special case, and is tested on examples from OpenAI Gym. Based on multiple experiments with a range of neural network sizes, it is found that the new algorithms converge quickly and are robust to choice of function approximation architecture.

Supported by

Video URL

Zap Q-learning with Nonlinear Function Approximation

Abstract

Intro to Meta-Complexity: Part 2

Intro to Meta-Complexity: Part 1

Aggregative Efficiency of Bayesian Learning in Networks

(Relaxing) Common Belief for Social Networks

Organizing Modular Production

Video URL

Zap Q-learning with Nonlinear Function Approximation

APA

MLA

BibTex

Abstract

Intro to Meta-Complexity: Part 2

Intro to Meta-Complexity: Part 1

Aggregative Efficiency of Bayesian Learning in Networks

(Relaxing) Common Belief for Social Networks

Organizing Modular Production