Video URL

https://old.simons.berkeley.edu/talks/learning-across-bandits-high-dimension-robust-statistics

Learning Across Bandits in High Dimension via Robust Statistics

(2022). Learning Across Bandits in High Dimension via Robust Statistics. The Simons Institute for the Theory of Computing. https://old.simons.berkeley.edu/talks/learning-across-bandits-high-dimension-robust-statistics

Learning Across Bandits in High Dimension via Robust Statistics. The Simons Institute for the Theory of Computing, Oct. 11, 2022, https://old.simons.berkeley.edu/talks/learning-across-bandits-high-dimension-robust-statistics

          @misc{ scivideos_22745,
            doi = {},
            url = {https://old.simons.berkeley.edu/talks/learning-across-bandits-high-dimension-robust-statistics},
            author = {},
            keywords = {},
            language = {en},
            title = {Learning Across Bandits in High Dimension via Robust Statistics},
            publisher = {The Simons Institute for the Theory of Computing},
            year = {2022},
            month = {oct},
            note = {22745 see, \url{https://scivideos.org/index.php/simons-institute/22745}}
          }

Hamsa Bastani (University of Pennsylvania)

October 11, 2022

Talk number22745

Source RepositorySimons Institute

Subject

Computer Science

Abstract

Decision-makers often face the "many bandits" problem, where one must simultaneously learn across related but heterogeneous contextual bandit instances. For instance, a large retailer may wish to dynamically learn product demand across many stores to solve pricing or inventory problems, making it desirable to learn jointly for stores serving similar customers; alternatively, a hospital network may wish to dynamically learn patient risk across many providers to allocate personalized interventions, making it desirable to learn jointly for hospitals serving similar patient populations. Motivated by real datasets, we decompose the unknown parameter in each bandit instance into a global parameter plus a sparse instance-specific term. Then, we propose a novel two-stage estimator that exploits this structure in a sample-efficient way by using a combination of robust statistics (to learn across similar instances) and LASSO regression (to debias the results). We embed this estimator within a bandit algorithm, and prove that it improves asymptotic regret bounds in the context dimension; this improvement is exponential for data-poor instances. We further demonstrate how our results depend on the underlying network structure of bandit instances. Finally, we illustrate the value of our approach on synthetic and real datasets. Joint work with Kan Xu. Paper: https://arxiv.org/abs/2112.14233

Supported by

Video URL

Learning Across Bandits in High Dimension via Robust Statistics

Abstract

Intro to Meta-Complexity: Part 2

Intro to Meta-Complexity: Part 1

Aggregative Efficiency of Bayesian Learning in Networks

(Relaxing) Common Belief for Social Networks

Organizing Modular Production

Video URL

Learning Across Bandits in High Dimension via Robust Statistics

APA

MLA

BibTex

Abstract

Intro to Meta-Complexity: Part 2

Intro to Meta-Complexity: Part 1

Aggregative Efficiency of Bayesian Learning in Networks

(Relaxing) Common Belief for Social Networks

Organizing Modular Production