Video URL

https://old.simons.berkeley.edu/node/22921

Chasing the Long Tail: What Neural Networks Memorize and Why

(2022). Chasing the Long Tail: What Neural Networks Memorize and Why. The Simons Institute for the Theory of Computing. https://old.simons.berkeley.edu/node/22921

Chasing the Long Tail: What Neural Networks Memorize and Why. The Simons Institute for the Theory of Computing, Nov. 07, 2022, https://old.simons.berkeley.edu/node/22921

          @misc{ scivideos_22921,
            doi = {},
            url = {https://old.simons.berkeley.edu/node/22921},
            author = {},
            keywords = {},
            language = {en},
            title = {Chasing the Long Tail: What Neural Networks Memorize and Why},
            publisher = {The Simons Institute for the Theory of Computing},
            year = {2022},
            month = {nov},
            note = {22921 see, \url{https://scivideos.org/index.php/simons-institute/22921}}
          }

Vitaly Feldman (Apple ML Research)

November 07, 2022

Talk number22921

Source RepositorySimons Institute

Subject

Computer Science

Abstract

Deep learning algorithms that achieve state-of-the-art results on image and text recognition tasks tend to fit the entire training dataset (nearly) perfectly including mislabeled examples and outliers. This propensity to memorize seemingly useless data and the resulting large generalization gap have puzzled many practitioners and is not explained by existing theories of machine learning. We provide a simple conceptual explanation and a theoretical model demonstrating that memorization of outliers and mislabeled examples is necessary for achieving close-to-optimal generalization error when learning from long-tailed data distributions. Image and text data are known to follow such distributions and therefore our results establish a formal link between these empirical phenomena. We then demonstrate the utility of memorization and support our explanation empirically. These results rely on a new technique for efficiently estimating memorization and influence of training data points. Our results allow us to quantify the cost of limiting memorization in learning and explain the disparate effects that privacy and model compression have on different subgroups.

Supported by

Video URL

Chasing the Long Tail: What Neural Networks Memorize and Why

Abstract

Intro to Meta-Complexity: Part 2

Intro to Meta-Complexity: Part 1

Aggregative Efficiency of Bayesian Learning in Networks

(Relaxing) Common Belief for Social Networks

Organizing Modular Production

Video URL

Chasing the Long Tail: What Neural Networks Memorize and Why

APA

MLA

BibTex

Abstract

Intro to Meta-Complexity: Part 2

Intro to Meta-Complexity: Part 1

Aggregative Efficiency of Bayesian Learning in Networks

(Relaxing) Common Belief for Social Networks

Organizing Modular Production