PIRSA:18040050

Video URL

The Information Theory of Deep Neural Networks: The statistical physics aspects

Tishby, N. (2018). The Information Theory of Deep Neural Networks: The statistical physics aspects. Perimeter Institute for Theoretical Physics. https://pirsa.org/18040050

Tishby, Naftali. The Information Theory of Deep Neural Networks: The statistical physics aspects. Perimeter Institute for Theoretical Physics, Apr. 25, 2018, https://pirsa.org/18040050

          @misc{ scivideos_PIRSA:18040050,
            doi = {10.48660/18040050},
            url = {https://pirsa.org/18040050},
            author = {Tishby, Naftali},
            keywords = {Other Physics},
            language = {en},
            title = {The Information Theory of Deep Neural Networks: The statistical physics aspects},
            publisher = {Perimeter Institute for Theoretical Physics},
            year = {2018},
            month = {apr},
            note = {PIRSA:18040050 see, \url{https://scivideos.org/index.php/pirsa/18040050}}
          }

Naftali Tishby Hebrew University of Jerusalem

April 25, 2018

Talk numberPIRSA:18040050

DOI10.48660/18040050

Source RepositoryPIRSA

Collection

Colloquium

Talk Type Scientific Series

Subject

Physics

Abstract

The surprising success of learning with deep neural networks poses two fundamental challenges: understanding why these networks work so well and what this success tells us about the nature of intelligence and our biological brain. Our recent Information Theory of Deep Learning shows that large deep networks achieve the optimal tradeoff between training size and accuracy, and that this optimality is achieved through the noise in the learning process.

In this talk, I will focus on the statistical physics aspects of our theory and the interaction between the stochastic dynamics of the training algorithm (Stochastic Gradient Descent) and the phase structure of the Information Bottleneck problem. Specifically, I will describe the connections between the phase transition and the final location and representation of the hidden layers, and the role of these phase transitions in determining the weights of the network.

Based partly on joint works with Ravid Shwartz-Ziv, Noga Zaslavsky, and Shlomi Agmon.

Supported by

Video URL

The Information Theory of Deep Neural Networks: The statistical physics aspects

Abstract

Lecture 2: Universality Classes of Nonlinear Networks

Lecture - Statistical Physics (Core), PHYS 602

Lecture - Statistical Physics (Core), PHYS 602

Effective tools for binary black hole dynamics

Lecture 1: Introduction: Criticality in Linear Networks

Video URL

The Information Theory of Deep Neural Networks: The statistical physics aspects

APA

MLA

BibTex

Abstract

Lecture 2: Universality Classes of Nonlinear Networks

Lecture - Statistical Physics (Core), PHYS 602

Lecture - Statistical Physics (Core), PHYS 602

Effective tools for binary black hole dynamics

Lecture 1: Introduction: Criticality in Linear Networks