Format results
-
-
-
Aspects of RG flows and Bayesian Updating
David Berman Queen Mary - University of London (QMUL)
PIRSA:25040108 -
Renormalization Group Flows: from Optimal Transport to Diffusion Models
Jordan Cotler Harvard University
PIRSA:25040095 -
Statistical physics of learning with two-layer neural networks
Bruno Loureiro École Normale Supérieure - PSL
PIRSA:25040093 -
Architectural bias in a transport-based generative model : an asymptotic perspective
Hugo Cui Harvard University
PIRSA:25040092 -
Solvable models of scaling and emergence in deep learning
Cengiz Pehlevan Harvard University
PIRSA:25040091 -
Towards a “Theoretical Minimum” for Physicists in AI
Yonatan Kahn Princeton University
PIRSA:25040089 -
Creativity by Compositionality in Generative Diffusion Models
Alessandro Favero École Polytechnique Fédérale de Lausanne
PIRSA:25040088 -
Causal Inference Meets Quantum Physics
Robert Spekkens Perimeter Institute for Theoretical Physics
PIRSA:25040086 -
-
-
Explainable AI in (Astro)physics
Luisa Lucie-Smith Universität Hamburg
PIRSA:25040098Machine learning has significantly improved the way scientists model and interpret large datasets across a broad range of the physical sciences; yet, its "black box" nature often limits our ability to trust and understand its results. Interpretable and explainable AI is ultimately required to realize the potential of machine-assisted scientific discovery. I will review efforts toward explainable AI focusing in particular in applications within the field of Astrophysics. I will present an explainable deep learning framework which combines model compression and information theory to achieve explainability. I will demonstrate its relevance to cosmological large-scale structures, such as dark matter halos and galaxies, as well as the cosmic microwave background, revealing new physical insights derived from these explainable AI models. -
NN/QFT correspondence
Ro Jefferson Utrecht University
PIRSA:25040128As we've seen at this workshop, exciting progress has recently been made in the study of neural networks by applying ideas and techniques from theoretical physics. In this talk, I will discuss a precise relation between quantum field theory and deep neural networks, the NN/QFT correspondence. In particular, I will go beyond the level of analogy by explicitly constructing the QFT corresponding to a class of networks encompassing both vanilla feedforward and recurrent architectures. The resulting theory closely resembles the well-studied O(N) vector model, in which the variance of the weight initializations plays the role of the 't Hooft coupling. In this framework, the Gaussian process approximation used in machine learning corresponds to a free field theory, and finite-width effects can be computed perturbatively in the ratio of depth to width, T/N. These provide corrections to the correlation length that controls the depth to which information can propagate through the network, and thereby sets the scale at which such networks are trainable by gradient descent. This analysis provides a non-perturbative description of networks at initialization, and opens several interesting avenues to the study of criticality in these models. -
Aspects of RG flows and Bayesian Updating
David Berman Queen Mary - University of London (QMUL)
PIRSA:25040108We will examine the idea of Bayesian updating as an inverse diffusion like process and its relation to the exact renormalisation group. In particular we will look at the role of Fisher Information, its metric and possible physical interpretations. -
Renormalization Group Flows: from Optimal Transport to Diffusion Models
Jordan Cotler Harvard University
PIRSA:25040095We show that Polchinski’s equation for exact renormalization group flow is equivalent to the optimal transport gradient flow of a field-theoretic relative entropy. This gives a surprising information-theoretic formulation of the exact renormalization group, expressed in the language of optimal transport. We will provide reviews of both the exact renormalization group, as well as the theory of optimal transportation. Our techniques generalize to other RG flow equations beyond Polchinski's. Moreover, we establish a connection between this more general class of RG flows and stochastic Langevin PDEs, enabling us to construct ML-based adaptive bridge samplers for lattice field theories. Finally, we will discuss forthcoming work on related methods to variationally approximate ground states of quantum field theories. -
Statistical physics of learning with two-layer neural networks
Bruno Loureiro École Normale Supérieure - PSL
PIRSA:25040093Feature learning - or the capacity of neural networks to adapt to the data during training - is often quoted as one of the fundamental reasons behind their unreasonable effectiveness. Yet, making mathematical sense of this seemingly clear intuition is still a largely open question. In this talk, I will discuss a simple setting where we can precisely characterise how features are learned by a two-layer neural network during the very first few steps of training, and how these features are essential for the network to efficiently generalise under limited availability of data. -
Architectural bias in a transport-based generative model : an asymptotic perspective
Hugo Cui Harvard University
PIRSA:25040092We consider the problem of learning a generative model parametrized by a two-layer auto-encoder, and trained with online stochastic gradient descent, to sample from a high-dimensional data distribution with an underlying low-dimensional structure. We provide a tight asymptotic characterization of low-dimensional projections of the resulting generated density, and evidence how mode(l) collapse can arise. On the other hand, we discuss how in a case where the architectural bias is suited to the target density, these simple models can efficiently learn to sample from a binary Gaussian mixture target distribution. -
Solvable models of scaling and emergence in deep learning
Cengiz Pehlevan Harvard University
PIRSA:25040091 -
Towards a “Theoretical Minimum” for Physicists in AI
Yonatan Kahn Princeton University
PIRSA:25040089As progress in AI hurtles forward at a speed seldom seen in the history of science, theorists who wish to gain a first-principles understanding of AI can be overwhelmed by the enormous number of papers, notational choices, and assumptions in the literature. I will make a pitch for developing a “Theoretical Minimum” for theoretical physicists aiming to study AI, with the goal of getting members of our community up to speed as quickly as possible with a suite of standard results whose validity can be checked by numerical experiments requiring only modest compute. In particular, this will require close collaboration between statistical physics, condensed matter physics, and high-energy physics, three communities that all have important perspectives to bring to the table but whose notation must be harmonized in order to be accessible to new researchers. I will focus my discussion on (a) the various approaches to the infinite-width limit, which seems like the best entry point for theoretical physicists who first encounter neural networks, and (b) the need for benchmark datasets from physics complex enough to capture aspects of natural-language data but which are nonetheless “calculable” from first-principles using tools of theoretical physics. -
Creativity by Compositionality in Generative Diffusion Models
Alessandro Favero École Polytechnique Fédérale de Lausanne
PIRSA:25040088Diffusion models have shown remarkable success in generating high-dimensional data such as images and language – a feat only possible if data has strong underlying structure. Understanding deep generative models thus requires understanding the structure of the data they learn from. In particular, natural data is often composed of features organized hierarchically. In this talk, we will model this structure using probabilistic context-free grammars – tree-like generative models from linguistics. I will present a theory of denoising diffusion on this data, predicting a phase transition that governs the reconstruction of features at various hierarchical levels. I will show empirical evidence for it in both image and language diffusion models. I will then discuss how diffusion models learn these grammars, revealing a quantitative relationship between data correlations and the training set size needed to learn how to hierarchically compose new data. In particular, we predict a polynomial scaling of sample complexity with data dimension, providing a mechanism by which diffusion models avoid the curse of dimensionality. Additionally, this theory predicts that models trained on limited data generate outputs that are locally coherent but lack global consistency, an effect empirically confirmed across modalities. These results offer a new perspective on how generative models learn to become creative and compose novel data by progressively uncovering the latent hierarchical structure. -
Causal Inference Meets Quantum Physics
Robert Spekkens Perimeter Institute for Theoretical Physics
PIRSA:25040086Can the effectiveness of a medical treatment be determined without the expense of a randomized controlled trial? Can the impact of a new policy be disentangled from other factors that happen to vary at the same time? Questions such as these are the purview of the field of causal inference, a general-purpose science of cause and effect, applicable in domains ranging from epidemiology to economics. Researchers in this field seek in particular to find techniques for extracting causal conclusions from statistical data. Meanwhile, one of the most significant results in the foundations of quantum theory—Bell's theorem—can also be understood as an attempt to disentangle correlation and causation. Recently, it has been recognized that Bell's result is an early foray into the field of causal inference and that the insights derived from 60 years of research on his theorem can supplement and improve upon state-of-the-art causal inference techniques. In the other direction, the conceptual framework developed by causal inference researchers provides a fruitful new perspective on what could possibly count as a satisfactory causal explanation of the quantum correlations observed in Bell experiments. Efforts to elaborate upon these connections have led to an exciting flow of techniques and insights across the disciplinary divide. This talk will highlight some of what is happening at the intersection of these two fields. -
Scaling Limits for Learning: Dynamics and Statics
Blake Bordelon Harvard University
PIRSA:25040085In this talk, I will discuss how physics can help improve our understanding of deep learning systems and guide improvements to their scaling strategies. I will first discuss mathematical results based on mean-field techniques from statistical physics to analyze the feature learning dynamics of neural networks as well as posteriors of large Bayesian neural networks. This theory will provide insights to develop initialization and optimization schemes for neural networks that admit well defined infinite width and depth limits and behave consistently across model scales, providing practical advantages. These limits also enable a theoretical characterization of the types of learned solutions reached by deep networks, and provide a starting point to characterize generalization and neural scaling laws (see Cengiz Pehlevan's talk). -