Search Talks

Search results from ICTS-TIFR

49 - 60 of 1428 Results

Mean-Field Theory Insights into Neural Feature Dynamics, Infinite-Scale Limits, and Scaling Laws

Cengiz Pehlevan

August 12, 2025

ICTS:32497
Turing Lecture: Overparametrized models: linear theory and its limits

Andrea Montanari

August 12, 2025

ICTS:32491
Sandbox for the Blackbox: How LLMs learn Structured Data

Ashok Makkuva

August 12, 2025

ICTS:32490
Computationally eﬀicient reductions between some statistical models (Online)

Ashwin Pananjady

August 11, 2025

ICTS:32494
Strongly correlated particle systems: a toolbox for machine intelligence

Subhro Ghosh

August 11, 2025

ICTS:32493
Posterior Sampling for Image Personalization and Editing

Sanjay Shakkottai

August 11, 2025

ICTS:32492
Posterior Sampling for Image Personalization and Editing

Sanjay Shakkottai

August 11, 2025

ICTS:32483
Sandbox for the Blackbox: How LLMs learn Structured Data

Ashok Makkuva

August 11, 2025

ICTS:32482
Turing lecture: The mathematics of large machine learning models

Andrea Montanari

August 10, 2025

ICTS:32487
Collaborative Prediction via Tractable Agreement Protocols

Surbhi Goel

August 10, 2025

ICTS:32485
TBA

Damek Davis

August 10, 2025

ICTS:32484
Basic learning theory

Karthik Sridharan

August 08, 2025

ICTS:32478

Mean-Field Theory Insights into Neural Feature Dynamics, Infinite-Scale Limits, and Scaling Laws

Cengiz Pehlevan

August 12, 2025

ICTS:32497

When a neural network becomes extremely wide or deep, its learning dynamics simplify and can be described by the same “mean-field” ideas that explain magnetism and fluids. I will walk through these ideas step-by-step, showing how they suggest practical recipes for initialization and optimization that scale smoothly from small models to cutting-edge transformers. I will also discuss neural scaling laws—empirical power-law rules that relate model size, data, and compute—and illustrate them with solvable toy models.
Turing Lecture: Overparametrized models: linear theory and its limits

Andrea Montanari

August 12, 2025

ICTS:32491

The success of modern AI models defies classical theoretical wisdom. Classical theory recommended the use of convex optimization, and yet AI models learn by optimizing highly non-convex function. Classical theory prescribed to control model complexity and yet AI models are very complex, so complex that they often memorize the training data. Classical wisdom recommends a careful and interpretable choice of model architecture, and yet modern architectures rarely offer a parsimonious representation of a target distribution class.

The discovery that learning can take place in completely unexpected scenario poses beautiful conceptual challenges. I will try to survey recent work towards addressing them.
Sandbox for the Blackbox: How LLMs learn Structured Data

Ashok Makkuva

August 12, 2025

ICTS:32490

In recent years, large language models (LLMs) have achieved unprecedented success across various disciplines, including natural language processing, computer vision, and reinforcement learning. This success has spurred a flourishing body of research aimed at understanding these models, from both theoretical perspectives such as representation and optimization, and scientific approaches such as interpretability.

To understand LLMs, an important research theme in the machine learning community is to model the input as mathematically structured data (e.g. Markov chains), where we have complete knowledge and control of the data properties. The goal is to use this controlled input to gain valuable insights into what solutions LLMs learn and how they learn them (e.g. induction head). This understanding is crucial, given the increasing ubiquity of the models, especially in safety-critical applications, and our limited understanding of them.

While the aforementioned works using this structured approach provide valuable insights into the inner workings of LLMs, the breadth and diversity of the field make it increasingly challenging for both experts and non-experts to stay abreast. To address this, our tutorial aims to provide a unifying perspective on recent advances in the analysis of LLMs, from a representational-cum-learning viewpoint. To this end, we focus on the two predominant classes of language models that have driven the AI revolution: transformers and recurrent models such as state-space models (SSMs). For these models, we discuss several concrete results, including their representational capacities, optimization landscape, and mechanistic interpretability. Building upon these perspectives, we outline several important future directions in this field, aiming to foster a clearer understanding of language models and to aid in the creation of more efficient architectures.

References and detailed explanation of our tutorial is here: https://capricious-comb-7a3tbssph.notion.site/NeurIPS-2024-Tutorial-San…
Computationally eﬀicient reductions between some statistical models (Online)

Ashwin Pananjady

August 11, 2025

ICTS:32494

Can a sample from one parametric statistical model (the source) be transformed into a sample from a different (target) model? Versions of this question were asked as far back as 1950, and a beautiful asymptotic theory of equivalence of experiments emerged in the latter half of the 20th century. Motivated by problems spanning information-computation gaps and differentially private data analysis, we ask the analogous non-asymptotic question in high-dimensional problems and with algorithmic considerations. We show how a single observation from some source models can be approximately transformed to a single observation from a large class of target models by a computationally efficient algorithm. I will present several such reductions and discuss their applications to the aforementioned problems.

This is joint work with Mengqi Lou and Guy Bresler.
Strongly correlated particle systems: a toolbox for machine intelligence

Subhro Ghosh

August 11, 2025

ICTS:32493

The classical paradigm of randomness in the sciences is that of i.i.d. random variables, and going beyond i.i.d. is often considered a difficulty and a challenge to be overcome. In this talk, we will explore a new perspective, wherein strongly constrained random systems in fact help to understand fundamental problems in machine learning. In particular, we will discuss strongly correlated particle systems that are well-motivated from statistical and quantum physics, including in particular determinantal probability measures. These will be used to shed important light on questions of fundamental interest in learning theory, focussing on applications to novel sampling techniques and advances in stochastic gradient descent.
Posterior Sampling for Image Personalization and Editing

Sanjay Shakkottai

August 11, 2025

ICTS:32492

This talk will consist of two parts: In the first part, we will present an overview of posterior sampling with diffusion models, and motivate the connection to inverse problems. Specific topics that we will cover include Gibbs sampling, Importance sampling and approximations for test-time optimization (aka training-free approaches such as DPS) with diffusion models. In the second part, we will discuss algorithms for image editing, stylization, etc, that are in production in large-scale settings. Specifically, we will discuss both diffusion and flow-based algorithms (PSLD, STSL, RB Modulation, RF Inversion) that operate in the latent space of SOTA foundation models (such as Stable Diffusion or Flux).

Diffusions class videos are posted on YouTube (and lecture notes link is also posted in the video caption). Link: https://www.youtube.com/@ifml9883/playlists
Posterior Sampling for Image Personalization and Editing

Sanjay Shakkottai

August 11, 2025

ICTS:32483

This talk will consist of two parts: In the first part, we will present an overview of posterior sampling with diffusion models, and motivate the connection to inverse problems. Specific topics that we will cover include Gibbs sampling, Importance sampling and approximations for test-time optimization (aka training-free approaches such as DPS) with diffusion models. In the second part, we will discuss algorithms for image editing, stylization, etc, that are in production in large-scale settings. Specifically, we will discuss both diffusion and flow-based algorithms (PSLD, STSL, RB Modulation, RF Inversion) that operate in the latent space of SOTA foundation models (such as Stable Diffusion or Flux).

Diffusions class videos are posted on YouTube (and lecture notes link is also posted in the video caption). Link: https://www.youtube.com/@ifml9883/playlists
Sandbox for the Blackbox: How LLMs learn Structured Data

Ashok Makkuva

August 11, 2025

ICTS:32482

In recent years, large language models (LLMs) have achieved unprecedented success across various disciplines, including natural language processing, computer vision, and reinforcement learning. This success has spurred a flourishing body of research aimed at understanding these models, from both theoretical perspectives such as representation and optimization, and scientific approaches such as interpretability.

To understand LLMs, an important research theme in the machine learning community is to model the input as mathematically structured data (e.g. Markov chains), where we have complete knowledge and control of the data properties. The goal is to use this controlled input to gain valuable insights into what solutions LLMs learn and how they learn them (e.g. induction head). This understanding is crucial, given the increasing ubiquity of the models, especially in safety-critical applications, and our limited understanding of them.

While the aforementioned works using this structured approach provide valuable insights into the inner workings of LLMs, the breadth and diversity of the field make it increasingly challenging for both experts and non-experts to stay abreast. To address this, our tutorial aims to provide a unifying perspective on recent advances in the analysis of LLMs, from a representational-cum-learning viewpoint. To this end, we focus on the two predominant classes of language models that have driven the AI revolution: transformers and recurrent models such as state-space models (SSMs). For these models, we discuss several concrete results, including their representational capacities, optimization landscape, and mechanistic interpretability. Building upon these perspectives, we outline several important future directions in this field, aiming to foster a clearer understanding of language models and to aid in the creation of more efficient architectures.

References and detailed explanation of our tutorial is here: https://capricious-comb-7a3tbssph.notion.site/NeurIPS-2024-Tutorial-San…
Turing lecture: The mathematics of large machine learning models

Andrea Montanari

August 10, 2025

ICTS:32487

The success of modern AI models defies classical theoretical wisdom. Classical theory recommended the use of convex optimization, and yet AI models learn by optimizing highly non-convex function. Classical theory prescribed to control model complexity and yet AI models are very complex, so complex that they often memorize the training data. Classical wisdom recommends a careful and interpretable choice of model architecture, and yet modern architectures rarely offer a parsimonious representation of a target distribution class.

The discovery that learning can take place in completely unexpected scenario poses beautiful conceptual challenges. I will try to survey recent work towards addressing them.
Collaborative Prediction via Tractable Agreement Protocols

Surbhi Goel

August 10, 2025

ICTS:32485

Designing effective collaboration between humans and AI systems is crucial for leveraging their complementary abilities in complex decision tasks. But how should agents possessing unique, private knowledge—like a human expert and an AI model—interact to reach decisions better than either could alone? If they were perfect Bayesians with a shared prior, Aumann's classical agreement theorem suggests conversation leads to a prediction via agreement which is accuracy-improving. However, this relies on implausible assumptions about their knowledge and computational power.

We show how to recover and generalize these guarantees using only computationally and statistically tractable assumptions. We develop efficient "collaboration protocols" where parties iteratively exchange only low-dimensional information – their current predictions or best-response actions – without needing to share underlying features. These protocols are grounded in conditions like conversation calibration/swap regret, which relax full Bayesian rationality, and are computationally efficiently enforceable. First, we prove this simple interaction leads to fast convergence to agreement, generalizing quantitative bounds even to high-dimensional and action-based settings. Second, we introduce a weak learning condition under which this agreement process inherently aggregates the parties' distinct information, that is, agents via our protocols arrive at final predictions that are provably competitive with an optimal predictor having access to their joint features. Together, these results offers a new, practical foundation for building systems that achieve the power of pooled knowledge through tractable interaction alone.

This talk is based on joint work with the amazing Natalie Collina, Varun Gupta, Ira Globus-Harris, Aaron Roth, Mirah Shi.
TBA

Damek Davis

August 10, 2025

ICTS:32484

TBA
Basic learning theory

Karthik Sridharan

August 08, 2025

ICTS:32478

TBA

Title	Speaker Profile(s)	Date	Info
Mean-Field Theory Insights into Neural Feature Dynamics, Infinite-Scale Limits, and Scaling Laws	Cengiz Pehlevan	2025‑08‑12	View details
Turing Lecture: Overparametrized models: linear theory and its limits	Andrea Montanari	2025‑08‑12	View details
Sandbox for the Blackbox: How LLMs learn Structured Data	Ashok Makkuva	2025‑08‑12	View details
Computationally eﬀicient reductions between some statistical models (Online)	Ashwin Pananjady	2025‑08‑11	View details
Strongly correlated particle systems: a toolbox for machine intelligence	Subhro Ghosh	2025‑08‑11	View details
Posterior Sampling for Image Personalization and Editing	Sanjay Shakkottai	2025‑08‑11	View details
Posterior Sampling for Image Personalization and Editing	Sanjay Shakkottai	2025‑08‑11	View details
Sandbox for the Blackbox: How LLMs learn Structured Data	Ashok Makkuva	2025‑08‑11	View details
Turing lecture: The mathematics of large machine learning models	Andrea Montanari	2025‑08‑10	View details
Collaborative Prediction via Tractable Agreement Protocols	Surbhi Goel	2025‑08‑10	View details
TBA	Damek Davis	2025‑08‑10	View details
Basic learning theory	Karthik Sridharan	2025‑08‑08	View details

Supported by

Format results

Mean-Field Theory Insights into Neural Feature Dynamics, Infinite-Scale Limits, and Scaling Laws

Turing Lecture: Overparametrized models: linear theory and its limits

Sandbox for the Blackbox: How LLMs learn Structured Data

Computationally eﬀicient reductions between some statistical models (Online)

Strongly correlated particle systems: a toolbox for machine intelligence

Posterior Sampling for Image Personalization and Editing

Posterior Sampling for Image Personalization and Editing

Sandbox for the Blackbox: How LLMs learn Structured Data

Turing lecture: The mathematics of large machine learning models

Collaborative Prediction via Tractable Agreement Protocols

TBA

Basic learning theory

Mean-Field Theory Insights into Neural Feature Dynamics, Infinite-Scale Limits, and Scaling Laws

Turing Lecture: Overparametrized models: linear theory and its limits

Sandbox for the Blackbox: How LLMs learn Structured Data

Computationally eﬀicient reductions between some statistical models (Online)

Strongly correlated particle systems: a toolbox for machine intelligence

Posterior Sampling for Image Personalization and Editing

Posterior Sampling for Image Personalization and Editing

Sandbox for the Blackbox: How LLMs learn Structured Data

Turing lecture: The mathematics of large machine learning models

Collaborative Prediction via Tractable Agreement Protocols

TBA

Basic learning theory