Search Talks

Search results from ICTS-TIFR

85 - 96 of 1467 Results

Strongly correlated particle systems: a toolbox for machine intelligence

Subhro Ghosh

August 13, 2025

ICTS:32495
What does guidance do? (Online)

Sitan Chen

August 12, 2025

ICTS:32499
New research directions in vector search

Kiran Shiragur

August 12, 2025

ICTS:32498
Mean-Field Theory Insights into Neural Feature Dynamics, Infinite-Scale Limits, and Scaling Laws

Cengiz Pehlevan

August 12, 2025

ICTS:32497
Turing Lecture: Overparametrized models: linear theory and its limits

Andrea Montanari

August 12, 2025

ICTS:32491
Sandbox for the Blackbox: How LLMs learn Structured Data

Ashok Makkuva

August 12, 2025

ICTS:32490
Computationally eﬀicient reductions between some statistical models (Online)

Ashwin Pananjady

August 11, 2025

ICTS:32494
Strongly correlated particle systems: a toolbox for machine intelligence

Subhro Ghosh

August 11, 2025

ICTS:32493
Posterior Sampling for Image Personalization and Editing

Sanjay Shakkottai

August 11, 2025

ICTS:32492
Posterior Sampling for Image Personalization and Editing

Sanjay Shakkottai

August 11, 2025

ICTS:32483
Sandbox for the Blackbox: How LLMs learn Structured Data

Ashok Makkuva

August 11, 2025

ICTS:32482
Turing lecture: The mathematics of large machine learning models

Andrea Montanari

August 10, 2025

ICTS:32487

Strongly correlated particle systems: a toolbox for machine intelligence

Subhro Ghosh

August 13, 2025

ICTS:32495

The classical paradigm of randomness in the sciences is that of i.i.d. random variables, and going beyond i.i.d. is often considered a difficulty and a challenge to be overcome. In this talk, we will explore a new perspective, wherein strongly constrained random systems in fact help to understand fundamental problems in machine learning. In particular, we will discuss strongly correlated particle systems that are well-motivated from statistical and quantum physics, including in particular determinantal probability measures. These will be used to shed important light on questions of fundamental interest in learning theory, focussing on applications to novel sampling techniques and advances in stochastic gradient descent.
What does guidance do? (Online)

Sitan Chen

August 12, 2025

ICTS:32499

When sampling from a base measure tilted by a reward model, a popular trick is to approximate the score of the tilted measure with the sum of the base score and the gradient of the reward. It is well-known that this does not sample from the base distribution but nevertheless seems to do something interesting and useful, e.g., classifier-free guidance (CFG) and diffusion posterior sampling (DPS). In this talk, I provide some theoretical perspectives on what this method actually samples from, focusing on a simple mixture model setting. In the first part, I will rigorously characterize the dynamics of CFG, proving that it generates archetypal and low-diversity samples in a certain precise sense. In the second part, I will show that for linear inverse problems, DPS with a careful choice of initialization simultaneously boosts reward and likelihood under the prior. I will then describe some experiments demonstrating that DPS with this initialization scheme achieves strong performance on hard image restoration tasks like large box inpainting. Based on https://arxiv.org/abs/2409.13074 and https://arxiv.org/abs/2506.10955
New research directions in vector search

Kiran Shiragur

August 12, 2025

ICTS:32498

Vector search is a fundamental problem with numerous applications in machine learning, computer vision, recommendation systems, and more. While vector search has been extensively studied, modern applications have introduced new requirements, such as diversity, multivector, multifilter, and others. In this talk, we explore these emerging research directions, with a focus on diversity and multivector embeddings in vector search.

For both problems, we propose the first provable graph-based algorithms that efficiently return approximate solutions. Our algorithms leverage popular graph-based methods, enabling us to build on existing, efficient implementations. Experimental results show that our algorithms outperform other approaches.
Mean-Field Theory Insights into Neural Feature Dynamics, Infinite-Scale Limits, and Scaling Laws

Cengiz Pehlevan

August 12, 2025

ICTS:32497

When a neural network becomes extremely wide or deep, its learning dynamics simplify and can be described by the same “mean-field” ideas that explain magnetism and fluids. I will walk through these ideas step-by-step, showing how they suggest practical recipes for initialization and optimization that scale smoothly from small models to cutting-edge transformers. I will also discuss neural scaling laws—empirical power-law rules that relate model size, data, and compute—and illustrate them with solvable toy models.
Turing Lecture: Overparametrized models: linear theory and its limits

Andrea Montanari

August 12, 2025

ICTS:32491

The success of modern AI models defies classical theoretical wisdom. Classical theory recommended the use of convex optimization, and yet AI models learn by optimizing highly non-convex function. Classical theory prescribed to control model complexity and yet AI models are very complex, so complex that they often memorize the training data. Classical wisdom recommends a careful and interpretable choice of model architecture, and yet modern architectures rarely offer a parsimonious representation of a target distribution class.

The discovery that learning can take place in completely unexpected scenario poses beautiful conceptual challenges. I will try to survey recent work towards addressing them.
Sandbox for the Blackbox: How LLMs learn Structured Data

Ashok Makkuva

August 12, 2025

ICTS:32490

In recent years, large language models (LLMs) have achieved unprecedented success across various disciplines, including natural language processing, computer vision, and reinforcement learning. This success has spurred a flourishing body of research aimed at understanding these models, from both theoretical perspectives such as representation and optimization, and scientific approaches such as interpretability.

To understand LLMs, an important research theme in the machine learning community is to model the input as mathematically structured data (e.g. Markov chains), where we have complete knowledge and control of the data properties. The goal is to use this controlled input to gain valuable insights into what solutions LLMs learn and how they learn them (e.g. induction head). This understanding is crucial, given the increasing ubiquity of the models, especially in safety-critical applications, and our limited understanding of them.

While the aforementioned works using this structured approach provide valuable insights into the inner workings of LLMs, the breadth and diversity of the field make it increasingly challenging for both experts and non-experts to stay abreast. To address this, our tutorial aims to provide a unifying perspective on recent advances in the analysis of LLMs, from a representational-cum-learning viewpoint. To this end, we focus on the two predominant classes of language models that have driven the AI revolution: transformers and recurrent models such as state-space models (SSMs). For these models, we discuss several concrete results, including their representational capacities, optimization landscape, and mechanistic interpretability. Building upon these perspectives, we outline several important future directions in this field, aiming to foster a clearer understanding of language models and to aid in the creation of more efficient architectures.

References and detailed explanation of our tutorial is here: https://capricious-comb-7a3tbssph.notion.site/NeurIPS-2024-Tutorial-San…
Computationally eﬀicient reductions between some statistical models (Online)

Ashwin Pananjady

August 11, 2025

ICTS:32494

Can a sample from one parametric statistical model (the source) be transformed into a sample from a different (target) model? Versions of this question were asked as far back as 1950, and a beautiful asymptotic theory of equivalence of experiments emerged in the latter half of the 20th century. Motivated by problems spanning information-computation gaps and differentially private data analysis, we ask the analogous non-asymptotic question in high-dimensional problems and with algorithmic considerations. We show how a single observation from some source models can be approximately transformed to a single observation from a large class of target models by a computationally efficient algorithm. I will present several such reductions and discuss their applications to the aforementioned problems.

This is joint work with Mengqi Lou and Guy Bresler.
Strongly correlated particle systems: a toolbox for machine intelligence

Subhro Ghosh

August 11, 2025

ICTS:32493

The classical paradigm of randomness in the sciences is that of i.i.d. random variables, and going beyond i.i.d. is often considered a difficulty and a challenge to be overcome. In this talk, we will explore a new perspective, wherein strongly constrained random systems in fact help to understand fundamental problems in machine learning. In particular, we will discuss strongly correlated particle systems that are well-motivated from statistical and quantum physics, including in particular determinantal probability measures. These will be used to shed important light on questions of fundamental interest in learning theory, focussing on applications to novel sampling techniques and advances in stochastic gradient descent.
Posterior Sampling for Image Personalization and Editing

Sanjay Shakkottai

August 11, 2025

ICTS:32492

This talk will consist of two parts: In the first part, we will present an overview of posterior sampling with diffusion models, and motivate the connection to inverse problems. Specific topics that we will cover include Gibbs sampling, Importance sampling and approximations for test-time optimization (aka training-free approaches such as DPS) with diffusion models. In the second part, we will discuss algorithms for image editing, stylization, etc, that are in production in large-scale settings. Specifically, we will discuss both diffusion and flow-based algorithms (PSLD, STSL, RB Modulation, RF Inversion) that operate in the latent space of SOTA foundation models (such as Stable Diffusion or Flux).

Diffusions class videos are posted on YouTube (and lecture notes link is also posted in the video caption). Link: https://www.youtube.com/@ifml9883/playlists
Posterior Sampling for Image Personalization and Editing

Sanjay Shakkottai

August 11, 2025

ICTS:32483

This talk will consist of two parts: In the first part, we will present an overview of posterior sampling with diffusion models, and motivate the connection to inverse problems. Specific topics that we will cover include Gibbs sampling, Importance sampling and approximations for test-time optimization (aka training-free approaches such as DPS) with diffusion models. In the second part, we will discuss algorithms for image editing, stylization, etc, that are in production in large-scale settings. Specifically, we will discuss both diffusion and flow-based algorithms (PSLD, STSL, RB Modulation, RF Inversion) that operate in the latent space of SOTA foundation models (such as Stable Diffusion or Flux).

Diffusions class videos are posted on YouTube (and lecture notes link is also posted in the video caption). Link: https://www.youtube.com/@ifml9883/playlists
Sandbox for the Blackbox: How LLMs learn Structured Data

Ashok Makkuva

August 11, 2025

ICTS:32482

In recent years, large language models (LLMs) have achieved unprecedented success across various disciplines, including natural language processing, computer vision, and reinforcement learning. This success has spurred a flourishing body of research aimed at understanding these models, from both theoretical perspectives such as representation and optimization, and scientific approaches such as interpretability.

To understand LLMs, an important research theme in the machine learning community is to model the input as mathematically structured data (e.g. Markov chains), where we have complete knowledge and control of the data properties. The goal is to use this controlled input to gain valuable insights into what solutions LLMs learn and how they learn them (e.g. induction head). This understanding is crucial, given the increasing ubiquity of the models, especially in safety-critical applications, and our limited understanding of them.

While the aforementioned works using this structured approach provide valuable insights into the inner workings of LLMs, the breadth and diversity of the field make it increasingly challenging for both experts and non-experts to stay abreast. To address this, our tutorial aims to provide a unifying perspective on recent advances in the analysis of LLMs, from a representational-cum-learning viewpoint. To this end, we focus on the two predominant classes of language models that have driven the AI revolution: transformers and recurrent models such as state-space models (SSMs). For these models, we discuss several concrete results, including their representational capacities, optimization landscape, and mechanistic interpretability. Building upon these perspectives, we outline several important future directions in this field, aiming to foster a clearer understanding of language models and to aid in the creation of more efficient architectures.

References and detailed explanation of our tutorial is here: https://capricious-comb-7a3tbssph.notion.site/NeurIPS-2024-Tutorial-San…
Turing lecture: The mathematics of large machine learning models

Andrea Montanari

August 10, 2025

ICTS:32487

The success of modern AI models defies classical theoretical wisdom. Classical theory recommended the use of convex optimization, and yet AI models learn by optimizing highly non-convex function. Classical theory prescribed to control model complexity and yet AI models are very complex, so complex that they often memorize the training data. Classical wisdom recommends a careful and interpretable choice of model architecture, and yet modern architectures rarely offer a parsimonious representation of a target distribution class.

The discovery that learning can take place in completely unexpected scenario poses beautiful conceptual challenges. I will try to survey recent work towards addressing them.

Title	Speaker Profile(s)	Date	Info
Strongly correlated particle systems: a toolbox for machine intelligence	Subhro Ghosh	2025‑08‑13	View details
What does guidance do? (Online)	Sitan Chen	2025‑08‑12	View details
New research directions in vector search	Kiran Shiragur	2025‑08‑12	View details
Mean-Field Theory Insights into Neural Feature Dynamics, Infinite-Scale Limits, and Scaling Laws	Cengiz Pehlevan	2025‑08‑12	View details
Turing Lecture: Overparametrized models: linear theory and its limits	Andrea Montanari	2025‑08‑12	View details
Sandbox for the Blackbox: How LLMs learn Structured Data	Ashok Makkuva	2025‑08‑12	View details
Computationally eﬀicient reductions between some statistical models (Online)	Ashwin Pananjady	2025‑08‑11	View details
Strongly correlated particle systems: a toolbox for machine intelligence	Subhro Ghosh	2025‑08‑11	View details
Posterior Sampling for Image Personalization and Editing	Sanjay Shakkottai	2025‑08‑11	View details
Posterior Sampling for Image Personalization and Editing	Sanjay Shakkottai	2025‑08‑11	View details
Sandbox for the Blackbox: How LLMs learn Structured Data	Ashok Makkuva	2025‑08‑11	View details
Turing lecture: The mathematics of large machine learning models	Andrea Montanari	2025‑08‑10	View details

Supported by

Format results

Strongly correlated particle systems: a toolbox for machine intelligence

What does guidance do? (Online)

New research directions in vector search

Mean-Field Theory Insights into Neural Feature Dynamics, Infinite-Scale Limits, and Scaling Laws

Turing Lecture: Overparametrized models: linear theory and its limits

Sandbox for the Blackbox: How LLMs learn Structured Data

Computationally eﬀicient reductions between some statistical models (Online)

Strongly correlated particle systems: a toolbox for machine intelligence

Posterior Sampling for Image Personalization and Editing

Posterior Sampling for Image Personalization and Editing

Sandbox for the Blackbox: How LLMs learn Structured Data

Turing lecture: The mathematics of large machine learning models

Strongly correlated particle systems: a toolbox for machine intelligence

What does guidance do? (Online)

New research directions in vector search

Mean-Field Theory Insights into Neural Feature Dynamics, Infinite-Scale Limits, and Scaling Laws

Turing Lecture: Overparametrized models: linear theory and its limits

Sandbox for the Blackbox: How LLMs learn Structured Data

Computationally eﬀicient reductions between some statistical models (Online)

Strongly correlated particle systems: a toolbox for machine intelligence

Posterior Sampling for Image Personalization and Editing

Posterior Sampling for Image Personalization and Editing

Sandbox for the Blackbox: How LLMs learn Structured Data

Turing lecture: The mathematics of large machine learning models