Format results
Posterior Sampling for Image Personalization and Editing
Sanjay ShakkottaiICTS:32492This talk will consist of two parts: In the first part, we will present an overview of posterior sampling with diffusion models, and motivate the connection to inverse problems. Specific topics that we will cover include Gibbs sampling, Importance sampling and approximations for test-time optimization (aka training-free approaches such as DPS) with diffusion models. In the second part, we will discuss algorithms for image editing, stylization, etc, that are in production in large-scale settings. Specifically, we will discuss both diffusion and flow-based algorithms (PSLD, STSL, RB Modulation, RF Inversion) that operate in the latent space of SOTA foundation models (such as Stable Diffusion or Flux).
Diffusions class videos are posted on YouTube (and lecture notes link is also posted in the video caption). Link: https://www.youtube.com/@ifml9883/playlists
Posterior Sampling for Image Personalization and Editing
Sanjay ShakkottaiICTS:32483This talk will consist of two parts: In the first part, we will present an overview of posterior sampling with diffusion models, and motivate the connection to inverse problems. Specific topics that we will cover include Gibbs sampling, Importance sampling and approximations for test-time optimization (aka training-free approaches such as DPS) with diffusion models. In the second part, we will discuss algorithms for image editing, stylization, etc, that are in production in large-scale settings. Specifically, we will discuss both diffusion and flow-based algorithms (PSLD, STSL, RB Modulation, RF Inversion) that operate in the latent space of SOTA foundation models (such as Stable Diffusion or Flux).
Diffusions class videos are posted on YouTube (and lecture notes link is also posted in the video caption). Link: https://www.youtube.com/@ifml9883/playlists
Sandbox for the Blackbox: How LLMs learn Structured Data
Ashok MakkuvaICTS:32482In recent years, large language models (LLMs) have achieved unprecedented success across various disciplines, including natural language processing, computer vision, and reinforcement learning. This success has spurred a flourishing body of research aimed at understanding these models, from both theoretical perspectives such as representation and optimization, and scientific approaches such as interpretability.
To understand LLMs, an important research theme in the machine learning community is to model the input as mathematically structured data (e.g. Markov chains), where we have complete knowledge and control of the data properties. The goal is to use this controlled input to gain valuable insights into what solutions LLMs learn and how they learn them (e.g. induction head). This understanding is crucial, given the increasing ubiquity of the models, especially in safety-critical applications, and our limited understanding of them.
While the aforementioned works using this structured approach provide valuable insights into the inner workings of LLMs, the breadth and diversity of the field make it increasingly challenging for both experts and non-experts to stay abreast. To address this, our tutorial aims to provide a unifying perspective on recent advances in the analysis of LLMs, from a representational-cum-learning viewpoint. To this end, we focus on the two predominant classes of language models that have driven the AI revolution: transformers and recurrent models such as state-space models (SSMs). For these models, we discuss several concrete results, including their representational capacities, optimization landscape, and mechanistic interpretability. Building upon these perspectives, we outline several important future directions in this field, aiming to foster a clearer understanding of language models and to aid in the creation of more efficient architectures.
References and detailed explanation of our tutorial is here: https://capricious-comb-7a3tbssph.notion.site/NeurIPS-2024-Tutorial-San…
Turing lecture: The mathematics of large machine learning models
Andrea MontanariICTS:32487The success of modern AI models defies classical theoretical wisdom. Classical theory recommended the use of convex optimization, and yet AI models learn by optimizing highly non-convex function. Classical theory prescribed to control model complexity and yet AI models are very complex, so complex that they often memorize the training data. Classical wisdom recommends a careful and interpretable choice of model architecture, and yet modern architectures rarely offer a parsimonious representation of a target distribution class.
The discovery that learning can take place in completely unexpected scenario poses beautiful conceptual challenges. I will try to survey recent work towards addressing them.
Collaborative Prediction via Tractable Agreement Protocols
Surbhi GoelICTS:32485Designing effective collaboration between humans and AI systems is crucial for leveraging their complementary abilities in complex decision tasks. But how should agents possessing unique, private knowledge—like a human expert and an AI model—interact to reach decisions better than either could alone? If they were perfect Bayesians with a shared prior, Aumann's classical agreement theorem suggests conversation leads to a prediction via agreement which is accuracy-improving. However, this relies on implausible assumptions about their knowledge and computational power.
We show how to recover and generalize these guarantees using only computationally and statistically tractable assumptions. We develop efficient "collaboration protocols" where parties iteratively exchange only low-dimensional information – their current predictions or best-response actions – without needing to share underlying features. These protocols are grounded in conditions like conversation calibration/swap regret, which relax full Bayesian rationality, and are computationally efficiently enforceable. First, we prove this simple interaction leads to fast convergence to agreement, generalizing quantitative bounds even to high-dimensional and action-based settings. Second, we introduce a weak learning condition under which this agreement process inherently aggregates the parties' distinct information, that is, agents via our protocols arrive at final predictions that are provably competitive with an optimal predictor having access to their joint features. Together, these results offers a new, practical foundation for building systems that achieve the power of pooled knowledge through tractable interaction alone.
This talk is based on joint work with the amazing Natalie Collina, Varun Gupta, Ira Globus-Harris, Aaron Roth, Mirah Shi.
An Introduction to Diffusion and Flow Models
Dheeraj NagarajICTS:32477In this series of talks, I will introduce basic elements of generative modeling with diffusion and flow models from first principles. This includes a short introduction to stochastic calculus, ordinary differential equations, evolution of probability measures, Fokker Planck equation, and the continuity equation. We will then apply these ideas to describe training and inference algorithms for diffusion models.
Statistical Optimal Transport (Online)
Sivaraman BalakrishnanICTS:32481Optimal transport studies the problem of rearranging one distribution into another while minimizing an associated cost. The past decade has witnessed tremendous progress in our understanding of the computational, methodological and statistical aspects of optimal transport (OT). Recent interest in OT has blossomed due to its close connections with diffusion models.
I will introduce the mathematical framework of OT, and then quickly transition to studying how well various objects in the OT framework (OT distances, and OT maps) can be estimated from samples of the underlying distributions.
Data assimilation: theory and practice
Amit ApteICTS:32480Data assimilation is a set of methods for incorporating sparse observations of a complex dynamical system, either deterministic or stochastic, into incomplete models of these systems. Mathematically this is the problem of nonlinear filtering and computationally, they are based on a variety of techniques including Markov chain Monte Carlo, optimization, importance sampling. This tutorial will begin with a quick introduction to the Bayesian underpinnings of data assimilation followed by applications to chaotic dynamical systems.
Data assimilation: theory and practice
Amit ApteICTS:32479Data assimilation is a set of methods for incorporating sparse observations of a complex dynamical system, either deterministic or stochastic, into incomplete models of these systems. Mathematically this is the problem of nonlinear filtering and computationally, they are based on a variety of techniques including Markov chain Monte Carlo, optimization, importance sampling. This tutorial will begin with a quick introduction to the Bayesian underpinnings of data assimilation followed by applications to chaotic dynamical systems.
Title | Speaker(s) | Date | Collection | Type | Info |
---|---|---|---|---|---|
Posterior Sampling for Image Personalization and Editing | Sanjay Shakkottai | 2025‑08‑11 | View details | ||
Posterior Sampling for Image Personalization and Editing | Sanjay Shakkottai | 2025‑08‑11 | View details | ||
Sandbox for the Blackbox: How LLMs learn Structured Data | Ashok Makkuva | 2025‑08‑11 | View details | ||
Turing lecture: The mathematics of large machine learning models | Andrea Montanari | 2025‑08‑10 | View details | ||
Collaborative Prediction via Tractable Agreement Protocols | Surbhi Goel | 2025‑08‑10 | View details | ||
TBA | Damek Davis | 2025‑08‑10 | View details | ||
Basic learning theory | Karthik Sridharan | 2025‑08‑08 | View details | ||
An Introduction to Diffusion and Flow Models | Dheeraj Nagaraj | 2025‑08‑08 | View details | ||
Statistical Optimal Transport (Online) | Sivaraman Balakrishnan | 2025‑08‑07 | View details | ||
Data assimilation: theory and practice | Amit Apte | 2025‑08‑07 | View details | ||
Poster Session | - | 2025‑08‑07 | View details | ||
Data assimilation: theory and practice | Amit Apte | 2025‑08‑07 | View details |