Format results
- Nikhil Devanur (Amazon)
Entanglement distillation in tensor networks
Takato Mori Perimeter Institute for Theoretical Physics
Quantum Field Theory I - Lecture 221012
Gang Xu Perimeter Institute for Theoretical Physics
PIRSA:22100049Relativity - Lecture 221012
PIRSA:22100076Learning Across Bandits in High Dimension via Robust Statistics
Hamsa Bastani (University of Pennsylvania)Are Multicriteria MDPs Harder to Solve Than Single-Criteria MDPs?
Csaba Szepasvari (University of Alberta, Google DeepMind)Modular commutators in conformal field theory, topological order, and holography
Yijian Zou Perimeter Institute for Theoretical Physics
A Game-Theoretic Approach to Offline Reinforcement Learning
Ching-An Cheng (Microsoft Research)Where are Milky Way’s Hadronic PeVatrons?
Takahiro Sudo Ohio State University
The Statistical Complexity of Interactive Decision Making
Dylan Foster (Microsoft Research)
Fast Algorithms for Online Stochastic Convex Programming
Nikhil Devanur (Amazon)We introduce the online stochastic Convex Programming (CP) problem, a very general version of stochastic online problems which allows arbitrary concave objectives and convex feasibility constraints. Many well-studied problems like the Adwords problem, online stochastic packing and covering, online stochastic matching with concave returns, etc. form a special case of online stochastic CP. We present fast algorithms for these problems, which achieve near-optimal regret guarantees for both the i.i.d. and the random permutation models of stochastic inputs. When applied to the special case of online packing, our ideas yield a simpler and faster primal-dual algorithm for this well studied problem, which achieves the optimal competitive ratio. Our techniques make explicit the connection of primal-dual paradigm and online learning to online stochastic CP.Entanglement distillation in tensor networks
Takato Mori Perimeter Institute for Theoretical Physics
Tensor network provides a geometric representation of quantum many-body wave functions. Inspired by holography, we discuss a geometric realization of (one-shot) entanglement distillation for tensor networks including the multi-scale entanglement renormalization ansatz and matrix product states. We evaluate the trace distances between the ‘distilled' states and EPR states step by step and see a trend of distillation. If time permits, I will mention a possible field theoretic generalization of this geometric distillation.
Zoom link: https://pitp.zoom.us/j/98545776462?pwd=b1Z3ZENNRWVITlNOZG1GdzJaMmN1Zz09
Quantum Field Theory I - Lecture 221012
Gang Xu Perimeter Institute for Theoretical Physics
PIRSA:22100049Relativity - Lecture 221012
PIRSA:22100076Learning Across Bandits in High Dimension via Robust Statistics
Hamsa Bastani (University of Pennsylvania)Decision-makers often face the "many bandits" problem, where one must simultaneously learn across related but heterogeneous contextual bandit instances. For instance, a large retailer may wish to dynamically learn product demand across many stores to solve pricing or inventory problems, making it desirable to learn jointly for stores serving similar customers; alternatively, a hospital network may wish to dynamically learn patient risk across many providers to allocate personalized interventions, making it desirable to learn jointly for hospitals serving similar patient populations. Motivated by real datasets, we decompose the unknown parameter in each bandit instance into a global parameter plus a sparse instance-specific term. Then, we propose a novel two-stage estimator that exploits this structure in a sample-efficient way by using a combination of robust statistics (to learn across similar instances) and LASSO regression (to debias the results). We embed this estimator within a bandit algorithm, and prove that it improves asymptotic regret bounds in the context dimension; this improvement is exponential for data-poor instances. We further demonstrate how our results depend on the underlying network structure of bandit instances. Finally, we illustrate the value of our approach on synthetic and real datasets. Joint work with Kan Xu. Paper: https://arxiv.org/abs/2112.14233Are Multicriteria MDPs Harder to Solve Than Single-Criteria MDPs?
Csaba Szepasvari (University of Alberta, Google DeepMind)Oftentimes, decisions involve multiple, possible conflicting rewards, or costs. For example, solving a problem faster may incur extra cost, or sacrifice safety. In cases like this, one possibility is to aim for decisions that maximize the value obtained from one of the reward functions, while keeping the value obtained from the other reward functions above some prespecified target values. Up to logarithmic factors, we resolve the optimal words-case sample complexity of finding solutions to such problems in the discounted MDP setting when a generative model of the MDP is available. While this is clearly an oversimplified problem, our analysis reveals an interesting gap between the sample complexity of this problem and the sample complexity of solving MDPs when the solver needs to return a solution which, with a prescribed probability, cannot violate the constraints. In the talk, I will explain the background of the problem, the origin of the gap, the algorithm that we know to achieve the near-optimal sample complexity, closing with some open questions. This is joint work with Sharan Vaswani and Lin F. YangModular commutators in conformal field theory, topological order, and holography
Yijian Zou Perimeter Institute for Theoretical Physics
The modular commutator is a recently discovered multipartite entanglement measure that quantifies the chirality of the underlying many-body quantum state. In this Letter, we derive a universal expression for the modular commutator in conformal field theories in 1+1 dimensions and discuss its salient features. We show that the modular commutator depends only on the chiral central charge and the conformal cross ratio. We test this formula for a gapped (2+1)-dimensional system with a chiral edge, i.e., the quantum Hall state, and observe excellent agreement with numerical simulations. Furthermore, we propose a geometric dual for the modular commutator in certain preferred states of the AdS/CFT correspondence. For these states, we argue that the modular commutator can be obtained from a set of crossing angles between intersecting Ryu-Takayanagi surfaces.
Zoom link: https://pitp.zoom.us/j/94069836709?pwd=RlA2ZUsxdXlPTlh2TStObHFDNUY0Zz09
Entropy-Area Law from Interior Semi-classical Degrees of Freedom
Yuki Yokokura RIKEN
Can degrees of freedom in the interior of black holes be responsible for the entropy-area law? If yes, what spacetime appears? In this talk, I answer these questions at the semi-classical level. Specifically, a black hole is considered as a bound state consisting of many semi-classical degrees of freedom which exist uniformly inside and have maximum gravity. The distribution of their information determines the interior metric through the semi-classical Einstein equation. Then, the interior is a continuous stacking of AdS_2 times S^2 without horizon or singularity and behaves like a local thermal state. Evaluating the entropy density from thermodynamic relations and integrating it over the interior volume, the area law is obtained with the factor 1/4 for any interior degrees of freedom. Here, the dynamics of gravity plays an essential role in changing the entropy from the volume law to the area law. This should help us clarify the holographic property of black-hole entropy. [arXiv: 2207.14274]
Zoom link: https://pitp.zoom.us/j/99386433635?pwd=VzlLV2U4T1ZOYmRVbG9YVlFIemVVZz09
Towards Explicit Discrete Holography: Aperiodic Spin Chains from Hyperbolic Tilings
Giuseppe Di Giulio University of Würzburg
The AdS/CFT correspondence is one of the most important breakthroughs of the last decades in theoretical physics. A recently proposed way to get insights on various features of this duality is achieved by discretizing the Anti-de Sitter spacetime. Within this program, we consider the Poincaré disk and we discretize it by introducing a regular hyperbolic tiling on it. The features of this discretization are expected to be identified in the quantum theory living on the boundary of the hyperbolic tiling. In this talk, we discuss how a class of boundary Hamiltonians can be naturally obtained in this discrete geometry via an inflation rule that allows constructing the tiling using concentric layers of tiles. The models in this class are aperiodic spin chains. Using strong-disorder renormalization group techniques, we study the entanglement entropy of these boundary theories, identifying a logarithmic growth in the subsystem size, with a coefficient depending on the bulk discretization parameters.
Zoom link: https://pitp.zoom.us/j/95849965965?pwd=eEx5Q0gxR2orR0dzS2pQbG8rR09oUT09
A Game-Theoretic Approach to Offline Reinforcement Learning
Ching-An Cheng (Microsoft Research)Offline reinforcement learning (RL) is a paradigm for designing agents that can learn from existing datasets. Because offline RL can learn policies without collecting new data or expensive expert demonstrations, it offers great potentials for solving real-world problems. However, offline RL faces a fundamental challenge: oftentimes data in real world can only be collected by policies meeting certain criteria (e.g., on performance, safety, or ethics). As a result, existing data, though being large, could lack diversity and have limited usefulness. In this talk, I will introduce a generic game-theoretic approach to offline RL. It frames offline RL as a two-player game where a learning agent competes with an adversary that simulates the uncertain decision outcomes due to missing data coverage. By this game analogy, I will present a systematic and provably correct framework to design offline RL algorithms that can learn good policies with state-of-the-art empirical performance. In addition, I will show that this framework reveals a natural connection between offline RL and imitation learning, which ensures the learned policies to be always no worse than the data collection policies regardless of hyperparameter choices.Where are Milky Way’s Hadronic PeVatrons?
Takahiro Sudo Ohio State University
Observations indicate the existence of natural particle accelerators in the Milky Way, capable of producing PeV cosmic rays (“PeVatrons”). Observations also indicate the existence of extreme sources in the Milky Way, capable of producing gamma-ray radiations above 100 TeV. If these gamma-ray sources are hadronic cosmic-ray accelerators, then they must also be neutrino sources. However, no neutrino sources have been detected. How can we consistently understand the observations of cosmic rays, gamma rays, and neutrinos? We point out two extreme scenarios are allowed: (1) the hadronic cosmic-ray accelerators and the gamma-ray sources are the same objects, so that neutrino sources exist and improved telescopes can detect them, versus (2) the hadronic cosmic-ray accelerators and the gamma-ray sources are distinct, so that there are no detectable neutrino sources. We discuss the nature of Milky Way’s highest energy gamma-ray sources and outline future prospects toward understanding the origin of hadronic cosmic rays.
Zoom link: https://pitp.zoom.us/j/91390039665?pwd=dGJ2b3VCbVFhUVpSelpjYzJHdk1Gdz09
The Statistical Complexity of Interactive Decision Making
Dylan Foster (Microsoft Research)A fundamental challenge in interactive learning and decision making, ranging from bandit problems to reinforcement learning, is to provide sample-efficient, adaptive learning algorithms that achieve near-optimal regret. This question is analogous to the classical problem of optimal (supervised) statistical learning, where there are well-known complexity measures (e.g., VC dimension and Rademacher complexity) that govern the statistical complexity of learning. However, characterizing the statistical complexity of interactive learning is substantially more challenging due to the adaptive nature of the problem. In this talk, we will introduce a new complexity measure, the Decision-Estimation Coefficient, which is necessary and sufficient for sample-efficient interactive learning. In particular, we will provide: 1. a lower bound on the optimal regret for any interactive decision making problem, establishing the Decision-Estimation Coefficient as a fundamental limit. 2. a unified algorithm design principle, Estimation-to-Decisions, which attains a regret bound matching our lower bound, thereby achieving optimal sample-efficient learning as characterized by the Decision-Estimation Coefficient. Taken together, these results give a theory of learnability for interactive decision making. When applied to reinforcement learning settings, the Decision-Estimation Coefficient recovers essentially all existing hardness results and lower bounds.