Elias Bareinboim (Columbia University), Frederick Eberhardt (Caltech), Kun Zhang (Carnegie Mellon University), and Uri Shalit (Technion - Israel Institute of Technology)
Euclidean wormholes are exotic types of gravitational solutions that we still don't understand completely. In the first part of the talk, I will analyze asymptotically AdS wormhole solutions from a gravitational point of view. By studying correlation functions of local and non-local operators, the universal properties that any putative holographic dual should exhibit, become manifest. In the second part, I will describe some concrete field theoretic models (both effective and microscopic) that share these properties.
Data-driven design is making headway into a number of application areas, including protein, small-molecule, and materials engineering. The design goal is to construct an object with desired properties, such as a protein that binds to a target more tightly than previously observed. To that end, costly experimental measurements are being replaced with calls to a high-capacity regression model trained on labeled data, which can be leveraged in an in silico search for promising design candidates. The aim then is to discover designs that are better than the best design in the observed data. This goal puts machine-learning based design in a much more difficult spot than traditional applications of predictive modelling, since successful design requires, by definition, some degree of extrapolation---a pushing of the predictive models to its unknown limits, in parts of the design space that are a priori unknown. In this talk, I will discuss our methodological approaches to this problem, as well as report on some recent success in designing gene therapy delivery (AAV) libraries, useful for general downstream directed evolution selections.
Superradiant instabilities may create clouds of ultralight bosons around black holes, forming so-called “gravitational atoms.” It was recently shown that the presence of a binary companion can induce resonant transitions between a cloud's bound states. When these transitions backreact on the binary's orbit, they lead to qualitatively distinct signatures in the gravitational waveform that can dominate the overall behavior of the inspiral. In this talk, I will show that the interaction with the companion can also trigger transitions from bound to unbound states of the cloud---a process which I will refer to as ``ionization,'' in analogy with the photoelectric effect in atomic physics. Here, too, there is a type of resonance with a similarly distinct signature, which may ultimately be used to detect any dark ultralight bosons that exist in our universe.
Identifying which genetic variants influence medically relevant phenotypes is an important task both for therapeutic development and for risk prediction. In the last decade, genome wide association studies have been the most widely-used instrument to tackle this question. One challenge that they encounter is in the interplay between genetic variability and the structure of human populations. In this talk, we will focus on some opportunities that arise when one collects data from diverse populations and present statistical methods that allow us to leverage them. The presentation will be based on joint work with M. Sesia, S. Li, Z. Ren, Y. Romano and E. Candes.
Complete randomization allows for consistent estimation of the average treatment effect based on the difference in means of the outcomes without strong modeling assumptions on the outcome-generating process. Appropriate use of the pretreatment covariates can further improve the estimation efficiency. However, missingness in covariates is common in experiments and raises an important question: should we adjust for covariates subject to missingness, and if so, how? The unadjusted difference in means is always unbiased. The complete-covariate analysis adjusts for all completely observed covariates and improves the efficiency of the difference in means if at least one completely observed covariate is predictive of the outcome. Then what is the additional gain of adjusting for covariates subject to missingness? A key insight is that the missingness indicators act as fully observed pretreatment covariates as long as missingness is not affected by the treatment, and can thus be used in covariate adjustment to bring additional estimation efficiency. This motivates adding the missingness indicators to the regression adjustment, yielding the missingness-indicator method as a well-known but not so popular strategy in the literature of missing data. We recommend it due to its many advantages. We also propose modifications to the missingness-indicator method based on asymptotic and finite-sample considerations. To reconcile the conflicting recommendations in the missing data literature, we analyze and compare various strategies for analyzing randomized experiments with missing covariates under the design-based framework. This framework treats randomization as the basis for inference and does not impose any modeling assumptions on the outcome-generating process and missing-data mechanism.
RCTs are potentially useful in many ways other than standard confirmatory intent to treat (ITT) analyses, but to succeed difficult problems must be overcome.I will discuss some or (time-permitting) all of the following problems :
1. The problem of transportability of the trial results to other populations: I will explain why transportability is much more difficult in trials comparing longitudinal dynamic treatment regimes rather than in simple point treatment trials.
2. The problematic use of RCT data in micro-simulation models used in cost-benefit analyses
3.The problem of combining data from large, often confounded, administrative or electronic medical records , with data from smaller underpowered randomized trials in estimating individualized treatment strategies.
4. The problem of using the results of RCTs to benchmark the ability of observational analyses to 'get it right', with the goal of providing evidence that causal analyses of observational data are sufficiently reliable to contribute to decision making
5.The problem noncompliance with assigned protocol in trials in which the per-protocol effect rather than the ITT effect is of substantive importance .
6. The problem of leveraging the prior knowledge that diagnostic tests have "no direct effect on the outcome except through the treatment delivered" to greatly increase the power of trials designed to estimate the cost vs benefit of competing testing strategies.
We consider the problem of counterfactual inference in sequentially designed experiments wherein a collection of units undergo a sequence of interventions based on policies adaptive over time, and outcomes are observed based on the assigned interventions. Our goal is counterfactual inference, i.e., estimate what would have happened if alternate policies were used, a problem that is inherently challenging due to the heterogeneity in the outcomes across users and time. In this work, we identify structural assumptions that allow us to impute the missing potential outcomes in sequential experiments, where the policy is allowed to adapt simultaneously to all users' past data. We prove that under suitable assumptions on the latent factors and temporal dynamics, a variant of the nearest neighbor strategy allows us to impute the missing information using the observed outcome across time and users. Under mild assumptions on the adaptive policy and the underlying latent factor model, we prove that using data till time t for N users in the study, our estimate for the missing potential outcome at time t+1 admits a mean squared-error that scales as t^{-1/2+\delta} + N^{-1+\delta} for any \delta>0, for any fixed user. We also provide an asymptotic confidence interval for each outcome under suitable growth conditions on N and t, which can then be used to build confidence intervals for individual treatment effects. Our work extends the recent literature on inference with adaptively collected data by allowing for policies that pool across users, the matrix completion literature for missing at random settings by allowing for adaptive sampling mechanisms, and missing data problems in multivariate time series by allowing for a generic non-parametric model.
The field of quantum information provides fundamental insight into central open questions in quantum thermodynamics and quantum many-body physics, such as the characterization of the influence of quantum effects on the flow of energy and information. These insights have inspired new methods for cooling physical systems at the quantum scale using tools from quantum information processing. These protocols not only provide an essentially different way to cool, but also go beyond conventional cooling techniques, bringing important applications for quantum technologies. In this talk, I will first review the basic ideas of algorithmic cooling and give analytical results for the achievable cooling limits for the conventional heat-bath version. Then, I will show how the limits can be circumvented by using quantum correlations. In one algorithm I take advantage of correlations that can be created during the rethermalization step with the heat-bath and in another I use correlations present in the initial state induced by the internal interactions of the system. Finally, I will present a recently fully characterized quantum property of quantum many-body systems, in which entanglement in low-energy eigenstates can obstruct local outgoing energy flows.
Elias Bareinboim (Columbia University), Frederick Eberhardt (Caltech), Kun Zhang (Carnegie Mellon University), and Uri Shalit (Technion - Israel Institute of Technology)
I will present recent work exploring how and when can confounded offline data be used to improve online reinforcement learning. We will explore conditions of partial observability and distribution shifts between the offline and online environments, and present results for contextual bandits, imitation learning and reinforcement learning.
In this talk, I will discuss recent work on reasoning and learning with soft interventions, including the problem of identification, extrapolation/transportability, and structural learning. I will also briefly discuss a new calculus, which generalizes the do-calculus, as well as algorithmic and graphical conditions.
Supporting material:
General Transportability of Soft Interventions: Completeness Results .
J. Correa, E. Bareinboim.
In Proceedings of the 34th Annual Conference on Neural Information Processing Systems (NeurIPS), 2020.
https://causalai.net/r68.pdf
Causal Discovery from Soft Interventions with Unknown Targets: Characterization & Learning.
A. Jaber, M. Kocaoglu, K. Shanmugam, E. Bareinboim.
In Proceedings of the 34th Annual Conference on Neural Information Processing Systems (NeurIPS), 2020.
https://causalai.net/r67.pdf
A Calculus For Stochastic Interventions: Causal Effect Identification and Surrogate Experiments
J. Correa, E. Bareinboim.
In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI), 2019.
https://causalai.net/r55.pdf
Elias Bareinboim (Columbia University), Frederick Eberhardt (Caltech), Kun Zhang (Carnegie Mellon University), and Uri Shalit (Technion - Israel Institute of Technology)