Format results
- Anish Agarwal (MIT EECS)
Learning In The Presence Of Strategic Agents: Dynamics, Equilibria, And Convergence
Eric Mazumdar (Caltech)Algorithmic Advances For The Design And Analysis Of Randomized Control Trials
Christopher Harshaw (Yale)Fundamental Limits Of Learning In Data-Driven Problems
Yanjun Han (Simons Institute)Harish-Chandra bimodules in complex rank
Aleksandra Utiralova Massachusetts Institute of Technology (MIT)
Information, Learning, And Incentive Design For Societal Networks
Manxi Wu (UC Berkeley)Fair And Reliable Machine Learning For High-Stakes Applications:approaches Using Information Theory
Sanghamitra Dutta (JP Morgan)
Causal Inference For Socio-Economic & Engineering Systems
Anish Agarwal (MIT EECS)What will happen to Y if we do A? A variety of meaningful socio-economic and engineering questions can be formulated this way. To name a few: What will happen to a patient's health if they are given a new therapy? What will happen to a country's economy if policy-makers legislate a new tax? What will happen to a data center's latency if a new congestion control protocol is used? In this talk, we will explore how to answer such counterfactual questions using observational data---which is increasingly available due to digitization and pervasive sensors---and/or very limited experimental data. The two key challenges in doing so are: (i) counterfactual prediction in the presence of latent confounders; (ii) estimation with modern datasets which are high-dimensional, noisy, and sparse. Towards this goal, the key framework we introduce is connecting causal inference with tensor completion, a very active area of research across a variety of fields. In particular, we show how to represent the various potential outcomes (i.e., counterfactuals) of interest through an order-3 tensor. The key theoretical results presented are: (i) Formal identification results establishing under what missingness patterns, latent confounding, and structure on the tensor is recovery of unobserved potential outcomes possible. (ii) Introducing novel estimators to recover these unobserved potential outcomes and proving they are finite-sample consistent and asymptotically normal. The efficacy of the proposed estimators is shown on high-impact real-world applications. These include working with: (i) TaurRx Therapeutics to propose novel clinical trial designs to reduce the number of patients recruited for a trial and to correct for bias from patient dropouts; (ii) Uber Technologies on evaluating the impact of certain driver engagement policies without having to run an A/B test.Learning In The Presence Of Strategic Agents: Dynamics, Equilibria, And Convergence
Eric Mazumdar (Caltech)The ability to learn from data and make decisions in real-time has led to the rapid deployment of machine learning algorithms across many aspects of everyday life. While this has enabled new services and technologies, the fact that algorithms are increasingly interacting with people and other algorithms marks a distinct shift away from the traditional machine learning paradigm. Indeed, little is known about how these algorithms--- that were designed to operate in isolation--- behave when confronted with strategic behaviors on the part of people, and the extent to which strategic agents can game the algorithms to achieve better outcomes. In this talk, I will give an overview of my work on learning games and in the presence of strategic agents and multi-agent reinforcement learning.Multi-Agent Reinforcement Learning In Stochastic Games: From Alphago To Robust Control
Kaiqing Zhang (MIT)Reinforcement learning (RL) has recently achieved tremendous successes in several artificial intelligence applications. Many of the forefront applications of RL involve "multiple agents", e.g., playing chess and Go games, autonomous driving, and robotics. In this talk, I will introduce several recent works on multi-agent reinforcement learning (MARL) with theoretical guarantees. Specifically, we focus on solving the most basic multi-agent RL setting: infinite-horizon zero-sum stochastic games (Shapley 1953), using three common RL approaches: model-based, value-based, and policy-based ones. We first show that for the tabular setting, "model-based multi-agent RL" (estimating the model first and then planning) can achieve near-optimal sample complexity when a generative model of the game environment is available. Second, we show that a simple variant of "Q-learning" (value-based) can find the Nash equilibrium of the game, even if the agents run it independently/in a "fully decentralized" fashion. Third, we show that "policy gradient" methods (policy-based) can solve zero-sum stochastic games with linear dynamics and quadratic costs, which equivalently solves a robust and risk-sensitive control problem. With this connection to robust control, we discover that our policy gradient methods automatically preserve the robustness of the system during iterations, some phenomena we referred to as "implicit regularization". Time permitting, I will also discuss some ongoing and future directions along these lines.Algorithmic Advances For The Design And Analysis Of Randomized Control Trials
Christopher Harshaw (Yale)In this talk, I will survey some of my dissertation work on algorithmic problems arising in the design and analysis of randomized experiments. I hope to give a sense of the style of problems and technical work that I enjoy. During my dissertation work, I was asking: How can we design sampling algorithms to achieve desired levels of covariate balance in a randomized experiment? How can we estimate the variance of a treatment effect estimator in the presence of general interference? How should we analyze and design so-called "bipartite" experiments where units which receive treatment are distinct from units on which outcomes are measured?Fundamental Limits Of Learning In Data-Driven Problems
Yanjun Han (Simons Institute)What is the best we can do with the amount of data at our disposal with a given learning task? Modern learning problems---with a modest amount of data or subject to data processing constraints---frequently raise the need to understand the fundamental limits and make judicious use of the available small or imperfect data. This talk will cover several examples of learning where exploiting the key structure, as well as optimally trading between real-world resources, are vital to achieve statistical efficiency.Choosing Between Dags Can Be Hard
Richard Guo (University of Cambridge)Current methods for causal discovery typically report a single directed acyclic graph (DAG). Through an example, I hope to convince you that this might not be the best practice. In fact, depending on how two DAGs intersect and the local geometry at the intersection, the hardness of this problem can vary dramatically.Causal Inference With Corrupted Data: Measurement Error, Missing Values, Discretization, And Differential Privacy
Rahul Singh (MIT)Even the most carefully curated economic data sets have variables that are noisy, missing, discretized, or privatized. The standard workflow for empirical research involves data cleaning followed by data analysis that typically ignores the bias and variance consequences of data cleaning. We formulate a semiparametric model for causal inference with corrupted data to encompass both data cleaning and data analysis. We propose a new end-to-end procedure for data cleaning, estimation, and inference with data cleaning-adjusted confidence intervals. We prove consistency, Gaussian approximation, and semiparametric efficiency for our estimator of the causal parameter by finite sample arguments. The rate of Gaussian approximation is $n^{-1/2}$ for global parameters such as average treatment effect, and it degrades gracefully for local parameters such as heterogeneous treatment effect for a specific demographic. Our key assumption is that the true covariates are approximately low rank. In our analysis, we provide nonasymptotic theoretical contributions to matrix completion, statistical learning, and semiparametric statistics. We verify the coverage of the data cleaning-adjusted confidence intervals in simulations calibrated to resemble differential privacy as implemented in the 2020 US Census.Causal Inference And Data-Driven Decision-Making
Angela Zhou (UC Berkeley)I will provide a brief overview of previous work on credible inference in the context of causal inference and statistical machine learning, and discuss ongoing directions on interfaces of causal inference embedded in complex operational systems.Possibility of causal loops without superluminal signalling -- a general framework
Vilasini Venkatesh University of York
Causality is fundamental to science, but it appears in several different forms. One is relativistic causality, which is tied to a space-time structure and forbids signalling outside the future. On the other hand, causality can be defined operationally using causal models by considering the flow of information within a network of physical systems and interventions on them. From both a foundational and practical viewpoint, it is useful to establish the class of causal models that can coexist with relativistic principles such as no superluminal signalling, noting that causation and signalling are not equivalent. We develop such a general framework that allows these different notions of causality to be independently defined and for connections between them to be established. The framework first provides an operational way to model causation in the presence of cyclic, fine-tuned and non-classical causal influences. We then consider how a causal model can be embedded in a space-time structure and propose a mathematical condition (compatibility) for ensuring that the embedded causal model does not allow signalling outside the space-time future. We identify several distinct classes of causal loops that can arise in our framework, showing that compatibility with a space-time can rule out only some of them. We then demonstrate the mathematical possibility of causal loops embedded in Minkowski space-time that can be operationally detected through interventions, without leading to superluminal signalling. Our framework provides conditions for preventing superluminal signalling within arbitrary (possibly cyclic) causal models and also allows us to model causation in post-quantum theories admitting jamming correlations. Applying our framework to such scenarios, we show that post-quantumjamming can indeed lead to superluminal signalling contrary to previous claims. Finally, this work introduces a new causal modelling concept of ``higher-order affects relations'' and several related technical results, which have applications for causal discovery in fined-tuned causal models.
Harish-Chandra bimodules in complex rank
Aleksandra Utiralova Massachusetts Institute of Technology (MIT)
Deligne tensor categories are defined as an interpolation of the categories of representations of groups GL_n, O_n, Sp_{2n} or S_n to the complex values of the parameter n. One can extend many classical representation-theoretic notions and constructions to this context. These complex rank analogs of classical objects provide insights into their stable behavior patterns as n goes to infinity.
I will talk about some of my results on Harish-Chandra bimodules in Deligne categories. It is known that in the classical case simple Harish-Chandra bimodules admit a classification in terms of W-orbits of certain pairs of weights. However, the notion of weight is not well-defined in the setting of Deligne categories. I will explain how in complex rank the above-mentioned classification translates to a condition on the corresponding (left and right) central characters.
Another interesting phenomenon arising in complex rank is that there are two ways to define Harish-Chandra bimodules. That is, one can either require that the center acts locally finitely on a bimodule M or that M has a finite K-type. The two conditions are known to be equivalent for a semi-simple Lie algebra in the classical setting, however, in Deligne categories that is no longer the case. I will talk about a way to construct examples of Harish-Chandra bimodules of finite K-type using the ultraproduct realization of Deligne categories.Zoom Link: https://pitp.zoom.us/j/93951304913?pwd=WVk1Uk54ODkyT3ZIT2ljdkwxc202Zz09
Information, Learning, And Incentive Design For Societal Networks
Manxi Wu (UC Berkeley)Today's data-rich platforms are reshaping the operations of societal networks by providing information, recommendations, and matching services to a large number of users. How can we model the behavior of human agents in response to services provided by these platforms, and develop tools to improve the aggregate outcomes in a socially desirable manner? In this talk, I will briefly summarize our works that tackle this question from three aspects: 1) Game-theoretic analysis of the impact of information platforms (navigation apps) on the strategic behavior and learning processes of travelers in uncertain networks; 2) Market mechanism design for efficient carpooling and toll pricing in the presence of autonomous driving technology; 3) Security analysis and resource allocation for robustness under random or adversarial disruptions.Fair And Reliable Machine Learning For High-Stakes Applications:approaches Using Information Theory
Sanghamitra Dutta (JP Morgan)How do we make machine learning (ML) algorithms fair and reliable? This is particularly important today as ML enters high-stakes applications such as hiring and education, often adversely affecting people's lives with respect to gender, race, etc., and also violating anti-discrimination laws. When it comes to resolving legal disputes or even informing policies and interventions, only identifying bias/disparity in a model's decision is insufficient. We really need to dig deeper into how it arose. E.g., disparities in hiring that can be explained by an occupational necessity (code-writing skills for software engineering) may be exempt by law, but the disparity arising due to an aptitude test may not be (Ref: Griggs v. Duke Power `71). This leads us to a question that bridges the fields of fairness, explainability, and law: How can we identify and explain the sources of disparity in ML models, e.g., did the disparity entirely arise due to the critical occupational necessities? In this talk, I propose a systematic measure of "non-exempt disparity," i.e., the bias which cannot be explained by the occupational necessities. To arrive at a measure for the non-exempt disparity, I adopt a rigorous axiomatic approach that brings together concepts in information theory (in particular, an emerging body of work called Partial Information Decomposition) with causality.