Abstract
In many domains, we are interested in estimating the total causal treatment effect in the presence of network interference, where the outcome of one individual or unit is affected by the treatment assignment of those in its local network. Additional challenges arise when complex cluster randomized designs are not feasible to implement, or the network is unknown and costly to estimate. We propose a new measure of model complexity that characterizes the difficulty of estimating the total treatment effect under the standard A/B testing setup. We provide a class of unbiased estimators whose variance is optimal with respect to the population size and the treatment budget. Furthermore, we show that when the network is completely unknown, we can still estimate the total treatment effect under a richer yet simple staggered rollout experimental design. The proposed design principles, and related estimator, work with a broad class of outcome models. Our solution and statistical guarantees do not rely on restrictive network properties, allowing for highly connected networks that may not be easily clustered. This is joint work with Edoardo Airoldi, Christian Borgs, Jennifer Chayes, Mayleen Cortez, and Matthew Eichhorn
Abstract
A random structure exhibits symmetry if its law remains invariant under a group of transformations. Exchangeability (of graphs, sequences, etc) and stationarity are examples. Under suitable conditions, the transformation group can be used to define an estimator that averages over an instance of the structure, and such estimators turn out to satisfy a law of large numbers, a central limit theorem, and further convergence results. Loosely speaking: The large-sample theory of i.i.d averages still holds if the i.i.d assumption is substituted by a suitable symmetry assumption.
Joint work with Morgane Austern.
Abstract
Dense graph limit theory is mainly concerned with law-of large-number type of results. We propose a corresponding central limit theorem - or rather fluctuation theory - based on Janson's theory of Gaussian Hilbert Spaces and generalised U-statistics from the 1990s. Our approach provides rates and allows for proper statistical inference based on subgraph counts.
Abstract
The problem of completing a large low rank matrix using a subset of revealed entries has received much attention in the last ten years. I will give a necessary and sufficient condition, stated in the language of graph limit theory, for a sequence of matrix completion problems with arbitrary missing patterns to be asymptotically solvable. I will also present an algorithm that is able to approximately recover the matrix whenever recovery is possible.
Abstract
Ramsey's Theorem states that for every graph H, there is an integer R(H) such that every 2-edge-coloring of R(H)-vertex complete graph contains a monochromatic copy of H. In this talk, we focus on a natural quantitative extension: how many monochromatic copies of H can we find in every 2-edge-coloring of K_N, and what graphs H are so-called common, i.e., the number of monochromatic copies of H is asymptotically minimized by a random 2-edge-coloring. A classical result of Goodman from 1959 states that the triangle is a common graph. On the other hand, Thomason proved in 1989 that no clique of order at least four is common, and the existence of a common graph with chromatic number larger than three was open until 2012, when Hatami, Hladky, Kral, Norin and Razborov proved that the 5-wheel is common. In this talk, we show that for every k>4 there exists a common graph with chromatic number k.
This is a joint work with D. Kral and F. Wei
Abstract
Motifs (patterns of subgraphs), such as edges and triangles, encode important structural information about the geometry of a network. Consequently, counting motifs in a large network is an important statistical and computational problem. In this talk we will consider the problem of estimating motif densities and fluctuations of subgraph counts in an inhomogeneous random graph sampled from a graphon. We will show that the limiting distributions of subgraph counts can be Gaussian or non-Gaussian, depending on a notion of regularity of subgraphs with respect to the graphon. Using these results and a novel multiplier bootstrap for graphons, we will construct joint confidence sets for the motif densities. Finally, we will discuss various structure theorems and open questions about degeneracies of the limiting distribution.
Joint work with Anirban Chatterjee, Soham Dan, and Svante Janson.
Abstract
We initiate a study of large deviations for block model random graphs in the dense regime. Following Chatterjee and Varadhan (2011), we establish an LDP for dense block models, viewed as random graphons. As an application of our result, we study upper tail large deviations for homomorphism densities of regular graphs. We identify the existence of a "symmetric'' phase, where the graph, conditioned on the rare event, looks like a block model with the same block sizes as the generating graphon. In specific examples, we also identify the existence of a "symmetry breaking'' regime, where the conditional structure is not a block model with compatible dimensions. This identifies a "reentrant phase transition'' phenomenon for this problem---analogous to one established for Erdős–Rényi random graphs (Chatterjee and Dey (2010), Charrerjee and Varadhan (2011)). Finally, extending the analysis of Lubetzky and Zhao (2015), we identify the precise boundary between the symmetry and symmetry breaking regimes for homomorphism densities of regular graphs and the operator norm on Erdős–Rényi bipartite graphs.
Joint work with Christian Borgs, Jennifer Chayes, Subhabrata Sen, and Samantha Petti
Abstract
Variational approximations provide an attractive computational alternative to MCMC-based strategies for approximating the posterior distribution in Bayesian inference. Despite their popularity in applications, supporting theoretical guarantees are limited, particularly in high-dimensional settings. In this talk, we will study bayesian inference in the context of a linear model with product priors, and derive sufficient conditions for the correctness (to leading order) of the naive mean-field approximation. To this end, we will utilize recent advances in the theory of non-linear large deviations (Chatterjee and Dembo 2014). Next, we analyze the naive mean-field variational problem, and precisely characterize the asymptotic properties of the posterior distribution in this setting. The theory of graph limits provides a crucial ingredient to study this high-dimensional variational problem. This is based on joint work with Sumit Mukherjee (Columbia University).
Abstract
We discuss recent theorems on both smooth and singular responses of large dense graphs to changes in edge and triangle density constraints. Smoothness requires control over typical (exponentially most) graphs with given sharp values of those two densities.
In particular we prove the existence of a connected open set S in the plane of edge and triangle densities, cut into two pieces S' and S" by the curve C corresponding to graphs with independent edges. For typical graphs G with given edge and triangle densities, every subgraph density of G is real analytic on S' and S" as a function of the edge and triangle densities. However these subgraph densities are not analytic, or even differentiable, on C.
Joint work with Joe Neeman and Lorenzo Sadun.
Abstract
In the theory of random graphs the giant component has remained a guiding theme since the seminal paper of Erdos and Renyi. It has long been observed that the emergence of the giant component is analogous to the survival of a Galton-Watson branching process. This analogy crystallises the interplay between the local and the global structure of a sparse random graph. In fact, the notion that the Galton-Watson tree is the limiting object of the 'local structure' of the Erdos and Renyi random graph can be formalised nicely in the language of 'local weak convergence' introduced by Benjamini and Schramm and by Aldous and Steele. The local structure and local weak limits found their applications also in message passing algorithms, such as Belief Propagation and Warning Propagation, to mention a few.
Turning our attention to a random graph with a topological constraint (e.g., a random planar graph) where the independence of edges is lost, we will show that the planarity constraint affects the global component structure substantially, which in turn affects the local structure.
Abstract
A combinatorial structure is said to be quasirandom if it resembles a random structure in a certain robust sense. The notion of quasirandom graphs, developed in the work of Rödl, Thomason, Chung, Graham and Wilson in 1980s, is particularly robust as several different properties of truly random graphs, e.g., subgraph density, edge distribution and spectral properties, are satisfied by a large graph if and only if one of them is. We will discuss recent results on quasirandomness of various kinds of combinatorial structures (in particular, directed graphs, permutations and Latin squares) obtained using analytic tools provided by the theory of combinatorial limits.
Abstract
In this talk we study the random cluster model on essentially large girth and random regular graphs. We give explicit formula for the limiting free entropy of the random cluster model. Our result extends the work of Dembo, Montanari, Sly and Sun for the Potts model, and we prove a conjecture of Helmuth, Jenssen and Perkins about the phase transition of the random cluster model. This is joint work with Ferenc Bencs and Márton Borbényi.