Talks

7513 - 7524 of 22668 Results

The Fully Constrained Formulation: local uniqueness and numerical accuracy

Isabel Cordero Carrion University of Valencia

December 16, 2020

PIRSA:20120005

Physics
The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks in High Dimension

Wei Hu (Princeton University)

December 16, 2020

16877

Computer Science
Analyzing Optimization and Generalization in Deep Learning via Dynamics of Gradient Descent

Nadav Cohen (Tel-Aviv University)

December 16, 2020

16875

Computer Science
SGD Learns One-Layer Networks in WGANs

Qi Lei (Princeton University)

December 16, 2020

16874

Computer Science
1st discussion session

Jorge Noreña

December 15, 2020

SAIFR:2472

Physics
2nd discussion session

Nelson Padilla

December 15, 2020

SAIFR:2473

Physics
Recent Progress in Algorithmic Robust Statistics via the Sum-of-Squares Method

Pravesh Kothari (CMU)

December 15, 2020

16921

Computer Science
Phase Transitions for Detecting Latent Geometry in Random Graphs

Dheeraj Nagaraj (MIT)

December 15, 2020

16873

Computer Science
Low-Degree Hardness of Random Optimization Problems

Alex Wein (New York University)

December 15, 2020

16866

Computer Science
A voyage through undulating dark matter and the GUTs of u(48)

Joseph Tooby-Smith University of Cambridge

December 15, 2020

PIRSA:20120025

High Energy Physics
Counting and Sampling Subgraphs in Sublinear Time

Talya Eden (MIT)

December 15, 2020

16864

Computer Science
Learning and Testing for Gradient Descent

Emmanuel Abbe (EPFL)

December 15, 2020

16863

Computer Science

The Fully Constrained Formulation: local uniqueness and numerical accuracy

Isabel Cordero Carrion University of Valencia

December 16, 2020

PIRSA:20120005

Physics

In this talk I will introduce the Fully Constrained Formulation (FCF) of General Relativity. In this formulation one has a hyperbolic sector and an elliptic one. The constraint equations are solved in each time step and are encoded in the elliptic sector; this set of equations have to be solved to compute initial data even if a free evolution scheme is used for a posterior dynamical evolution. Other formulations (like the XCTS formulation) share a similar elliptic sector. I will comment about the local uniqueness issue of the elliptic sector in the FCF. I will also described briefly the hyperbolic sector. I will finish with some recent reformulation of the equations which keeps the good properties of the local uniqueness, improves the numerical accuracy of the system and gives some additional information.
The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks in High Dimension

Wei Hu (Princeton University)

December 16, 2020

16877

Computer Science

Modern neural networks are often regarded as complex black-box functions whose behavior is difficult to understand owing to their nonlinear dependence on the data and the nonconvexity in their loss landscapes. In this work, we show that these common perceptions can be completely false in the early phase of learning. In particular, we formally prove that, for a class of well-behaved input distributions in high dimension, the early-time learning dynamics of a two-layer fully-connected neural network can be mimicked by training a simple linear model on the inputs. We additionally argue that this surprising simplicity can persist in networks with more layers and with convolutional architecture, which we verify empirically. Key to our analysis is to bound the spectral norm of the difference between the Neural Tangent Kernel (NTK) at initialization and an affine transform of the data kernel; however, unlike many previous results utilizing the NTK, we do not require the network to have disproportionately large width, and the network is allowed to escape the kernel regime later in training. Link to paper: https://arxiv.org/abs/2006.14599
Analyzing Optimization and Generalization in Deep Learning via Dynamics of Gradient Descent

Nadav Cohen (Tel-Aviv University)

December 16, 2020

16875

Computer Science

Understanding deep learning calls for addressing the questions of: (i) optimization --- the effectiveness of simple gradient-based algorithms in solving neural network training programs that are non-convex and thus seemingly difficult; and (ii) generalization --- the phenomenon of deep learning models not overfitting despite having many more parameters than examples to learn from. Existing analyses of optimization and/or generalization typically adopt the language of classical learning theory, abstracting away many details on the setting at hand. In this talk I will argue that a more refined perspective is in order, one that accounts for the dynamics of the optimizer. I will then demonstrate a manifestation of this approach, analyzing the dynamics of gradient descent over linear neural networks. We will derive what is, to the best of my knowledge, the most general guarantee to date for efficient convergence to global minimum of a gradient-based algorithm training a deep network. Moreover, in stark contrast to conventional wisdom, we will see that sometimes, adding (redundant) linear layers to a classic linear model significantly accelerates gradient descent, despite the introduction of non-convexity. Finally, we will show that such addition of layers induces an implicit bias towards low rank (different from any type of norm regularization), and by this explain generalization of deep linear neural networks for the classic problem of low rank matrix completion. Works covered in this talk were in collaboration with Sanjeev Arora, Noah Golowich, Elad Hazan, Wei Hu, Yuping Luo and Noam Razin.
SGD Learns One-Layer Networks in WGANs

Qi Lei (Princeton University)

December 16, 2020

16874

Computer Science

Generative adversarial networks (GANs) are a widely used framework for learning generative models. Wasserstein GANs (WGANs), one of the most successful variants of GANs, require solving a min-max optimization problem to global optimality but are in practice successfully trained using stochastic gradient descent-ascent. In this talk, we show that, when the generator is a one-layer network, stochastic gradient descent-ascent converges to a global solution with polynomial time and sample complexity.
1st discussion session

Jorge Noreña

December 15, 2020

SAIFR:2472

Physics
2nd discussion session

Nelson Padilla

December 15, 2020

SAIFR:2473

Physics
Recent Progress in Algorithmic Robust Statistics via the Sum-of-Squares Method

Pravesh Kothari (CMU)

December 15, 2020

16921

Computer Science

Past five years have witnessed a sequence of successes in designing efficient algorithms for statistical estimation tasks when the input data is corrupted with a constant fraction of fully malicious outliers. The Sum-of-Squares (SoS) method has been an integral part of this story and is behind robust learning algorithms for tasks such as estimating the mean, covariance, and higher moment tensors of a broad class of distributions, clustering and parameter estimation for spherical and non-spherical mixture models, linear regression, and list-decodable learning. In this talk, I will attempt to demystify this (unreasonable?) effectiveness of the SoS method in robust statistics. I will argue that the utility of the SoS algorithm in robust statistics can be directly attributed to its capacity (via low-degree SoS proofs) to "reason about" analytic properties of probability distributions such as sub-gaussianity, hypercontractivity, and anti-concentration. I will discuss precise formulations of such statements, show how they lead to a principled blueprint for problems in robust statistics including the applications mentioned above, and point out natural gaps in our understanding of analytic properties within SoS, which, if resolved would yield improved guarantees for basic tasks in robust statistics.
Phase Transitions for Detecting Latent Geometry in Random Graphs

Dheeraj Nagaraj (MIT)

December 15, 2020

16873

Computer Science

Random graphs with latent geometric structure are popular models of social and biological networks, with applications ranging from network user profiling to circuit design. These graphs are also of purely theoretical interest within computer science, probability and statistics. A fundamental initial question regarding these models is: when are these random graphs affected by their latent geometry and when are they indistinguishable from simpler models without latent structure, such as the Erdős-Rényi graph G(n,p)? We address this question for two of the most well-studied models of random graphs with latent geometry -- the random intersection and random geometric graph. Joint work with Matt Brennan and Dheeraj Nagaraj.
Low-Degree Hardness of Random Optimization Problems

Alex Wein (New York University)

December 15, 2020

16866

Computer Science

In high-dimensional statistical problems (including planted clique, sparse PCA, community detection, etc.), the class of "low-degree polynomial algorithms" captures many leading algorithmic paradigms such as spectral methods, approximate message passing, and local algorithms on sparse graphs. As such, lower bounds against low-degree algorithms constitute concrete evidence for average-case hardness of statistical problems. This method has been widely successful at explaining and predicting statistical-to-computational gaps in these settings. While prior work has understood the power of low-degree algorithms for problems with a "planted" signal, we consider here the setting of "random optimization problems" (with no planted signal), including the problem of finding a large independent set in a random graph, as well as the problem of optimizing the Hamiltonian of mean-field spin glass models. Focusing on the independent set problem, I will define low-degree algorithms in this setting, argue that they capture the best known algorithms, and explain new proof techniques that give sharp lower bounds against low-degree algorithms in this setting. The proof involves a generalization of the so-called "overlap gap property", which is a structural property of the solution space. Based on arXiv:2004.12063 (joint with David Gamarnik and Aukosh Jagannath) and arXiv:2010.06563
A voyage through undulating dark matter and the GUTs of u(48)

Joseph Tooby-Smith University of Cambridge

December 15, 2020

PIRSA:20120025

High Energy Physics

This talk will be split into two distinct halves: The first half will be based on the paper arxiv:2007.03662 and suggest that an interplay between microscopic and macroscopic physics can lead to an undulation on time scales not related to celestial dynamics. By searching for such undulations, the discovery potential of light DM search experiments can be enhanced.
The second half will look at some currently unpublished work into finding all the semi-simple subalgebras of u(48) which contain the SM. Such algebras (in a loose sense of the term) form GUTs and studying them has relevance to family unification, proton decay etc. Although there has been previous work into the classification of GUTs, we believe this is the first this broad question has been answered.
Counting and Sampling Subgraphs in Sublinear Time

Talya Eden (MIT)

December 15, 2020

16864

Computer Science

In this talk I will shortly survey recent developments in approximate subgraph counting and sampling in sublinear-time. Both counting and sampling small subgraphs is a basic primitive, well studied both in theory and in practice. We consider these problems in the sublinear-time setting, where access to the graph $G$ is given via queries. We will consider both general graphs, and graphs of bounded arboricity which can be viewed as ``sparse everywhere" graphs, and we will see how we can use this property to obtain substantially faster algorithms.
Learning and Testing for Gradient Descent

Emmanuel Abbe (EPFL)

December 15, 2020

16863

Computer Science

We present lower-bounds for the generalization error of gradient descent on free initializations, reducing the problem to testing the algorithm’s output under different data models. We then discuss lower-bounds on random initialization and present the problem of learning communities in the pruned-block-model, where it is conjectured that GD fails.

Title	Speaker(s)	Date	Collection	Type	Info
The Fully Constrained Formulation: local uniqueness and numerical accuracy		2020‑12‑16	Colloquium	Scientific Series	View details
The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks in High Dimension	Wei Hu (Princeton University)	2020‑12‑16			View details
Analyzing Optimization and Generalization in Deep Learning via Dynamics of Gradient Descent	Nadav Cohen (Tel-Aviv University)	2020‑12‑16			View details
SGD Learns One-Layer Networks in WGANs	Qi Lei (Princeton University)	2020‑12‑16			View details
1st discussion session	Jorge Noreña	2020‑12‑15	Latin American Workshop on Observational Cosmology	Conference	View details
2nd discussion session	Nelson Padilla	2020‑12‑15	Latin American Workshop on Observational Cosmology	Conference	View details
Recent Progress in Algorithmic Robust Statistics via the Sum-of-Squares Method	Pravesh Kothari (CMU)	2020‑12‑15			View details
Phase Transitions for Detecting Latent Geometry in Random Graphs	Dheeraj Nagaraj (MIT)	2020‑12‑15			View details
Low-Degree Hardness of Random Optimization Problems	Alex Wein (New York University)	2020‑12‑15			View details
A voyage through undulating dark matter and the GUTs of u(48)		2020‑12‑15	Particle Physics	Scientific Series	View details
Counting and Sampling Subgraphs in Sublinear Time	Talya Eden (MIT)	2020‑12‑15			View details
Learning and Testing for Gradient Descent	Emmanuel Abbe (EPFL)	2020‑12‑15			View details

Supported by

Format results

The Fully Constrained Formulation: local uniqueness and numerical accuracy

The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks in High Dimension

Analyzing Optimization and Generalization in Deep Learning via Dynamics of Gradient Descent

SGD Learns One-Layer Networks in WGANs

1st discussion session

2nd discussion session

Recent Progress in Algorithmic Robust Statistics via the Sum-of-Squares Method

Phase Transitions for Detecting Latent Geometry in Random Graphs

Low-Degree Hardness of Random Optimization Problems

A voyage through undulating dark matter and the GUTs of u(48)

Counting and Sampling Subgraphs in Sublinear Time

Learning and Testing for Gradient Descent

The Fully Constrained Formulation: local uniqueness and numerical accuracy

The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks in High Dimension

Analyzing Optimization and Generalization in Deep Learning via Dynamics of Gradient Descent

SGD Learns One-Layer Networks in WGANs

1st discussion session

2nd discussion session

Recent Progress in Algorithmic Robust Statistics via the Sum-of-Squares Method

Phase Transitions for Detecting Latent Geometry in Random Graphs

Low-Degree Hardness of Random Optimization Problems

A voyage through undulating dark matter and the GUTs of u(48)

Counting and Sampling Subgraphs in Sublinear Time

Learning and Testing for Gradient Descent