CDS Affiliated Professor & Silver Professor of Mathematics at Courant Gérard Ben Arous Wins 2022 NeurIPS Outstanding Paper Award

NYU Center for Data Science
2 min readDec 20, 2022

The honor recognizes his research in high-dimensional limit theorems for stochastic gradient descent

CDS Affiliated Professor, Gérard Ben Arous

CDS Affiliated Professor & Silver Professor of Mathematics at Courant Gérard Ben Arous won the 2022 Neural Information Processing Systems (NeurIPS) Conference outstanding paper awards for his research “High-dimensional limit theorems for SGD: Effective dynamics and critical scaling.” The paper was co-authored by Reza Gheissari from Northwestern University, an NYU Courant PhD graduate, and Aukosh Jagannath from the University of Waterloo, who also received his PhD in mathematics from NYU, advised by Gérard.

The NeurIPS conference strives to bolster intellectual exchange and advance research in the areas of artificial intelligence and machine learning. This year the annual academic conference was held at the New Orleans Convention Center in Louisiana from Monday, November 28th through Friday, December 9th.

The winning paper explores stochastic gradient descent (SGD). ‘Stochastic’ means linked to random probability, so SGD implies samples are selected at random rather than selecting the whole data set. In data science, SGD is the predominant method for large-scale optimization problems and is used to train complicated parametric models on high-dimensional data. Specifically, the research studies scaling limits of stochastic gradient descent (SGD) with constant step size in the high-dimensional regime. Step size refers to the fundamental sample time of the model where the solver takes a step at each sample time determined by the model. The study revealed how complex SGD can be when the step size is large, determines the nature of SGD, and by comparing it to the ordinary differential equation (ODE) when the step size is small, the research lays out insights into the nonconvex optimization landscape.

“This work shows how the very high-dimensional dynamics of optimization may be reduced to a finite dimensional dynamical system, depending on the step size,” said Gérard. “The performance of the SGD algorithm is then related to the escape time from the projection of high-entropy regions in this finite dimensional system.”

Gérard’s research focuses on probability theory and its connections to other areas of mathematics, physics, and industrial applications. In addition to his teaching experience, he has worked as the Vice-Provost for Science and Engineering development at NYU as well as the Director of the Courant Institute of Mathematical Sciences.

by Meryl Phair

--

--

NYU Center for Data Science

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.