Algorithms & Randomness Center (ARC)
Ilias Diakonikolas (USC)
Monday, September 18, 2017
Klaus 1116 East - 11:00 am
Title: Statistical Query Lower Bounds for High-Dimensional Unsupervised Learning
We describe a general technique that yields the first Statistical Query lower bounds for a range of fundamental high-dimensional learning problems. Our main results are for the problems of (1) learning Gaussian mixture models, and (2) robust learning of a single Gaussian distribution. For these problems, we show a super-polynomial gap between the sample complexity and the computational complexity of any Statistical Query (SQ) algorithm for the problem. SQ algorithms are a class of algorithms that are only allowed to query expectations of functions of the distribution rather than directly access samples. This class of algorithms is quite broad: a wide range of known algorithmic techniques in machine learning are known to be implementable using SQs.
Our SQ lower bounds are attained via a unified moment-matching technique that is useful in other contexts. Our method yields tight lower bounds for a number of related unsupervised estimation problems, including robust covariance estimation in spectral norm, and robust sparse mean estimation. Finally, for the classical problem of robustly testing an unknown mean Gaussian, we show a sample complexity lower bound that scales linearly in the dimension. This matches the sample complexity of the corresponding robust learning problem and separates the sample complexity of robust testing from standard testing. This separation is surprising because such a gap does not exist for the corresponding learning problem.
(Based on joint work with Daniel Kane (UCSD) and Alistair Stewart (USC).)
Bio: Ilias Diakonikolas is an Assistant Professor and Andrew and Erna Viterbi Early Career Chair in the Department of Computer Science at USC. He obtained a Diploma in electrical and computer engineering from the National Technical University of Athens and a Ph.D. in computer science from Columbia University where he was advised by Mihalis Yannakakis. Before moving to USC, he was a faculty member at the University of Edinburgh, and prior to that he was the Simons postdoctoral fellow in theoretical computer science at the University of California, Berkeley. His research is on the algorithmic foundations
of massive data sets, in particular on designing efficient algorithms for fundamental problems in machine learning. He is a recipient of a Sloan Fellowship, an NSF Career Award, a Google Faculty Research Award, a Marie Curie Fellowship, the IBM Research Pat Goldberg Best Paper Award, and an honorable mention in the George Nicholson competition from the INFORMS society.
Videos of recent talks are available at: https://smartech.gatech.edu/handle/1853/46836