Tutorial: Recent Advances in High-Dimensional Robust Statistics

Location: STOC 2019, June 23, 2019

Presenters: Ilias Diakonikolas and Daniel Kane

Description and Outline

Fitting a model to a collection of observations is one of the quintessential questions in statistics and machine learning. The typical assumption is that the data was generated by a model of a given type (e.g., a mixture model). This is a simplifying assumption that is only approximately valid, as real datasets are typically exposed to some source of contamination. Hence, any estimator designed for a particular model must also be robust in the presence of corrupted/noisy data. Classical work in robust statistics, starting with the pioneering works of Tukey and Huber in the 1960s, pinned down the fundamental information-theoretic aspects of high-dimensional robust estimation. In contrast, until very recently, the computational aspects have been poorly understood. In particular, even for the basic problem of robustly estimating the mean of a high-dimensional dataset, all known robust estimators were hard to compute. Moreover, the accuracy of the known heuristics (e.g., RANSAC) degrades polynomially as the dimension increases. This state of affairs prompted the following natural question:

Can we reconcile robustness and computational efficiency in high-dimensional estimation?

A recent line of work in theoretical computer science obtained the first computationally efficient robust estimators for a range of high-dimensional estimation tasks. In this tutorial talk, we will survey the algorithmic techniques underlying these estimators and the connections between them. We will illustrate these techniques for the following problems and settings: robust mean and covariance estimation, robust stochastic optimization, robust estimation under sparsity assumptions, list-decodable learning and mixture models, robust estimation of higher moments, computational-robustness tradeoffs. Finally, we will discuss new directions and opportunities for future work.

Schedule and Slides

Time Talk
2:00 - 3:15 Ilias Diakonikolas
Introduction; Robust Mean Estimation and Applications
3:30- 4:30 Daniel Kane
Four Vignettes: Robust Covariance Estimation, List-Decodable Learning, Robust Estimation of High-Degree Moments, Robust Sparse Estimation
4:30- 5:00 Ilias Diakonikolas
Computational-Statistical Tradeoffs and Open Problems

Related References