ISIT 2019 Tutorial: Robust Statistics - From Information Theory to Algorithms

Description and Outline

Fitting a model to a collection of observations is one of the quintessential questions in statistics and machine learning. The typical assumption is that the data was generated by a model of a given type (e.g., a mixture model). This is a simplifying assumption that is only approximately valid, as real datasets are typically exposed to some source of contamination. Hence, any estimator designed for a particular model must also be robust in the presence of corrupted/noisy data. Classical work in robust statistics, starting with the pioneering works of Tukey and Huber in the 1960s, pinned down the fundamental information-theoretic aspects of high-dimensional robust estimation. In contrast, until very recently, the computational aspects have been poorly understood. In particular, even for the basic problem of robustly estimating the mean of a high-dimensional dataset, all known robust estimators were hard to compute. Moreover, the accuracy of the known heuristics (e.g., RANSAC) degrades polynomially as the dimension increases. This state of affairs prompted the following natural question:

Can we reconcile robustness and computational efficiency in high-dimensional estimation?

A recent line of work in theoretical computer science obtained the first computationally efficient robust estimators for a range of high-dimensional estimation tasks. In this tutorial, we will survey the key algorithmic ideas in these recent developments. We will complement these positive results by discussing the fundamental limits of robust estimation. Finally, we will discuss new directions and opportunities for future work.

Slides

[pdf]

Main References

Robust Estimators in High Dimensions without the Computational Intractability

Agnostic Estimation of Mean and Covariance

Being Robust (in High Dimensions) Can Be Practical

Statistical Query Lower Bounds for Robust Estimation of High-dimensional Gaussians and Gaussian Mixtures

Robustly Learning a Gaussian: Getting Optimal Error, Efficiently

Sever: A Robust Meta-Algorithm for Stochastic Optimization

Tutorial: Robust Statistics-From Information Theory to Algorithms

Location: ISIT 2019, July 7, 2019

Presenter: Ilias Diakonikolas

Description and Outline

Slides

Main References