TTI-Chicago Summer Workshop:

Computational Efficiency & High-Dimensional Robust Statistics

August 13-17, 2018

Organizers:

Ilias Diakonikolas (diakonik@usc.edu), Daniel Kane (dakane@ucsd.edu)

Topic Description

Fitting a model to a collection of observations is one of the quintessential problems in machine learning. The standard assumption is that the data was generated by a model of a given type (e.g., a mixture model). This simplifying assumption is at best only approximately valid, as real datasets are typically exposed to some source of contamination. Hence, any estimator designed for a particular model that is to be used in practice must also be robust in the presence of model misspecification. This is the prototypical goal in robust statistics, a field that took shape in the 1960s with the pioneering works of Tukey and Huber. Until recently, even for the basic problem of robustly estimating the mean of a high-dimensional dataset, all known robust estimators were hard to compute. Moreover, the quality of the common heuristics (e.g., RANSAC) degrades badly as the dimension increases. This state of affairs prompted the following natural question:

Can we reconcile robustness and computational efficiency in high-dimensional estimation?

A recent line of work in theoretical computer science has sought to circumvent this apparent computational barrier for a range of natural generative models. Specifically, [1, 2] gave the first provably robust and efficiently computable estimators for several fundamental estimation tasks that were previously thought to be intractable. These include robust estimation of mean and covariance in high dimensions, and robust learning of various latent variable models.

Since the dissemination of [1, 2], there has been a flurry of research activity on algorithmic robust high-dimensional estimation in a range of settings. Moreover, the developed algorithmic ideas have found applications in exploratory data analysis and adversarial machine learning.

Objectives

This workshop will focus on the recent developments in high-dimensional robust statistics. A first objective will be to introduce the machine learning community to the new insights and techniques in the exciting area of algorithmic robust statistics. Moreover, by bringing a number of experts together, we expect to foster collaboration that will push the boundary of the field. This includes both circumventing technical obstacles and discussing new directions and opportunities for future work.

Specifically, we will examine concrete questions such as: What are the inherent limitations of the recently developed algorithmic techniques, and can we design new techniques that yield efficient robust estimators for richer families of problems and models? Is there a formal connection between adversarial robustness and other notions of stability? Can we leverage insights from the theory of robustness to improve the reliability of machine learning systems?

Further details about this workshop will be posted in due course.

References

[1] Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, Alistair Stewart: Robust Estimators in High Dimensions without the Computational Intractability. Proceedings of the 57th IEEE Symposium on Foundations of Computer Science (FOCS 2016).

[2] Kevin Lai, Anup B. Rao, Santosh Vempala: Agnostic Estimation of Mean and Covariance. Proceedings of the 57th IEEE Symposium on Foundations of Computer Science (FOCS 2016).