Talk by Shai Ben-David (DSAS Colloquium)
Room: 320
Shai Ben-David - University of Waterloo
Title: Unsupervised learning; what can, what can't and what should not be done.
Abstract: Unsupervised learning refers to the process of finding patterns and drawing conclusions from raw data (in contrast to supervised learning, where the training data is labeled, or scored, and the learner is expected to figure out a labeling/scoring rule for use in yet-unseen examples). Unlabeled data is, naturally, more readily available than supervised examples, and there is therefore much to gain from being able to utilize such data. However, our understanding on unsupervised learning is much less satisfactory than the established theory of supervised learning.
In this talk I will discuss several aspects of the theory of unsupervised learning and describe some recent results and insights, as well as provide my idiosyncratic advice about how the research and practice of this important task should (and should not) be carried out.
In particular, I will highlight joint work with Hasan Ashiani, Nick Harvey, Chris Law, Abas Merhabian and Yniv Plan resolving the sample complexity of learning mixtures of Gaussians (that paper won Best Paper Award in last year's NeurIPS), work with Shay Moran, Pavel Hrubes, Amir Shpilka and Amir Yehudayoff that shows independence of set theory of a basic statistical learnability problem (this paper was featured last January in Nature Magazine), and a critical overview of current research and practice of clustering algorithms.