Scientific Lectures //
Title: Visualizing Data using t-SNE
Laurens van der Maaten, Ph.D. - Delft University of Technology
Presented: October 15, 2013
ABSTRACT: Visualization techniques are essential tools for every data scientist. Unfortunately, the majority of visualization techniques can only be used to inspect a limited number of variables of interest simultaneously. As a result, these techniques are not suitable for big data that is very high-dimensional. An effective way to visualize high-dimensional data is to represent each data object by a two-dimensional point in such a way that similar objects are represented by nearby points, and that dissimilar objects are represented by distant points. The resulting two-dimensional points can be visualized in a scatter plot. This leads to a map of the data that reveals the underlying structure of the objects, such as the presence of clusters. We present a new technique to embed high-dimensional objects in a two-dimensional map, called t-Distributed Stochastic Neighbor Embedding (t-SNE), that produces substantially better results than alternative techniques. We demonstrate the value of t-SNE in domains such as computer vision and bioinformatics. In addition, we show how to scale up t-SNE to big data sets with millions of objects, and we present an approach to visualize objects of which the similarities are non-metric (such as semantic similarities).
This talk describes joint work with Geoffrey Hinton (Google / University of Toronto).
BIOGRAPHY: Laurens van der Maaten is an Assistant Professor at Delft University of Technology, The Netherlands. Before, he worked as a post-doctoral researcher at University of California, San Diego and as a Ph.D. student at Tilburg University, The Netherlands. He also worked as a visiting researcher at University of Toronto and at Imperial College London. Prof. van der Maaten's research interests are in machine learning and computer vision; his work focuses on high-dimensional data analysis, object tracking, regularization of classifiers, structured prediction as well as on applications of machine learning and computer vision to, among others, the analysis of paintings.
To view presentation please click here.