Emblaze: Illuminating Machine Learning Representations through Interactive Comparison of Embedding Spaces
Published at
IUI
| Helsinki, Finland
2022
Abstract
Modern machine learning techniques commonly rely on complex, high-dimensional
embedding representations to capture underlying structure in the data and
improve performance. In order to characterize model flaws and choose a desirable
representation, model builders often need to compare across multiple embedding
spaces, a challenging analytical task supported by few existing tools. We first
interviewed nine embedding experts in a variety of fields to characterize the
diverse challenges they face and techniques they use when analyzing embedding
spaces. Informed by these perspectives, we developed a novel system called
Emblaze that integrates embedding space comparison within a computational
notebook environment. Emblaze uses an animated, interactive scatter plot with a
novel Star Trail augmentation to enable visual comparison. It also employs novel
neighborhood analysis and clustering procedures to dynamically suggest groups of
points with interesting changes between spaces. Through a series of case studies
with ML experts, we demonstrate how interactive comparison with Emblaze can help
gain new insights into embedding space structure. Emblaze is open source and
available to install (see our
GitHub repository for details).