User Study: Embedding Space Visualization

Published on

We are looking for domain experts in a variety of fields, such as natural language processing, computational biology, and computer vision, who are currently developing or analyzing embedding spaces in their work. If you participate, you will be given an embedding analysis tool and asked to think aloud as you investigate data specific to your domain of expertise. You are encouraged, but not required, to prepare your own dataset(s) to visualize during the study. The study session will last 1-2 hours, after which you may opt to use the tool independently and return within 7-10 days for a follow-up interview. Participants will be paid $20 per hour spent in sessions with the investigators. Participants must be at least 18 years old.

If you are interested in participating, please fill out this form, or email the researchers with any questions.

Background: Many machine learning models work by generating a “latent space” which maps out (or “embeds”) the input data to better perform a downstream task, such as classification. Visualizing these embedding spaces is an important step to make sure that the model has learned the desired attributes (e.g. correctly separating dogs from cats, or cancer cells from non-cancer cells). However, most existing visualizations are static and are quite difficult to compare from one model to another. We have designed a system, called Emblaze, that allows experts to visualize these embedding spaces within a Jupyter notebook, and visually compare and inspect across multiple spaces. Emblaze is publicly available on GitHub. If you are interested in using this tool in your research, we encourage you to participate in our study, or contact us to discuss how it can better support your needs!