Evaluating Systemic Error Detection Methods using Synthetic Images
Published at
SCIS at ICML
| Baltimore, MA
2022
Abstract
We introduce SpotCheck, a framework for generating synthetic datasets to use for
evaluating methods for discovering blindspots (i.e., systemic errors) in image
classifiers. We use SpotCheck to run controlled studies of how various factors
influence the performance of blindspot discovery methods. Our experiments reveal
several shortcomings of existing methods, such as relatively poor performance in
settings with multiple blindspots and sensitivity to hyperparameters. Further,
we find that a method based on dimensionality reduction, PlaneSpot, is
competitive with existing methods, which has promising implications for the
development of interactive tools.