Database Benchmarking for Supporting Real-Time Interactive Querying of Large Data

Philipp Eichmann
Marco Angelini
Tiziana Catarci
Giuseppe Santucci
Yukun Zheng
Carsten Binnig
Jean-Daniel Fekete
Published at SIGMOD in Portland 2020
In this paper, we present a new benchmark to validate the suitability of database systems for interactive visualization workloads. While there exist proposals for evaluating database systems on interactive data exploration workloads, none rely on real user traces for database benchmarking. To this end, our long term goal is to collect user traces that represent workloads with different exploration characteristics. In this paper, we present an initial benchmark that focuses on "crossfilter"-style applications, which are a popular interaction type for data exploration and a particularly demanding scenario for testing database system performance. We make our benchmark materials, including input datasets, interaction sequences, corresponding SQL queries, and analysis code, freely available as a community resource, to foster further research in this area: