Network Report: A Structured Description for Network Datasets

Picture of Ryan A. Rossi
Ryan A. Rossi
Picture of Nesreen K. Ahmed
Nesreen K. Ahmed
Published at CIKM | Atlanta, Georgia 2022
The rapid development of network science and technologies de- pends on shareable datasets. Currently, there is no standard practice for reporting and sharing network datasets. Some network dataset providers only share links, while others provide some contexts or basic statistics. As a result, critical information may be unintentionally dropped, and network dataset consumers may misunderstand or overlook critical aspects. Inappropriately using a network dataset can lead to severe consequences (e.g., discrimination) especially when machine learning models on networks are deployed in high- stake domains. Challenges arise as networks are often used across different domains (e.g., network science, physics, etc) and have com- plex structures. To facilitate the communication between network dataset providers and consumers, we propose network report. A network report is a structured description that summarizes and contextualizes a network dataset. Network report extends the idea of dataset reports (e.g., Datasheets for Datasets) from prior work with network-specific descriptions of the non-i.i.d. nature, demographic information, network characteristics, etc. We hope network reports encourage transparency and accountability in network research and development across different fields.