Collaborative Research: HCC: Medium: Intelligent support for non-experts to navigate large information spaces

Apparently we never posted the news of our latest Gravity Spy NSF award, HCC 21-06865, Collaborative Research: HCC: Medium: Intelligent support for non-experts to navigate large information spaces, which was awarded in October 2021. This is a joint project with Kevin Crowston and Carsten Østerlund (Syracuse), Corey Jackson (Wisconsin), Aggelos Katsaggelos, Vassiliki Kalogera, Christopher Berry and Scott Coughlin (Northwestern), and Marissa Walker (Christopher Newport).

This project will build our understanding of how to enable non-expert volunteers in a citizen-science project to contribute to analyses of large volumes of data by searching for potentially causal relations. The increasing use of automated scientific-data-collection instruments has led to an explosion in the amount of scientific data collected, challenging the ability of scientists to analyze them. Volunteers have less background knowledge than experts about the purpose, context, content, provenance and processes associated with the data. A system that provides such background knowledge will enable non-experts to make sense of the data. The research plan also includes building system support to augment the capabilities of the volunteers, for example by searching for related data and by performing causal inference in conjunction with volunteers. Citizen-science projects provide a vehicle to disseminate scientific practice, knowledge and findings to the general public to increase awareness and understanding of the practices and techniques of data-intensive science. Findings should be directly applicable to the target context of involving citizen-science volunteers in navigating and analyzing large quantities of science data and generalize to other settings with big data.

In this research, volunteers classify noise events (glitches) produced by the Laser Interferometer Gravitational-wave Observatory (LIGO). Along with glitches observed in the main Gravitational Wave channel, the detectors record around 400,000 auxiliary channels of data that may provide information about the origins of the glitch. The research will test hypotheses about the kind of additional information needed to enable non-experts to productively navigate this large dynamic dataset to find related information and will develop processes, techniques and tools to allow the volunteers to manage and efficiently process the data. It will develop our understanding of how and when to introduce which different types of background knowledge about the data to enable non-experts to work on a task, such as by providing maps and visualizations of particular data and relationships at the time they are most needed in the volunteers' work process. The gravitational physics and astronomy communities will directly benefit from advances in LIGO detector characterization, data quality vetoes and hence signal searches.