%0 Journal Article %J European Physical Journal Plus %D 2024 %T Gravity Spy: Lessons Learned and a Path Forward %A Michael Zevin %A Corey B. Jackson %A Zoheyr Doctor %A Yunan Wu %A Carsten Østerlund %A L. Clifton Johnson %A Christopher P. L. Berry %A Kevin Crowston %A Scott B. Coughlin %A Vicky Kalogera %A Sharan Banagiri %A Derek Davis %A Jane Glanzer %A Renzhi Hao %A Aggelos K. Katsaggelos %A Oli Patane %A Jennifer Sanchez %A Joshua Smith %A Siddharth Soni %A Laura Trouille %A Marissa Walker %A Irina Aerith %A Wilfried Domainko %A Victor-Georges Baranowski %A Gerhard Niklasch %A Barbara Téglás %X

The Gravity Spy project aims to uncover the origins of glitches, transient bursts of noise that hamper analysis of gravitational-wave data. By using both the work of citizen-science volunteers and machine-learning algorithms, the Gravity Spy project enables reliable classification of glitches. Citizen science and machine learning are intrinsically coupled within the Gravity Spy framework, with machine-learning classifications providing a rapid first-pass classification of the dataset and enabling tiered volunteer training, and volunteer-based classifications verifying the machine classifications, bolstering the machine-learning training set and identifying new morphological classes of glitches. These classifications are now routinely used in studies characterizing the performance of the LIGO gravitational-wave detectors. Providing the volunteers with a training framework that teaches them to classify a wide range of glitches, as well as additional tools to aid their investigations of interesting glitches, empowers them to make discoveries of new classes of glitches. This demonstrates that, when giving suitable support, volunteers can go beyond simple classification tasks to identify new features in data at a level comparable to domain experts. The Gravity Spy project is now providing volunteers with more complicated data that includes auxiliary monitors of the detector to identify the root cause of glitches.

%B European Physical Journal Plus %V 139 %P Article 100 %8 01/2024 %G eng %R 10.1140/epjp/s13360-023-04795-4 %0 Journal Article %J IEEE Transactions on Learning Technologies %D 2020 %T Knowledge Tracing to Model Learning in Online Citizen Science Projects %A Kevin Crowston %A Carsten Østerlund %A Tae Kyoung Lee %A Corey Brian Jackson %A Mahboobeh Harandi %A Sarah Allen %A Sara Bahaadini %A Scott Coughlin %A Aggelos Katsaggelos %A Shane Larson %A Neda Rohani %A Joshua Smith %A Laura Trouille %A Michael Zevin %X

We present the design of a citizen science system that uses machine learning to guide the presentation of image classification tasks to newcomers to help them more quickly learn how to do the task while still contributing to the work of the project. A Bayesian model for tracking volunteer learning for training with tasks with uncertain outcomes is presented and fit to data from 12,986 volunteer contributors. The model can be used both to estimate the ability of volunteers and to decide the classification of an image. A simulation of the model applied to volunteer promotion and image retirement suggests that the model requires fewer classifications than the current system.

%B IEEE Transactions on Learning Technologies %V 13 %P 123-134 %G eng %6 1 %R 10.1109/TLT.2019.2936480 %> https://crowston.syr.edu/sites/crowston.syr.edu/files/transaction%20paper%20final%20figures%20in%20text.pdf %0 Journal Article %J Computers in Human Behavior %D 2020 %T Teaching Citizen Scientists to Categorize Glitches using Machine-Learning-Guided Training %A Corey Jackson %A Carsten Østerlund %A Kevin Crowston %A Mahboobeh Harandi %A Sarah Allen %A Sara Bahaadini %A Scott Coughlin %A Vicky Kalogera %A Aggelos Katsaggelos %A Shane Larson %A Neda Rohani %A Joshua Smith %A Laura Trouille %A Michael Zevin %X

Training users in online communities is important for making high performing contributors. However, several conundrums exists in choosing the most effective approaches to training users. For example, if it takes time to learn to do the task correctly, then the initial contributions may not be of high enough quality to be useful. We conducted an online field experiment where we recruited users (N = 386) in a web-based citizen-science project to evaluate the two training approaches. In one training regime, users received one-time training and were asked to learn and apply twenty classes to the data. In the other approach, users were gradually exposed to classes of data that were selected by trained machine learning algorithms as being members of particular classes. The results of our analysis revealed that the gradual training produced “high performing contributors”. In our comparison of the treatment and control groups we found users who experienced gradual training performed significantly better on the task (an average accuracy of 90% vs. 54%), contributed more work (an average of 228 vs. 121 classifications), and were retained in the project for a longer period of time (an average of 2.5 vs. 2 sessions). The results suggests online production communities seeking to train newcomers would benefit from training regimes that gradually introduce them to the work of the project using real tasks.

%B Computers in Human Behavior %V 105 %P 106198 %G eng %R 10.1016/j.chb.2019.106198 %> https://crowston.syr.edu/sites/crowston.syr.edu/files/MLGT-preprint.pdf %0 Journal Article %J Physical Review D %D 2019 %T Classifying the unknown: Discovering novel gravitational-wave detector glitches using similarity learning %A Scott Coughlin %A Sara Bahaadini %A Neda Rohani %A Michael Zevin %A Patane, Oli %A Mahboobeh Harandi %A Corey Brian Jackson %A Noroozi, V. %A Sarah Allen %A Areeda, J. %A Coughlin, M. %A Ruiz, P. %A Berry, C. P. L. %A Kevin Crowston %A Aggelos Katsaggelos %A Andrew Lundgren %A Carsten Østerlund %A Joshua Smith %A Laura Trouille %A Vicky Kalogera %X

The observation of gravitational waves from compact binary coalescences by LIGO and Virgo has begun a new era in astronomy. A critical challenge in making detections is determining whether loud transient features in the data are caused by gravitational waves or by instrumental or environmental sources. The citizen-science project Gravity Spy has been demonstrated as an efficient infrastructure for classifying known types of noise transients (glitches) through a combination of data analysis performed by both citizen volunteers and machine learning. We present the next iteration of this project, using similarity indices to empower citizen scientists to create large data sets of unknown transients, which can then be used to facilitate supervised machine-learning characterization. This new evolution aims to alleviate a persistent challenge that plagues both citizen-science and instrumental detector work: the ability to build large samples of relatively rare events. Using two families of transient noise that appeared unexpectedly during LIGO's second observing run, we demonstrate the impact that the similarity indices could have had on finding these new glitch types in the Gravity Spy program.

%B Physical Review D %V 99 %P 082002 %G eng %N 8 %R 10.1103/PhysRevD.99.082002 %0 Journal Article %J Classical and Quantum Gravity %D 2017 %T Gravity Spy: Integrating Advanced LIGO Detector Characterization, Machine Learning, and Citizen Science %A Michael Zevin %A Scott Coughlin %A Sara Bahaadini %A Emre Besler %A Neda Rohani %A Sarah Allen %A Miriam Cabero %A Kevin Crowston %A Aggelos Katsaggelos %A Shane Larson %A Tae Kyoung Lee %A Chris Lintott %A Tyson Littenberg %A Andrew Lundgren %A Carsten Oesterlund %A Joshua Smith %A Laura Trouille %A Vicky Kalogera %B Classical and Quantum Gravity %V 34 %P 064003 %G eng %9 Journal Article %R 10.1088/1361-6382/aa5cea