Scalable Variational Gaussian Processes for Crowdsourcing: Glitch Detection in LIGO

被引：0

作者：

Morales-Alvarez, Pablo ^{[1
]}

Ruiz, Pablo ^{[2
]}

Coughlin, Scott ^{[3
,4
]}

Molina, Rafael ^{[1
]}

Katsaggelos, Aggelos K. ^{[2
]}

机构：

[1] Univ Granada, Dept Comp Sci & Artificial Intelligence, Granada 18010, Spain

[2] Northwestern Univ, Dept Elect Engn & Comp Sci, Evanston, IL 60208 USA

[3] Northwestern Univ, Ctr Interdisciplinary Explorat & Res Astrophys CI, Evanston, IL 60208 USA

[4] Cardiff Univ, Dept Phys & Astron, Cardiff CF10 3AT, Wales

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2022年 / 44卷 / 03期

基金：

美国国家科学基金会;

关键词：

Crowdsourcing; Training; Probabilistic logic; Gaussian processes; Machine learning; Uncertainty; Bayes methods; citizen science; laser interferometer gravitational waves observatory; sparse Gaussian processes; scalability; uncertainty quantification; deep learning; CITIZEN SCIENCE; CLASSIFICATION;

D O I：

10.1109/TPAMI.2020.3025390

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the last years, crowdsourcing is transforming the way classification training sets are obtained. Instead of relying on a single expert annotator, crowdsourcing shares the labelling effort among a large number of collaborators. For instance, this is being applied in the laureate laser interferometer gravitational waves observatory (LIGO), in order to detect glitches which might hinder the identification of true gravitational-waves. The crowdsourcing scenario poses new challenging difficulties, as it has to deal with different opinions from a heterogeneous group of annotators with unknown degrees of expertise. Probabilistic methods, such as Gaussian processes (GP), have proven successful in modeling this setting. However, GPs do not scale up well to large data sets, which hampers their broad adoption in real-world problems (in particular LIGO). This has led to the very recent introduction of deep learning based crowdsourcing methods, which have become the state-of-the-art for this type of problems. However, the accurate uncertainty quantification provided by GPs has been partially sacrificed. This is an important aspect for astrophysicists in LIGO, since a glitch detection system should provide very accurate probability distributions of its predictions. In this work, we first leverage a standard sparse GP approximation (SVGP) to develop a GP-based crowdsourcing method that factorizes into mini-batches. This makes it able to cope with previously-prohibitive data sets. This first approach, which we refer to as scalable variational Gaussian processes for crowdsourcing (SVGPCR), brings back GP-based methods to a state-of-the-art level, and excels at uncertainty quantification. SVGPCR is shown to outperform deep learning based methods and previous probabilistic ones when applied to the LIGO data. Its behavior and main properties are carefully analyzed in a controlled experiment based on the MNIST data set. Moreover, recent GP inference techniques are also adapted to crowdsourcing and evaluated experimentally.

引用

页码：1534 / 1551

页数：18

共 50 条

[1] Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition
Sun, Shengyangu
Shi, Jiaxin
Wilson, Andrew Gordon
Grosse, Roger
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[2] Detection of anomalies amongst LIGO's glitch populations with autoencoders
Laguarta, Paloma
van der Laag, Robin
Lopez, Melissa
Dooney, Tom
Miller, Andrew L.
Schmidt, Stefano
Cavaglia, Marco
Caudill, Sarah
Driessens, Kurt
Karel, Joel
Lenders, Roy
van den Broeck, Chris
[J]. CLASSICAL AND QUANTUM GRAVITY, 2024, 41 (05)
[3] Learning from crowds in digital pathology using scalable variational Gaussian processes
Lopez-Perez, Miguel
Amgad, Mohamed
Morales-Alvarez, Pablo
Ruiz, Pablo
Cooper, Lee A. D.
Molina, Rafael
Katsaggelos, Aggelos K.
[J]. SCIENTIFIC REPORTS, 2021, 11 (01)
[4] Learning from crowds in digital pathology using scalable variational Gaussian processes
Miguel López-Pérez
Mohamed Amgad
Pablo Morales-Álvarez
Pablo Ruiz
Lee A. D. Cooper
Rafael Molina
Aggelos K. Katsaggelos
[J]. Scientific Reports, 11
[5] Scalable and Interpretable Forecasting of Hydrological Time Series Based on Variational Gaussian Processes
Pastrana-Cortes, Julian David
Gil-Gonzalez, Julian
Alvarez-Meza, Andres Marino
Cardenas-Pena, David Augusto
Orozco-Gutierrez, Alvaro Angel
[J]. WATER, 2024, 16 (14)
[6] Scalable Gaussian Process Variational Autoencoders
Jazbec, Metod
Ashman, Matthew
Fortuin, Vincent
Pearce, Michael
Mandt, Stephan
Raetsch, Gunnar
[J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
[7] Scalable Variational Gaussian Process Classification
Hensman, James
Matthews, Alex G. de G.
Ghahramani, Zoubin
[J]. ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 38, 2015, 38 : 351 - 360
[8] Scalable Indoor Localization via Mobile Crowdsourcing and Gaussian Process
Chang, Qiang
Li, Qun
Shi, Zesen
Chen, Wei
Wang, Weiping
[J]. SENSORS, 2016, 16 (03):
[9] Medical image segmentation using scalable functional variational Bayesian neural networks with Gaussian processes
Chen, Xu
Zhao, Yue
Liu, Chuancai
[J]. NEUROCOMPUTING, 2022, 500 : 58 - 72
[10] VARIATIONAL GAUSSIAN PROCESS FOR MISSING LABEL CROWDSOURCING CLASSIFICATION PROBLEMS
Ruiz, Pablo
Besler, Emre
Molina, Rafael
Katsaggelos, Aggelos K.
[J]. 2016 IEEE 26TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2016,

← 1 2 3 4 5 →