Intra- and Inter-rater Agreement in a Subjective Speech Quality Assessment Task in Crowdsourcing

被引：4

作者：

Jimenez, Rafael Zequeira ^{[1
]}

Llagostera, Anna ^{[2
]}

Naderi, Babak ^{[1
]}

Moeller, Sebastian ^{[3
]}

Berger, Jens ^{[2
]}

机构：

[1] Tech Univ Berlin, Berlin, Germany

[2] Rohde & Schwarz SwissQual AG, Zuchwil, Switzerland

[3] Tech Univ Berlin, DFKI Projektburo Berlin, Berlin, Germany

来源：

COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2019 ) | 2019年

关键词：

inter-rater reliability; speech quality assessment; crowdsourcing; listeners' agreement; subjectivity in crowdsourcing;

D O I：

10.1145/3308560.3317084

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Crowdsourcing is a great tool for conducting subjective user studies with large amounts of users. Collecting reliable annotations about the quality of speech stimuli is challenging. The task itself is of high subjectivity and users in crowdsourcing work without supervision. This work investigates the intra- and inter-listener agreement withing a subjective speech quality assessment task. To this end, a study has been conducted in the laboratory and in crowdsourcing in which listeners were requested to rate speech stimuli with respect to their overall quality. Ratings were collected on a 5-point scale in accordance with the ITU-T Rec. P.800 and P.808, respectively. The speech samples were taken from the database ITU-T Rec. P.501 Annex D, and were presented four times to the listeners. Finally, the crowdsourcing results were contrasted to the ratings collected in the laboratory. Strong and significant Spearman's correlation was achieved when contrasting the ratings collected in both environments. Our analysis show that while the inter-rater agreement increased the more the listeners conducted the assessment task, the intra-rater reliability remained constant. Our study setup helped to overcome the subjectivity of the task and we found that disagreement can represent a source of information to some extent.

引用

页码：1138 / 1143

页数：6

共 50 条

[1] Intra- and inter-rater agreement in the assessment of occupational exposure to metals
Rybicki, BA
Peterson, EL
Johnson, CC
Kortsha, GX
Cleary, WM
Gorell, JM
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 1998, 27 (02) : 269 - 273
[2] Intra- and inter-rater agreement of the Genital Injury Severity Scale
Kelly, Dana L.
Larkin, Hillary J.
Paolinetti, Lauri A.
JOURNAL OF FORENSIC AND LEGAL MEDICINE, 2017, 52 : 172 - 180
[3] Intra- and Inter-Rater Reliability of the Modified Tuck Jump Assessment
Fort-Vanmeerhaeghe, Azahara
Montalvo, Alicia M.
Lloyd, Rhodri S.
Read, Paul
Myer, Gregory D.
JOURNAL OF SPORTS SCIENCE AND MEDICINE, 2017, 16 (01) : 117 - 124
[4] Intra- and inter-rater reliability of the assessment of capacity for myoelectric control
Hermansson, LM
Bodin, L
Eliasson, AC
JOURNAL OF REHABILITATION MEDICINE, 2006, 38 (02) : 118 - 123
[5] Intra- and inter-rater agreement for the detection of band neutrophils and toxic change in horses
Fernandez, Nicole J.
Gilroy, Cornelia, V
Wagg, Catherine R.
Kwong, Grace P. S.
Roy, Marie-France
VETERINARY CLINICAL PATHOLOGY, 2019, 48 (04) : 668 - 676
[6] Intra- and inter-rater agreement of a new classification system of gingival recession defects
Rotundo, Roberto
Mori, Massimiliano
Bonaccini, Daniele
Baldi, Carlo
EUROPEAN JOURNAL OF ORAL IMPLANTOLOGY, 2011, 4 (02) : 127 - 133
[7] Comparison between Inter-rater Reliability and Inter-rater Agreement in Performance Assessment
Liao, Shih Chieh
Hunt, Elizabeth A.
Chen, Walter
ANNALS ACADEMY OF MEDICINE SINGAPORE, 2010, 39 (08) : 613 - 618
[8] INTRA- AND INTER-RATER RELIABILITY OF A BIOPHOTOGRAMMETRIC ASSESSMENT PROTOCOL FOR PRETERM INFANTS
Campos, Juliana Vieira
Moreno, Mariana Alves
Silva, Ricardo de Bastos
Quirino da Silva, Jessica Neves
de Carvalho, Milena Ferreira
Abreu dos Santos, Rayssa Christina
Peres, Rodrigo Tosta
Santos, Rosana da Silva
Ferreira, Halina Cidrini
REVISTA PAULISTA DE PEDIATRIA, 2021, 39
[9] Reproducibility of radiomics quality score: an intra- and inter-rater reliability study
Akinci D'Antonoli, Tugba
Cavallo, Armando Ugo
Vernuccio, Federica
Stanzione, Arnaldo
Klontzas, Michail E.
Cannella, Roberto
Ugga, Lorenzo
Baran, Agah
Fanni, Salvatore Claudio
Petrash, Ekaterina
Ambrosini, Ilaria
Cappellini, Luca Alessandro
van Ooijen, Peter
Kotter, Elmar
Pinto dos Santos, Daniel
Cuocolo, Renato
EUROPEAN RADIOLOGY, 2024, 34 (04) : 2791 - 2804
[10] Reproducibility of radiomics quality score: an intra- and inter-rater reliability study
Tugba Akinci D’Antonoli
Armando Ugo Cavallo
Federica Vernuccio
Arnaldo Stanzione
Michail E. Klontzas
Roberto Cannella
Lorenzo Ugga
Agah Baran
Salvatore Claudio Fanni
Ekaterina Petrash
Ilaria Ambrosini
Luca Alessandro Cappellini
Peter van Ooijen
Elmar Kotter
Daniel Pinto dos Santos
Renato Cuocolo
European Radiology, 2024, 34 : 2791 - 2804

← 1 2 3 4 5 →