Intra- and Inter-rater Agreement in a Subjective Speech Quality Assessment Task in Crowdsourcing

被引:4
|
作者
Jimenez, Rafael Zequeira [1 ]
Llagostera, Anna [2 ]
Naderi, Babak [1 ]
Moeller, Sebastian [3 ]
Berger, Jens [2 ]
机构
[1] Tech Univ Berlin, Berlin, Germany
[2] Rohde & Schwarz SwissQual AG, Zuchwil, Switzerland
[3] Tech Univ Berlin, DFKI Projektburo Berlin, Berlin, Germany
关键词
inter-rater reliability; speech quality assessment; crowdsourcing; listeners' agreement; subjectivity in crowdsourcing;
D O I
10.1145/3308560.3317084
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Crowdsourcing is a great tool for conducting subjective user studies with large amounts of users. Collecting reliable annotations about the quality of speech stimuli is challenging. The task itself is of high subjectivity and users in crowdsourcing work without supervision. This work investigates the intra- and inter-listener agreement withing a subjective speech quality assessment task. To this end, a study has been conducted in the laboratory and in crowdsourcing in which listeners were requested to rate speech stimuli with respect to their overall quality. Ratings were collected on a 5-point scale in accordance with the ITU-T Rec. P.800 and P.808, respectively. The speech samples were taken from the database ITU-T Rec. P.501 Annex D, and were presented four times to the listeners. Finally, the crowdsourcing results were contrasted to the ratings collected in the laboratory. Strong and significant Spearman's correlation was achieved when contrasting the ratings collected in both environments. Our analysis show that while the inter-rater agreement increased the more the listeners conducted the assessment task, the intra-rater reliability remained constant. Our study setup helped to overcome the subjectivity of the task and we found that disagreement can represent a source of information to some extent.
引用
收藏
页码:1138 / 1143
页数:6
相关论文
共 50 条
  • [1] Intra- and inter-rater agreement in the assessment of occupational exposure to metals
    Rybicki, BA
    Peterson, EL
    Johnson, CC
    Kortsha, GX
    Cleary, WM
    Gorell, JM
    INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 1998, 27 (02) : 269 - 273
  • [2] Intra- and inter-rater agreement of the Genital Injury Severity Scale
    Kelly, Dana L.
    Larkin, Hillary J.
    Paolinetti, Lauri A.
    JOURNAL OF FORENSIC AND LEGAL MEDICINE, 2017, 52 : 172 - 180
  • [3] Intra- and Inter-Rater Reliability of the Modified Tuck Jump Assessment
    Fort-Vanmeerhaeghe, Azahara
    Montalvo, Alicia M.
    Lloyd, Rhodri S.
    Read, Paul
    Myer, Gregory D.
    JOURNAL OF SPORTS SCIENCE AND MEDICINE, 2017, 16 (01) : 117 - 124
  • [4] Intra- and inter-rater reliability of the assessment of capacity for myoelectric control
    Hermansson, LM
    Bodin, L
    Eliasson, AC
    JOURNAL OF REHABILITATION MEDICINE, 2006, 38 (02) : 118 - 123
  • [5] Intra- and inter-rater agreement for the detection of band neutrophils and toxic change in horses
    Fernandez, Nicole J.
    Gilroy, Cornelia, V
    Wagg, Catherine R.
    Kwong, Grace P. S.
    Roy, Marie-France
    VETERINARY CLINICAL PATHOLOGY, 2019, 48 (04) : 668 - 676
  • [6] Intra- and inter-rater agreement of a new classification system of gingival recession defects
    Rotundo, Roberto
    Mori, Massimiliano
    Bonaccini, Daniele
    Baldi, Carlo
    EUROPEAN JOURNAL OF ORAL IMPLANTOLOGY, 2011, 4 (02) : 127 - 133
  • [7] Comparison between Inter-rater Reliability and Inter-rater Agreement in Performance Assessment
    Liao, Shih Chieh
    Hunt, Elizabeth A.
    Chen, Walter
    ANNALS ACADEMY OF MEDICINE SINGAPORE, 2010, 39 (08) : 613 - 618
  • [8] INTRA- AND INTER-RATER RELIABILITY OF A BIOPHOTOGRAMMETRIC ASSESSMENT PROTOCOL FOR PRETERM INFANTS
    Campos, Juliana Vieira
    Moreno, Mariana Alves
    Silva, Ricardo de Bastos
    Quirino da Silva, Jessica Neves
    de Carvalho, Milena Ferreira
    Abreu dos Santos, Rayssa Christina
    Peres, Rodrigo Tosta
    Santos, Rosana da Silva
    Ferreira, Halina Cidrini
    REVISTA PAULISTA DE PEDIATRIA, 2021, 39
  • [9] Reproducibility of radiomics quality score: an intra- and inter-rater reliability study
    Akinci D'Antonoli, Tugba
    Cavallo, Armando Ugo
    Vernuccio, Federica
    Stanzione, Arnaldo
    Klontzas, Michail E.
    Cannella, Roberto
    Ugga, Lorenzo
    Baran, Agah
    Fanni, Salvatore Claudio
    Petrash, Ekaterina
    Ambrosini, Ilaria
    Cappellini, Luca Alessandro
    van Ooijen, Peter
    Kotter, Elmar
    Pinto dos Santos, Daniel
    Cuocolo, Renato
    EUROPEAN RADIOLOGY, 2024, 34 (04) : 2791 - 2804
  • [10] Reproducibility of radiomics quality score: an intra- and inter-rater reliability study
    Tugba Akinci D’Antonoli
    Armando Ugo Cavallo
    Federica Vernuccio
    Arnaldo Stanzione
    Michail E. Klontzas
    Roberto Cannella
    Lorenzo Ugga
    Agah Baran
    Salvatore Claudio Fanni
    Ekaterina Petrash
    Ilaria Ambrosini
    Luca Alessandro Cappellini
    Peter van Ooijen
    Elmar Kotter
    Daniel Pinto dos Santos
    Renato Cuocolo
    European Radiology, 2024, 34 : 2791 - 2804