Intra- and Inter-rater Agreement in a Subjective Speech Quality Assessment Task in Crowdsourcing

被引:4
|
作者
Jimenez, Rafael Zequeira [1 ]
Llagostera, Anna [2 ]
Naderi, Babak [1 ]
Moeller, Sebastian [3 ]
Berger, Jens [2 ]
机构
[1] Tech Univ Berlin, Berlin, Germany
[2] Rohde & Schwarz SwissQual AG, Zuchwil, Switzerland
[3] Tech Univ Berlin, DFKI Projektburo Berlin, Berlin, Germany
关键词
inter-rater reliability; speech quality assessment; crowdsourcing; listeners' agreement; subjectivity in crowdsourcing;
D O I
10.1145/3308560.3317084
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Crowdsourcing is a great tool for conducting subjective user studies with large amounts of users. Collecting reliable annotations about the quality of speech stimuli is challenging. The task itself is of high subjectivity and users in crowdsourcing work without supervision. This work investigates the intra- and inter-listener agreement withing a subjective speech quality assessment task. To this end, a study has been conducted in the laboratory and in crowdsourcing in which listeners were requested to rate speech stimuli with respect to their overall quality. Ratings were collected on a 5-point scale in accordance with the ITU-T Rec. P.800 and P.808, respectively. The speech samples were taken from the database ITU-T Rec. P.501 Annex D, and were presented four times to the listeners. Finally, the crowdsourcing results were contrasted to the ratings collected in the laboratory. Strong and significant Spearman's correlation was achieved when contrasting the ratings collected in both environments. Our analysis show that while the inter-rater agreement increased the more the listeners conducted the assessment task, the intra-rater reliability remained constant. Our study setup helped to overcome the subjectivity of the task and we found that disagreement can represent a source of information to some extent.
引用
收藏
页码:1138 / 1143
页数:6
相关论文
共 50 条
  • [21] Investigating the intra- and inter-rater reliability of a panel of subjective and objective burn scar measurement tools
    Lee, K. C.
    Bamford, A.
    Gardiner, F.
    Agovino, A.
    ter Horst, B.
    Bishop, J.
    d, A. Sitch
    Grover, L.
    Logan, A.
    Moiemen, N. S.
    BURNS, 2019, 45 (06) : 1311 - 1324
  • [22] Intra- and inter-rater reliability for transvaginal cervical strain elastography
    Janssen, Matthew
    Koelper, Nathanael C.
    Weatherby, Michele
    Werth, Christina
    Schwartz, Nadav
    AMERICAN JOURNAL OF OBSTETRICS AND GYNECOLOGY, 2023, 228 (01) : S188 - S189
  • [23] Intra- and inter-rater reliability in ultrasonographic measurements of coracohumeral distance
    Guerra-Rodriguez, Diego
    Guerrero-Henriquez, Juan
    Basilio, Daniel
    Mendez-Rebolledo, Guillermo
    MUSCULOSKELETAL SCIENCE AND PRACTICE, 2024, 69
  • [24] INTRA- AND INTER-RATER RELIABILITY OF THE SELECTIVE FUNCTIONAL MOVEMENT ASSESSMENT (SFMA) IN HEALTHY PARTICIPANTS
    Stanek, Justin M.
    Smith, Joshua
    Petrie, Jake
    INTERNATIONAL JOURNAL OF SPORTS PHYSICAL THERAPY, 2019, 14 (01): : 107 - 116
  • [25] INTRA- AND INTER-RATER RELIABILITY OF HYOID DISPLACEMENT MEASURED BY ULTRASOUND
    Macrae, Phoebe
    Huckabee, M.
    Doeltgen, S.
    Jones, R.
    DYSPHAGIA, 2009, 24 (04) : 482 - 483
  • [26] Intra- and inter-rater reliability of the Multiple Sclerosis Functional Composite
    Fischer, JS
    Cohen, JA
    Cutter, GR
    Mertz, LA
    Bolibrush, DM
    Skaramagas, TT
    Jak, AJ
    Kniker, JE
    NEUROLOGY, 1999, 52 (06) : A548 - A548
  • [27] Intra- and Inter-rater Objectivity of the Frequency Speed of Kick Test
    Antonaccio, Romulo Fernandes
    Mansur Machado, Frederico Sander
    Da Silva Santos, Jonatas Ferreira
    IDO MOVEMENT FOR CULTURE-JOURNAL OF MARTIAL ARTS ANTHROPOLOGY, 2022, 22 (03): : 1 - 5
  • [28] The Intra- and Inter-rater Reliabilities of the Forward Head Posture Assessment of Normal Healthy Subjects
    Nam, Seok Hyun
    Son, Sung Min
    Kwon, Jung Won
    Lee, Na Kyung
    JOURNAL OF PHYSICAL THERAPY SCIENCE, 2013, 25 (06) : 737 - 739
  • [29] INTRA- AND INTER-RATER RELIABILITY OF FUGL-MEYER ASSESSMENT OF UPPER EXTREMITY IN STROKE
    Hernandez, Edgar D.
    Galeano, Claudia P.
    Barbosa, Nubia E.
    Forero, Sandra M.
    Nordin, Asa
    Sunnerhagen, Katharina S.
    Alt Murphy, Margit
    JOURNAL OF REHABILITATION MEDICINE, 2019, 51 (09) : 652 - 659
  • [30] Intra- and inter-rater reliability and agreement of stimulus electrodiagnostic tests in post-COVID-19 patients
    Almeida, Isabella da Silva
    Ferreira, Leandro Gomes de Jesus
    Ventura, Alvaro de Almeida
    Mansur, Henrique
    Babault, Nicolas
    Marqueti, Rita de Cassia
    Durigan, Joao Luiz Quagliotti
    PHYSIOLOGICAL MEASUREMENT, 2023, 44 (05)