Intra- and Inter-rater Agreement in a Subjective Speech Quality Assessment Task in Crowdsourcing

被引:4
|
作者
Jimenez, Rafael Zequeira [1 ]
Llagostera, Anna [2 ]
Naderi, Babak [1 ]
Moeller, Sebastian [3 ]
Berger, Jens [2 ]
机构
[1] Tech Univ Berlin, Berlin, Germany
[2] Rohde & Schwarz SwissQual AG, Zuchwil, Switzerland
[3] Tech Univ Berlin, DFKI Projektburo Berlin, Berlin, Germany
关键词
inter-rater reliability; speech quality assessment; crowdsourcing; listeners' agreement; subjectivity in crowdsourcing;
D O I
10.1145/3308560.3317084
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Crowdsourcing is a great tool for conducting subjective user studies with large amounts of users. Collecting reliable annotations about the quality of speech stimuli is challenging. The task itself is of high subjectivity and users in crowdsourcing work without supervision. This work investigates the intra- and inter-listener agreement withing a subjective speech quality assessment task. To this end, a study has been conducted in the laboratory and in crowdsourcing in which listeners were requested to rate speech stimuli with respect to their overall quality. Ratings were collected on a 5-point scale in accordance with the ITU-T Rec. P.800 and P.808, respectively. The speech samples were taken from the database ITU-T Rec. P.501 Annex D, and were presented four times to the listeners. Finally, the crowdsourcing results were contrasted to the ratings collected in the laboratory. Strong and significant Spearman's correlation was achieved when contrasting the ratings collected in both environments. Our analysis show that while the inter-rater agreement increased the more the listeners conducted the assessment task, the intra-rater reliability remained constant. Our study setup helped to overcome the subjectivity of the task and we found that disagreement can represent a source of information to some extent.
引用
收藏
页码:1138 / 1143
页数:6
相关论文
共 50 条
  • [41] Intra- and inter-rater reliability of ultrasound measures of the anterior cruciate ligament
    Sievert, Zachary A.
    Bennett, Hunter J.
    Weinhandl, Joshua T.
    JOURNAL OF ULTRASOUND, 2021, 24 (01) : 49 - 55
  • [42] Intra- and inter-rater reliability of ultrasound measures of the anterior cruciate ligament
    Zachary A. Sievert
    Hunter J. Bennett
    Joshua T. Weinhandl
    Journal of Ultrasound, 2021, 24 : 49 - 55
  • [43] Intra- and inter-rater reliability of the BASRI, SASSS and mSASSS in ankylosing spondylitis
    Myckatyn, SO
    Oswald, AE
    Lambert, RG
    Spady, B
    Maksymowych, WP
    ANNALS OF THE RHEUMATIC DISEASES, 2004, 63 : 414 - 414
  • [44] Intra- and inter-rater relinhility and validity of the tandem gait test for the assessment of dynamic gait balance
    Koyama, Soichiro
    Tanabe, Shigeo
    Itoh, Norihide
    Saitoh, Eiichi
    Takeda, Kazuya
    Hirano, Satoshi
    Ohtsuka, Kei
    Mukaino, Masahiko
    Yanohara, Ryuzo
    Sakurai, Hiroaki
    Kanada, Yoshikiyo
    EUROPEAN JOURNAL OF PHYSIOTHERAPY, 2018, 20 (03) : 135 - 140
  • [45] Intra- and inter-rater reliability of the Italian Fugl-Meyer assessment of upper and lower extremity
    Hochleitner, Ines
    Pellicciari, Leonardo
    Castagnoli, Chiara
    Paperini, Anita
    Politi, Angela Maria
    Campagnini, Silvia
    Pancani, Silvia
    Basagni, Benedetta
    Gerli, Filippo
    Carrozza, Maria Chiara
    Macchi, Claudio
    Murphy, Margit Alt
    Cecchi, Francesca
    DISABILITY AND REHABILITATION, 2023, 45 (18) : 2989 - 2999
  • [46] Intra- and inter-rater reliability of the Assessment of Children's Hand Skills based on video recordings
    Chien, Chi-Wen
    Scanlon, Clare
    Rodger, Sylvia
    Copley, Jodie
    BRITISH JOURNAL OF OCCUPATIONAL THERAPY, 2014, 77 (02) : 82 - 90
  • [47] Intra- and inter-rater reliability of jumping mechanography muscle function assessments
    Matheson, L. A.
    Duffy, S.
    Maroof, A.
    Gibbons, R.
    Duffy, C.
    Roth, J.
    JOURNAL OF MUSCULOSKELETAL & NEURONAL INTERACTIONS, 2013, 13 (04) : 480 - 486
  • [48] Gauging the Quality of Relevance Assessments using Inter-Rater Agreement
    Damessie, Tadele T.
    Nghiem, Thao P.
    Scholer, Falk
    Culpepper, J. Shane
    SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 1089 - 1092
  • [49] Intra- and inter-rater reliability, agreement, and minimal detectable change of the handheld dynamometer in individuals with symptomatic hip osteoarthritis
    Vaz, Gilvan Ferreira
    Freire, Felipe Florencio
    Goncalves, Henrique Mansur M.
    de Aviz, Marcus Alexandre Brito M.
    Martins, Wagner Rodrigues M.
    Durigan, Joao Luiz Quagliotti M.
    PLOS ONE, 2023, 18 (06):
  • [50] Contrast-enhanced US Bosniak Classification: intra- and inter-rater agreement, confounding features, and diagnostic performance
    Jin, Dong-dong
    Zhuang, Bo-wen
    Lin, Ke
    Zhang, Nan
    Qiao, Bin
    Xie, Xiao-yan
    Xie, Xiao-hua
    Wang, Yan
    INSIGHTS INTO IMAGING, 2024, 15 (01):