Intelligibility Rating with Automatic Speech Recognition, Prosodic, and Cepstral Evaluation

被引:0
|
作者
Haderlein, Tino [1 ,2 ]
Moers, Cornelia [3 ]
Moebius, Bernd [4 ]
Rosanowski, Frank [2 ]
Noeth, Elmar [1 ]
机构
[1] Univ Erlangen Nurnberg, Pattern Recognit Lab Informat 5, Martensstr 3, D-91058 Erlangen, Germany
[2] Univ Erlangen Nurnberg, Dept Phoniatr & Pedaudiol, D-91054 Erlangen, Germany
[3] Univ Bonn, Dept Speech & Commun, D-53115 Bonn, Germany
[4] Univ Saarland, Dept Computat Linguist & Phonet, D-66041 Saarbrucken, Germany
来源
关键词
SUSTAINED VOWELS; VOICE; DISORDERS; QUALITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For voice rehabilitation, speech intelligibility is an important criterion. Automatic evaluation of intelligibility has been shown to be successful for automatic speech recognition methods combined with prosodic analysis. In this paper, this method is extended by using measures based on the Cepstral Peak Prominence (CPP). 73 hoarse patients (48.3 +/- 16.8 years) uttered the vowel /e/ and read the German version of the text "The North Wind and the Sun". Their intelligibility was evaluated perceptually by 5 speech therapists and physicians according to a 5-point scale. Support Vector Regression (SVR) revealed a feature set with a human-machine correlation of r = 0.85 consisting of the word accuracy, smoothed CPP computed from a speech section, and three prosodic features (normalized energy of word-pause-word intervals, F-0 value at voice offset in a word, and standard deviation of jitter). The average human-human correlation was r = 0.82. Hence, the automatic method can be a meaningful objective support for perceptual analysis.
引用
收藏
页码:195 / 202
页数:8
相关论文
共 50 条
  • [1] Automatic Rating of Hoarseness by Text-based Cepstral and Prosodic Evaluation
    Haderlein, Tino
    Moers, Cornelia
    Moebius, Bernd
    Noeth, Elmar
    [J]. TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 573 - 580
  • [2] Intelligibility of laryngectomees' substitute speech:: automatic speech recognition and subjective rating
    Schuster, M
    Haderlein, T
    Nöth, E
    Lohscheller, J
    Eysholdt, U
    Rosanowski, F
    [J]. EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY, 2006, 263 (02) : 188 - 193
  • [3] Intelligibility of laryngectomees’ substitute speech: automatic speech recognition and subjective rating
    Maria Schuster
    Tino Haderlein
    Elmar Nöth
    Jörg Lohscheller
    Ulrich Eysholdt
    Frank Rosanowski
    [J]. European Archives of Oto-Rhino-Laryngology and Head & Neck, 2006, 263 : 188 - 193
  • [4] Influence of Reverberation on Automatic Evaluation of Intelligibility with Prosodic Features
    Haderlein, Tino
    Doellinger, Michael
    Schuetzenberger, Anne
    Noeth, Elmar
    [J]. TEXT, SPEECH, AND DIALOGUE, 2016, 9924 : 461 - 469
  • [5] Evaluation of speech intelligibility for children with cleft lip and palate by means of automatic speech recognition
    Schuster, Maria
    Maier, Andreas
    Haderlein, Tino
    Nkenke, Emeka
    Wohlleben, Ulrike
    Rosanowski, Frank
    Eysholdt, Ulrich
    Noeth, Elmar
    [J]. INTERNATIONAL JOURNAL OF PEDIATRIC OTORHINOLARYNGOLOGY, 2006, 70 (10) : 1741 - 1747
  • [6] Prosodic and accentual information for automatic speech recognition
    Milone, DH
    Rubio, AJ
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (04): : 321 - 333
  • [7] Prosodic knowledge sources for automatic speech recognition
    Vergyri, D
    Stolcke, A
    Gadde, VRR
    Ferrer, L
    Shriberg, E
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 208 - 211
  • [8] CEPSTRAL NOISE SUBTRACTION FOR ROBUST AUTOMATIC SPEECH RECOGNITION
    Rehr, Robert
    Gerkmann, Timo
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 375 - 378
  • [9] Autonomous measurement of speech intelligibility utilizing automatic speech recognition
    Meyer, Bernd T.
    Kollmeier, Birger
    Ooster, Jasper
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2982 - 2986
  • [10] Effect of prosodic changes on speech intelligibility
    Mayo, Catherine
    Aubanel, Vincent
    Cooke, Martin
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1706 - 1709