Phoneme Confusions in Human and Automatic Speech Recognition

被引:0
|
作者
Meyer, Bernd T. [1 ]
Waechter, Matthias [1 ]
Brand, Thomas [1 ]
Kollmeier, Birger [1 ]
机构
[1] Carl von Ossietzky Univ Oldenburg, Med Phys Sect, D-2900 Oldenburg, Germany
关键词
human speech recognition; automatic speech recognition; dialect; accent; phoneme confusions; MFCC;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A comparison between automatic speech recognition (ASR) and human speech recognition (HSR) is performed as prerequisite for identifying sources of errors and improving feature extraction in ASR. HSR and ASR experiments are carried out with the same logatome database which consists of nonsense syllables. Two different kinds of signals are presented to human listeners: First, noisy speech samples are converted to Mel-frequency cepstral coefficients which are resynthesized to speech, with information about voicing and fundamental frequency being discarded. Second, the original signals with added noise are presented, which is used to evaluate the loss of information caused by the process of resynthesis. The analysis also covers the degradation of ASR caused by dialect or accent and shows that different error patterns emerge for ASR and HSR. The information loss induced by the calculation of ASR features has the same effect as a deteriation of the SNR by 10 dB.
引用
收藏
页码:2740 / 2743
页数:4
相关论文
共 50 条
  • [1] PHONEME SELECTION FOR STUDIES IN AUTOMATIC SPEECH RECOGNITION
    SHOUP, JE
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1962, 34 (04): : 397 - &
  • [2] A Comprehensive Examination of Phoneme Recognition in Automatic Speech Recognition Systems
    Bhatt, Shobha
    Bansal, Shweta
    Kumar, Ankit
    Pandey, Saroj Kumar
    Ojha, Manoj Kumar
    Singh, Kamred Udham
    Chakraborty, Sanjay
    Singh, Teekam
    Swarup, Chetan
    [J]. TRAITEMENT DU SIGNAL, 2023, 40 (05) : 1997 - 2008
  • [3] DNN-based automatic speech recognition as a model for human phoneme perception
    Exter, Mats
    Meyer, Bernd T.
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 615 - 619
  • [4] MLP BASED PHONEME DETECTORS FOR AUTOMATIC SPEECH RECOGNITION
    Thomas, Samuel
    Patrick Nguyen
    Zweig, Geoffrey
    Hermansky, Hynek
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5024 - 5027
  • [5] Automatic Phoneme Border Detection to Improve Speech Recognition
    Sergio, Suarez-Guerra
    Cristian-Remington, Juarez-Murillo
    Jose Luis, Oropeza-Rodriguez
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, MICAI 2015, PT I, 2015, 9413 : 127 - 135
  • [6] Automatic Fongbe Phoneme Recognition From Spoken Speech Signal
    Laleye, Frejus A. A.
    Ezin, Eugene C.
    Motamed, Cina
    [J]. ICINCO: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, VOL 1, 2016, : 102 - 109
  • [7] Contribution from the accuracy of phoneme recognition to the quality of automatic recognition of Russian speech
    Karpukhin I.A.
    [J]. Moscow University Computational Mathematics and Cybernetics, 2016, 40 (2) : 89 - 95
  • [8] Phoneme recognition and confusions with multichannel cochlear implants:: Vowels
    Välimaa, TT
    Määttä, TK
    Löppönen, HJ
    Sorri, MJ
    [J]. JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2002, 45 (05): : 1039 - 1054
  • [9] Phoneme recognition and confusions with multichannel cochlear implants:: Consonants
    Välimaa, TT
    Määttä, TK
    Löppönen, HJ
    Sorri, MJ
    [J]. JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2002, 45 (05): : 1055 - 1069
  • [10] EVALUATING GRAPHEME-TO-PHONEME CONVERTERS IN AUTOMATIC SPEECH RECOGNITION CONTEXT
    Jouvet, Denis
    Fohr, Dominique
    Illina, Irina
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4821 - 4824