Automatic accent identification as an analytical tool for accent robust automatic speech recognition

被引:22
|
作者
Najafian, Maryam [1 ,2 ]
Russell, Martin [3 ]
机构
[1] Univ Birmingham, Sch Engn, Birmingham, W Midlands, England
[2] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[3] Univ Birmingham, Sch Comp Sci, Birmingham, W Midlands, England
关键词
Speech recognition; Accent identification; British accents; I-vector;
D O I
10.1016/j.specom.2020.05.003
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present a novel study of relationships between automatic accent identification (AID) and accent-robust automatic speech recognition (ASR), using i-vector based AID and deep neural network, hidden Markov Model (DNN-HMM) based ASR. A visualization of the AID i-vector space and a novel analysis of the accent content of the WSJCAM0 corpus are presented. Accents that occur at the periphery of AID space are referred to as "extreme ". We demonstrate a negative correlation, with respect to accent, between AID and ASR accuracy, where extreme accents exhibit the highest AID and lowest ASR performance. These relationships between accents inform a set of ASR experiments in which a generic training set (WSJCAM0) is supplemented with a fixed amount of accented data from the ABI (Accents of the British Isles) corpus. The best performance across all accents, a 32% relative reduction in errors compared with the baseline ASR system, is obtained when the supplementary data comprises extreme accented speech, even though this accent accounts for just 14% of the test data. We conclude that i-vector based AID analysis provides a principled approach to the selection of training material for accent robust ASR. We speculate that this may generalize to other detection technologies and other types of variability, such as Speaker Identification (SI) and speaker variability.
引用
收藏
页码:44 / 55
页数:12
相关论文
共 50 条
  • [1] ACTIVE LEARNING FOR ACCENT ADAPTATION IN AUTOMATIC SPEECH RECOGNITION
    Nallasamy, Udhyakumar
    Metze, Florian
    Schultz, Tanja
    [J]. 2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 360 - 365
  • [2] Robust automatic accent identification based on the acoustic evidence
    Alsharhan E.
    Ramsay A.
    [J]. International Journal of Speech Technology, 2023, 26 (03) : 665 - 680
  • [3] Automatic detection of accent nuclei at the head of words for speech recognition
    Minematsu, N
    Nakagawa, S
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1620 - 1623
  • [4] i-Vector Modeling of Speech Attributes for Automatic Foreign Accent Recognition
    Behravan, Hamid
    Hautamaki, Ville
    Siniscalchi, Sabato Marco
    Kinnunen, Tomi
    Lee, Chin-Hui
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (01) : 29 - 41
  • [5] Advanced accent/dialect identification and accentedness assessment with multi-embedding models and automatic speech recognition
    Ghorbani, Shahram
    Hansen, John H. L.
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2024, 155 (06): : 3848 - 3860
  • [6] Fast accent identification and accented speech recognition
    Kat, LW
    Fung, P
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 221 - 224
  • [7] Fast accent identification and accented speech recognition
    Univ of Science and Technology, Hong Kong, Hong Kong
    [J]. ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, (221-224):
  • [8] Automatic Accent Identification Using Less Data: a Shift from Global to Segmental Accent
    Grigaliunaite, Justina
    Melnik-Leroy, Gerda Ana
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024,
  • [9] USE OF INTONATION TO IDENTIFY WHEN TO USE THE DIACRITICAL ACCENT MARK IN AUTOMATIC SPEECH RECOGNITION
    Bolanos Araya, Constantino
    Camacho Lozano, Arturo
    Rio Urrutia, Ximena del
    [J]. REVISTA KANINA, 2016, 40 (04): : 13 - 19
  • [10] Automatic Accent and Gender Recognition of Regional UK Speakers
    Jayne, Chrisina
    Chang, Victor
    Bailey, Jozeene
    Xu, Qianwen Ariel
    [J]. ENGINEERING APPLICATIONS OF NEURAL NETWORKS, EAAAI/EANN 2022, 2022, 1600 : 67 - 80