Evaluating automatic speech recognition systems as quantitative models of cross-lingual phonetic category perception

被引:4
|
作者
Schatz, Thomas [1 ,2 ]
Bach, Francis [3 ]
Dupoux, Emmanuel [4 ]
机构
[1] Univ Maryland, Dept Linguist, College Pk, MD 20742 USA
[2] Univ Maryland, UMIACS, College Pk, MD 20742 USA
[3] PSL Res Univ, CNRS, Ecole Normale Super, Dept Informat ENS,SIERRA Project Team,INRIA, 45 Rue Ulm, F-75005 Paris, France
[4] PSL Res Univ, CNRS, Ecole Normale Super, Dept Etud Cognit ENS,EHESS,LSCP, 29 Rue Ulm, F-75005 Paris, France
来源
基金
欧洲研究理事会; 美国国家科学基金会;
关键词
JAPANESE;
D O I
10.1121/1.5037615
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Theories of cross-linguistic phonetic category perception posit that listeners perceive foreign sounds by mapping them onto their native phonetic categories, but, until now, no way to effectively implement this mapping has been proposed. In this paper, Automatic Speech Recognition systems trained on continuous speech corpora are used to provide a fully specified mapping between foreign sounds and native categories. The authors show how the machine ABX evaluation method can be used to compare predictions from the resulting quantitative models with empirically attested effects in human cross-linguistic phonetic category perception. (C) 2018 Acoustical Society of America
引用
收藏
页码:EL372 / EL378
页数:7
相关论文
共 50 条
  • [1] Cross-Lingual Automatic Speech Recognition Using Tandem Features
    Lal, Partha
    King, Simon
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (12): : 2506 - 2515
  • [2] Cross-lingual Automatic Speech Recognition Exploiting Articulatory Features
    Zhan, Qingran
    Motlicek, Petr
    Du, Shixuan
    Shan, Yahui
    Ma, Sifan
    Xie, Xiang
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1912 - 1916
  • [3] A Preliminary Study of Cross-lingual Emotion Recognition from Speech: Automatic Classification versus Human Perception
    Jeon, Je Hun
    Le, Duc
    Xia, Rui
    Liu, Yang
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2836 - 2839
  • [4] Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition
    Farooq, Muhammad Umar
    Hain, Thomas
    [J]. INTERSPEECH 2022, 2022, : 3849 - 3853
  • [5] Speech Emotion Recognition with Cross-lingual Databases
    Chiou, Bo-Chang
    Chen, Chia-Ping
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 558 - 561
  • [6] IMPROVING LUXEMBOURGISH SPEECH RECOGNITION WITH CROSS-LINGUAL SPEECH REPRESENTATIONS
    Le Minh Nguyen
    Nayak, Shekhar
    Coler, Matt
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 792 - 797
  • [7] Zero-Shot Cross-lingual Aphasia Detection using Automatic Speech Recognition
    Chatzoudis, Gerasimos
    Plitsis, Manos
    Stamouli, Spyridoula
    Dimou, Athanasia-Lida
    Katsamanis, Nassos
    Katsouros, Vassilis
    [J]. INTERSPEECH 2022, 2022, : 2178 - 2182
  • [8] CLIoS: Cross-lingual Induction of Speech Recognition Grammars
    Perera, Nadine
    Pitz, Michael
    Pinkal, Manfred
    [J]. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 2487 - 2494
  • [9] Unsupervised Cross-lingual Representation Learning for Speech Recognition
    Conneau, Alexis
    Baevski, Alexei
    Collobert, Ronan
    Mohamed, Abdelrahman
    Auli, Michael
    [J]. INTERSPEECH 2021, 2021, : 2426 - 2430
  • [10] Cross-Lingual Subspace Gaussian Mixture Models for Low-Resource Speech Recognition
    Lu, Liang
    Ghoshal, Arnab
    Renals, Steve
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (01) : 17 - 27