Discriminating Native from Non-Native Speech Using Fusion of Visual Cues

被引:3
|
作者
Georgakis, Christos [1 ]
Petridis, Stavros [1 ]
Pantic, Maja [1 ,2 ]
机构
[1] Imperial Coll London, Dept Comp, London, England
[2] Univ Twente, EEMCS, Enschede, Netherlands
基金
英国工程与自然科学研究理事会; 欧盟第七框架计划;
关键词
Non-Native Speech; Visual-only Accent Classification; Foreign Accent Detection; Visual Speech Processing;
D O I
10.1145/2647868.2655026
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The task of classifying accent, as belonging to a native language speaker or a foreign language speaker, has been so far addressed by means of the audio modality only. However, features extracted from the visual modality have been successfully used to extend or substitute audio-only approaches developed for speech or language recognition. This paper presents a fully automated approach to discriminating native from non-native speech in English, based exclusively on visual appearance features from speech. Long Short-Term Memory Neural Networks (LSTMs) are employed to model accent-related speech dynamics and yield accent-class predictions. Subject-independent experiments are conducted on speech episodes captured by mobile phones from the challenging MOBIO Database. We establish a text-dependent scenario, using only those recordings in which all subjects read the same paragraph. Our results show that decision-level fusion of networks trained with complementary appearance descriptors consistently leads to performance improvement over single-feature systems, with the highest gain in accuracy reaching 7.3%. The best feature combinations achieve classification accuracy of 75%, rendering the proposed method a useful accent classification tool in cases of missing or noisy audio stream.
引用
收藏
页码:1177 / 1180
页数:4
相关论文
共 50 条
  • [41] The Effects of Acoustic and Semantic Enhancements on Perception of Native and Non-Native Speech
    Kato, Misaki
    Baese-Berk, Melissa M.
    LANGUAGE AND SPEECH, 2024, 67 (01) : 40 - 71
  • [42] Influence of native and non-native multitalker babble on speech recognition in noise
    Jain, Chandni
    Konadath, Sreeraj
    Vimal, Bharathi M.
    Suresh, Vidhya
    AUDIOLOGY RESEARCH, 2014, 4 (01) : 9 - 13
  • [43] Gradeschoolers' linguistic and pragmatic speech adaptation to native and non-native interlocution
    Ravid, D
    Olshtain, E
    Ze'elon, R
    JOURNAL OF PRAGMATICS, 2003, 35 (01) : 71 - 99
  • [44] The IFCASL Corpus of French and German Non-native and Native Read Speech
    Trouvain, Juergen
    Bonneau, Anne
    Colotte, Vincent
    Fauth, Camille
    Fohr, Dominique
    Jouvet, Denis
    Juegler, Jeanin
    Laprie, Yves
    Mella, Odile
    Moebius, Bernd
    Zimmerer, Frank
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 1333 - 1338
  • [45] Error patterns of native and non-native listeners' perception of speech in noise
    Zinszer, Benjamin D.
    Riggs, Meredith
    Reetzke, Rachel
    Chandrasekaran, Bharath
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 145 (02): : EL129 - EL135
  • [46] Understanding the neural mechanisms for infants' perception of native and non-native speech
    Liu, Liquan
    Peter, Varghese
    Tyler, Michael D.
    BRAIN AND LANGUAGE, 2023, 242
  • [47] Re-Examining Phonetic Variability in Native and Non-Native Speech
    Vaughn, Charlotte
    Baese-Berk, Melissa
    Idemaru, Kaori
    PHONETICA, 2019, 76 (05) : 327 - 358
  • [48] Exploring Native and Non-Native English Child Speech Recognition With Whisper
    Jain, Rishabh
    Barcovschi, Andrei
    Yiwere, Mariam Yahayah
    Corcoran, Peter
    Cucu, Horia
    IEEE ACCESS, 2024, 12 : 41601 - 41610
  • [49] Auditory-Motor Interactions for the Production of Native and Non-Native Speech
    Jones, Oiwi Parker
    Seghier, Mohamed L.
    Duncan, Keith J. Kawabata
    Leff, Alex P.
    Green, David W.
    Price, Cathy J.
    JOURNAL OF NEUROSCIENCE, 2013, 33 (06): : 2376 - 2387
  • [50] Fluency in Using Morphosyntactic Cues to Establish Reference: How Do Native and Non-Native Speakers Differ?
    Lew-Williams, Casey
    Fernald, Anne
    PROCEEDINGS OF THE 33RD ANNUAL BOSTON UNIVERSITY CONFERENCE ON LANGUAGE DEVELOPMENT, VOLS 1 AND 2, 2009, : 290 - 301