Discriminating Native from Non-Native Speech Using Fusion of Visual Cues

被引:3
|
作者
Georgakis, Christos [1 ]
Petridis, Stavros [1 ]
Pantic, Maja [1 ,2 ]
机构
[1] Imperial Coll London, Dept Comp, London, England
[2] Univ Twente, EEMCS, Enschede, Netherlands
基金
英国工程与自然科学研究理事会; 欧盟第七框架计划;
关键词
Non-Native Speech; Visual-only Accent Classification; Foreign Accent Detection; Visual Speech Processing;
D O I
10.1145/2647868.2655026
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The task of classifying accent, as belonging to a native language speaker or a foreign language speaker, has been so far addressed by means of the audio modality only. However, features extracted from the visual modality have been successfully used to extend or substitute audio-only approaches developed for speech or language recognition. This paper presents a fully automated approach to discriminating native from non-native speech in English, based exclusively on visual appearance features from speech. Long Short-Term Memory Neural Networks (LSTMs) are employed to model accent-related speech dynamics and yield accent-class predictions. Subject-independent experiments are conducted on speech episodes captured by mobile phones from the challenging MOBIO Database. We establish a text-dependent scenario, using only those recordings in which all subjects read the same paragraph. Our results show that decision-level fusion of networks trained with complementary appearance descriptors consistently leads to performance improvement over single-feature systems, with the highest gain in accuracy reaching 7.3%. The best feature combinations achieve classification accuracy of 75%, rendering the proposed method a useful accent classification tool in cases of missing or noisy audio stream.
引用
收藏
页码:1177 / 1180
页数:4
相关论文
共 50 条
  • [21] Non-native speech recognition sentences: A new materials set for non-native speech perception research
    Stringer, Louise
    Iverson, Paul
    BEHAVIOR RESEARCH METHODS, 2020, 52 (02) : 561 - 571
  • [22] Non-native speech recognition sentences: A new materials set for non-native speech perception research
    Louise Stringer
    Paul Iverson
    Behavior Research Methods, 2020, 52 : 561 - 571
  • [23] Non-native Speech in English Literature
    Lange, Claudia
    ANGLIA-ZEITSCHRIFT FUR ENGLISCHE PHILOLOGIE, 2016, 134 (03): : 527 - U359
  • [24] Linguistic properties of non-native speech
    Tomokiyo, LM
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1335 - 1338
  • [25] Production and perception of temporal patterns in native and non-native speech
    Bent, Tessa
    Bradlow, Ann R.
    Smith, Bruce L.
    PHONETICA, 2008, 65 (03) : 131 - 147
  • [26] ACOUSTIC MODELING FOR NATIVE AND NON-NATIVE MANDARIN SPEECH RECOGNITION
    Chen, Xin
    Cheng, Jian
    2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 325 - 329
  • [27] Perception of audiovisual speech synchrony for native and non-native language
    Navarra, Jordi
    Alsius, Agnes
    Velasco, Ignacio
    Soto-Faraco, Salvador
    Spence, Charles
    BRAIN RESEARCH, 2010, 1323 : 84 - 93
  • [28] Infant selective attention to native and non-native audiovisual speech
    Roth, Kelly C.
    Clayton, Kenna R. H.
    Reynolds, Greg D.
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [29] Native and Non-native Speaker Judgements on the Quality of Synthesized Speech
    Janska, Anna C.
    Clark, Robert A. J.
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1121 - +
  • [30] Perceptual adaptation to non-native speech
    Bradlow, Ann R.
    Bent, Tessa
    COGNITION, 2008, 106 (02) : 707 - 729