L2-ARCTIC: A Non-Native English Speech Corpus

被引:59
|
作者
Zhao, Guanlong [1 ]
Sonsaat, Sinem [2 ]
Silpachai, Alif [2 ]
Lucic, Ivana [2 ]
Chukharev-Hudilainen, Evgeny [2 ]
Levis, John [2 ]
Gutierrez-Osuna, Ricardo [1 ]
机构
[1] Texas A&M Univ, Dept Comp Sci & Engn, College Stn, TX 77843 USA
[2] Iowa State Univ, Dept English, Ames, IA USA
关键词
speech corpus; voice conversion; accent conversion; mispronunciation detection; STRESS;
D O I
10.21437/Interspeech.2018-1110
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we introduce L2-ARCTIC, a speech corpus of non-native English that is intended for research in voice conversion, accent conversion, and mispronunciation detection. This initial release includes recordings from ten non-native speakers of English whose first languages (LIs) are Hindi, Korean, Mandarin, Spanish, and Arabic, each L1 containing recordings from one male and one female speaker. Each speaker recorded approximately one hour of read speech from the Carnegie Mellon University ARCTIC prompts, from which we generated orthographic and forced-aligned phonetic transcriptions. In addition, we manually annotated 150 utterances per speaker to identify three types of mispronunciation errors: substitutions, deletions, and additions, making it a valuable resource not only for research in voice conversion and accent conversion but also in computer-assisted pronunciation training. The corpus is publicly accessible at https://psi.engr.tamu.edu/12-arctic-corpus/.
引用
收藏
页码:2783 / 2787
页数:5
相关论文
共 50 条
  • [1] Comparing transcription agreement on non-native English speech corpus between native and non-native annotators
    Ryu, Hyuksu
    Kim, Sunhee
    Chung, Minhwa
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2363 - 2366
  • [2] Non-native Speech in English Literature
    Lange, Claudia
    [J]. ANGLIA-ZEITSCHRIFT FUR ENGLISCHE PHILOLOGIE, 2016, 134 (03): : 527 - U359
  • [3] INTELLIGIBILITY OF ENGLISH SPEECH TO NON-NATIVE ENGLISH SPEAKERS
    IRVINE, DH
    [J]. LANGUAGE AND SPEECH, 1977, 20 (OCT-) : 308 - 316
  • [4] The IFCASL Corpus of French and German Non-native and Native Read Speech
    Trouvain, Juergen
    Bonneau, Anne
    Colotte, Vincent
    Fauth, Camille
    Fohr, Dominique
    Jouvet, Denis
    Juegler, Jeanin
    Laprie, Yves
    Mella, Odile
    Moebius, Bernd
    Zimmerer, Frank
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 1333 - 1338
  • [5] Spectral integration of English speech for non-native English speakers
    Calandruccio, Lauren
    Buss, Emily
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 142 (03): : 1646 - 1654
  • [6] AN ANALYSIS OF GRAMMATICAL ERRORS IN NON-NATIVE SPEECH IN ENGLISH
    Lee, John
    Seneff, Stephanie
    [J]. 2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS, 2008, : 89 - 92
  • [7] Acoustic characteristics of non-native Lombard speech in the DELNN corpus
    Marcoux, Katherine
    Ernestus, Mirjam
    [J]. JOURNAL OF PHONETICS, 2024, 102
  • [8] Perceptions of L2 Fluency by Native and Non-native Speakers of English
    Rossiter, Marian J.
    [J]. CANADIAN MODERN LANGUAGE REVIEW-REVUE CANADIENNE DES LANGUES VIVANTES, 2009, 65 (03): : 395 - 412
  • [9] Intelligibility of English Mosaic Speech: Comparison between Native and Non-Native Speakers of English
    Santi
    Nakajima, Yoshitaka
    Ueda, Kazuo
    Remijn, Gerard B.
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (19): : 1 - 13
  • [10] Exploring Native and Non-Native English Child Speech Recognition With Whisper
    Jain, Rishabh
    Barcovschi, Andrei
    Yiwere, Mariam Yahayah
    Corcoran, Peter
    Cucu, Horia
    [J]. IEEE ACCESS, 2024, 12 : 41601 - 41610