JAPANESE-ENGLISH CODE-SWITCHING SPEECH DATA CONSTRUCTION

被引:0
|
作者
Nakayama, Sahoko [1 ]
Kano, Takatomo [1 ]
Quoc Truong Do [1 ]
Sakti, Sakriani [1 ]
Nakamura, Satoshi [1 ]
机构
[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Augmented Human Commun Lab, Ikoma, Nara, Japan
关键词
Data construction; code-switching; Japanese and English languages;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the number of Japanese-English bilingual speakers continues to increase, code-switching phenomena also happen more frequently. The units and locations of switches may vary widely from single word switches to whole phrases (beyond the length of the loanword units). Therefore, speech recognition systems must be developed that can handle not only Japanese or English but also Japanese-English code-switching. Consequently, a large-scale code-switching speech database is required for model training. But collecting natural conversation dialogues of Japanese-English data is both time-consuming and expensive. This paper presents the construction of Japanese-English code-switching speech data by utilizing a Japanese and English text-to-speech system from a bilingual speaker. Various switching units are also investigated including units of words and phrases. As a result, we successfully constructed over 280-k speech utterances of Japanese-English code-switching.
引用
收藏
页码:67 / 71
页数:5
相关论文
共 50 条
  • [1] SPEECH CHAIN FOR SEMI-SUPERVISED LEARNING OF JAPANESE-ENGLISH CODE-SWITCHING ASR AND TTS
    Nakayama, Sahoko
    Tjandra, Andros
    Sakti, Sakriani
    Nakamura, Satoshi
    [J]. 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 182 - 189
  • [2] Mandarin-English Code-switching Speech Recognition
    Xu, Haihua
    Van Tung Pham
    Kyaw, Zin Tun
    Lim, Zhi Hao
    Chng, Eng Siong
    Li, Haizhou
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 554 - 555
  • [3] Japanese/English code-switching: Syntax and pragmatics.
    McGloin, NH
    [J]. JOURNAL OF ASIAN STUDIES, 1998, 57 (02): : 545 - 546
  • [4] Acoustic data augmentation for Mandarin-English code-switching speech recognition
    Long, Yanhua
    Li, Yijie
    Zhang, Qiaozheng
    Wei, Shuang
    Ye, Hong
    Yang, Jichen
    [J]. APPLIED ACOUSTICS, 2020, 161
  • [5] A FUNCTIONAL-ANALYSIS OF JAPANESE ENGLISH CODE-SWITCHING
    NISHIMURA, M
    [J]. JOURNAL OF PRAGMATICS, 1995, 23 (02) : 157 - 181
  • [6] TEXTUAL DATA AUGMENTATION FOR ARABIC-ENGLISH CODE-SWITCHING SPEECH RECOGNITION
    Hussein, Amir
    Chowdhury, Shammur Absar
    Abdelali, Ahmed
    Dehak, Najim
    Ali, Ahmed
    Khudanpur, Sanjeev
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 777 - 784
  • [7] Japanese/English code-switching: Syntax and pragmatics.
    Onodera, NO
    [J]. LANGUAGE IN SOCIETY, 1999, 28 (03) : 467 - 470
  • [8] Acoustic modeling for Thai-English code-switching speech
    Chunwijitra, Vataya
    Thatphithakkul, Sumonmas
    Chootrakool, Patcharika
    Kasuriya, Sawit
    [J]. PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 94 - 99
  • [9] Code-switching in reported speech
    Leisiö, L
    [J]. SELECTED PAPERS FROM THE 6TH INTERNATIONAL PRAGMATICS CONFERENCE, VOL 2: PRAGMATICS IN 1998, 1999, : 349 - 362
  • [10] Code-Switching in Early English
    Honkapohja, Alpo
    Wright, Laura
    [J]. JOURNAL OF HISTORICAL PRAGMATICS, 2013, 14 (02) : 321 - 327