JAPANESE-ENGLISH CODE-SWITCHING SPEECH DATA CONSTRUCTION

被引:0
|
作者
Nakayama, Sahoko [1 ]
Kano, Takatomo [1 ]
Quoc Truong Do [1 ]
Sakti, Sakriani [1 ]
Nakamura, Satoshi [1 ]
机构
[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Augmented Human Commun Lab, Ikoma, Nara, Japan
关键词
Data construction; code-switching; Japanese and English languages;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the number of Japanese-English bilingual speakers continues to increase, code-switching phenomena also happen more frequently. The units and locations of switches may vary widely from single word switches to whole phrases (beyond the length of the loanword units). Therefore, speech recognition systems must be developed that can handle not only Japanese or English but also Japanese-English code-switching. Consequently, a large-scale code-switching speech database is required for model training. But collecting natural conversation dialogues of Japanese-English data is both time-consuming and expensive. This paper presents the construction of Japanese-English code-switching speech data by utilizing a Japanese and English text-to-speech system from a bilingual speaker. Various switching units are also investigated including units of words and phrases. As a result, we successfully constructed over 280-k speech utterances of Japanese-English code-switching.
引用
收藏
页码:67 / 71
页数:5
相关论文
共 50 条
  • [21] CODE-SWITCHING - HINDI-ENGLISH
    VERMA, SK
    [J]. LINGUA, 1976, 38 (02) : 153 - 165
  • [22] Hinglish: code-switching in Indian English
    Sailaja, Pingali
    [J]. ELT JOURNAL, 2011, 65 (04) : 473 - 480
  • [23] Code-switching in early English literature
    Schendl, Herbert
    [J]. LANGUAGE AND LITERATURE, 2015, 24 (03) : 233 - 248
  • [24] AN EVALUATION BENCHMARK FOR AUTOMATIC SPEECH RECOGNITION OF GERMAN-ENGLISH CODE-SWITCHING
    Khosravani, Abbas
    Garner, Philip N.
    Lazaridis, Alexandros
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 811 - 816
  • [25] NON-AUTOREGRESSIVE MANDARIN-ENGLISH CODE-SWITCHING SPEECH RECOGNITION
    Chuang, Shun-Po
    Chang, Heng-Jui
    Huang, Sung-Feng
    Lee, Hung-yi
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 465 - 472
  • [26] Developing an Automatic Speech Recognizer For Filipino with English Code-Switching in News Broadcast
    Lim, Mark Louis
    Xu, Aaron John
    Lin, Charles Stepven
    Chen, Zishi
    Pascual, Ronald
    [J]. 2022-14TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST 2022), 2022, : 13 - 17
  • [27] CanVEC - the Canberra Vietnamese-English Code-switching Natural Speech Corpus
    Li Nguyen
    Bryant, Christopher
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 4121 - 4129
  • [28] Cyclic Transfer Learning for Mandarin-English Code-Switching Speech Recognition
    Nga, Cao Hong
    Vu, Duc-Quang
    Luong, Huong Hoang
    Huang, Chien-Lin
    Wang, Jia-Ching
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1387 - 1391
  • [29] ADDRESSING ACCENT MISMATCH IN MANDARIN-ENGLISH CODE-SWITCHING SPEECH RECOGNITION
    Tan, Zhili
    Fan, Xinghua
    Zhu, Hui
    Lin, Ed
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8259 - 8263
  • [30] Borrowing or Code-switching? Traces of community norms in Vietnamese-English speech
    Li Nguyen
    [J]. AUSTRALIAN JOURNAL OF LINGUISTICS, 2018, 38 (04) : 443 - 466