MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION

被引:3
|
作者
Li, Xinjian [1 ]
Mortensen, David R. [1 ]
Metze, Florian [1 ]
Black, Alan W. [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
关键词
Multilingual Phonetic Dataset; Multilingual Speech Alignment; Low-Resource Speech recognition;
D O I
10.1109/ICASSP39728.2021.9413720
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Phone Recognition is one of the most important tasks in the field of multilingual speech recognition, especially for low-resource languages whose orthographies are not available. However, most speech recognition datasets so far only focus on high-resource languages, there are very few datasets available for low-resource languages, especially datasets with detailed phone annotation. In this work, we present a large multilingual phonetic dataset, which is preprocessed and aligned from the UCLA phonetic dataset. The dataset contains around 100 low-resource languages and 7000 utterances in total. This dataset would provide an ideal training/evaluation set for universal phone recognition.
引用
收藏
页码:6958 / 6962
页数:5
相关论文
共 50 条
  • [21] Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition
    Farooq, Muhammad Umar
    Hain, Thomas
    [J]. INTERSPEECH 2022, 2022, : 3849 - 3853
  • [22] JukeBox: A Multilingual Singer Recognition Dataset
    Chowdhury, Anurag
    Cozzo, Austin
    Ross, Arun
    [J]. INTERSPEECH 2020, 2020, : 2267 - 2271
  • [23] Cross-Lingual Self-training to Learn Multilingual Representation for Low-Resource Speech Recognition
    Zi-Qiang Zhang
    Yan Song
    Ming-Hui Wu
    Xin Fang
    Ian McLoughlin
    Li-Rong Dai
    [J]. Circuits, Systems, and Signal Processing, 2022, 41 : 6827 - 6843
  • [24] Cross-Lingual Self-training to Learn Multilingual Representation for Low-Resource Speech Recognition
    Zhang, Zi-Qiang
    Song, Yan
    Wu, Ming-Hui
    Fang, Xin
    McLoughlin, Ian
    Dai, Li-Rong
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (12) : 6827 - 6843
  • [25] A Comparative Study of BNF and DNN Multilingual Training on Cross-lingual Low-resource Speech Recognition
    Xu, Haihua
    Van Hai Do
    Xiao, Xiong
    Chng, Eng-Siong
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2132 - 2136
  • [26] Multilingual Convolutional, Long Short-Term Memory, Deep Neural Networks for Low Resource Speech Recognition
    Bukhari, Danish
    Wang, Yutian
    Wang, Hui
    [J]. ADVANCES IN INFORMATION AND COMMUNICATION TECHNOLOGY, 2017, 107 : 842 - 847
  • [27] SPEAKER AUGMENTATION FOR LOW RESOURCE SPEECH RECOGNITION
    Du, Chenpeng
    Yu, Kai
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7719 - 7723
  • [28] AUTOMATIC RECOGNITION OF PHONETIC PATTERNS IN SPEECH
    DUDLEY, H
    BALASHEK, S
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1958, 30 (08): : 721 - 732
  • [29] AUTOMATIC RECOGNITION OF PHONETIC ELEMENTS IN SPEECH
    DAVIS, KH
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1953, 25 (04): : 832 - 832
  • [30] Arabic Phonetic Dictionaries for Speech Recognition
    Ali, Mohamed
    Elshafei, Moustafa
    Al-Ghamdi, Mansour
    Al-Muhtaseb, Husni
    Al-Najjar, Atef
    [J]. JOURNAL OF INFORMATION TECHNOLOGY RESEARCH, 2009, 2 (04) : 67 - 80