End-to-End Mispronunciation Detection with Simulated Error Distance

被引:2
|
作者
Zhang, Zhan [1 ]
Wang, Yuehai [1 ]
Yang, Jianyi [1 ]
机构
[1] Zhejiang Univ, Dept Informat & Elect Engn, Hangzhou, Zhejiang, Peoples R China
来源
关键词
mispronunciation detection; second language learning; speech recognition; TRANSFORMER; SPEECH;
D O I
10.21437/Interspeech.2022-870
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
With the development of deep learning, the performance of the mispronunciation detection model has improved greatly. However, the annotation for mispronunciation is quite expensive as it requires the experts to carefully judge the error for each pronounced phoneme. As a result, the supervised end-to-end mispronunciation detection model faces the problem of data shortage. Although the text-based data augmentation can partially alleviate this problem, we analyze that it only simulates the categorical phoneme error. Such a simulation is inefficient for the real situation. In this paper, we propose a novel unit-based data augmentation method. Our method converts the continuous audio signal into the robust audio vector and then into the discrete unit sequence. By modifying this unit sequence, we generate a more reasonable mispronunciation and can get the vector distance as the error indicator. By training on such simulated data, the experiments on L2Arctic show that our method can improve the performance of the mispronunciation detection task compared with the text-based method.
引用
收藏
页码:4327 / 4331
页数:5
相关论文
共 50 条
  • [31] An End-to-End Model for Android Malware Detection
    Liang, Hongliang
    Song, Yan
    Xiao, Da
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2017, : 140 - 142
  • [32] End-to-end lane detection with convolution and transformer
    Ge, Zekun
    Ma, Chao
    Fu, Zhumu
    Song, Shuzhong
    Si, Pengju
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (19) : 29607 - 29627
  • [33] Toward End-to-End Deception Detection in Videos
    Karimi, Hamid
    Tang, Jiliang
    Li, Yanen
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 1278 - 1283
  • [34] An End-to-End Approach for the Detection of Phishing Attacks
    Hammi, Badis
    Bilot, Tristan
    Bazain, Danyil
    Binand, Nicolas
    Jaen, Maxime
    Mitta, Chems
    El Madhoun, Nour
    [J]. ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOL 4, AINA 2024, 2024, 202 : 314 - 325
  • [35] End-to-End Entity Detection with Proposer and Regressor
    Wen, Xueru
    Zhou, Changjiang
    Tang, Haotian
    Liang, Luguang
    Qi, Hong
    Jiang, Yu
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (07) : 9269 - 9294
  • [36] GravityNet for end-to-end small lesion detection
    Russo, Ciro
    Bria, Alessandro
    Marrocco, Claudio
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 150
  • [37] SmokePose: End-to-End Smoke Keypoint Detection
    Jing, Tao
    Zeng, Ming
    Meng, Qing-Hao
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (10) : 5778 - 5789
  • [38] End-to-End Temporal Action Detection With Transformer
    Liu, Xiaolong
    Wang, Qimeng
    Hu, Yao
    Tang, Xu
    Zhang, Shiwei
    Bai, Song
    Bai, Xiang
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 5427 - 5441
  • [39] Develop End-to-End Anomaly Detection System
    Mengoli, Emanuele
    Yao, Zhiyuan
    Wei, Wutao
    Clausen, Thomas
    [J]. 2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 1370 - 1379
  • [40] What Makes for End-to-End Object Detection?
    Sun, Peize
    Jiang, Yi
    Xie, Enze
    Shao, Wenqi
    Yuan, Zehuan
    Wang, Changhu
    Luo, Ping
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139