End-to-End Mispronunciation Detection with Simulated Error Distance

被引:2
|
作者
Zhang, Zhan [1 ]
Wang, Yuehai [1 ]
Yang, Jianyi [1 ]
机构
[1] Zhejiang Univ, Dept Informat & Elect Engn, Hangzhou, Zhejiang, Peoples R China
来源
关键词
mispronunciation detection; second language learning; speech recognition; TRANSFORMER; SPEECH;
D O I
10.21437/Interspeech.2022-870
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
With the development of deep learning, the performance of the mispronunciation detection model has improved greatly. However, the annotation for mispronunciation is quite expensive as it requires the experts to carefully judge the error for each pronounced phoneme. As a result, the supervised end-to-end mispronunciation detection model faces the problem of data shortage. Although the text-based data augmentation can partially alleviate this problem, we analyze that it only simulates the categorical phoneme error. Such a simulation is inefficient for the real situation. In this paper, we propose a novel unit-based data augmentation method. Our method converts the continuous audio signal into the robust audio vector and then into the discrete unit sequence. By modifying this unit sequence, we generate a more reasonable mispronunciation and can get the vector distance as the error indicator. By training on such simulated data, the experiments on L2Arctic show that our method can improve the performance of the mispronunciation detection task compared with the text-based method.
引用
收藏
页码:4327 / 4331
页数:5
相关论文
共 50 条
  • [21] End-to-end distance computation in grid environment by NDS, the network distance service
    Gossa, Julien
    Pierson, Jean-Marc
    [J]. ECUMN 2007: FOURTH EUROPEAN CONFERENCE ON UNIVERSAL MULTISERVICE NETWORKS, PROCEEDINGS, 2007, : 210 - +
  • [22] GravityNet for end-to-end small lesion detection
    Russo, Ciro
    Bria, Alessandro
    Marrocco, Claudio
    [J]. Artificial Intelligence in Medicine, 2024, 150
  • [23] End-to-End Automatic Pronunciation Error Detection Based on Improved Hybrid CTC/Attention Architecture
    Zhang, Long
    Zhao, Ziping
    Ma, Chunmei
    Shan, Linlin
    Sun, Huazhi
    Jiang, Lifen
    Deng, Shiwen
    Gao, Chang
    [J]. SENSORS, 2020, 20 (07) : 1 - 24
  • [24] Towards End-to-End Synthetic Speech Detection
    Hua, Guang
    Teoh, Andrew Beng Jin
    Zhang, Haijian
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2021, 28 (28) : 1265 - 1269
  • [25] End-to-end people detection in crowded scenes
    Stewart, Russell
    Andriluka, Mykhaylo
    Ng, Andrew Y.
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2325 - 2333
  • [26] An End-to-End Approach for the Detection of Phishing Attacks
    Hammi, Badis
    Bilot, Tristan
    Bazain, Danyil
    Binand, Nicolas
    Jaen, Maxime
    Mitta, Chems
    El Madhoun, Nour
    [J]. ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOL 4, AINA 2024, 2024, 202 : 314 - 325
  • [27] An End-to-End Model for Android Malware Detection
    Liang, Hongliang
    Song, Yan
    Xiao, Da
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2017, : 140 - 142
  • [28] Intrinsic Explainability for End-to-End Object Detection
    Fernandes, Luis
    Fernandes, Joao N. D.
    Calado, Mariana
    Pinto, Joao Ribeiro
    Cerqueira, Ricardo
    Cardoso, Jaime S.
    [J]. IEEE ACCESS, 2024, 12 : 2623 - 2634
  • [29] An end-to-end workflow for improved methylation detection
    Bonar, Lydia
    Butcher, Kristin
    Bocek, Michael
    Corbitt, Holly
    Hoglund, Bryan
    Nassif, Cibelle
    Cherry, Patrick
    Murphy, Derek
    Challacombe, Jean
    Toro, Esteban
    [J]. CANCER RESEARCH, 2023, 83 (07)
  • [30] End-to-End Detection and Recognition of Arithmetic Expressions
    Wan, Jiangpeng
    Zhao, Mengbiao
    Yin, Fei
    Zhang, Xu-Yao
    Huang, LinLin
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PT I, 2021, 13019 : 505 - 517