End-to-End Mispronunciation Detection with Simulated Error Distance

被引：2

作者：

Zhang, Zhan ^{[1
]}

Wang, Yuehai ^{[1
]}

Yang, Jianyi ^{[1
]}

机构：

[1] Zhejiang Univ, Dept Informat & Elect Engn, Hangzhou, Zhejiang, Peoples R China

来源：

INTERSPEECH 2022 | 2022年

关键词：

mispronunciation detection; second language learning; speech recognition; TRANSFORMER; SPEECH;

D O I：

10.21437/Interspeech.2022-870

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

With the development of deep learning, the performance of the mispronunciation detection model has improved greatly. However, the annotation for mispronunciation is quite expensive as it requires the experts to carefully judge the error for each pronounced phoneme. As a result, the supervised end-to-end mispronunciation detection model faces the problem of data shortage. Although the text-based data augmentation can partially alleviate this problem, we analyze that it only simulates the categorical phoneme error. Such a simulation is inefficient for the real situation. In this paper, we propose a novel unit-based data augmentation method. Our method converts the continuous audio signal into the robust audio vector and then into the discrete unit sequence. By modifying this unit sequence, we generate a more reasonable mispronunciation and can get the vector distance as the error indicator. By training on such simulated data, the experiments on L2Arctic show that our method can improve the performance of the mispronunciation detection task compared with the text-based method.

引用

页码：4327 / 4331

页数：5

共 50 条

[31] An End-to-End Model for Android Malware Detection
Liang, Hongliang
Song, Yan
Xiao, Da
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2017, : 140 - 142
[32] End-to-end lane detection with convolution and transformer
Ge, Zekun
Ma, Chao
Fu, Zhumu
Song, Shuzhong
Si, Pengju
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (19) : 29607 - 29627
[33] Toward End-to-End Deception Detection in Videos
Karimi, Hamid
Tang, Jiliang
Li, Yanen
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 1278 - 1283
[34] An End-to-End Approach for the Detection of Phishing Attacks
Hammi, Badis
Bilot, Tristan
Bazain, Danyil
Binand, Nicolas
Jaen, Maxime
Mitta, Chems
El Madhoun, Nour
[J]. ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOL 4, AINA 2024, 2024, 202 : 314 - 325
[35] End-to-End Entity Detection with Proposer and Regressor
Wen, Xueru
Zhou, Changjiang
Tang, Haotian
Liang, Luguang
Qi, Hong
Jiang, Yu
[J]. NEURAL PROCESSING LETTERS, 2023, 55 (07) : 9269 - 9294
[36] GravityNet for end-to-end small lesion detection
Russo, Ciro
Bria, Alessandro
Marrocco, Claudio
[J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 150
[37] SmokePose: End-to-End Smoke Keypoint Detection
Jing, Tao
Zeng, Ming
Meng, Qing-Hao
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (10) : 5778 - 5789
[38] End-to-End Temporal Action Detection With Transformer
Liu, Xiaolong
Wang, Qimeng
Hu, Yao
Tang, Xu
Zhang, Shiwei
Bai, Song
Bai, Xiang
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 5427 - 5441
[39] Develop End-to-End Anomaly Detection System
Mengoli, Emanuele
Yao, Zhiyuan
Wei, Wutao
Clausen, Thomas
[J]. 2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 1370 - 1379
[40] What Makes for End-to-End Object Detection?
Sun, Peize
Jiang, Yi
Xie, Enze
Shao, Wenqi
Yuan, Zehuan
Wang, Changhu
Luo, Ping
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139

← 1 2 3 4 5 →