A study on fine-tuning wav2vec2.0 Model for the task of Mispronunciation Detection and Diagnosis

被引:19
|
作者
Peng, Linkai [1 ]
Fu, Kaiqi [1 ]
Lin, Binghuai [2 ]
Ke, Dengfeng [1 ]
Zhan, Jinsong [1 ]
机构
[1] Beijing Language & Culture Univ, Beijing, Peoples R China
[2] Tencent Technol Co Ltd, Smart Platform Prod Dept, Shenzhen, Peoples R China
来源
关键词
self-supervised; mispronunciation detection and diagnosis (MDD); computer-aided pronunciation training (CAPT); wav2vec; 2.0; pre-training;
D O I
10.21437/Interspeech.2021-1344
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Mispronunciation detection and diagnosis (MDD) technology is a key component of computer-assisted pronunciation training system (CAPT). The mainstream method is based on deep neural network automatic speech recognition. Unfortunately, the technique requires massive human-annotated speech recordings for training. Due to the huge variations in mother tongue, age, and proficiency level among second language learners, it is difficult to gather a large amount of matching data for acoustic model training, which greatly limits the model performance. In this paper, we explore the use of Self-Supervised Pretraining (SSP) model wav2vec2.0 for MDD tasks. SSP utilizes a large unlabelled dataset to learn general representation and can be applied in downstream tasks. We conduct experiments using two publicly available datasets (TIMIT, L2-arctic) and our best system achieves 60.44% f1-score. Moreover, our method is able to achieve 55.52% f1-score with 3 times less data, which demonstrates the effectiveness of SSP on MDD1.
引用
收藏
页码:4448 / 4452
页数:5
相关论文
共 50 条
  • [31] Evaluating the Effectiveness of Fine-Tuning Large Language Model for Domain-Specific Task
    Dabhi, Saumya
    Martinez, Joseph
    Poursardar, Faryaneh
    2024 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI 2024, 2024, : 176 - 177
  • [32] Assessment of Non-Native Speech Intelligibility using Wav2vec2-based Mispronunciation Detection and Multi-level Goodness of Pronunciation Transformer
    Shekar, Ram C. M. C.
    Yang, Mu
    Hirschi, Kevin
    Looney, Stephen
    Kang, Okim
    Hansen, John
    INTERSPEECH 2023, 2023, : 984 - 988
  • [33] Fine-tuning Audio Spectrogram Transformer with Task-aware Adapters for Sound Event Detection
    Li, Kang
    Song, Yan
    McLoughlin, Ian
    Liu, Lin
    Li, Jin
    Dai, Li-Rong
    INTERSPEECH 2023, 2023, : 291 - 295
  • [34] Performance Evaluation of LWIR Image Detection Using Fine-tuning of YOLOX Model
    Bae, Jaehyun
    Kang, Byung-Jin
    Kim, Daehyeon
    Baek, Kyounghoon
    Journal of Institute of Control, Robotics and Systems, 2024, 30 (07) : 685 - 690
  • [35] Improving speech depression detection using transfer learning with wav2vec 2.0 in low-resource environments
    Zhang, Xu
    Zhang, Xiangcheng
    Chen, Weisi
    Li, Chenlong
    Yu, Chengyuan
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [36] Improving Speech Translation Accuracy and Time Efficiency With Fine-Tuned wav2vec 2.0-Based Speech Segmentation
    Fukuda, Ryo
    Sudoh, Katsuhito
    Nakamura, Satoshi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 906 - 916
  • [37] GreyBox at SemEval-2024 Task 4: Progressive Fine-tuning (for Multilingual Detection of Propaganda Techniques)
    Roll, Nathan
    Graham, Calbert
    PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 888 - 893
  • [38] Learning Task-Specific Initialization for Effective Federated Continual Fine-Tuning of Foundation Model Adapters
    Peng, Danni
    Wang, Yuan
    Fu, Huazhu
    Wee, Qingsong
    Liu, Yong
    Goh, Rick Siow Mong
    2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 811 - 816
  • [39] Robustness Fine-Tuning Deep Learning Model for Cancers Diagnosis Based on Histopathology Image Analysis
    El-Ghany, Sameh Abd
    Azad, Mohammad
    Elmogy, Mohammed
    DIAGNOSTICS, 2023, 13 (04)
  • [40] Diagnosis of Brain Tumor Using Light Weight Deep Learning Model with Fine-Tuning Approach
    Shelatkar, Tejas
    Urvashi, Dr.
    Shorfuzzaman, Mohammad
    Alsufyani, Abdulmajeed
    Lakshmanna, Kuruva
    COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2022, 2022