A study on fine-tuning wav2vec2.0 Model for the task of Mispronunciation Detection and Diagnosis

被引:19
|
作者
Peng, Linkai [1 ]
Fu, Kaiqi [1 ]
Lin, Binghuai [2 ]
Ke, Dengfeng [1 ]
Zhan, Jinsong [1 ]
机构
[1] Beijing Language & Culture Univ, Beijing, Peoples R China
[2] Tencent Technol Co Ltd, Smart Platform Prod Dept, Shenzhen, Peoples R China
来源
关键词
self-supervised; mispronunciation detection and diagnosis (MDD); computer-aided pronunciation training (CAPT); wav2vec; 2.0; pre-training;
D O I
10.21437/Interspeech.2021-1344
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Mispronunciation detection and diagnosis (MDD) technology is a key component of computer-assisted pronunciation training system (CAPT). The mainstream method is based on deep neural network automatic speech recognition. Unfortunately, the technique requires massive human-annotated speech recordings for training. Due to the huge variations in mother tongue, age, and proficiency level among second language learners, it is difficult to gather a large amount of matching data for acoustic model training, which greatly limits the model performance. In this paper, we explore the use of Self-Supervised Pretraining (SSP) model wav2vec2.0 for MDD tasks. SSP utilizes a large unlabelled dataset to learn general representation and can be applied in downstream tasks. We conduct experiments using two publicly available datasets (TIMIT, L2-arctic) and our best system achieves 60.44% f1-score. Moreover, our method is able to achieve 55.52% f1-score with 3 times less data, which demonstrates the effectiveness of SSP on MDD1.
引用
收藏
页码:4448 / 4452
页数:5
相关论文
共 50 条
  • [41] Prompt Engineering or Fine-Tuning? A Case Study on Phishing Detection with Large Language Models
    Trad, Fouad
    Chehab, Ali
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2024, 6 (01): : 367 - 384
  • [42] Fine-Tuning Deteriorates General Textual Out-of-Distribution Detection by Distorting Task-Agnostic Features
    Chen, Sishuo
    Yang, Wenkai
    Bi, Xiaohan
    Sun, Xu
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 564 - 579
  • [43] Fine-Tuning Pre-Trained Model for Consumer Fraud Detection from Consumer Reviews
    Tang, Xingli
    Li, Keqi
    Huang, Liting
    Zhou, Hui
    Ye, Chunyang
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2023, PT II, 2023, 14147 : 451 - 456
  • [44] On the Fine-Tuning of the Stick-Beam Wing Dynamic Model of a Tiltrotor: A Case Study
    Beretta, Jacopo
    Cardozo, Andres
    Paletta, Nicola
    Chiariello, Antonio
    Belardo, Marika
    AEROSPACE, 2024, 11 (02)
  • [45] Cross-modal distillation with audio-text fusion for fine-grained emotion classification using BERT and Wav2vec 2.0
    Kim, Donghwa
    Kang, Pilsung
    NEUROCOMPUTING, 2022, 506 : 168 - 183
  • [46] Applying the conformal prediction paradigm for the uncertainty quantification of an end-to-end automatic speech recognition model (wav2vec 2.0)
    Ernez, Fares
    Arnold, Alexandre
    Galametz, Audrey
    Kobus, Catherine
    Ould-Amer, Nawal
    CONFORMAL AND PROBABILISTIC PREDICTION WITH APPLICATIONS, VOL 204, 2023, 204 : 16 - 35
  • [47] Lightweight model-based two-step fine-tuning for fault diagnosis with limited data
    Tang, Tang
    Wu, Jie
    Chen, Ming
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2022, 33 (12)
  • [48] UIRISC at SemEval-2023 Task 10: Explainable Detection of Online Sexism by Ensembling Fine-tuning Language Models
    Zhong, Tianyun
    Song, Runhui
    Liu, Xunyuan
    Wang, Juelin
    Wang, Boya
    Li, Binyang
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 2082 - 2090
  • [49] KInITVeraAI at SemEval-2023 Task 3: Simple yet Powerful Multilingual Fine-Tuning for Persuasion Techniques Detection
    Hromadka, Timo
    Smolen, Timotej
    Remis, Tomas
    Pecher, Branislav
    Srba, Ivan
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 629 - 637
  • [50] CardiffNLP-Metaphor at SemEval-2022 Task 2: Targeted Fine-tuning of Transformer-based Language Models for Idiomaticity Detection
    Boisson, Joanne
    Anke, Luis Espinosa
    Collados, Jose Camacho
    PROCEEDINGS OF THE 16TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2022, 2022, : 169 - 177