Improved mispronunciation detection with deep neural network trained acoustic models and transfer learning based logistic regression classifiers

被引:137
|
作者
Hu, Wenping [1 ,2 ]
Qian, Yao [2 ]
Soong, Frank K. [2 ]
Wang, Yong [1 ]
机构
[1] Univ Sci & Technol China, Hefei 230026, Peoples R China
[2] Microsoft Res Asia, Beijing 100080, Peoples R China
关键词
Computer-aided language learning; Mispronunciation detection; Deep neural network; Logistic regression; Transfer learning; ERROR; KNOWLEDGE;
D O I
10.1016/j.specom.2014.12.008
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Mispronunciation detection is an important part in a Computer-Aided Language Learning (CALL) system. By automatically pointing out where mispronunciations occur in an utterance, a language learner can receive informative and to-the-point feedbacks. In this paper, we improve mispronunciation detection performance with a Deep Neural Network (DNN) trained acoustic model and transfer learning based Logistic Regression (LR) classifiers. The acoustic model trained by the conventional GMM-HMM based approach is refined by the DNN training with enhanced discrimination. The corresponding Goodness Of Pronunciation (GOP) scores are revised to evaluate pronunciation quality of non-native language learners robustly. A Neural Network (NN) based, Logistic Regression (LR) classifier, where a general neural network with shared hidden layers for extracting useful speech features is pre-trained firstly with pooled, training data in the sense of transfer learning, and then phone-dependent, 2-class logistic regression classifiers are trained as phone specific output layer nodes, is proposed to mispronunciation detection. The new LR classifier streamlines training multiple individual classifiers separately by learning the common feature representation via the shared hidden layer. Experimental results on an isolated English word corpus recorded by non-native (L2) English learners show that the proposed GOP measure can improve the performance of GOP based mispronunciation detection approach, i.e., 7.4% of the precision and recall rate are both improved, compared with the conventional GOP estimated from GMM-HMM. The NN-based LR classifier improves the equal precision recall rate by 25% over the best GOP based approach. It also outperforms the state-of-art Support Vector Machine (SVM) based classifier by 2.2% of equal precision recall rate improvement. Our approaches also achieve similar results on a continuous read, L2 Mandarin language learning corpus. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:154 / 166
页数:13
相关论文
共 50 条
  • [21] A Novel Transfer Learning Ensemble based Deep Neural Network for Plant Disease Detection
    Lakshmi, R. Kavitha
    Savarimuthu, Nickolas
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PERFORMANCE EVALUATION (COMPE-2021), 2021, : 17 - +
  • [22] Power System Fault Probability Diagnosis Based on the Logistic Regression Deep Neural Network
    Lin J.
    Ren Y.
    Shan X.
    Li J.
    Zhai M.
    Wang B.
    Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/Journal of Tianjin University Science and Technology, 2021, 54 (02): : 186 - 195
  • [23] Detection of explosives in dustbins using deep transfer learning based multiclass classifiers
    Gyasi-Agyei, Amoakoh
    APPLIED INTELLIGENCE, 2024, 54 (02) : 2314 - 2347
  • [24] Detection of explosives in dustbins using deep transfer learning based multiclass classifiers
    Amoakoh Gyasi-Agyei
    Applied Intelligence, 2024, 54 : 2314 - 2347
  • [25] Classifiers ensemble of transfer learning for improved drill wear classification using convolutional neural network
    Kurek, Jaroslaw
    Antoniuk, Izabella
    Gorski, Jaroslaw
    Jegorowa, Albina
    Swiderski, Bartosz
    Kruk, Michal
    Wieczorek, Grzegorz
    Pach, Jakub
    Orlowski, Arkadiusz
    Aleksiejuk-Gawron, Joanna
    Machine Graphics and Vision, 2019, 28 (1-4): : 13 - 23
  • [26] Classifiers ensemble of transfer learning for improved drill wear classification using convolutional neural network
    Kurek, Jaroslaw
    Antoniuk, Izabella
    Górski, Jaroslaw
    Jegorowa, Albina
    Świderski, Bartosz
    Kruk, Michal
    Wieczorek, Grzegorz
    Pach, Jakub
    Orlowski, Arkadiusz
    Aleksiejuk-Gawron, Joanna
    Machine Graphics and Vision, 2021, 28 (01): : 13 - 23
  • [27] Network intrusion detection models based on improved dynamic neural network
    Zhang, Guiling
    Sun, Jizhou
    Jisuanji Gongcheng/Computer Engineering, 2006, 32 (11): : 10 - 12
  • [28] An Adaptive Deep Learning Neural Network Model to Enhance Machine-Learning-Based Classifiers for Intrusion Detection in Smart Grids
    Li, Xue Jun
    Ma, Maode
    Sun, Yihan
    ALGORITHMS, 2023, 16 (06)
  • [29] DEEP NEURAL NETWORK MODELS TRAINED WITH A FIXED RANDOM CLASSIFIER TRANSFER BETTER ACROSS DOMAINS
    Ali, Hafiz Tiomoko
    Michieli, Umberto
    Moon, Ji Joong
    Kim, Daehyun
    Ozay, Mete
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 5305 - 5309
  • [30] A P300-Detection Method Based on Logistic Regression and a Convolutional Neural Network
    Li, Qi
    Wu, Yan
    Song, Yu
    Zhao, Di
    Sun, Meiqi
    Zhang, Zhilin
    Wu, Jinglong
    FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2022, 16