CNN-RNN-CTC BASED END-TO-END MISPRONUNCIATION DETECTION AND DIAGNOSIS

被引:0
|
作者
Leung, Wai-Kim [1 ]
Liu, Xunying [1 ]
Meng, Helen [1 ]
机构
[1] Chinese Univ Hong Kong, Human Comp Commun Lab, Dept Syst Engn & Engn Management, BDDA Res Ctr, Hong Kong, Peoples R China
关键词
Computer Assisted Pronunciation Training (CAPT); Mispronunciation Detection and Diagnosis (MDD); Connectionist Temporal Classification (CTC); Convolutional Neural Network (CNN); e-learning;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper focuses on using Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) and Connectionist Temporal Classification (CTC) to build an end-to-end speech recognition for Mispronunciation Detection and Diagnosis (MDD) task. Our approach is end-to-end models, while phonemic or graphemic information, or forced alignment between different linguistic units, are not required. We conduct experiments that compare the proposed CNN-RNN-CTC approach with alternative mispronunciation detection and diagnoses (MDD) approaches. The F-measure of our approach is 74.65%, which significantly outperforms the Extended Recognition Network (ERN) (S-AM) by 44.75% and State-level Acoustic Model (S-AM) by 32.28% relatively. The relative improvement in F-measure when over Acoustic-Phonemic Model (APM), Acoustic-Graphemic Model (AGM) and Acoustic-Phonemic-Graphemic Model (APGM) are 9.57%, 5.04% and 2.77% respectively.
引用
收藏
页码:8132 / 8136
页数:5
相关论文
共 50 条
  • [21] An end-to-end RNS CNN Accelerator
    Sakellariou, Vasilis
    Paliouras, Vassilis
    Kouretas, Ioannis
    Saleh, Hani
    Stouraitis, Thanos
    [J]. 2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024, 2024, : 75 - 79
  • [22] An End-to-End Pipeline for Early Diagnosis of Acute Promyelocytic Leukemia Based on a Compact CNN Model
    Qiao, Yifan
    Zhang, Yi
    Liu, Nian
    Chen, Pu
    Liu, Yan
    [J]. DIAGNOSTICS, 2021, 11 (07)
  • [23] Sparse R-CNN: An End-to-End Framework for Object Detection
    Sun, Peize
    Zhang, Rufeng
    Jiang, Yi
    Kong, Tao
    Xu, Chenfeng
    Zhan, Wei
    Tomizuka, Masayoshi
    Yuan, Zehuan
    Luo, Ping
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15650 - 15664
  • [24] A Grasp Pose Detection Scheme With an End-to-End CNN Regression Approach
    Cheng, Hu
    Meng, Max Q. -H.
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO), 2018, : 544 - 549
  • [25] CNN-based End-to-End Learning for Lane Centering
    Ebu, Iffat Ara
    Islam, Fahmida
    Ball, John E.
    Goodin, Christopher T.
    [J]. AUTONOMOUS SYSTEMS:SENSORS, PROCESSING, AND SECURITY FOR GROUND, AIR, SEA, AND SPACE VEHICLES AND INFRASTRUCTURE 2024, 2024, 13052
  • [26] A Light CNN for End-to-End Car License Plates Detection and Recognition
    Wang, Wanwei
    Yang, Jun
    Chen, Min
    Wang, Peng
    [J]. IEEE ACCESS, 2019, 7 : 173875 - 173883
  • [27] END-TO-END SPEECH RECOGNITION WITH WORD-BASED RNN LANGUAGE MODELS
    Hori, Takaaki
    Cho, Jaejin
    Watanabe, Shinji
    [J]. 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 389 - 396
  • [28] Development of CRF and CTC Based End-To-End Kazakh Speech Recognition System
    Oralbekova, Dina
    Mamyrbayev, Orken
    Othman, Mohamed
    Alimhan, Keylan
    Zhumazhanov, Bagashar
    Nuranbayeva, Bulbul
    [J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2022, PT I, 2022, 13757 : 519 - 531
  • [29] EFFICIENT FREE KEYWORD DETECTION BASED ON CNN AND END-TO-END CONTINUOUS DP-MATCHING
    Tanaka, Tomohiro
    Shinozaki, Takahiro
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 637 - 644
  • [30] Preprocessing Techniques for End-To-End Trainable RNN-Based Conversational System
    Maziad, Hussein
    Rammouz, Julie-Ann
    El Asmar, Boulos
    Tekli, Joe
    [J]. WEB ENGINEERING, ICWE 2021, 2021, 12706 : 255 - 270