Automatic Miscue Detection using RNN Based Models With Data Augmentation

被引:0
|
作者
Hong, Yoon Seok [1 ]
Ki, Kyung Seo [1 ]
Gweon, Gahgene [1 ]
机构
[1] Seoul Natl Univ, Grad Sch Convergence Sci & Technol, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
automatic miscue detection; phoneme classification; data augmentation; recurrent neural network;
D O I
10.21437/Interspeech.2018-1644
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This study proposes a method of using data augmentation to address the problem of data shortages in miscue detection tasks. Three main steps were taken. First, a phoneme classifier was developed to acquire force-aligned data, which would be used for miscue classification and data augmentation. In order to create the phoneme classifier, phonetic features of "Seoul Reading Speech" (SRS) corpus were extracted by using grapheme-to-phoneme (G2P) to train CNN-based models. Second, to obtain miscue labeled corpus, we performed data augmentation using the phoneme classifier output, which is artificially generated miscue corpus of SRS (modified-SRS). This miscue corpus was created by randomly deleting or modifying sound sections according to three miscue categories; extension (EXT), pause (PAU), and pre-correction (PRE). Third, the performance of the miscue classifier was tested after training three types of RNN based models (LSTM, BiLSTM, BiGRU) with the modified-SRS corpus. The results show that the BiGRU model performed best at 0.819 in F1-score on augmented data, while BiLSTM model performed best at 0.512 on real data.
引用
收藏
页码:1646 / 1650
页数:5
相关论文
共 50 条
  • [1] Data augmentation using CycleGAN-based methods for automatic bridge crack detection
    Li, Baoxian
    Guo, Hongbin
    Wang, Zhanfei
    STRUCTURES, 2024, 62
  • [2] Data augmentation using generative models for track intrusion detection
    Lee, Soohyung
    Kim, Beomseong
    Lee, Heesung
    SCIENCE PROGRESS, 2023, 106 (04)
  • [3] Data Augmentation Using Generative Adversarial Network for Automatic Machine Fault Detection Based on Vibration Signals
    Bui, Van
    Pham, Tung Lam
    Nguyen, Huy
    Jang, Yeong Min
    APPLIED SCIENCES-BASEL, 2021, 11 (05): : 1 - 16
  • [4] CNN Based on Transfer Learning Models Using Data Augmentation and Transformation for Detection of Concrete Crack
    Islam, Md Monirul
    Hossain, Md Belal
    Akhtar, Md Nasim
    Moni, Mohammad Ali
    Hasan, Khondokar Fida
    ALGORITHMS, 2022, 15 (08)
  • [5] Depression Detection on Twitter Using RNN and LSTM Models
    Apoorva, Abhyudaya
    Goyal, Vinat
    Kumar, Aveekal
    Singh, Rishu
    Sharma, Sanjeev
    ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2022, PT II, 2023, 1798 : 305 - 319
  • [6] Automatic Detection of Grammatical Errors in English Verbs Based on RNN Algorithm: Auxiliary Objectives for Neural Error Detection Models
    He, Yizhou
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
  • [7] Automatic detection of interictal spikes using data mining models
    Valenti, P
    Scarpettini, M
    Cazamajou, E
    Aizemberg, A
    Silva, W
    Giagante, B
    Oddo, S
    Kochen, S
    EPILEPSIA, 2003, 44 : 179 - 179
  • [8] Automatic detection of interictal spikes using data mining models
    Valenti, P
    Cazamajou, E
    Scarpettini, M
    Aizemberg, A
    Silva, W
    Kochen, S
    JOURNAL OF NEUROSCIENCE METHODS, 2006, 150 (01) : 105 - 110
  • [9] RNN Models for Rain Detection
    Habi, Hai Victor
    Messer, Hagit
    PROCEEDINGS OF THE 2019 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2019), 2019, : 184 - 188
  • [10] Impartial Differentiable Automatic Data Augmentation Based on Finite Difference Approximation for Pedestrian Detection
    Zhou, Shirui
    Tang, Yi
    Liu, Min
    Wang, Yaonan
    Wen, He
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71