ASR ERROR DETECTION AND RECOGNITION RATE ESTIMATION USING DEEP BIDIRECTIONAL RECURRENT NEURAL NETWORKS

被引:0
|
作者
Ogawa, Atsunori [1 ]
Hori, Takaaki [1 ]
机构
[1] NTT Corp, NTT Commun Sci Labs, Tokyo, Japan
关键词
Automatic speech recognition; error detection; recognition rate estimation; deep bidirectional recurrent neural networks; generalization ability; LSTM;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recurrent neural networks (RNNs) have recently been applied as the classifiers for sequential labeling problems. In this paper, deep bidirectional RNNs (DBRNNs) are applied for the first time to error detection in automatic speech recognition (ASR), which is a sequential labeling problem. We investigate three types of ASR error detection tasks, i.e. confidence estimation, out-of-vocabulary word detection and error type classification. We also estimate recognition rates from the error type classification results. Experimental results show that the DBRNNs greatly outperform conditional random fields (CRFs), especially for the detection of infrequent error labels. The DBRNNs also slightly outperform the CRFs in recognition rate estimation. In addition, experiments using a reduced size of training data suggest that the DBRNNs have a better generalization ability than the CRFs owing to their word vector representation in a low-dimensional continuous space. As a result, the DBRNNs trained using only 20% of the training data show higher error detection performance than the CRFs trained using the full training data.
引用
收藏
页码:4370 / 4374
页数:5
相关论文
共 50 条
  • [21] Speech Enhancement for Speaker Recognition Using Deep Recurrent Neural Networks
    Tkachenko, Maxim
    Yamshinin, Alexander
    Lyubimov, Nikolay
    Kotov, Mikhail
    Nastasenko, Marina
    SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 690 - 699
  • [22] Bidirectional deep recurrent neural networks for process fault classification
    Chadha, Gavneet Singh
    Panambilly, Ambarish
    Schwung, Andreas
    Ding, Steven X.
    ISA TRANSACTIONS, 2020, 106 (106) : 330 - 342
  • [23] WORD EMBEDDINGS COMBINATION AND NEURAL NETWORKS FOR ROBUSTNESS IN ASR ERROR DETECTION
    Ghannay, Sahar
    Esteve, Yannick
    Camelin, Nathalie
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 1671 - 1675
  • [24] Quality estimation for DASH clients by using Deep Recurrent Neural Networks
    Kheibari, Bita
    Sayit, Muge
    2020 16TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT (CNSM), 2020,
  • [25] Forex exchange rate forecasting using deep recurrent neural networks
    Alexander Jakob Dautel
    Wolfgang Karl Härdle
    Stefan Lessmann
    Hsin-Vonn Seow
    Digital Finance, 2020, 2 (1-2): : 69 - 96
  • [26] Crackle and Breathing Phase Detection in Lung Sounds with Deep Bidirectional Gated Recurrent Neural Networks
    Messner, Elmar
    Fediuk, Melanie
    Swatek, Paul
    Scheidl, Stefan
    Smolle-Juttner, Freyja-Maria
    Olschewski, Horst
    Pernkopf, Franz
    2018 40TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2018, : 356 - 359
  • [27] Bidirectional recurrent neural networks
    Schuster, M
    Paliwal, KK
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1997, 45 (11) : 2673 - 2681
  • [28] Facial expression recognition of pain detection using recurrent neural networks
    Bie, Mei
    Xiao, Wei
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2019, 125 : 219 - 220
  • [29] Video Description Using Bidirectional Recurrent Neural Networks
    Peris, Alvaro
    Bolanos, Marc
    Radeva, Petia
    Casacuberta, Francisco
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT II, 2016, 9887 : 3 - 11
  • [30] RECURRENT DEEP NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION
    Weng, Chao
    Yu, Dong
    Watanabe, Shinji
    Juang, Biing-Hwang
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,