Classification of causes of speech recognition errors using attention-based bidirectional long short-term memory and modulation spectrum

被引:0
|
作者
Santoso, Jennifer [1 ]
Yamada, Takeshi [1 ]
Makino, Shoji [1 ]
机构
[1] Univ Tsukuba, Tsukuba, Ibaraki, Japan
基金
日本学术振兴会;
关键词
NETWORKS;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we address the problem of classifying four common utterance characteristics related to the utterance speed, which cause speech recognition errors. We previously proposed bidirectional long short-term memory (BLSTM) as a classifier and the modulation spectrum as an acoustic feature. However, the performance of it is still insufficient, since BLSTM classified the utterance characteristics from the overall utterance, while most of the recognition errors resulted from utterance characteristics occur in only a small part of utterance. In this paper, we propose an approach to enhance classifier by using attention mechanism (attention-based BLSTM). Attention-based BLSTM enables the classifier to weight each frame according to its importance instead of directly measuring overall information from the speech. Furthermore, we investigate the correspondence of utterance characteristics to different modulation spectrum block lengths. To evaluate the performance of the proposed method, we conducted a classification experiment on Japanese conversational speeches with four different utterance characteristics: 'fast', 'slow', 'filler', and 'stutter'. As a result, the proposed method improved the F-score by 0.033-0.129 compared with the previously proposed method using BLSTM. This result confirms the effectiveness of attention-based BLSTM in classifying cause of errors based on utterance characteristics.
引用
收藏
页码:302 / 306
页数:5
相关论文
共 50 条
  • [1] Classification of causes of speech recognition errors using attention-based bidirectional long short-term memory and modulation spectrum
    Santoso, Jennifer
    Yamada, Takeshi
    Makino, Shoji
    [J]. 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019, 2019, : 302 - 306
  • [2] Attention-Based Convolution Skip Bidirectional Long Short-Term Memory Network for Speech Emotion Recognition
    Zhang, Huiyun
    Huang, Heming
    Han, Henry
    [J]. IEEE ACCESS, 2021, 9 : 5332 - 5342
  • [3] Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification
    Zhou, Peng
    Shi, Wei
    Tian, Jun
    Qi, Zhenyu
    Li, Bingchen
    Hao, Hongwei
    Xu, Bo
    [J]. PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2016), VOL 2, 2016, : 207 - 212
  • [4] Hyperspectral Image Classification Using Attention-Based Bidirectional Long Short-Term Memory Network
    Mei, Shaohui
    Li, Xingang
    Liu, Xiao
    Cai, Huimin
    Du, Qian
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [5] Attention-based Bidirectional Long Short-Term Memory Networks for Relation Classification Using Knowledge Distillation from BERT
    Wang, Zihan
    Yang, Bo
    [J]. 2020 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2020, : 562 - 568
  • [6] Image Captioning with Bidirectional Semantic Attention-Based Guiding of Long Short-Term Memory
    Cao, Pengfei
    Yang, Zhongyi
    Sun, Liang
    Liang, Yanchun
    Yang, Mary Qu
    Guan, Renchu
    [J]. NEURAL PROCESSING LETTERS, 2019, 50 (01) : 103 - 119
  • [7] Image Captioning with Bidirectional Semantic Attention-Based Guiding of Long Short-Term Memory
    Pengfei Cao
    Zhongyi Yang
    Liang Sun
    Yanchun Liang
    Mary Qu Yang
    Renchu Guan
    [J]. Neural Processing Letters, 2019, 50 : 103 - 119
  • [8] Speech emotion recognition based on convolutional neural network with attention-based bidirectional long short-term memory network and multi-task learning
    Liu, Zhen-Tao
    Han, Meng-Ting
    Wu, Bao-Han
    Rehman, Abdul
    [J]. APPLIED ACOUSTICS, 2023, 202
  • [9] Forecasting Teleconsultation Demand with an Ensemble Attention-Based Bidirectional Long Short-Term Memory Model
    Chen, Wenjia
    Yu, Lean
    Li, Jinlin
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2021, 14 (01) : 821 - 833
  • [10] Attention-based bidirectional-long short-term memory for abnormal human activity detection
    Manoj Kumar
    Anoop Kumar Patel
    Mantosh Biswas
    S. Shitharth
    [J]. Scientific Reports, 13