Performance prediction of automatic speech recognition systems using convolutional neural networks

被引:0
|
作者
Elloumi, Zied [1 ,2 ]
Lecouteux, Benjamin [2 ]
Galibert, Olivier [1 ]
Besacier, Laurent [2 ]
机构
[1] Lab Natl Metrol & Essais LNE, Paris, France
[2] Univ Grenoble Alpes, CNRS, Grenoble INP, LIG, F-38000 Grenoble, France
来源
TRAITEMENT AUTOMATIQUE DES LANGUES | 2018年 / 59卷 / 02期
关键词
performance prediction; large vocabulary continuous speech recognition; convolutional neural networks;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This paper focuses on the ASR performance prediction task. Two prediction approaches are compared: a state-of-the-art performance prediction based on engineered features and a new strategy based on learnt features using convolutional neural networks. We also try to better understand which information is captured by the deep model and its relation with different conditioning factors. To take advantage of this analysis, we then try to leverage these 3 types of information at training time through multi-task learning, which is slightly more efficient on ASR performance prediction task.
引用
收藏
页码:49 / 76
页数:28
相关论文
共 50 条
  • [1] Convolutional Neural Networks for Speech Recognition
    Abdel-Hamid, Ossama
    Mohamed, Abdel-Rahman
    Jiang, Hui
    Deng, Li
    Penn, Gerald
    Yu, Dong
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) : 1533 - 1545
  • [2] SPEECH EMOTION RECOGNITION USING QUATERNION CONVOLUTIONAL NEURAL NETWORKS
    Muppidi, Aneesh
    Radfar, Martin
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6309 - 6313
  • [3] Speech Recognition of Punjabi Numerals Using Convolutional Neural Networks
    Aditi, Thakur
    Karun, Verma
    [J]. ADVANCES IN COMPUTER COMMUNICATION AND COMPUTATIONAL SCIENCES, VOL 1, 2019, 759 : 61 - 69
  • [4] Speech Emotion Recognition using Convolutional and Recurrent Neural Networks
    Lim, Wootaek
    Jang, Daeyoung
    Lee, Taejin
    [J]. 2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [5] Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition
    Parcollet, Titouan
    Zhang, Ying
    Morchid, Mohamed
    Trabelsi, Chiheb
    Linares, Georges
    De Mori, Renato
    Bengio, Yoshua
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 22 - 26
  • [6] ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
    Han, Wei
    Zhang, Zhengdong
    Zhang, Yu
    Yu, Jiahui
    Chiu, Chung-Cheng
    Qin, James
    Gulati, Anmol
    Pang, Ruoming
    Wu, Yonghui
    [J]. INTERSPEECH 2020, 2020, : 3610 - 3614
  • [7] Automatic target recognition using deep convolutional neural networks
    Nasrabadi, Nasser M.
    Kazemi, Hadi
    Iranmanesh, Mehdi
    [J]. AUTOMATIC TARGET RECOGNITION XXVIII, 2018, 10648
  • [8] Nonintrusive Speech Intelligibility Prediction Using Convolutional Neural Networks
    Andersen, Asger Heidemann
    de Haan, Jan Mark
    Tan, Zheng-Hua
    Jensen, Jesper
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (10) : 1925 - 1939
  • [9] Continuous speech recognition by convolutional neural networks
    Zhang, Qing-Qing
    Liu, Yong
    Pan, Jie-Lin
    Yan, Yong-Hong
    [J]. Gongcheng Kexue Xuebao/Chinese Journal of Engineering, 2015, 37 (09): : 1212 - 1217
  • [10] Convolutional Neural Networks for Distant Speech Recognition
    Swietojanski, Pawel
    Ghoshal, Arnab
    Renals, Steve
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (09) : 1120 - 1124