Continuous Speech Recognition based on Convolutional Neural Network

被引:2
|
作者
Zhang, Qing-qing [1 ]
Liu, Yong [1 ]
Pan, Jie-lin [1 ]
Yan, Yong-hong [1 ]
机构
[1] Chinese Acad Sci, Key Lab Speech Acoust & Content Understanding, Beijing 100190, Peoples R China
关键词
Convolutional Neural Networks; Continuous speech recognition; Local convolution; Weight-sharing; sub-sampling;
D O I
10.1117/12.2197152
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Convolutional Neural Networks (CNNs), which showed success in achieving translation invariance for many image processing tasks, are investigated for continuous speech recognitions in the paper. Compared to Deep Neural Networks (DNNs), which have been proven to be successful in many speech recognition tasks nowadays, CNNs can reduce the NN model sizes significantly, and at the same time achieve even better recognition accuracies. Experiments on standard speech corpus TIMIT showed that CNNs outperformed DNNs in the term of the accuracy when CNNs had even smaller model size.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Speech emotion recognition based on spiking neural network and convolutional neural network
    Du, Chengyan
    Liu, Fu
    Kang, Bing
    Hou, Tao
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 147
  • [2] Continuous speech recognition by convolutional neural networks
    Zhang, Qing-Qing
    Liu, Yong
    Pan, Jie-Lin
    Yan, Yong-Hong
    Gongcheng Kexue Xuebao/Chinese Journal of Engineering, 2015, 37 (09): : 1212 - 1217
  • [3] Audiovisual speech recognition based on a deep convolutional neural network
    Rudregowda S.
    Patilkulkarni S.
    Ravi V.
    H.L. G.
    Krichen M.
    Data Science and Management, 2024, 7 (01): : 25 - 34
  • [4] Speech Emotion Recognition based on Interactive Convolutional Neural Network
    Cheng, Huihui
    Tang, Xiaoyu
    2020 IEEE 3RD INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP 2020), 2020, : 163 - 167
  • [5] Implementation of Convolutional Neural Network for Speech Recognition
    Wang, Zhichao
    Na, Xingyu
    Liu, Yong
    Pan, Jielin
    Yan, Yonghong
    INTERNATIONAL ACADEMIC CONFERENCE ON THE INFORMATION SCIENCE AND COMMUNICATION ENGINEERING (ISCE 2014), 2014, : 239 - 243
  • [6] Continuous Speech Emotion Recognition with Convolutional Neural Networks
    Vryzas, Nikolaos
    Vrysis, Lazaros
    Matsiola, Maria
    Kotsakis, Rigas
    Dimoulas, Charalampos
    Kalliris, George
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2020, 68 (1-2): : 14 - 24
  • [7] Continuous speech emotion recognition with convolutional neural networks
    Vryzas, Nikolaos
    Vrysis, Lazaros
    Matsiola, Maria
    Kotsakis, Rigas
    Dimoulas, Charalampos
    Kalliris, George
    AES: Journal of the Audio Engineering Society, 2020, 68 (1-2): : 14 - 24
  • [8] Residual Convolutional Neural Network-Based Dysarthric Speech Recognition
    Kumar, Raj
    Tripathy, Manoj
    Anand, R. S.
    Kumar, Niraj
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024, 49 (12) : 16241 - 16251
  • [9] Constructing Speech Emotion Recognition Model Based on Convolutional Neural Network
    Kuo, Jong-Yih
    Chen, Zhao-Ming
    Lin, Hui-Chi
    2021 28TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE WORKSHOPS (APSECW 2021), 2021, : 52 - 56
  • [10] CONVOLUTIONAL NEURAL NETWORKS-BASED CONTINUOUS SPEECH RECOGNITION USING RAW SPEECH SIGNAL
    Palaz, Dimitri
    Magimai-Doss, Mathew
    Collobert, Ronan
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4295 - 4299