Continuous Speech Recognition based on Convolutional Neural Network

被引：2

作者：

Zhang, Qing-qing ^{[1
]}

Liu, Yong ^{[1
]}

Pan, Jie-lin ^{[1
]}

Yan, Yong-hong ^{[1
]}

机构：

[1] Chinese Acad Sci, Key Lab Speech Acoust & Content Understanding, Beijing 100190, Peoples R China

来源：

SEVENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2015) | 2015年 / 9631卷

关键词：

Convolutional Neural Networks; Continuous speech recognition; Local convolution; Weight-sharing; sub-sampling;

D O I：

10.1117/12.2197152

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Convolutional Neural Networks (CNNs), which showed success in achieving translation invariance for many image processing tasks, are investigated for continuous speech recognitions in the paper. Compared to Deep Neural Networks (DNNs), which have been proven to be successful in many speech recognition tasks nowadays, CNNs can reduce the NN model sizes significantly, and at the same time achieve even better recognition accuracies. Experiments on standard speech corpus TIMIT showed that CNNs outperformed DNNs in the term of the accuracy when CNNs had even smaller model size.

引用

页数：6

共 50 条

[41] Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network
Badshah, Abdul Malik
Ahmad, Jamil
Rahim, Nasir
Baik, Sung Wook
2017 INTERNATIONAL CONFERENCE ON PLATFORM TECHNOLOGY AND SERVICE (PLATCON), 2017, : 125 - 129
[42] Speech Emotion Recognition in Neurological Disorders Using Convolutional Neural Network
Zisad, Sharif Noor
Hossain, Mohammad Shahadat
Andersson, Karl
BRAIN INFORMATICS, BI 2020, 2020, 12241 : 287 - 296
[43] Visual Speech Recognition of Korean Words Using Convolutional Neural Network
Lee, Sung-Won
Yu, Je-Hun
Park, Seung Min
Sim, Kwee-Bo
INTERNATIONAL JOURNAL OF FUZZY LOGIC AND INTELLIGENT SYSTEMS, 2019, 19 (01) : 1 - 9
[44] Exploring Convolutional Neural Network Structures and Optimization Techniques for Speech Recognition
Abdel-Hamid, Ossama
Deng, Li
Yu, Dong
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3365 - 3369
[45] Convolutional Time Delay Neural Network for Khmer Automatic Speech Recognition
Srun, Nalin
Leang, Sotheara
Thu, Ye Kyaw
Sam, Sethserey
2022 17TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING (ISAI-NLP 2022) / 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INTERNET OF THINGS (AIOT 2022), 2022,
[46] Speech Emotion Recognition through Hybrid Features and Convolutional Neural Network
Alluhaidan, Ala Saleh
Saidani, Oumaima
Jahangir, Rashid
Nauman, Muhammad Asif
Neffati, Omnia Saidani
APPLIED SCIENCES-BASEL, 2023, 13 (08):
[47] Convolutional Neural Network with Spectrogram and Perceptual Features for Speech Emotion Recognition
Zhang, Linjuan
Wang, Longbiao
Dang, Jianwu
Guo, Lili
Guan, Haotian
NEURAL INFORMATION PROCESSING (ICONIP 2018), PT IV, 2018, 11304 : 62 - 71
[48] Speech Enhancement based on Deep Convolutional Neural Network
Nuthakki, Ramesh
Masanta, Payel
Yukta, T. N.
PROCEEDINGS OF THE 2021 FIFTH INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC 2021), 2021, : 770 - 775
[49] Convolutional Neural Networks for Speech Recognition
Abdel-Hamid, Ossama
Mohamed, Abdel-Rahman
Jiang, Hui
Deng, Li
Penn, Gerald
Yu, Dong
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) : 1533 - 1545
[50] Deep and shallow features fusion based on deep convolutional neural network for speech emotion recognition
Sun L.
Chen J.
Xie K.
Gu T.
International Journal of Speech Technology, 2018, 21 (04) : 931 - 940

← 1 2 3 4 5 →