Convolutional support vector machines for speech recognition

被引：14

作者：

Passricha, Vishal ^{[1
]}

Aggarwal, Rajesh Kumar ^{[1
]}

机构：

[1] Natl Inst Technol, Comp Engn Dept, Kurukshetra, Haryana, India

来源：

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY | 2019年 / 22卷 / 03期

关键词：

ASR; CNN; SVM; Maximum margin; CSVM; NEURAL-NETWORKS; FEATURES; MODELS;

D O I：

10.1007/s10772-018-09584-4

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Convolutional neural networks (CNNs) have demonstrated the state-of-the-art performances on automatic speech recognition. Softmax activation function for prediction and minimizing the cross-entropy loss is employed by most of the CNNs. This paper proposes a new deep architecture in which two heterogeneous classification techniques named as CNN and support vector machines (SVMs) are combined together. In this proposed model, features are learned using convolution layer and classified by SVMs. The last layer of CNN i.e. softmax layer is replaced by SVMs to efficiently deal with high dimensional features. This model should be interpreted as a special form of structured SVM and named as convolutional support vector machine (CSVM). Instead of training each component separately, the parameters of CNN and SVMs are jointly trained using frame level max-margin, sequence level max-margin, and state-level minimum Bayes risk criterion. The performance of CSVM is checked on TIMIT and Wall Street Journal datasets for phone recognition. By incorporating the features of both CNN and SVMs, CSVM improves the result by 13.33% and 2.31% over baseline CNN and segmental recurrent neural networks respectively.

引用

页码：601 / 609

页数：9

共 50 条

[1] Convolutional support vector machines for speech recognition
Vishal Passricha
Rajesh Kumar Aggarwal
[J]. International Journal of Speech Technology, 2019, 22 : 601 - 609
[2] Speech Recognition using Support Vector Machines
Aida-zade, Kamil
Xocayev, Anar
Rustamov, Samir
[J]. 2016 IEEE 10TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT), 2016, : 108 - 111
[3] RECURRENT SUPPORT VECTOR MACHINES FOR SPEECH RECOGNITION
Zhang, Shi-Xiong
Zhao, Rui
Liu, Chaojun
Li, Jinyu
Gong, Yifan
[J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5885 - 5889
[4] An Application of Speech Recognition with Support Vector Machines
Eray, Osman
Tokat, Sezai
Iplikci, Serdar
[J]. 2018 6TH INTERNATIONAL SYMPOSIUM ON DIGITAL FORENSIC AND SECURITY (ISDFS), 2018, : 38 - 43
[5] Applications of support vector machines to speech recognition
Ganapathiraju, A
Hamaker, JE
Picone, J
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2004, 52 (08) : 2348 - 2355
[6] Infinite Support Vector Machines in Speech Recognition
Yang, Jingzhou
van Dalen, Rogier C.
Gales, Mark
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3302 - 3306
[7] Speech Emotion Recognition Using Support Vector Machines
Yu, Caiming
Tian, Qingxi
Cheng, Fang
Zhang, Shiqing
[J]. ADVANCED RESEARCH ON COMPUTER SCIENCE AND INFORMATION ENGINEERING, PT I, 2011, 152 : 215 - 220
[8] Visual speech recognition using support vector machines
Gordan, M
Kotropoulos, C
Pitas, I
[J]. DSP 2002: 14TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING PROCEEDINGS, VOLS 1 AND 2, 2002, : 1093 - 1096
[9] A Study of Support Vector Machines for Emotional Speech Recognition
Kurpukdee, Nattapong
Kasuriya, Sawit
Chunwijitra, Vataya
Wutiwiwatchai, Chai
Lamsrichan, Poonlap
[J]. 2017 8TH INTERNATIONAL CONFERENCE OF INFORMATION AND COMMUNICATION TECHNOLOGY FOR EMBEDDED SYSTEMS (IC-ICTES), 2017,
[10] DEEP NEURAL SUPPORT VECTOR MACHINES FOR SPEECH RECOGNITION
Zhang, Shi-Xiong
Liu, Chaojun
Yao, Kaisheng
Gong, Yifan
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4275 - 4279

← 1 2 3 4 5 →