Convolutional support vector machines for speech recognition

被引:14
|
作者
Passricha, Vishal [1 ]
Aggarwal, Rajesh Kumar [1 ]
机构
[1] Natl Inst Technol, Comp Engn Dept, Kurukshetra, Haryana, India
关键词
ASR; CNN; SVM; Maximum margin; CSVM; NEURAL-NETWORKS; FEATURES; MODELS;
D O I
10.1007/s10772-018-09584-4
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Convolutional neural networks (CNNs) have demonstrated the state-of-the-art performances on automatic speech recognition. Softmax activation function for prediction and minimizing the cross-entropy loss is employed by most of the CNNs. This paper proposes a new deep architecture in which two heterogeneous classification techniques named as CNN and support vector machines (SVMs) are combined together. In this proposed model, features are learned using convolution layer and classified by SVMs. The last layer of CNN i.e. softmax layer is replaced by SVMs to efficiently deal with high dimensional features. This model should be interpreted as a special form of structured SVM and named as convolutional support vector machine (CSVM). Instead of training each component separately, the parameters of CNN and SVMs are jointly trained using frame level max-margin, sequence level max-margin, and state-level minimum Bayes risk criterion. The performance of CSVM is checked on TIMIT and Wall Street Journal datasets for phone recognition. By incorporating the features of both CNN and SVMs, CSVM improves the result by 13.33% and 2.31% over baseline CNN and segmental recurrent neural networks respectively.
引用
收藏
页码:601 / 609
页数:9
相关论文
共 50 条
  • [21] Speaker Recognition from Coded Speech Using Support Vector Machines
    Janicki, Artur
    Staroszczyk, Tomasz
    [J]. TEXT, SPEECH AND DIALOGUE, TSD 2011, 2011, 6836 : 291 - 298
  • [22] VISUAL SPEECH RECOGNITION USING OPTICAL FLOW AND SUPPORT VECTOR MACHINES
    Shaikh, Ayaz A.
    Kumar, Dinesh K.
    Gubbi, Jayavardhana
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2011, 10 (02) : 167 - 187
  • [23] Structured Support Vector Machines for Noise Robust Continuous Speech Recognition
    Zhang, Shi-Xiong
    Gales, M. J. F.
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 996 - 999
  • [24] VISUAL SPEECH RECOGNITION USING DYNAMIC FEATURES AND SUPPORT VECTOR MACHINES
    Yau, Wai Chee
    Kumar, Dinesh Kant
    Arjunan, Sridhar Poosapadi
    [J]. INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2008, 8 (03) : 419 - 437
  • [25] Tone recognition of continuous Cantonese speech based on support vector machines
    Peng, G
    Wang, WSY
    [J]. SPEECH COMMUNICATION, 2005, 45 (01) : 49 - 62
  • [26] A Support Vector Machines-based rejection technique for speech recognition
    Ma, CX
    Randolph, MA
    Drish, J
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 381 - 384
  • [27] Speech Emotion Recognition using Convolutional Long Short-Term Memory Neural Network and Support Vector Machines
    Kurpukdee, Nattapong
    Koriyama, Tomoki
    Kobayashi, Takao
    Kasuriya, Sawit
    Wutiwiwatchai, Chai
    Lamsrichan, Poonlap
    [J]. 2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1744 - 1749
  • [28] Cattle Brand Recognition using Convolutional Neural Network and Support Vector Machines
    Silva, C.
    Welfer, D.
    Gioda, F. P.
    Dornelles, C.
    [J]. IEEE LATIN AMERICA TRANSACTIONS, 2017, 15 (02) : 310 - 316
  • [29] Speech Recognition using Wavelet Packets, Neural Networks and Support Vector Machines
    Kulkarni, Purva
    Kulkarni, Saili
    Mulange, Sucheta
    Dand, Aneri
    Cheeran, Alice N.
    [J]. 2014 INTERNATIONAL CONFERENCE ON SIGNAL PROPAGATION AND COMPUTER TECHNOLOGY (ICSPCT 2014), 2014, : 451 - 455
  • [30] Robust Noisy Speech Recognition Using Deep Neural Support Vector Machines
    Amami, Rimah
    Ben Ayed, Dorra
    [J]. DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE, 2019, 800 : 300 - 307