A scale-rate filter selection method in the spectro-temporal domain for phoneme classification

被引:2
|
作者
Fartash, Mehdi [1 ]
Setayeshi, Saeed [2 ,3 ]
Razzazi, Farbod [1 ]
机构
[1] Islamic Azad Univ, Dept Elect & Comp Engn, Sci & Res Branch, Tehran, Iran
[2] Amirkabir Univ Technol, Dept Radiat Med, Tehran, Iran
[3] Amirkabir Univ Technol, Tehran Polytech, Dept Med Radiat Engn, Tehran, Iran
关键词
RECEPTIVE-FIELDS; SPEECH; REPRESENTATIONS;
D O I
10.1016/j.compeleceng.2012.12.013
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, there has been a significant increase in studies employing auditory models in speech recognition systems. In this paper, we propose a new evolutionary tuned feature extraction method by spectro-temporal analysis. In our proposed model, there is a special subspace for each phoneme with a specific best scale in the spectral filter and a specific best rate in the temporal filter. These two parameters were obtained by genetic cellular automata evolutionary algorithm. The extracted features from the specific subspace are classified by a binary one-versus-rest support vector machine. Finally, a multiclass classifier for all phonemes is employed by combining these sub-models. The proposed method improved the discrimination of phonemes significantly especially in highly confusable phonemes. To show the efficiency of the proposed feature sets, it was empirically compared with two baseline models. The achieved relative improvements are about 10% in classification rate for voiced plosives, unvoiced plosives and nasals; and about 7.38% for front vowels relative to the state of the art baseline model. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1537 / 1548
页数:12
相关论文
共 50 条
  • [41] Normalization of spectro-temporal Gabor filter bank features for improved robust automatic speech recognition systems
    Schaedler, Marc Rene
    Kollmeier, Birger
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1810 - 1813
  • [42] A Clustering-based Approach for Features Extraction in Spectro-temporal Domain using Artificial Neural Network
    Esfandian, N.
    Hosseinpour, K.
    [J]. INTERNATIONAL JOURNAL OF ENGINEERING, 2021, 34 (02): : 452 - 457
  • [43] A Clustering-based Approach for Features Extraction in Spectro-temporal Domain using Artificial Neural Network
    Esfandian, N.
    Hosseinpour, K.
    [J]. International Journal of Engineering, Transactions A: Basics, 2021, 34 (02): : 452 - 457
  • [44] Classification of Scene Evolution Patterns from Satellite Image Time Series Based on Spectro-temporal Signatures
    Costachioiu, Teodor
    Lazarescu, Vasile
    Datcu, Mihai
    [J]. 2011 10TH INTERNATIONAL SYMPOSIUM ON SIGNALS, CIRCUITS AND SYSTEMS (ISSCS), 2011,
  • [45] Spectro-Temporal Feature Based Multi-Channel Convolutional Neural Network for ECG Beat Classification
    Hao, Chen
    Wibowo, Sandi
    Majmudar, Maulik
    Rajput, Kuldeep Singh
    [J]. 2019 41ST ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2019, : 5642 - 5645
  • [46] A clustering-based approach for features extraction in spectro-temporal domain using artificial neural network
    Esfandian, N.
    Hosseinpour, K.
    [J]. International Journal of Engineering, Transactions B: Applications, 2021, 34 (02): : 452 - 457
  • [47] Neural encoding of spectro-temporal cues at slow and near speech-rate in cochlear implant users
    Undurraga, Jaime A.
    Van Yper, Lindsey
    Bance, Manohar
    McAlpine, David
    Vickers, Deborah
    [J]. HEARING RESEARCH, 2021, 403
  • [48] PREDOMINANT MELODY EXTRACTION FROM VOCAL POLYPHONIC MUSIC SIGNAL BY COMBINED SPECTRO-TEMPORAL METHOD
    Reddy, Gurunath M.
    Rao, K. Sreenivasa
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 455 - 459
  • [49] Speech-Nonspeech Discrimination using the Information Bottleneck Method and Spectro-Temporal Modulation Index
    Markaki, Maria
    Wohlmayr, Michael
    Stylianou, Yannis
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1897 - 1900
  • [50] Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition
    Schaedler, Marc Rene
    Kollmeier, Birger
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2015, 137 (04): : 2047 - 2059