Speech recognition using cepstral articulatory features

被引:4
|
作者
Najnin, Shamima [1 ]
Banerjee, Bonny [2 ,3 ]
机构
[1] Intel Corp, Hillsboro, OR 97124 USA
[2] Univ Memphis, Inst Intelligent Syst, Memphis, TN 38152 USA
[3] Univ Memphis, Dept Elect & Comp Engn, Memphis, TN 38152 USA
基金
美国国家科学基金会;
关键词
Phoneme recognition; Acoustic feature; Cepstral articulatory feature; Inversion mapping; General regression neural network; Deep neural network; RETINAL PROJECTIONS; DEEP;
D O I
10.1016/j.specom.2019.01.002
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Though speech recognition has been widely investigated in the past decades, the role of articulation in recognition has received scant attention. Recognition accuracy increases when recognizers are trained with acoustic features in conjunction with articulatory ones. Traditionally, acoustic features are represented by mel-frequency cepstral coefficients (MFCCs) while articulatory features are represented by the locations or trajectories of the articulators. We propose the articulatory cepstral coefficients (ACCs) as features which are the cepstral coefficients of the time-location articulatory signal. We show that ACCs yield state-of-the-art results in phoneme classification and recognition on benchmark datasets over a wide range of experiments. The similarity of MFCCs and ACCs and their superior performance in isolation and conjunction indicate that common algorithms can be effectively used for acoustic and articulatory signals.
引用
收藏
页码:26 / 37
页数:12
相关论文
共 50 条
  • [1] Robust Speech Recognition Combining Cepstral and Articulatory Features
    Zha, Zhuan-ling
    Hu, Jin
    Zhan, Qing-ran
    Shan, Ya-hui
    Xie, Xiang
    Wang, Jing
    Cheng, Hao-bo
    [J]. PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 1401 - 1405
  • [2] Whispery speech recognition using adapted articulatory features
    Jou, SC
    Schultz, T
    Waibel, A
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 1009 - 1012
  • [3] Articulatory Features for "Meeting" Speech Recognition
    Metze, Florian
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 581 - 584
  • [4] Speech Emotion Recognition Using Auditory Spectrogram and Cepstral Features
    Zhao, Shujie
    Yang, Yan
    Cohen, Israel
    Zhang, Lijun
    [J]. 29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 136 - 140
  • [5] Applying articulatory features to speech emotion recognition
    Zhou, Yu
    Sun, Yanqing
    Yang, Lin
    Yan, Yonghong
    [J]. 2009 INTERNATIONAL CONFERENCE ON RESEARCH CHALLENGES IN COMPUTER SCIENCE, ICRCCS 2009, 2009, : 73 - 76
  • [6] Speech recognition using syllable and pseudo-articulatory features modeling
    Zhang, L
    [J]. PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05), 2005, : 137 - 141
  • [7] Combining Evidences from Mel Cepstral and Cochlear Cepstral Features for Speaker Recognition Using Whispered Speech
    Raikar, Aditya
    Gandhi, Ami
    Patil, Hemant A.
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 405 - 413
  • [8] Speech Emotion Recognition Using Gammatone Cepstral Coefficients and Deep Learning Features
    Sharan, Roneel, V
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLIED NETWORK TECHNOLOGIES, ICMLANT, 2023, : 139 - 142
  • [9] Recognition of reverberant speech using full cepstral features and spectral missing data
    Palomaki, Kalle J.
    Brown, Guy J.
    Barker, Jon R.
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 289 - 292
  • [10] NMF-based Cepstral Features for Speech Emotion Recognition
    Lashkari, Milad
    Seyedin, Sanaz
    [J]. 2018 4TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2018, : 189 - 193