Phase based spectro-temporal features for building a robust ASR system

被引:4
|
作者
Dutta, Anirban [1 ]
Ashishkumar, G. [1 ]
Rao, Ch V. Rama [1 ]
机构
[1] Natl Inst Technol Meghalaya, Shillong, Meghalaya, India
来源
关键词
phase; Gabor; spectro-temporal; recognition;
D O I
10.21437/Interspeech.2020-2258
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Spectro-temporal feature extraction has shown its robustness in the field of speech recognition. However, these features are derived from magnitude spectrum of the complex Fourier Transform (FT). In this work, we investigate to see if phase information can substitute magnitude based spectro-temporal features. We compared with different state of art phase spectrum and evaluated its performance. The experiments are carried out in different noisy environments. We found Modified Group Delay (MODGD) spectrum to closely resemble the structure of power spectrum. A relative performance difference of 0.03% on average is observed for the MODGD spectro-temporal features compared to the magnitude based features. The analysis showed that phase can indeed carry equivalent or complementary information to magnitude based spectro-temporal features.
引用
收藏
页码:1668 / 1672
页数:5
相关论文
共 50 条
  • [1] Spectro-temporal Power Spectrum Features for Noise Robust ASR
    Hamed Riazati Seresht
    Seyed Mohammad Ahadi
    Sanaz Seyedin
    Circuits, Systems, and Signal Processing, 2017, 36 : 3222 - 3242
  • [2] Spectro-temporal Power Spectrum Features for Noise Robust ASR
    Seresht, Hamed Riazati
    Ahadi, Seyed Mohammad
    Seyedin, Sanaz
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2017, 36 (08) : 3222 - 3242
  • [3] Improving the Performance of ASR System by Building Acoustic Models using Spectro-Temporal and Phase-Based Features
    Anirban Dutta
    G. Ashishkumar
    Ch. V. Rama Rao
    Circuits, Systems, and Signal Processing, 2022, 41 : 1609 - 1632
  • [4] Improving the Performance of ASR System by Building Acoustic Models using Spectro-Temporal and Phase-Based Features
    Dutta, Anirban
    Ashishkumar, G.
    Rao, Ch V. Rama
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (03) : 1609 - 1632
  • [5] Comparing Different Flavors of Spectro-Temporal Features for ASR
    Meyer, Bernd T.
    Ravuri, Suman V.
    Schaedler, Marc Rene
    Morgan, Nelson
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1276 - +
  • [6] Robust Multi-Band ASR Using Deep Neural Nets and Spectro-temporal Features
    Kovacs, Gyoergy
    Toth, Laszlo
    Grosz, Tamas
    SPEECH AND COMPUTER, 2014, 8773 : 386 - 393
  • [7] Robust Dialect Identification System using Spectro-Temporal Gabor Features
    Chittaragi, Nagaratna B.
    Mothukuri, Siva Krishna P.
    Hegde, Pradyoth
    Koolagudi, Shashidhar G.
    PROCEEDINGS OF TENCON 2018 - 2018 IEEE REGION 10 CONFERENCE, 2018, : 1589 - 1594
  • [8] Using Spectro-Temporal Features to Improve AFE Feature Extraction for ASR
    Ravuri, Suman V.
    Morgan, Nelson
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1181 - 1184
  • [9] ROBUST SPECTRO-TEMPORAL FEATURES BASED ON AUTOREGRESSIVE MODELS OF HILBERT ENVELOPES
    Ganapathy, Sriram
    Thomas, Samuel
    Hermansky, Hynek
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4286 - 4289
  • [10] Hierarchical spectro-temporal features for robust speech recognition
    Domont, Xavier
    Heckmann, Martin
    Joublin, Frank
    Goerick, Christian
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4417 - 4420