Phase based spectro-temporal features for building a robust ASR system

被引：4

作者：

Dutta, Anirban ^{[1
]}

Ashishkumar, G. ^{[1
]}

Rao, Ch V. Rama ^{[1
]}

机构：

[1] Natl Inst Technol Meghalaya, Shillong, Meghalaya, India

来源：

INTERSPEECH 2020 | 2020年

关键词：

phase; Gabor; spectro-temporal; recognition;

D O I：

10.21437/Interspeech.2020-2258

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

Spectro-temporal feature extraction has shown its robustness in the field of speech recognition. However, these features are derived from magnitude spectrum of the complex Fourier Transform (FT). In this work, we investigate to see if phase information can substitute magnitude based spectro-temporal features. We compared with different state of art phase spectrum and evaluated its performance. The experiments are carried out in different noisy environments. We found Modified Group Delay (MODGD) spectrum to closely resemble the structure of power spectrum. A relative performance difference of 0.03% on average is observed for the MODGD spectro-temporal features compared to the magnitude based features. The analysis showed that phase can indeed carry equivalent or complementary information to magnitude based spectro-temporal features.

引用

页码：1668 / 1672

页数：5

共 50 条

[1] Spectro-temporal Power Spectrum Features for Noise Robust ASR
Hamed Riazati Seresht
Seyed Mohammad Ahadi
Sanaz Seyedin
Circuits, Systems, and Signal Processing, 2017, 36 : 3222 - 3242
[2] Spectro-temporal Power Spectrum Features for Noise Robust ASR
Seresht, Hamed Riazati
Ahadi, Seyed Mohammad
Seyedin, Sanaz
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2017, 36 (08) : 3222 - 3242
[3] Improving the Performance of ASR System by Building Acoustic Models using Spectro-Temporal and Phase-Based Features
Anirban Dutta
G. Ashishkumar
Ch. V. Rama Rao
Circuits, Systems, and Signal Processing, 2022, 41 : 1609 - 1632
[4] Improving the Performance of ASR System by Building Acoustic Models using Spectro-Temporal and Phase-Based Features
Dutta, Anirban
Ashishkumar, G.
Rao, Ch V. Rama
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (03) : 1609 - 1632
[5] Comparing Different Flavors of Spectro-Temporal Features for ASR
Meyer, Bernd T.
Ravuri, Suman V.
Schaedler, Marc Rene
Morgan, Nelson
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1276 - +
[6] Robust Multi-Band ASR Using Deep Neural Nets and Spectro-temporal Features
Kovacs, Gyoergy
Toth, Laszlo
Grosz, Tamas
SPEECH AND COMPUTER, 2014, 8773 : 386 - 393
[7] Robust Dialect Identification System using Spectro-Temporal Gabor Features
Chittaragi, Nagaratna B.
Mothukuri, Siva Krishna P.
Hegde, Pradyoth
Koolagudi, Shashidhar G.
PROCEEDINGS OF TENCON 2018 - 2018 IEEE REGION 10 CONFERENCE, 2018, : 1589 - 1594
[8] Using Spectro-Temporal Features to Improve AFE Feature Extraction for ASR
Ravuri, Suman V.
Morgan, Nelson
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1181 - 1184
[9] ROBUST SPECTRO-TEMPORAL FEATURES BASED ON AUTOREGRESSIVE MODELS OF HILBERT ENVELOPES
Ganapathy, Sriram
Thomas, Samuel
Hermansky, Hynek
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4286 - 4289
[10] Hierarchical spectro-temporal features for robust speech recognition
Domont, Xavier
Heckmann, Martin
Joublin, Frank
Goerick, Christian
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4417 - 4420

← 1 2 3 4 5 →