Feature extraction based on perceptually non-uniform spectral compression for speech recognition

被引：0

作者：

Chu, KK ^{[1
]}

Leung, SF ^{[1
]}

机构：

[1] City Univ Hong Kong, Dept Elect Engn, Hong Kong, Hong Kong, Peoples R China

来源：

PROCEEDINGS OF THE 2003 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL III: GENERAL & NONLINEAR CIRCUITS AND SYSTEMS | 2003年

关键词：

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The power law of hearing used in approximating the loudness function has an exponent that decreases from about 0.3 for a narrow band tone to 0.23 for a broadband uniform-exciting noise. Exploiting this property of psychoacoustics of hearing, this paper proposes a new feature extraction method for robust speech recognition. In the method, larger energy compression is applied to broadband-like high frequency bands of the power spectrum of each frame, instead of a fixed compression for all frequency bands as in root cepstral analysis or PLP analysis. In addition, those sound segments having broadband characteristics are given larger compression as well, using frame energy as the measuring index. The scatter of feature vectors and the class discrimination of our new method for phonemes are compared against traditional feature extraction techniques. It is shown that the feature derived from the new scheme has smaller variation and better class discrimination than the traditional features. Significant improvement in recognition accuracy is also obtained, especially in very low SNR, under white noise environment.

引用

页码：726 / 729

页数：4

共 50 条

[21] A Trajectory Compression Algorithm Based on Non-uniform Quantization
Lv, Chengjiao
Chen, Feng
Xu, Yongzhi
Song, Junping
Lv, Pin
2015 12TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2015, : 2469 - 2474
[22] Importance of Non-Uniform Prosody Modification for Speech Recognition in Emotion Conditions
Raju, V. V. Vidyadhara
Vydana, Hari Krishna
Gangashetty, Suryakanth, V
Vuppala, Anil Kumar
2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 573 - 576
[23] Crowd Panic Detection Using Autoencoder with Non-uniform Feature Extraction
George, Michael
Bijitha, C., V
Jose, Babita Roslind
PROCEEDINGS OF THE 2018 8TH INTERNATIONAL SYMPOSIUM ON EMBEDDED COMPUTING AND SYSTEM DESIGN (ISED 2018), 2018, : 11 - 15
[24] Novel model compensation for features based on SNR-dependent non-uniform spectral compression
Ning, Geng-xin
Leung, Shu-hung
Chu, Kam-keung
Wei, Gang
2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 1121 - 1124
[25] Speech recognition with emphasis on wavelet based feature extraction
Farooq, O
Datta, S
IETE JOURNAL OF RESEARCH, 2002, 48 (01) : 3 - 13
[26] Acceleration of feature extraction for FPGA based speech recognition
Arminas, Vytautas
Tamulevicius, Gintautas
Navakauskas, Dalius
Ivanovas, Edgaras
PHOTONICS APPLICATIONS IN ASTRONOMY, COMMUNICATIONS, INDUSTRY, AND HIGH-ENERGY PHYSICS EXPERIMENTS 2010, 2010, 7745
[27] MVDR based feature extraction for robust speech recognition
Dharanipragada, S
Rao, BD
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 309 - 312
[28] Face extraction from non-uniform background and recognition in compressed domain
Tsapatsoulis, N
Doulamis, N
Doulamis, A
Kollias, S
PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 2701 - 2704
[29] NON-STATIONARY FEATURE EXTRACTION FOR AUTOMATIC SPEECH RECOGNITION
Tueske, Zoltan
Golik, Pavel
Schlueter, Ralf
Drepper, Friedhelm R.
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5204 - 5207
[30] Speech recognition as feature extraction for speaker recognition
Stolcke, A.
Shriberg, E.
Ferrer, L.
Kajarekar, S.
Sonmez, K.
Tur, G.
2007 IEEE WORKSHOP ON SIGNAL PROCESSING APPLICATIONS FOR PUBLIC SECURITY AND FORENSICS, 2007, : 39 - +

← 1 2 3 4 5 →