Analysis and design of Wavelet-Packet Cepstral coefficients for automatic speech recognition

被引：32

作者：

Pavez, Eduardo ^{[1
]}

Silva, Jorge F. ^{[1
]}

机构：

[1] Univ Chile, Dept Elect Engn, Santiago 4123, Chile

来源：

SPEECH COMMUNICATION | 2012年 / 54卷 / 06期

关键词：

Wavelet Packets; Filter-bank analysis; Automatic speech recognition; Filter-bank selection; Cepstral coefficients; The Gray code; SAMPLING THEOREM; MARKOV-MODELS; SIGNAL; REPRESENTATIONS; FILTERS;

D O I：

10.1016/j.specom.2012.02.002

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This work proposes using Wavelet-Packet Cepstral coefficients (WPPCs) as an alternative way to do filter-bank energy-based feature extraction (FE) for automatic speech recognition (ASR). The rich coverage of time-frequency properties of Wavelet Packets (WPs) is used to obtain new sets of acoustic features, in which competitive and better performances are obtained with respect to the widely adopted Mel-Frequency Cepstral coefficients (MFCCs) in the TIMIT corpus. In the analysis, concrete filter-bank design considerations are stipulated to obtain most of the phone-discriminating information embedded in the speech signal, where the filter-bank frequency selectivity, and better discrimination in the lower frequency range [200 Hz-1 kHz] of the acoustic spectrum are important aspects to consider. (C) 2012 Elsevier B.V. All rights reserved.

引用

页码：814 / 835

页数：22

共 50 条

[21] Automatic Speech Recognition System Based on Wavelet Analysis
Ziolko, Mariusz
Galka, Jakub
Ziolko, Bartosz
Jadczyk, Tomasz
Skurzok, Dawid
Wicijowski, Jan
[J]. 2010 IEEE FOURTH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2010), 2010, : 450 - 451
[22] Linear Frequency Residual Cepstral Coefficients for Speech Emotion Recognition
Hora, Baveet Singh
Uthiraa, S.
Patil, Hemant A.
[J]. SPEECH AND COMPUTER, SPECOM 2023, PT I, 2023, 14338 : 116 - 129
[23] Whispered Speech Recognition Based on Gammatone Filterbank Cepstral Coefficients
Markovic, B.
Galic, J.
Grozdic, D.
Jovicic, S. T.
Mijic, M.
[J]. JOURNAL OF COMMUNICATIONS TECHNOLOGY AND ELECTRONICS, 2017, 62 (11) : 1255 - 1261
[24] Analysis of harmonics in power systems using the wavelet-packet transform
Barros, Julio
Diego, Ramon I.
[J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2008, 57 (01) : 63 - 69
[25] Recognition of emotion from speech using evolutionary cepstral coefficients
Bakhshi, Ali
Chalup, Stephan
Harimi, Ali
Mirhassani, Seyed Mostafa
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (47-48) : 35739 - 35759
[26] DELTA-SPECTRAL CEPSTRAL COEFFICIENTS FOR ROBUST SPEECH RECOGNITION
Kumar, Kshitiz
Kim, Chanwoo
Stern, Richard M.
[J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4784 - 4787
[27] Recognition of emotion from speech using evolutionary cepstral coefficients
Ali Bakhshi
Stephan Chalup
Ali Harimi
Seyed Mostafa Mirhassani
[J]. Multimedia Tools and Applications, 2020, 79 : 35739 - 35759
[28] New cepstral representation using wavelet analysis and spectral transformation for robust speech recognition
Wassner, H
Chollet, G
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 260 - 263
[29] Whispered speech recognition based on gammatone filterbank cepstral coefficients
B. Marković
J. Galić
Ð. Grozdić
S. T. Jovičić
M. Mijić
[J]. Journal of Communications Technology and Electronics, 2017, 62 : 1255 - 1261
[30] Perceptual harmonic cepstral coefficients for speech recognition in noisy environment
Gu, L
Rose, K
[J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 125 - 128

← 1 2 3 4 5 →