Gammatone Cepstral Coefficients: Biologically Inspired Features for Non-Speech Audio Classification

被引：190

作者：

Valero, Xavier ^{[1
]}

Alias, Francesc ^{[1
]}

机构：

[1] La Salle Univ Ramon Llull, GTM Grp Recerca Tecnol Media, Barcelona 08022, Spain

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2012年 / 14卷 / 06期

关键词：

Audio classification; audio scene recognition; environmental sound; feature extraction; Gammatone cepstral coefficients; ENVIRONMENTAL SOUND RECOGNITION; FREQUENCY;

D O I：

10.1109/TMM.2012.2199972

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the context of non-speech audio recognition and classification for multimedia applications, it becomes essential to have a set of features able to accurately represent and discriminate among audio signals. Mel frequency cepstral coefficients (MFCC) have become a de facto standard for audio parameterization. Taking as a basis the MFCC computation scheme, the Gammatone cepstral coefficients (GTCCs) are a biologically inspired modification employing Gammatone filters with equivalent rectangular bandwidth bands. In this letter, the GTCCs, which have been previously employed in the field of speech research, are adapted for non-speech audio classification purposes. Their performance is evaluated on two audio corpora of 4 h each (general sounds and audio scenes), following two cross-validation schemes and four machine learning methods. According to the results, classification accuracies are significantly higher when employing GTCC rather than other state-of-the-art audio features. As a detailed analysis shows, with a similar computational cost, the GTCC are more effective than MFCC in representing the spectral characteristics of non-speech audio signals, especially at low frequencies.

引用

页码：1684 / 1689

页数：6

共 50 条

[1] Speech Emotion Recognition Using Gammatone Cepstral Coefficients and Deep Learning Features
Sharan, Roneel, V
2023 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLIED NETWORK TECHNOLOGIES, ICMLANT, 2023, : 139 - 142
[2] Gammatone Wavelet Cepstral Coefficients for Robust Speech Recognition
Adiga, Aniruddha
Magimai-Doss, Mathew
Seelamantula, Chandra Sekhar
2013 IEEE INTERNATIONAL CONFERENCE OF IEEE REGION 10 (TENCON), 2013,
[3] Bottleneck Features based on Gammatone Frequency Cepstral Coefficients
Qi, Jun
Wang, Dong
Xu, Ji
Tejedor, Javier
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1750 - 1754
[4] Whispered speech recognition based on gammatone filterbank cepstral coefficients
B. Marković
J. Galić
Ð. Grozdić
S. T. Jovičić
M. Mijić
Journal of Communications Technology and Electronics, 2017, 62 : 1255 - 1261
[5] Whispered Speech Recognition Based on Gammatone Filterbank Cepstral Coefficients
Markovic, B.
Galic, J.
Grozdic, D.
Jovicic, S. T.
Mijic, M.
JOURNAL OF COMMUNICATIONS TECHNOLOGY AND ELECTRONICS, 2017, 62 (11) : 1255 - 1261
[6] Real-World Speech/Non-Speech Audio Classification Based on Sparse Representation Features and GPCs
Shi, Ziqiang
Han, Jiqing
Zheng, Tieran
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2412 - 2415
[7] Speech Based Features Applied to the Detection of Non-speech Audio Events
Vozarikova, Eva
Cizmar, Anton
12TH INTERNATIONAL CONFERENCE ON RESEARCH IN TELECOMMUNICATION TECHNOLOGIES (RTT 2010), 2010, : 125 - 128
[8] Call Analysis with Classification Using Speech and Non-Speech Features
Ju, Yun-Cheng
Wang, Ye-Yi
Acero, Alex
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1902 - 1905
[9] Boosting speech/non-speech classification using averaged Mel-frequency Cepstrum Coefficients features
Xiong, ZY
Huang, TS
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2002, PROCEEDING, 2002, 2532 : 573 - 580
[10] NON-SPEECH AUDIO EVENT DETECTION
Portelo, Jose
Bugalho, Miguel
Trancoso, Isabel
Neto, Joao
Abad, Alberto
Serralheiro, Antonio
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 1973 - 1976

← 1 2 3 4 5 →