Feature selection and stacking for robust discrimination of speech, monophonic singing, and polyphonic music

被引：10

作者：

Schuller, B ^{[1
]}

Schmitt, BJB ^{[1
]}

Arsic, D ^{[1
]}

Reiter, S ^{[1
]}

Lang, M ^{[1
]}

Rigoll, G ^{[1
]}

机构：

[1] Tech Univ Munich, Inst Human Machine Commun, D-80333 Munich, Germany

来源：

2005 IEEE International Conference on Multimedia and Expo (ICME), Vols 1 and 2 | 2005年

关键词：

D O I：

10.1109/ICME.2005.1521554

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this work we strive to find an optimal set of acoustic features for the discrimination of speech, monophonic singing, and polyphonic music to robustly segment acoustic media streams for annotation and interaction purposes. Furthermore we introduce ensemble-based classification approaches within this task. From a basis of 276 attributes we select the most efficient set by SVM-SFFS. Additionally relevance of single features by calculation of information gain ratio is presented. As a basis of comparison we reduce dimensionality by PCA. We show extensive analysis of different classifiers within the named task. Among these are Kernel Machines, Decision Trees, and Bayesian Classifiers. Moreover we improve single classifier performance by Bagging and Boosting, and finally combine strengths of classifiers by StackingC. The database is formed by 2,114 samples of speech, and singing of 58 persons. 1,000 Music clips have been taken from the MTV-Europe-Top-20 1980-2000. The outstanding discrimination results of a working real-time capable implementation stress the practicability of the proposed novel ideas.

引用

页码：840 / 843

页数：4

共 32 条

[1] FUSING TRANSCRIPTION RESULTS FROM POLYPHONIC AND MONOPHONIC AUDIO FOR SINGING MELODY TRANSCRIPTION IN POLYPHONIC MUSIC
Zhu, Bilei
Wu, Fuzhang
Li, Ke
Wu, Yongjian
Huang, Feiyue
Wu, Yunsheng
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 296 - 300
[2] ROBUST FEATURE EXTRACTION FOR AUTOMATIC RECOGNITION OF VIBRATO SINGING IN RECORDED POLYPHONIC MUSIC
Weninger, Felix
Amir, Noam
Amir, Ofer
Ronen, Irit
Eyben, Florian
Schuller, Bjoern
[J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 85 - 88
[3] Feature Extraction Using Time-Frequency Analysis for Monophonic-Polyphonic Wheeze Discrimination
Ulukaya, Sezer
Sen, Ipek
Kahya, Yasemin P.
[J]. 2015 37TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2015, : 5412 - 5415
[4] A New Feature for Speech/Music Discrimination
Huang, Houjun
Xu, Yunfei
Zhou, Ruohua
[J]. INTERNATIONAL ACADEMIC CONFERENCE ON THE INFORMATION SCIENCE AND COMMUNICATION ENGINEERING (ISCE 2014), 2014, : 133 - 137
[5] Feature extraction for speech and music discrimination
Hou, Huiyu
Sadka, Abdul
Jiang, Richard M.
[J]. 2008 INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING, 2008, : 154 - 157
[6] Robust singing detection in speech/music discriminator design
Chou, W
Gu, L
[J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 865 - 868
[7] Speech/music discrimination for robust speech recognition in robots
Choi, Mu Yeol
Song, Hwa Jeon
Kim, Hyung Soon
[J]. 2007 RO-MAN: 16TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, VOLS 1-3, 2007, : 118 - +
[8] A fast and robust speech/music discrimination approach
Wang, WQ
Gao, W
Ying, DW
[J]. ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 1325 - 1329
[9] Discrimination of speech and monophonic singing in continuous audio streams applying multi-layer support vector machines
Schuller, B
Rigoll, G
Lang, M
[J]. 2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, : 1655 - 1658
[10] Adaptive feature selection for speech/music classification
Abu-El-Quran, A. R.
Goubran, R. A.
Chan, A. D. C.
[J]. 2006 IEEE WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 2006, : 212 - +

← 1 2 3 4 →