Feature selection and stacking for robust discrimination of speech, monophonic singing, and polyphonic music

被引:10
|
作者
Schuller, B [1 ]
Schmitt, BJB [1 ]
Arsic, D [1 ]
Reiter, S [1 ]
Lang, M [1 ]
Rigoll, G [1 ]
机构
[1] Tech Univ Munich, Inst Human Machine Commun, D-80333 Munich, Germany
关键词
D O I
10.1109/ICME.2005.1521554
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work we strive to find an optimal set of acoustic features for the discrimination of speech, monophonic singing, and polyphonic music to robustly segment acoustic media streams for annotation and interaction purposes. Furthermore we introduce ensemble-based classification approaches within this task. From a basis of 276 attributes we select the most efficient set by SVM-SFFS. Additionally relevance of single features by calculation of information gain ratio is presented. As a basis of comparison we reduce dimensionality by PCA. We show extensive analysis of different classifiers within the named task. Among these are Kernel Machines, Decision Trees, and Bayesian Classifiers. Moreover we improve single classifier performance by Bagging and Boosting, and finally combine strengths of classifiers by StackingC. The database is formed by 2,114 samples of speech, and singing of 58 persons. 1,000 Music clips have been taken from the MTV-Europe-Top-20 1980-2000. The outstanding discrimination results of a working real-time capable implementation stress the practicability of the proposed novel ideas.
引用
收藏
页码:840 / 843
页数:4
相关论文
共 32 条
  • [1] FUSING TRANSCRIPTION RESULTS FROM POLYPHONIC AND MONOPHONIC AUDIO FOR SINGING MELODY TRANSCRIPTION IN POLYPHONIC MUSIC
    Zhu, Bilei
    Wu, Fuzhang
    Li, Ke
    Wu, Yongjian
    Huang, Feiyue
    Wu, Yunsheng
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 296 - 300
  • [2] ROBUST FEATURE EXTRACTION FOR AUTOMATIC RECOGNITION OF VIBRATO SINGING IN RECORDED POLYPHONIC MUSIC
    Weninger, Felix
    Amir, Noam
    Amir, Ofer
    Ronen, Irit
    Eyben, Florian
    Schuller, Bjoern
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 85 - 88
  • [3] Feature Extraction Using Time-Frequency Analysis for Monophonic-Polyphonic Wheeze Discrimination
    Ulukaya, Sezer
    Sen, Ipek
    Kahya, Yasemin P.
    [J]. 2015 37TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2015, : 5412 - 5415
  • [4] A New Feature for Speech/Music Discrimination
    Huang, Houjun
    Xu, Yunfei
    Zhou, Ruohua
    [J]. INTERNATIONAL ACADEMIC CONFERENCE ON THE INFORMATION SCIENCE AND COMMUNICATION ENGINEERING (ISCE 2014), 2014, : 133 - 137
  • [5] Feature extraction for speech and music discrimination
    Hou, Huiyu
    Sadka, Abdul
    Jiang, Richard M.
    [J]. 2008 INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING, 2008, : 154 - 157
  • [6] Robust singing detection in speech/music discriminator design
    Chou, W
    Gu, L
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 865 - 868
  • [7] Speech/music discrimination for robust speech recognition in robots
    Choi, Mu Yeol
    Song, Hwa Jeon
    Kim, Hyung Soon
    [J]. 2007 RO-MAN: 16TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, VOLS 1-3, 2007, : 118 - +
  • [8] A fast and robust speech/music discrimination approach
    Wang, WQ
    Gao, W
    Ying, DW
    [J]. ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 1325 - 1329
  • [9] Discrimination of speech and monophonic singing in continuous audio streams applying multi-layer support vector machines
    Schuller, B
    Rigoll, G
    Lang, M
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, : 1655 - 1658
  • [10] Adaptive feature selection for speech/music classification
    Abu-El-Quran, A. R.
    Goubran, R. A.
    Chan, A. D. C.
    [J]. 2006 IEEE WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 2006, : 212 - +