A comparison of features for speech, music discrimination.

被引:79
|
作者
Carey, MJ [1 ]
Parris, ES [1 ]
Lloyd-Thomas, H [1 ]
机构
[1] Ensigma Ltd, Chepstow, Mons, England
关键词
D O I
10.1109/ICASSP.1999.758084
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Several approaches have previously been taken to the problem of discriminating between speech and music signals. These have used different features as the input to the classifier and have tested and trained on different material. In this paper we examine the discrimination achieved by several different features using common training and test sets and the same classifier. The database assembled for these tests includes speech from thirteen languages and music from all over the world. In each case the distributions in the feature space were modelled by a Gaussian mixture model. Experiments were carried out on four types of feature, amplitude, cepstra, pitch and zero-crossings. In each case the derivative of the feature was also used and found to improve performance. The best performance resulted from using the cepstra and delta cepstra which gave an equal error rate (EER) of 1.2%. This was closely followed by normalised amplitude and delta amplitude. This however used a much less complex model. The pitch and delta pitch gave an EER of 4% which was better than the zero-crossing which produced an EER of 6%.
引用
收藏
页码:149 / 152
页数:4
相关论文
共 50 条
  • [1] MUSIC TONALITY FEATURES FOR SPEECH/MUSIC DISCRIMINATION
    Sell, Gregory
    Clark, Pascal
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [2] Novel features for effective speech and music discrimination
    Muharak, Omer Mohsin
    Ambikairajah, Eliathamby
    Epps, Julien
    2006 IEEE INTERNATIONAL CONFERENCE ON ENGINEERING OF INTELLIGENT SYSTEMS, 2006, : 343 - +
  • [3] Speech music discrimination using class-specific features
    Beierholm, T
    Baggenstoss, PM
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, 2004, : 379 - 382
  • [4] Spatial discrimination.
    Wirt, SK
    Greisheimer, EM
    PROCEEDINGS OF THE SOCIETY FOR EXPERIMENTAL BIOLOGY AND MEDICINE, 1930, 28 (03): : 341 - 342
  • [5] Histogram Equalization-Based Features for Speech, Music, and Song Discrimination
    Gallardo-Antolin, Ascension
    Montero, Juan M.
    IEEE SIGNAL PROCESSING LETTERS, 2010, 17 (07) : 659 - 662
  • [6] Enhancing Speech and Music Discrimination Through the Integration of Static and Dynamic Features
    Chen, Liangwei
    Zhou, Xiren
    Tut, Qiang
    Chen, Huanhuan
    INTERSPEECH 2024, 2024, : 4318 - 4322
  • [7] Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination
    Duzenli, Timur
    Ozkurt, Nalan
    ISTANBUL UNIVERSITY-JOURNAL OF ELECTRICAL AND ELECTRONICS ENGINEERING, 2011, 11 (01): : 1355 - 1362
  • [8] NOISE ROBUST FEATURES FOR SPEECH/MUSIC DISCRIMINATION IN REAL-TIME TELECOMMUNICATION
    Fu, Zhong-Hua
    Wang, Jhing-Fa
    Xie, Lei
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 574 - +
  • [9] Improvement to speech-music discrimination using sinusoidal model based features
    Shirazi, Jalil
    Ghaemmaghami, Shahrokh
    MULTIMEDIA TOOLS AND APPLICATIONS, 2010, 50 (02) : 415 - 435
  • [10] Improvement to speech-music discrimination using sinusoidal model based features
    Jalil Shirazi
    Shahrokh Ghaemmaghami
    Multimedia Tools and Applications, 2010, 50 : 415 - 435