A comparison of features for speech, music discrimination.

被引:79
|
作者
Carey, MJ [1 ]
Parris, ES [1 ]
Lloyd-Thomas, H [1 ]
机构
[1] Ensigma Ltd, Chepstow, Mons, England
关键词
D O I
10.1109/ICASSP.1999.758084
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Several approaches have previously been taken to the problem of discriminating between speech and music signals. These have used different features as the input to the classifier and have tested and trained on different material. In this paper we examine the discrimination achieved by several different features using common training and test sets and the same classifier. The database assembled for these tests includes speech from thirteen languages and music from all over the world. In each case the distributions in the feature space were modelled by a Gaussian mixture model. Experiments were carried out on four types of feature, amplitude, cepstra, pitch and zero-crossings. In each case the derivative of the feature was also used and found to improve performance. The best performance resulted from using the cepstra and delta cepstra which gave an equal error rate (EER) of 1.2%. This was closely followed by normalised amplitude and delta amplitude. This however used a much less complex model. The pitch and delta pitch gave an EER of 4% which was better than the zero-crossing which produced an EER of 6%.
引用
收藏
页码:149 / 152
页数:4
相关论文
共 50 条
  • [31] Violence through environmental discrimination.
    不详
    JOURNAL OF PEACE RESEARCH, 2002, 39 (02) : 251 - 251
  • [32] Contribution of FOTI to product discrimination.
    OMullane, DM
    Ellwood, RP
    Kavanagh, D
    Jones, PR
    Chesters, RK
    Huntington, E
    JOURNAL OF DENTAL RESEARCH, 1997, 76 : 1927 - 1927
  • [33] The Minkowski law of field discrimination.
    Weber, H
    Wellstein, J
    MATHEMATISCHE ANNALEN, 1913, 73 : 275 - 285
  • [34] Empirical mode decomposition based statistical features for discrimination of speech and low frequency music signal
    Kumar, Arvind
    Chandra, Mahesh
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (01) : 33 - 58
  • [35] Empirical mode decomposition based statistical features for discrimination of speech and low frequency music signal
    Arvind Kumar
    Mahesh Chandra
    Multimedia Tools and Applications, 2023, 82 : 33 - 58
  • [36] Discrimination Effectiveness of Speech Cepstral Features
    Malegaonkar, A.
    Ariyaeeinia, A.
    Sivakumaran, P.
    Pillay, S.
    BIOMETRICS AND IDENTITY MANAGEMENT, 2008, 5372 : 91 - 99
  • [37] A fast and robust speech/music discrimination approach
    Wang, WQ
    Gao, W
    Ying, DW
    ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 1325 - 1329
  • [38] Speech/music discrimination for analysis of radio stations
    Kacprzak, Stanislaw
    Chwiecko, Blazej
    Ziolko, Bartosz
    2017 INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING (IWSSIP), 2017,
  • [39] A multifeature speech/music discrimination system.
    Saad, EM
    El-Adawy, MI
    Abu-El-Wafa, ME
    Wahba, AA
    2002 IEEE PROCEEDINGS OF THE NINETEENTH NATIONAL RADIO SCIENCE CONFERENCE, VOLS 1 AND 2, 2002, : 208 - 213
  • [40] Modulation features for speech and music classification
    Mubarak, Omer Mohsin
    Ambikairajah, Eliathamby
    Epps, Julien
    Gunawan, Teddy Surya
    2006 10TH IEEE SINGAPORE INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS, VOLS 1 AND 2, 2006, : 764 - +