Improvement to speech-music discrimination using sinusoidal model based features

被引:0
|
作者
Jalil Shirazi
Shahrokh Ghaemmaghami
机构
[1] Islamic Azad University,Science & Research Branch
[2] Sharif University of Technology,undefined
来源
关键词
Audio classification; Sinusoidal model;
D O I
暂无
中图分类号
学科分类号
摘要
This paper addresses a model-based audio content analysis for classification of speech-music mixed audio signals into speech and music. A set of new features is presented and evaluated based on sinusoidal modeling of audio signals. The new feature set, including variance of the birth frequencies and duration of the longest frequency track in sinusoidal model, as a measure of the harmony and signal continuity, is introduced and discussed in detail. These features are used and compared to typical features as inputs to an audio classifier. Performance of these sinusoidal model features is evaluated through classification of audio into speech and music using both the GMM (Gaussian Mixture Model) and the SVM (Support Vector Machine) classifiers. Experimental results show that the proposed features are quite successful in speech/music discrimination. By using only a set of two sinusoidal model features, extracted from 1-s segments of the signal, we achieved 96.84% accuracy in the audio classification. Experimental comparisons also confirm superiority of the sinusoidal model features to the popular time domain and frequency domain features in audio classification.
引用
收藏
页码:415 / 435
页数:20
相关论文
共 50 条
  • [31] On the Discrimination of Speech/Music using a Time Series Regularity
    Swe, Ei Mon Mon
    Pwint, Moe
    Sattar, Farook
    ISM: 2008 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, 2008, : 53 - +
  • [32] Speech/music classification using speech-specific features
    Khonglah, Banriskhem K.
    Prasanna, S. R. Mahadeva
    DIGITAL SIGNAL PROCESSING, 2016, 48 : 71 - 83
  • [33] Speech/music discrimination based on wavelets for broadcast programs
    Didiot, E.
    Illina, I.
    Mella, O.
    Fohr, D.
    Haton, J. -P
    SIGMAP 2006: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA APPLICATIONS, 2006, : 151 - +
  • [34] A wavelet-based parameterization for speech/music discrimination
    Didiot, E.
    Illina, I.
    Fohr, D.
    Mella, O.
    COMPUTER SPEECH AND LANGUAGE, 2010, 24 (02): : 341 - 357
  • [35] Speech/Music Discrimination Based on Discrete Wavelet Transform
    Ntalampiras, Stavros
    Fakotakis, Nikos
    ARTIFICIAL INTELLIGENCE: THEORIES, MODELS AND APPLICATIONS, SETN 2008, 2008, 5138 : 205 - 211
  • [36] CATALOG-BASED SINGLE-CHANNEL SPEECH-MUSIC SEPARATION WITH THE ITAKURA-SAITO DIVERGENCE
    Demir, Cemil
    Cemgil, A. Taylan
    Saraclar, Murat
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 2812 - 2816
  • [37] Speech/music classification using phase-based and magnitude-based features
    Bhattacharjee, Mrinmoy
    Prasanna, S. R. Mahadeva
    Guha, Prithwijit
    SPEECH COMMUNICATION, 2022, 142 : 34 - 48
  • [38] Speech/music discrimination-based audio characterization using blind watermarking scheme
    Mezghani, Eya
    Charfeddine, Maha
    Nicolas, Henri
    Ben Amar, Chokri
    JOURNAL OF INFORMATION ASSURANCE AND SECURITY, 2016, 11 (06): : 311 - 321
  • [39] A Voice Activity Detection Using Cyclic Statistics Based on Sinusoidal Speech Model
    Dou, Hui-jing
    Bao, Chang-chun
    Li, Ru-wei
    ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 1240 - 1243
  • [40] NOISE ROBUST FEATURES FOR SPEECH/MUSIC DISCRIMINATION IN REAL-TIME TELECOMMUNICATION
    Fu, Zhong-Hua
    Wang, Jhing-Fa
    Xie, Lei
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 574 - +