Improvement to speech-music discrimination using sinusoidal model based features

被引:0
|
作者
Jalil Shirazi
Shahrokh Ghaemmaghami
机构
[1] Islamic Azad University,Science & Research Branch
[2] Sharif University of Technology,undefined
来源
关键词
Audio classification; Sinusoidal model;
D O I
暂无
中图分类号
学科分类号
摘要
This paper addresses a model-based audio content analysis for classification of speech-music mixed audio signals into speech and music. A set of new features is presented and evaluated based on sinusoidal modeling of audio signals. The new feature set, including variance of the birth frequencies and duration of the longest frequency track in sinusoidal model, as a measure of the harmony and signal continuity, is introduced and discussed in detail. These features are used and compared to typical features as inputs to an audio classifier. Performance of these sinusoidal model features is evaluated through classification of audio into speech and music using both the GMM (Gaussian Mixture Model) and the SVM (Support Vector Machine) classifiers. Experimental results show that the proposed features are quite successful in speech/music discrimination. By using only a set of two sinusoidal model features, extracted from 1-s segments of the signal, we achieved 96.84% accuracy in the audio classification. Experimental comparisons also confirm superiority of the sinusoidal model features to the popular time domain and frequency domain features in audio classification.
引用
收藏
页码:415 / 435
页数:20
相关论文
共 50 条
  • [41] Co-channel speech separation based on sinusoidal model for speech
    Zhao, HM
    Zhou, XD
    Yu, YB
    2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 815 - 818
  • [42] Speech vs Music Discrimination using Empirical Mode Decomposition
    Khonglah, Banriskhem K.
    Sharma, Rajib
    Prasanna, S. R. Mahadeva
    2015 TWENTY FIRST NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2015,
  • [43] ANALYSIS OF EMOTIONAL SPEECH USING AN ADAPTIVE SINUSOIDAL MODEL
    Kafentzis, George P.
    Yakoumaki, Theodora
    Mouchtaris, Athanasios
    Styhanou, Yannis
    2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 1492 - 1496
  • [44] Discrimination between Speech and Music Using Time Series Events
    Alnadabi, Muhammad
    Johnstone, Sherri
    ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 565 - +
  • [45] Speech/Music Discrimination Using Spectrum Analysis and Neural Network
    Keum, Ji-Soo
    Lim, Sung -Kit
    Lee, Hyon-Soo
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2007, 26 (05): : 207 - 213
  • [46] Enhancement of spectral contrast to speech using a sinusoidal model
    Aguilera, CM
    Navas, A
    Tejero, JC
    Gago, A
    ELECTRONICS LETTERS, 1999, 35 (23) : 1997 - 1998
  • [47] SPEAKER DEPENDENT SPEECH ENHANCEMENT USING SINUSOIDAL MODEL
    Mowlaee, Pejman
    Nachbar, Christian
    2014 14TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2014, : 80 - 84
  • [48] Voiced speech excitation synthesis using a sinusoidal model
    Pollard, MP
    Cheetham, BMG
    Goodyear, CC
    Edgington, MD
    ELECTRONICS LETTERS, 1998, 34 (06) : 531 - 532
  • [49] Speech enhancement using a constrained iterative sinusoidal model
    Jensen, J
    Hansen, JHL
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (07): : 731 - 740
  • [50] Speech/music discrimination based on a modified low energy ratio
    Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
    Qinghua Daxue Xuebao, 2008, SUPPL. (720-724):