Improvement to speech-music discrimination using sinusoidal model based features

被引:0
|
作者
Jalil Shirazi
Shahrokh Ghaemmaghami
机构
[1] Islamic Azad University,Science & Research Branch
[2] Sharif University of Technology,undefined
来源
关键词
Audio classification; Sinusoidal model;
D O I
暂无
中图分类号
学科分类号
摘要
This paper addresses a model-based audio content analysis for classification of speech-music mixed audio signals into speech and music. A set of new features is presented and evaluated based on sinusoidal modeling of audio signals. The new feature set, including variance of the birth frequencies and duration of the longest frequency track in sinusoidal model, as a measure of the harmony and signal continuity, is introduced and discussed in detail. These features are used and compared to typical features as inputs to an audio classifier. Performance of these sinusoidal model features is evaluated through classification of audio into speech and music using both the GMM (Gaussian Mixture Model) and the SVM (Support Vector Machine) classifiers. Experimental results show that the proposed features are quite successful in speech/music discrimination. By using only a set of two sinusoidal model features, extracted from 1-s segments of the signal, we achieved 96.84% accuracy in the audio classification. Experimental comparisons also confirm superiority of the sinusoidal model features to the popular time domain and frequency domain features in audio classification.
引用
收藏
页码:415 / 435
页数:20
相关论文
共 50 条
  • [21] Clean vs. Overlapped Speech-Music Detection Using Harmonic-Percussive Features and Multi-Task Learning
    Bhattacharjee, Mrinmoy
    Prasanna, S. R. M.
    Guha, Prithwijit
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1 - 10
  • [22] Performance evaluation of HHT based features for Speech/Music Discrimination under Noisy condition
    Kumar, Arvind
    Kishore, Kamlesh
    Chandra, Mahesh
    PROCEEDINGS OF THE 2020 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND SECURITY (ICCCS-2020), 2020,
  • [23] SPEECH ENHANCEMENT BASED ON A SINUSOIDAL MODEL
    KATES, JM
    JOURNAL OF SPEECH AND HEARING RESEARCH, 1994, 37 (02): : 449 - 464
  • [24] APPLICATION OF INDEPENDENT COMPONENT ANALYSIS FOR SPEECH-MUSIC SEPARATION USING AN EFFICIENT SCORE FUNCTION ESTIMATION
    Pishravian, Arash
    Sahaf, Masoud Reza Aghabozorgi
    JOURNAL OF ELECTRICAL ENGINEERING-ELEKTROTECHNICKY CASOPIS, 2012, 63 (06): : 380 - 385
  • [25] Empirical mode decomposition based statistical features for discrimination of speech and low frequency music signal
    Kumar, Arvind
    Chandra, Mahesh
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (01) : 33 - 58
  • [26] Empirical mode decomposition based statistical features for discrimination of speech and low frequency music signal
    Arvind Kumar
    Mahesh Chandra
    Multimedia Tools and Applications, 2023, 82 : 33 - 58
  • [27] RNN-based speech synthesis using a continuous sinusoidal model
    Al-Radhi, Mohammed Salah
    Csapo, Tamas Gabor
    Nemeth, Gaza
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [28] Speech quality enhancement based on sinusoidal model using Chebyshev filter
    Kim, Kiliong
    Chung, Yongick
    Park, Cheolyong
    Son, Youngho
    Yoon, Janghong
    PROCEEDINGS OF FUTURE GENERATION COMMUNICATION AND NETWORKING, MAIN CONFERENCE PAPERS, VOL 1, 2007, : 322 - 326
  • [29] Enhancing Speech and Music Discrimination Through the Integration of Static and Dynamic Features
    Chen, Liangwei
    Zhou, Xiren
    Tut, Qiang
    Chen, Huanhuan
    INTERSPEECH 2024, 2024, : 4318 - 4322
  • [30] Speech and Music Discrimination Using Spectral Transition Rate
    Yang, Kyong-Chul
    Bang, Yong-Chan
    Cho, Sun-Ho
    Yook, Dongsuk
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2009, 28 (03): : 273 - 278