Improvement to speech-music discrimination using sinusoidal model based features

被引：0

作者：

Jalil Shirazi

Shahrokh Ghaemmaghami

机构：

[1] Islamic Azad University,Science & Research Branch

[2] Sharif University of Technology,undefined

来源：

Multimedia Tools and Applications | 2010年 / 50卷

关键词：

Audio classification; Sinusoidal model;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

This paper addresses a model-based audio content analysis for classification of speech-music mixed audio signals into speech and music. A set of new features is presented and evaluated based on sinusoidal modeling of audio signals. The new feature set, including variance of the birth frequencies and duration of the longest frequency track in sinusoidal model, as a measure of the harmony and signal continuity, is introduced and discussed in detail. These features are used and compared to typical features as inputs to an audio classifier. Performance of these sinusoidal model features is evaluated through classification of audio into speech and music using both the GMM (Gaussian Mixture Model) and the SVM (Support Vector Machine) classifiers. Experimental results show that the proposed features are quite successful in speech/music discrimination. By using only a set of two sinusoidal model features, extracted from 1-s segments of the signal, we achieved 96.84% accuracy in the audio classification. Experimental comparisons also confirm superiority of the sinusoidal model features to the popular time domain and frequency domain features in audio classification.

引用

页码：415 / 435

页数：20

共 50 条

[31] On the Discrimination of Speech/Music using a Time Series Regularity
Swe, Ei Mon Mon
Pwint, Moe
Sattar, Farook
ISM: 2008 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, 2008, : 53 - +
[32] Speech/music classification using speech-specific features
Khonglah, Banriskhem K.
Prasanna, S. R. Mahadeva
DIGITAL SIGNAL PROCESSING, 2016, 48 : 71 - 83
[33] Speech/music discrimination based on wavelets for broadcast programs
Didiot, E.
Illina, I.
Mella, O.
Fohr, D.
Haton, J. -P
SIGMAP 2006: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA APPLICATIONS, 2006, : 151 - +
[34] A wavelet-based parameterization for speech/music discrimination
Didiot, E.
Illina, I.
Fohr, D.
Mella, O.
COMPUTER SPEECH AND LANGUAGE, 2010, 24 (02): : 341 - 357
[35] Speech/Music Discrimination Based on Discrete Wavelet Transform
Ntalampiras, Stavros
Fakotakis, Nikos
ARTIFICIAL INTELLIGENCE: THEORIES, MODELS AND APPLICATIONS, SETN 2008, 2008, 5138 : 205 - 211
[36] CATALOG-BASED SINGLE-CHANNEL SPEECH-MUSIC SEPARATION WITH THE ITAKURA-SAITO DIVERGENCE
Demir, Cemil
Cemgil, A. Taylan
Saraclar, Murat
2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 2812 - 2816
[37] Speech/music classification using phase-based and magnitude-based features
Bhattacharjee, Mrinmoy
Prasanna, S. R. Mahadeva
Guha, Prithwijit
SPEECH COMMUNICATION, 2022, 142 : 34 - 48
[38] Speech/music discrimination-based audio characterization using blind watermarking scheme
Mezghani, Eya
Charfeddine, Maha
Nicolas, Henri
Ben Amar, Chokri
JOURNAL OF INFORMATION ASSURANCE AND SECURITY, 2016, 11 (06): : 311 - 321
[39] A Voice Activity Detection Using Cyclic Statistics Based on Sinusoidal Speech Model
Dou, Hui-jing
Bao, Chang-chun
Li, Ru-wei
ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 1240 - 1243
[40] NOISE ROBUST FEATURES FOR SPEECH/MUSIC DISCRIMINATION IN REAL-TIME TELECOMMUNICATION
Fu, Zhong-Hua
Wang, Jhing-Fa
Xie, Lei
ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 574 - +

← 1 2 3 4 5 →