Speech/music discrimination for multimedia applications

被引：0

作者：

El-Maleh, K ^{[1
]}

Klein, M ^{[1
]}

Petrucci, G ^{[1
]}

Kabal, P ^{[1
]}

机构：

[1] McGill Univ, Dept Elect & Comp Engn, Montreal, PQ H3A 2A7, Canada

来源：

2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI | 2000年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

dAutomatic discrimination of speech and music is an important tool in many multimedia applications. Previous work has focused on using long-term features such as differential parameters, variances, and time-averages of spectral parameters. These classifiers use features estimated over windows of 0.5-5 seconds, and are relatively complex. In this paper, we present our results of combining the line spectral frequencies (LSFs) and zero-crossing-based features for frame-level narrowband speech/music discrimination. Our classification results for different types of music and speech show the good discriminating power of these features. Our classification algorithms operate using only a frame delay of 20 ms, making them suitable for real-time multimedia applications.

引用

页码：2445 / 2448

页数：4

共 50 条

[21] Comparison of ISFs and LSFs in Speech/Music Discrimination System
洪英
赵胜辉
匡镜明
[J]. Journal of Beijing Institute of Technology, 2005, (03) : 234 - 237
[22] On the Discrimination of Speech/Music using a Time Series Regularity
Swe, Ei Mon Mon
Pwint, Moe
Sattar, Farook
[J]. ISM: 2008 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, 2008, : 53 - +
[23] DISCRIMINATION FUNCTIONS PREDICTED FROM CATEGORIES IN SPEECH AND MUSIC
CUTTING, JE
ROSNER, BS
[J]. PERCEPTION & PSYCHOPHYSICS, 1976, 20 (01): : 87 - 88
[24] A wavelet-based parameterization for speech/music discrimination
Didiot, E.
Illina, I.
Fohr, D.
Mella, O.
[J]. COMPUTER SPEECH AND LANGUAGE, 2010, 24 (02): : 341 - 357
[25] Speech/Music Discrimination Based on Discrete Wavelet Transform
Ntalampiras, Stavros
Fakotakis, Nikos
[J]. ARTIFICIAL INTELLIGENCE: THEORIES, MODELS AND APPLICATIONS, SETN 2008, 2008, 5138 : 205 - 211
[26] Acoustic speech to lip feature mapping for multimedia applications
Li, CL
Dansereau, RM
Goubran, RA
[J]. ISPA 2003: PROCEEDINGS OF THE 3RD INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS, PTS 1 AND 2, 2003, : 829 - 832
[27] Automatic speech recognition - An emerging interface for multimedia applications
Fried, L
[J]. INFORMATION SYSTEMS MANAGEMENT, 1996, 13 (01) : 29 - 37
[28] Discrimination of stress in speech and music: A mismatch negativity (MMN) study
Peter, Varghese
McArthur, Genevieve
Thompson, William Forde
[J]. PSYCHOPHYSIOLOGY, 2012, 49 (12) : 1590 - 1600
[29] Speech/music discrimination based on a modified low energy ratio
Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
[J]. Qinghua Daxue Xuebao, 2008, SUPPL. (720-724):
[30] Speech vs Music Discrimination using Empirical Mode Decomposition
Khonglah, Banriskhem K.
Sharma, Rajib
Prasanna, S. R. Mahadeva
[J]. 2015 TWENTY FIRST NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2015,

← 1 2 3 4 5 →