Rhythm detection for speech-music discrimination in MPEG compressed domain

被引：3

作者：

Jarina, R ^{[1
]}

O'Connor, N ^{[1
]}

Marlow, S ^{[1
]}

Murphy, N ^{[1
]}

机构：

[1] Dublin City Univ, Ctr Digital Video Proc, Dublin 9, Ireland

来源：

DSP 2002: 14TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING PROCEEDINGS, VOLS 1 AND 2 | 2002年

关键词：

D O I：

10.1109/ICDSP.2002.1027851

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

A novel approach to speech-music discrimination based on rhythm (or beat) detection is introduced. Rhythmic pulses are detected by applying a long-term autocorrelation method on band-passed signals. This approach is combined with another, in which the features describe the energy peaks of the signal. The discriminator uses just three features that are computed from data directly taken from an MPEG-I bitstream. The discriminator was tested on more than 3 hours of audio data. Average recognition rate is 97.7%.

引用

页码：129 / 132

页数：4

共 50 条

[21] Motion Vector Based Moving Object Detection and Tracking in the MPEG Compressed Domain
Yokoyama, Takanori
Iwasaki, Toshiki
Watanabe, Toshinori
CBMI: 2009 INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING, 2009, : 201 - 206
[22] The effectiveness of Speech-Music Therapy for Aphasia (SMTA) in five speakers with Apraxia of Speech and aphasia
Hurkmans, Joost
Jonkers, Roel
de Bruijn, Madeleen
Boonstra, Anne M.
Hartman, Paul P.
Arendzen, Hans
Reinders-Messelink, Heleen A.
APHASIOLOGY, 2015, 29 (08) : 939 - 964
[23] Single-Channel Speech-Music Separation for Robust ASR With Mixture Models
Demir, Cemil
Saraclar, Murat
Cemgil, Ali Taylan
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (04): : 725 - 736
[24] Encoding scaled MPEG video in compressed domain
Hu, Q
Panchanathan, S
VISUAL COMMUNICATIONS AND IMAGE PROCESSING '97, PTS 1-2, 1997, 3024 : 983 - 991
[25] Splicing MPEG video streams in the compressed domain
Wee, SJ
Vasudev, B
1997 IEEE FIRST WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 1997, : 225 - 230
[26] Catalog-Based Single-Channel Speech-Music Separation
Demir, Cemil
Cemgil, A. Taylan
Saraclar, Murat
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2786 - +
[27] Application of Efficient Score Function Estimation in Blind Speech-Music Separation
Pishravian, A.
Aghabozorgi, M. R.
Abutalebi, H. R.
ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 618 - 621
[28] Dissolve detection in MPEG compressed video
Gu, LF
Tsui, K
Keightley, D
1997 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT PROCESSING SYSTEMS, VOLS 1 & 2, 1997, : 1692 - 1696
[29] CATALOG-BASED SINGLE-CHANNEL SPEECH-MUSIC SEPARATION FOR AUTOMATIC SPEECH RECOGNITION
Demir, Cemil
Cemgil, A. Taylan
Saraclar, Murat
19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 2133 - 2137
[30] Speech-Music Classification Model Based on Improved Neural Network and Beat Spectrum
Huang, Chun
Wei, HeFu
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (07) : 52 - 64

← 1 2 3 4 5 →