Rhythm detection for speech-music discrimination in MPEG compressed domain

被引:3
|
作者
Jarina, R [1 ]
O'Connor, N [1 ]
Marlow, S [1 ]
Murphy, N [1 ]
机构
[1] Dublin City Univ, Ctr Digital Video Proc, Dublin 9, Ireland
关键词
D O I
10.1109/ICDSP.2002.1027851
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A novel approach to speech-music discrimination based on rhythm (or beat) detection is introduced. Rhythmic pulses are detected by applying a long-term autocorrelation method on band-passed signals. This approach is combined with another, in which the features describe the energy peaks of the signal. The discriminator uses just three features that are computed from data directly taken from an MPEG-I bitstream. The discriminator was tested on more than 3 hours of audio data. Average recognition rate is 97.7%.
引用
收藏
页码:129 / 132
页数:4
相关论文
共 50 条
  • [31] Semi-supervised Single-Channel Speech-Music Separation for Automatic Speech Recognition
    Demir, Cemil
    Cemgil, A. Taylan
    Saraclar, Murat
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 688 - +
  • [32] Scene change detection using shape information in MPEG-4 compressed domain
    Park, IS
    Shin, DH
    Park, RH
    CISST '04: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON IMAGING SCIENCE, SYSTEMS, AND TECHNOLOGY, 2004, : 534 - 540
  • [33] ANALYSIS OF EFFECT OF SINGLE-CHANNEL SPEECH-MUSIC SEPARATION USING NMF TO AUTOMATIC SPEECH RECOGNITION
    Demir, Cemil
    Cemgil, A. Taylan
    Saraclar, Murat
    2014 22ND SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2014, : 1818 - 1821
  • [34] Detection of AMR double compression using compressed -domain speech features
    Sampaio, Jose F. P.
    Nascimento, Francisco A. de O.
    FORENSIC SCIENCE INTERNATIONAL-DIGITAL INVESTIGATION, 2020, 33
  • [35] MUSIC TONALITY FEATURES FOR SPEECH/MUSIC DISCRIMINATION
    Sell, Gregory
    Clark, Pascal
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [36] Music and speech prosody: a common rhythm
    Hausen, Maija
    Torppa, Ritva
    Salmela, Viljami R.
    Vainio, Martti
    Sarkamo, Teppo
    FRONTIERS IN PSYCHOLOGY, 2013, 4
  • [37] Cortical tracking of rhythm in music and speech
    Harding, Eleanor E.
    Sammler, Daniela
    Henry, Molly J.
    Large, Edward W.
    Kotz, Sonja A.
    NEUROIMAGE, 2019, 185 : 96 - 101
  • [38] Compressed domain MPEG-2 video editing
    Wang, K
    Woods, JW
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 225 - 228
  • [39] Semantic video summarization in compressed domain MPEG video
    Yu, JCS
    Kankanhalli, MS
    Mulhem, P
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 329 - 332
  • [40] Generation of the MPEG-7 descriptor in compressed domain
    Jin, SH
    Kim, CS
    Kang, HK
    Ro, YM
    IMAGE PROCESSING: ALGORITHMS AND SYSTEMS, 2002, 4667 : 160 - 169