Robust singing detection in speech/music discriminator design

被引:0
|
作者
Chou, W [1 ]
Gu, L [1 ]
机构
[1] Bell Labs, Lucent Technol, Murray Hill, NJ 07974 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, an approach for robust signing signal detection in speech/music discrimination is proposed and applied to applications of audio indexing. Conventional approaches in speech/music discrimination can provide reasonable performance with regular music signals but often perform poorly with singing segments. This is due mainly to the fact that speech and singing signals are extremely close and traditional features used in speech recognition do not provide a reliable cue for speech and singing signal discrimination. In order to improve the robustness of speech/music discrimination, a new set of features derived from harmonic coefficient and its 4Hz modulation values are developed in this paper, and these new features provide additional and reliable cues to separate speech from singing. In addition, a rule-based post-filtering scheme is also described which leads to further improvements in speech/music discrimination. Source-independent audio indexing experiments on PBS Skills database indicate that the proposed approach can greatly reduce the classification error rate on singing segments in the audio stream. Comparing with existing approaches, the overall segmentation error rate is reduced by more than 30%, averaged over all shows in the database.
引用
收藏
页码:865 / 868
页数:4
相关论文
共 50 条
  • [1] A robust and computationally efficient Speech/Music discriminator
    Jayme, Garcia Arnal Barbedo
    Lopes, Amauri
    [J]. JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2006, 54 (7-8): : 571 - 588
  • [2] Construction and evaluation of a robust multifeature speech/music discriminator
    Scheirer, E
    Slaney, M
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1331 - 1334
  • [3] A ROBUST SPEECH/MUSIC DISCRIMINATOR FOR SWITCHED AUDIO CODING
    Fuchs, Guillaume
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 569 - 573
  • [4] Design of an efficient music-speech discriminator
    Tardon, Lorenzo J.
    Sammartino, Simone
    Barbancho, Isabel
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2010, 127 (01): : 271 - 279
  • [5] Mixed wideband speech and music coding using a speech/music discriminator
    Qiao, RY
    [J]. IEEE TENCON'97 - IEEE REGIONAL 10 ANNUAL CONFERENCE, PROCEEDINGS, VOLS 1 AND 2: SPEECH AND IMAGE TECHNOLOGIES FOR COMPUTING AND TELECOMMUNICATIONS, 1997, : 605 - 608
  • [6] Feature selection and stacking for robust discrimination of speech, monophonic singing, and polyphonic music
    Schuller, B
    Schmitt, BJB
    Arsic, D
    Reiter, S
    Lang, M
    Rigoll, G
    [J]. 2005 IEEE International Conference on Multimedia and Expo (ICME), Vols 1 and 2, 2005, : 840 - 843
  • [7] A real-time speech-music discriminator
    Aarts, RM
    Dekkers, RT
    [J]. JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 1999, 47 (09): : 720 - 725
  • [8] A speech/music discriminator for radio recordings using Bayesian networks
    Giannakopoulos, Theodoros
    Pikrakis, Aggelos
    Theodoridis, Sergios
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 5667 - 5670
  • [9] A speech/music discriminator based on RMS and zero-crossings
    Panagiotakis, C
    Tziritas, G
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2005, 7 (01) : 155 - 166
  • [10] Inzanagi and Izanami: A Game for Speech Sounds, Singing and Music
    Lorenz, Dagmar C. G.
    [J]. JOURNAL OF AUSTRIAN STUDIES, 2015, 48 (03) : 163 - +