Singing voice detection for karaoke application

被引:2
|
作者
Shenoy, A [1 ]
Wu, YS [1 ]
Wang, Y [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Singapore 117543, Singapore
关键词
karaoke; singing voice; vocal segmentation; tonic; key; inverse comb filtering; rhythm; lyrics;
D O I
10.1117/12.631645
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
We present a framework to detect the regions of singing voice in musical audio signals. This work is oriented towards the development of a robust transcriber of lyrics for karaoke applications. The technique leverages on a combination of low-level audio features and higher level musical knowledge of rhythm and tonality. Musical knowledge of the key is used to create a song-specific filterbank to attenuate the presence of the pitched musical instruments. This is followed by subband processing of the audio to detect the musical octaves in which the vocals are present. Text processing is employed to approximate the duration of the sung passages using freely available lyrics. This is used to obtain a dynamic threshold for vocal/ non-vocal segmentation. This pairing of audio and text processing helps create a more accurate system. Experimental evaluation on a small database of popular songs shows the validity of the proposed approach. Holistic and per-component evaluation of the system is conducted and various improvements are discussed.
引用
收藏
页码:752 / 762
页数:11
相关论文
共 50 条
  • [1] Karalk: a karaoke dataset for cover song identification and singing voice analysis
    Bayle, Yann
    Marsik, Ladislav
    Rusek, Martin
    Robine, Matthias
    Hanna, Pierre
    Slaninova, Katerina
    Martinovic, Jan
    Pokorny, Jaroslav
    [J]. 2017 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2017, : 177 - 184
  • [2] Neural Dynamics of Karaoke-Like Voice Imitation in Singing Performance
    Fruehholz, Sascha
    Trost, Wiebke
    Constantinescu, Irina
    Grandjean, Didier
    [J]. FRONTIERS IN HUMAN NEUROSCIENCE, 2020, 14
  • [3] Detection of Singing Mistakes from Singing Voice
    Miyagawa, Isao
    Chiba, Yuya
    Nose, Takashi
    Ito, Akinori
    [J]. ADVANCES IN INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING, PT II, 2018, 82 : 130 - 136
  • [4] KaraTuner: Towards End-to-End Natural Pitch Correction for Singing Voice in Karaoke
    Zhuang, Xiaobin
    Yu, Huiran
    Zhao, Weifeng
    Jiang, Tao
    Hu, Peng
    [J]. INTERSPEECH 2022, 2022, : 4262 - 4266
  • [5] Singing Voice Detection: A Survey
    Monir, Ramy
    Kostrzewa, Daniel
    Mrozek, Dariusz
    [J]. ENTROPY, 2022, 24 (01)
  • [6] KaraMIR: A Project for Cover Song Identification and Singing Voice Analysis Using a Karaoke Songs Dataset
    Marsik, Ladislav
    Martisek, Petr
    Pokorny, Jaroslav
    Rusek, Martin
    Slaninova, Katerina
    Martinovic, Jan
    Robine, Matthias
    Hanna, Pierre
    Bayle, Yann
    [J]. INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2018, 12 (04) : 501 - 522
  • [7] Automatic evaluation of singing enthusiasm for karaoke
    Daido, Ryunosuke
    Ito, Masashi
    Makino, Shozo
    Ito, Akinori
    [J]. COMPUTER SPEECH AND LANGUAGE, 2014, 28 (02): : 501 - 517
  • [8] Application of ANC for Singing Voice Attenuation
    Suzuki, Daiki
    Kondo, Kazuhiro
    [J]. 2014 IEEE 3RD GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE), 2014, : 63 - 64
  • [9] Knowledge Distillation for Singing Voice Detection
    Paul, Soumava
    Reddy, Gurunath M.
    Rao, K. Sreenivasa
    Das, Partha Pratim
    [J]. INTERSPEECH 2021, 2021, : 4159 - 4163
  • [10] AN AUTOMATED SINGING EVALUATION METHOD FOR KARAOKE SYSTEMS
    Tsai, Wei-Ho
    Lee, Hsin-Chieh
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 2428 - 2431