Robust Query-by-Singing/Humming System against Background Noise Environments

被引:14
|
作者
Kim, Kichul [1 ]
Park, Kang Ryoung [2 ]
Park, Sung-Joo [3 ]
Lee, Soek-Pil [3 ]
Kim, Moo Young [1 ]
机构
[1] Sejong Univ, Dept Informat & Commun Engn, Human Comp Interact Lab, Seoul, South Korea
[2] Dongguk Univ, Div Elect & Elect Engn, Seoul, South Korea
[3] Korea Elect Technol Inst, Digital Media Res Ctr, Seoul, South Korea
关键词
Query-by-Singing/Humming; pitch estimation; background noise; dynamic time warping; CLASSIFICATION; SPEECH;
D O I
10.1109/TCE.2011.5955213
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Under background noise environments, the performance of the Query-by-Singing/Humming (QbSH) system is considerably degraded. Since human pitch information is used as a feature vector for the QbSH system, a noise robust pitch-estimation algorithm is inevitable. Thus, a novel pitch-estimation method is proposed by integrating temporal-autocorrelation and spectral-salience methods. As a pre-processing block, spectral smoothing is applied to enhance the stationarity of the noisy input signal. To calculate the similarity between the MIDI database and input humming signal, the dynamic time warping (DTW) algorithm is used. Jang's corpus and AURORA2 database are selected as humming and background noise signals, respectively. Compared with the standard pitch estimation algorithm in the ITU-T G.729 speech codec, the proposed pitch estimation method improves the average accuracy by 11.7% for the 0 dB signal-to-noise ratio (SNR) noise case. It also improves top-20 ratio and mean reciprocal rank (MRR) of the proposed QbSH system, on average, by 7.4% and 0.13, respectively(1).
引用
收藏
页码:720 / 725
页数:6
相关论文
共 34 条
  • [1] Multi-Classifier Based on a Query-by-Singing/Humming System
    Nam, Gi Pyo
    Park, Kang Ryoung
    [J]. SYMMETRY-BASEL, 2015, 7 (02): : 994 - 1016
  • [2] IMPLEMENTATION OF A MATCHING ENGINE FOR A PRACTICAL QUERY-BY-SINGING/HUMMING SYSTEM
    Jang, Dalwon
    Song, Chai-Jong
    Shin, Saim
    Park, Sung-Joo
    Jang, Sei-Jin
    Lee, Seok-Pil
    [J]. 2011 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2011, : 258 - 263
  • [3] Test of pitch extraction algorithms for query-by-singing/humming system
    Jang, Dalwon
    Jang, Sei-Jin
    Lee, Seok-Pil
    [J]. 2012 IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING (BMSB), 2012,
  • [4] Implementation of a Practical Query-by-Singing/Humming (QbSH) System and Its Commercial Applications
    Song, Chai-Jong
    Park, Hochong
    Yang, Chang-Mo
    Jang, Sei-Jin
    Lee, Seok-Pil
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2013, 59 (02) : 407 - 414
  • [5] A Design of Matching Engine for a Practical Query-by-Singing/Humming System with Polyphonic Recordings
    Lee, Seok-Pil
    Yoo, Hoon
    Jang, Dalwon
    [J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2014, 8 (02): : 723 - 736
  • [6] Implementation of a Practical Query-by-Singing/Humming (QbSH) System and Its Commercial Applications
    Song, Chai-Jong
    Park, Hochong
    Yang, Chang-Mo
    Jang, Sei-Jin
    Lee, Seok-Phil
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2013, : 102 - +
  • [7] Improving Query-by-Singing/Humming by Combining Melody and Lyric Information
    Wang, Chung-Che
    Jang, Jyh-Shing Roger
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (04) : 798 - 806
  • [8] An initial study on progressive filtering based on dynamic programming for query-by-singing/humming
    Jang, Jyh-Shing Roger
    Lee, Hong-Ru
    [J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2006, PROCEEDINGS, 2006, 4261 : 971 - 978
  • [9] Fast Query-by-Singing/Humming System That Combines Linear Scaling and Quantized Dynamic Time Warping Algorithm
    Nam, Gi Pyo
    Park, Kang Ryoung
    [J]. INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2015,
  • [10] A Query-by-Singing System for Retrieving Karaoke Music
    Yu, Hung-Ming
    Tsai, Wei-Ho
    Wang, Hsin-Min
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2008, 10 (08) : 1626 - 1637