Single-channel speech separation using empirical mode decomposition and multi pitch information with estimation of number of speakers

被引:7
|
作者
Prasanna Kumar M.K. [1 ]
Kumaraswamy R. [2 ]
机构
[1] BMS College of Engineering, Bangalore, 560019, Karnataka
[2] Siddaganga Institute of Technology, Tumkur, 572103, Karnataka
关键词
BSS; EMD; Excitation source; IMF; LP analysis; Multi pitch information; SCSS; SIFT;
D O I
10.1007/s10772-016-9392-y
中图分类号
学科分类号
摘要
Speech separation is an essential part of any voice recognition system like speaker recognition, speech recognition and hearing aids etc. When speech separation is applied at the front-end of any voice recognition system increases the performance efficiency of that particular system. In this paper we propose a system for single channel speech separation by combining empirical mode decomposition (EMD) and multi pitch information. The proposed method is completely unsupervised and requires no knowledge of the underlying speakers. In this method we apply EMD to short frames of the mixed speech for better estimation of the speech specific information. Speech specific information is derived through multi pitch tracking. To track multi pitch information from the mixed signal we apply simple-inverse filtering tracking and histogram based pitch estimation to excitation source information along with estimating the number of speakers present in the mixed signal. © 2016, Springer Science+Business Media New York.
引用
收藏
页码:109 / 125
页数:16
相关论文
共 50 条
  • [1] Source-Filter-Based Single-Channel Speech Separation Using Pitch Information
    Stark, Michael
    Wohlmayr, Michael
    Pernkopf, Franz
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (02): : 242 - 255
  • [2] Single-channel Multi-speakers Speech Separation Based on Isolated Speech Segments
    Shanfa Ke
    Zhongyuan Wang
    Ruimin Hu
    Xiaochen Wang
    [J]. Neural Processing Letters, 2023, 55 : 385 - 400
  • [3] SINGLE-CHANNEL SPEECH SEPARATION INTEGRATING PITCH INFORMATION BASED ON A MULTI TASK LEARNING FRAMEWORK
    Li, Xiang
    Liu, Rui
    Song, Tao
    Wu, Xihong
    Chen, Jing
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7279 - 7283
  • [4] Single-channel Multi-speakers Speech Separation Based on Isolated Speech Segments
    Ke, Shanfa
    Wang, Zhongyuan
    Hu, Ruimin
    Wang, Xiaochen
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (01) : 385 - 400
  • [5] Pitch Estimation of Noisy Speech Signals using Empirical Mode Decomposition
    Molla, Md. Khademul Islam
    Hirose, Keikichi
    Minematsu, Nobuaki
    Hasan, Md. Kamrul
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2177 - +
  • [6] SINGLE-CHANNEL SPEECH SEPARATION BY USING A SPARSE DECOMPOSITION WITH PERIODIC STRUCTURE
    Nakashizuka, Makoto
    Okumura, Hiroyuki
    Iiguni, Youji
    [J]. 2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATIONS SYSTEMS (ISPACS 2008), 2008, : 339 - 342
  • [7] A PITCH-AWARE APPROACH TO SINGLE-CHANNEL SPEECH SEPARATION
    Wang, Ke
    Soong, Frank
    Xie, Lei
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 296 - 300
  • [8] EEG single-channel seizure recognition using Empirical Mode Decomposition and Normalized Mutual Information
    Guarnizo, Cristian
    Delgado, Edilson
    [J]. 2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 1749 - 1752
  • [9] Harmonic Phase Estimation in Single-Channel Speech Enhancement Using Phase Decomposition and SNR Information
    Mowlaee, Pejman
    Kulmer, Josef
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (09) : 1521 - 1532
  • [10] Single Channel speech separation based on empirical mode decomposition and Hilbert Transform
    Krishna, Prasanna Kumar Mundodu
    Ramaswamy, Kumaraswamy
    [J]. IET SIGNAL PROCESSING, 2017, 11 (05) : 579 - 586