Single-channel speech separation using combined EMD and speech-specific information

被引:9
|
作者
Prasanna Kumar M.K. [1 ]
Kumaraswamy R. [2 ]
机构
[1] BMS College of Engineering, Bangalore, 560019, Karnataka
[2] Siddaganga Institute of Technology, Tumkur, 572103, Karnataka
关键词
BSS; EMD; IMF; Multi pitch information; SCSS; SIFT;
D O I
10.1007/s10772-017-9468-3
中图分类号
学科分类号
摘要
Multi-channel blind source separation (BSS) methods use more than one microphone. There is a need to develop speech separation algorithms under single microphone scenario. In this paper we propose a method for single channel speech separation (SCSS) by combining empirical mode decomposition (EMD) and speech specific information. Speech specific information is derived in the form of source-filter features. Source features are obtained using multi pitch information. Filter information is estimated using formant analysis. To track multi pitch information in the mixed signal we apply simple-inverse filtering tracking (SIFT) and histogram based pitch estimation to excitation source information. Formant estimation is done using linear predictive (LP) analysis. Pitch and formant estimation are done with and without EMD decomposition for better extraction of the individual speakers in the mixture. Combining EMD with speech specific information provides encouraging results for single-channel speech separation. © 2017, Springer Science+Business Media, LLC.
引用
收藏
页码:1037 / 1047
页数:10
相关论文
共 50 条
  • [31] Linear regression on sparse features for single-channel speech separation
    Schmidt, Mikkel N.
    Olsson, Rasmus K.
    [J]. 2007 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, 2007, : 149 - 152
  • [32] Sinusoidal Approach for the Single-Channel Speech Separation and Recognition Challenge
    Mowlaee, P.
    Saeidi, R.
    Tan, Z. -H.
    Christensen, M. G.
    Kinnunen, T.
    Franti, P.
    Jensen, S. H.
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 684 - +
  • [33] A Joint Approach for Single-Channel Speaker Identification and Speech Separation
    Mowlaee, Pejman
    Saeidi, Rahim
    Christensen, Mads Grsboll
    Tan, Zheng-Hua
    Kinnunen, Tomi
    Franti, Pasi
    Jensen, Soren Holdt
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (09): : 2586 - 2601
  • [34] Weak Speech Recovery for Single-Channel Speech Enhancement
    Wong, Arthur
    Ming, Kok
    Low, Siow Yong
    [J]. 2012 4TH INTERNATIONAL CONFERENCE ON INTELLIGENT AND ADVANCED SYSTEMS (ICIAS), VOLS 1-2, 2012, : 627 - 631
  • [35] Single-channel Speech Separation Using Dictionary-updated Orthogonal Matching Pursuit and Temporal Structure Information
    Haiyan Guo
    Xiaoxiong Li
    Lin Zhou
    Zhenyang Wu
    [J]. Circuits, Systems, and Signal Processing, 2015, 34 : 3861 - 3882
  • [36] Single-channel Speech Separation Using Dictionary-updated Orthogonal Matching Pursuit and Temporal Structure Information
    Guo, Haiyan
    Li, Xiaoxiong
    Zhou, Lin
    Wu, Zhenyang
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2015, 34 (12) : 3861 - 3882
  • [37] Single-channel speech separation using empirical mode decomposition and multi pitch information with estimation of number of speakers
    Prasanna Kumar M.K.
    Kumaraswamy R.
    [J]. International Journal of Speech Technology, 2017, 20 (1) : 109 - 125
  • [38] CATALOG-BASED SINGLE-CHANNEL SPEECH-MUSIC SEPARATION FOR AUTOMATIC SPEECH RECOGNITION
    Demir, Cemil
    Cemgil, A. Taylan
    Saraclar, Murat
    [J]. 19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 2133 - 2137
  • [39] Semi-supervised Single-Channel Speech-Music Separation for Automatic Speech Recognition
    Demir, Cemil
    Cemgil, A. Taylan
    Saraclar, Murat
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 688 - +
  • [40] SINGLE-CHANNEL SPEECH SEPARATION INTEGRATING PITCH INFORMATION BASED ON A MULTI TASK LEARNING FRAMEWORK
    Li, Xiang
    Liu, Rui
    Song, Tao
    Wu, Xihong
    Chen, Jing
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7279 - 7283