Single channel speech separation in modulation frequency domain based on a novel pitch range estimation method

被引：0

作者：

Azar Mahmoodzadeh

Hamid Reza Abutalebi

Hamid Soltanian-Zadeh

Hamid Sheikhzadeh

机构：

[1] Yazd University,Speech Processing Research Lab (SPRL), Electrical and Computer Engineering Department

[2] University of Tehran,Control and Intelligent Processing Center of Excellence (CIPCE), School of Electrical and Computer Engineering

[3] Henry Ford Health System,Image Analysis Laboratory, Department of Radiology

[4] Amirkabir University of Technology,Electrical Engineering Department

来源：

EURASIP Journal on Advances in Signal Processing | / 2012卷

关键词：

acoustic frequency; modulation frequency; onset and offset algorithm; pitch range estimation; speech separation;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Computational Auditory Scene Analysis (CASA) has been the focus in recent literature for speech separation from monaural mixtures. The performance of current CASA systems on voiced speech separation strictly depends on the robustness of the algorithm used for pitch frequency estimation. We propose a new system that estimates pitch (frequency) range of a target utterance and separates voiced portions of target speech. The algorithm, first, estimates the pitch range of target speech in each frame of data in the modulation frequency domain, and then, uses the estimated pitch range for segregating the target speech. The method of pitch range estimation is based on an onset and offset algorithm. Speech separation is performed by filtering the mixture signal with a mask extracted from the modulation spectrogram. A systematic evaluation shows that the proposed system extracts the majority of target speech signal with minimal interference and outperforms previous systems in both pitch extraction and voiced speech separation.

引用

共 50 条

[41] An Unambiguous Radial Velocity Estimation Method based on Interferometric Phase in Range Frequency Domain
Zhang, X. P.
Liao, G. S.
Zhu, S. Q.
Xu, J. W.
[J]. 2013 14TH INTERNATIONAL RADAR SYMPOSIUM (IRS), VOLS 1 AND 2, 2013, : 543 - 548
[42] Hybrid Approach to Single-Channel Speech Separation Based on Coherent-Incoherent Modulation Filtering
Mahmoodzadeh, Azar
Abutalebi, Hamid Reza
[J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2017, 36 (05) : 1970 - 1988
[43] Open Range Pitch Tracking for Carrier Frequency Difference Estimation from HF Transmitted Speech
Schmalenstroeer, Joerg
Heitkaemper, Jens
Ullmann, Joerg
Haeb-Umbach, Reinhold
[J]. 29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 1 - 5
[44] Application of Deep Learning-based Single-channel Speech Enhancement for Frequency-modulation Transmitted Speech
Ma, Ying
Zhang, Xueliang
[J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1064 - 1069
[45] Evaluating single-channel speech separation performance in transform-domain
Mowlaee, Pejman
Sayadiyan, Abolghasem
Sheikhzadeh, Hamid
[J]. JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2010, 11 (03): : 160 - 174
[46] Evaluating single-channel speech separation performance in transform-domain
Pejman MOWLAEE
Abolghasem SAYADIYAN
Hamid SHEIKHZADEH
[J]. Frontiers of Information Technology & Electronic Engineering, 2010, (03) : 160 - 174
[47] Modulation-domain Kalman filtering for single-channel speech enhancement
So, Stephen
Paliwal, Kuldip K.
[J]. SPEECH COMMUNICATION, 2011, 53 (06) : 818 - 829
[48] New Results in Modulation-Domain Single-Channel Speech Enhancement
Mowlaee, Pejman
Blass, Martin
Kleijn, W. Bastiaan
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (11) : 2125 - 2137
[49] Evaluating single-channel speech separation performance in transform-domain
Pejman Mowlaee
Abolghasem Sayadiyan
Hamid Sheikhzadeh
[J]. Journal of Zhejiang University SCIENCE C, 2010, 11 : 160 - 174
[50] Single-channel speech enhancement using Kalman filtering in the modulation domain
So, Stephen
Wojcicki, Kamil K.
Paliwal, Kuldip K.
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 993 - 996

← 1 2 3 4 5 →