Single Channel Speech Source Separation Using Hierarchical Deep Neural Networks

被引：0

作者：

Noorani, Seyed Majid ^{[1
]}

Seyedin, Sanaz ^{[1
]}

机构：

[1] Amirkabir Univ Technol, Dept Elect Engn, Tehran, Iran

来源：

2020 28TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE) | 2020年

关键词：

Speech source separation; Deep neural networks; Time-frequency masks; BLIND SEPARATION;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Single-channel speech source separation is a well known task for preparing speech signals for some applications like speech recognition and enhancement. In this paper, we introduce a novel design for separating sources with the help of hierarchical deep neural networks and time-frequency masks. The proposed method classifies the mixture signals in three categories based on the mixed genders in the first hierarchy. Thus, three other networks, each for a specific mixture type, use these categorized data for speech separation. Then, an enhancement stage improves the quality of voices considering an improved cost function that reduces the interference of the estimated sources of the previous stage. The demanded data is gathered from TSP corpus and the output of the systems have been evaluated with different metrics such as signal to distortion ratio (SDR), signal to interference ratio (SIR) and Perceptual evaluation of speech quality (PESQ). Comparing with other methods, the proposed architecture works considerably better and the results are outstanding.

引用

页码：466 / 470

页数：5

共 50 条

[1] DEEP NEURAL NETWORKS FOR SINGLE CHANNEL SOURCE SEPARATION
Grais, Emad M.
Sen, Mehmet Umut
Erdogan, Hakan
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[2] Single Channel Speech Separation Using Deep Neural Network
Chen, Linlin
Ma, Xiaohong
Ding, Shuxue
[J]. ADVANCES IN NEURAL NETWORKS, PT I, 2017, 10261 : 285 - 292
[3] Discriminative Enhancement for Single Channel Audio Source Separation Using Deep Neural Networks
Grais, Emad M.
Roma, Gerard
Simpson, Andrew J. R.
Plumbley, Mark D.
[J]. LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION (LVA/ICA 2017), 2017, 10169 : 236 - 246
[4] Combining Mask Estimates for Single Channel Audio Source Separation using Deep Neural Networks
Grais, Emad M.
Roma, Gerard
Simpson, Andrew J. R.
Plumbley, Mark D.
[J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3339 - 3343
[5] Towards Automated Single Channel Source Separation using Neural Networks
Gang, Arpita
Biyani, Pravesh
Soni, Akshay
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3494 - 3498
[6] Two-Stage Single-Channel Audio Source Separation Using Deep Neural Networks
Grais, Emad M.
Roma, Gerard
Simpson, Andrew J. R.
Plumbley, Mark D.
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (09) : 1469 - 1479
[7] JOINT TRAINING OF DEEP NEURAL NETWORKS FOR MULTI-CHANNEL DEREVERBERATION AND SPEECH SOURCE SEPARATION
Togami, Masahito
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3032 - 3036
[8] SINGLE-CHANNEL MIXED SPEECH RECOGNITION USING DEEP NEURAL NETWORKS
Weng, Chao
Yu, Dong
Seltzer, Michael L.
Droppo, Jasha
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[9] BITWISE NEURAL NETWORKS FOR EFFICIENT SINGLE-CHANNEL SOURCE SEPARATION
Kim, Minje
Smaragdis, Paris
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 701 - 705
[10] A Gender Mixture Detection Approach to Unsupervised Single-Channel Speech Separation Based on Deep Neural Networks
Wang, Yannan
Du, Jun
Dai, Li-Rong
Lee, Chin-Hui
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (07) : 1535 - 1546

← 1 2 3 4 5 →