Mask Estimation Using Phase Information and Inter-channel Correlation for Speech Enhancement

被引:1
|
作者
Sowjanya, Devi [1 ]
Sivapatham, Shoba [2 ]
Kar, Asutosh [1 ]
Mladenovic, Vladimir [3 ]
机构
[1] IIITDM Kancheepuram, Dept Elect & Commun Engn, Chennai 600127, Tamil Nadu, India
[2] VIT Univ, Sch Elect Engn, Chennai 600127, Tamil Nadu, India
[3] Univ Kragujevac, Fac Tech Sci Cacak, Cacak, Serbia
关键词
Inter-channel correlation; Phase difference; Deep neural network; Speech enhancement; Ideal ratio mask; Objective metrics; TRAINING TARGETS; NOISE; RATIO; INTELLIGIBILITY; SEPARATION; ALGORITHM; DATABASE;
D O I
10.1007/s00034-022-01981-0
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The most commonly used training target is masking-based approach which maps noisy speech to the time-frequency (T-F) unit and has a remarkable impact on the performance in the supervised learning algorithms. Traditional T-F masks like ideal ratio mask (IRM) demonstrate a strong performance but are limited to only the magnitude domain in enhancement. Though bounded IRM with phase constraint (BIRMP) includes phase difference but doesn't exploit channel correlation, the proposed ratio mask (pRM) considers channel correlation but is computed only in the magnitude domain. This work proposes a new mask, i.e., phase correlation ideal ratio mask (PCIRM), which includes both inter-channel correlation and phase difference between the noisy speech (N-S), noise (N) and clean speech (C-S). Considering these factors increases the percentage of C-S and readily decreases the percentage of unwanted noise in the speech components and conversely for the noise components making the mask more precise. The experimental results are conducted under different SNR levels using TIMIT dataset and NOISEX-92 dataset and also compared with the existing state-of-the-art approaches. The results prove that the proposed mask has higher performance than BIRMP and pRM in terms of speech quality and intelligibility.
引用
收藏
页码:4117 / 4135
页数:19
相关论文
共 50 条
  • [1] Mask Estimation Using Phase Information and Inter-channel Correlation for Speech Enhancement
    Devi Sowjanya
    Shoba Sivapatham
    Asutosh Kar
    Vladimir Mladenovic
    [J]. Circuits, Systems, and Signal Processing, 2022, 41 : 4117 - 4135
  • [2] A Single Image Enhancement using Inter-channel Correlation
    Kim, Jin
    Jeong, Soowoong
    Kim, Yong-Ho
    Lee, Sangkeun
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2012, : 496 - 497
  • [3] Multiple Speech Source Separation Using Inter-Channel Correlation and Relaxed Sparsity
    Jia, Maoshen
    Sun, Jundai
    Zheng, Xiguang
    [J]. APPLIED SCIENCES-BASEL, 2018, 8 (01):
  • [4] Mask estimation incorporating phase-sensitive information for speech enhancement
    Wang, Xianyun
    Bao, Changchun
    [J]. APPLIED ACOUSTICS, 2019, 156 : 101 - 112
  • [5] Demosaicking using inter-channel correlation in wavelet domain
    Kim, Hyuk Su
    Jeong, Bo Gyu
    Kim, Sang Soo
    Eom, Il Kyu
    [J]. PROCEEDINGS OF THE NINTH IASTED INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING, 2007, : 109 - 114
  • [6] Target Speech Detection Based on Microphone Array Using Inter-channel Phase Differences
    Guo, Yanmeng
    Li, Kai
    Fu, Qiang
    Yan, Yonghong
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2012, : 247 - 248
  • [7] Estimation of Inter-Channel Phase Differences using Non-Negative Matrix Factorization
    Kayser, Hendrik
    Anemueller, Joern
    Adiloglu, Kamil
    [J]. 2014 IEEE 8TH SENSOR ARRAY AND MULTICHANNEL SIGNAL PROCESSING WORKSHOP (SAM), 2014, : 77 - 80
  • [8] Harmonic Phase Estimation in Single-Channel Speech Enhancement Using Phase Decomposition and SNR Information
    Mowlaee, Pejman
    Kulmer, Josef
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (09) : 1521 - 1532
  • [9] Phase Estimation in Single Channel Speech Enhancement Using Phase Decomposition
    Kulmer, Josef
    Mowlaee, Pejman
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (05) : 598 - 602
  • [10] A Demosaicking Algorithm with Adaptive Inter-Channel Correlation
    Duran, Joan
    Buades, Antoni
    [J]. IMAGE PROCESSING ON LINE, 2015, 5 : 311 - 327