Mask Estimation Using Phase Information and Inter-channel Correlation for Speech Enhancement

被引:1
|
作者
Sowjanya, Devi [1 ]
Sivapatham, Shoba [2 ]
Kar, Asutosh [1 ]
Mladenovic, Vladimir [3 ]
机构
[1] IIITDM Kancheepuram, Dept Elect & Commun Engn, Chennai 600127, Tamil Nadu, India
[2] VIT Univ, Sch Elect Engn, Chennai 600127, Tamil Nadu, India
[3] Univ Kragujevac, Fac Tech Sci Cacak, Cacak, Serbia
关键词
Inter-channel correlation; Phase difference; Deep neural network; Speech enhancement; Ideal ratio mask; Objective metrics; TRAINING TARGETS; NOISE; RATIO; INTELLIGIBILITY; SEPARATION; ALGORITHM; DATABASE;
D O I
10.1007/s00034-022-01981-0
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The most commonly used training target is masking-based approach which maps noisy speech to the time-frequency (T-F) unit and has a remarkable impact on the performance in the supervised learning algorithms. Traditional T-F masks like ideal ratio mask (IRM) demonstrate a strong performance but are limited to only the magnitude domain in enhancement. Though bounded IRM with phase constraint (BIRMP) includes phase difference but doesn't exploit channel correlation, the proposed ratio mask (pRM) considers channel correlation but is computed only in the magnitude domain. This work proposes a new mask, i.e., phase correlation ideal ratio mask (PCIRM), which includes both inter-channel correlation and phase difference between the noisy speech (N-S), noise (N) and clean speech (C-S). Considering these factors increases the percentage of C-S and readily decreases the percentage of unwanted noise in the speech components and conversely for the noise components making the mask more precise. The experimental results are conducted under different SNR levels using TIMIT dataset and NOISEX-92 dataset and also compared with the existing state-of-the-art approaches. The results prove that the proposed mask has higher performance than BIRMP and pRM in terms of speech quality and intelligibility.
引用
收藏
页码:4117 / 4135
页数:19
相关论文
共 50 条
  • [41] A Hybrid Polarization Image Demosaicking Algorithm Based on Inter-Channel Correlation
    Lu, Yang
    Tian, Jiandong
    Su, Yiming
    Luo, Yidong
    Zhang, Junchao
    Hao, Chunhui
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2024, 10 : 1400 - 1413
  • [42] Using inter-channel correlation in blind evaluation of noise characteristics in multichannel remote sensing images
    Abramova, Victoriya V.
    Abramov, Sergey K.
    Lukin, Vladimir V.
    Vozel, Benoit
    Chehdi, Kacem
    [J]. IMAGE AND SIGNAL PROCESSING FOR REMOTE SENSING XXIV, 2018, 10789
  • [43] Denoising Algorithm for CFA Image Sensors Considering Inter-Channel Correlation
    Lee, Min Seok
    Park, Sang Wook
    Kang, Moon Gi
    [J]. SENSORS, 2017, 17 (06):
  • [44] Reversible Watermarking on Stereo Audio Signals by Exploring Inter-Channel Correlation
    Wu, Yuanxin
    Diao, Wen
    Hou, Dongdong
    Zhang, Weiming
    [J]. INTERNATIONAL JOURNAL OF DIGITAL CRIME AND FORENSICS, 2019, 11 (01) : 29 - 45
  • [45] Exploring Inter-Channel Correlation for Diversity-preserved Knowledge Distillation
    Liu, Li
    Huang, Qingle
    Lin, Sihao
    Xie, Hongwei
    Wang, Bing
    Chang, Xiaojun
    Liang, Xiaodan
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8251 - 8260
  • [46] Label Enhancement Using Inter-example Correlation Information
    Li, Chong
    Tan, Chao
    Qin, Qin
    Ji, Genlin
    [J]. PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT II, 2022, 13630 : 88 - 101
  • [47] Deep neural network based speech enhancement using mono channel mask
    Pallavi P. Ingale
    Sanjay L. Nalbalwar
    [J]. International Journal of Speech Technology, 2019, 22 : 841 - 850
  • [48] Deep neural network based speech enhancement using mono channel mask
    Ingale, Pallavi P.
    Nalbalwar, Sanjay L.
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (03) : 841 - 850
  • [49] On Speech Intelligibility Estimation of Phase-Aware Single-Channel Speech Enhancement
    Gaich, Andreas
    Mowlaee, Pejman
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2553 - 2557
  • [50] ON PHASE IMPORTANCE IN PARAMETER ESTIMATION IN SINGLE-CHANNEL SPEECH ENHANCEMENT
    Mowlaee, Pejman
    Saeidi, Rahim
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7462 - 7466