On supervised LPC estimation training targets for augmented Kalman filter-based speech enhancement

被引:3
|
作者
Roy, Sujan Kumar [1 ]
Nicolson, Aaron [2 ]
Paliwal, Kuldip K. [1 ]
机构
[1] Griffith Univ, Signal Proc Lab, Nathan, Qld 4111, Australia
[2] CSIRO, Australian eHlth Res Ctr, Herston, Qld 4006, Australia
关键词
Speechenhancement; AugmentedKalmanfilter; Linearpredicationcoefficients; Trainingtargets; Temporalconvolutionalnetwork; Multi-headattentionnetwork; DEEP LEARNING APPROACH; HEAD SELF-ATTENTION; COLORED-NOISE; QUALITY;
D O I
10.1016/j.specom.2022.06.004
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The performance of speech coding, speech recognition, and speech enhancement systems that rely on the augmented Kalman filter (AKF) largely depend upon the accuracy of clean speech and noise linear prediction coefficient (LPC) estimation. The formulation of clean speech and noise LPC estimation as a supervised learning task has shown considerable promise as of late. Generally, a deep neural network (DNN) learns to map noisy speech features to a training target that can be used for clean speech and noise LPC estimation. Such training targets fall into four categories: Line spectrum frequency (LSF), LPC power spectrum (LPC-PS), power spectrum (PS), and magnitude spectrum (MS) training targets. The choice of training target can have a significant impact on LPC estimation accuracy. Motivated by this, we perform a comprehensive study of the training targets with the aim of determining which is best for LPC estimation. To this end, we evaluate each training target using a temporal convolutional network (TCN) and a multi-head attention-based network. A large training set constructed from a wide variety of conditions, including real-world non-stationary and coloured noise sources over a range of signal-to-noise ratio (SNR) levels, is used for training. Testing on the NOIZEUS corpus demonstrates that the LPC-PS as the training target produces the lowest clean speech LPC spectral distortion (SD) level. We also construct the augmented Kalman filter (AKF) with the estimated speech and noise LPC parameters of each training target. Subjective AB listening tests and seven objective quality and intelligibility evaluation measures (CSIG, CBAK, COVL, PESQ, STOI, SegSNR, and SI-SDR) revealed that the LPC-PS training target produced enhanced speech at the highest quality and intelligibility amongst the training targets.
引用
收藏
页码:49 / 60
页数:12
相关论文
共 50 条
  • [1] Iterative and sequential Kalman filter-based speech enhancement algorithms
    Gannot, S
    Burshtein, D
    Weinstein, E
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (04): : 373 - 385
  • [2] Speech Enhancement by Kalman Filtering with a Particle Filter-Based Preprocessor
    Lee, Yun-Kyung
    Jung, Gyeo-Woon
    Kwon, Oh-Wook
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2013, : 340 - 341
  • [3] DeepLPC: A Deep Learning Approach to Augmented Kalman Filter-Based Single-Channel Speech Enhancement
    Roy, Sujan Kumar
    Nicolson, Aaron
    Paliwal, Kuldip K.
    [J]. IEEE ACCESS, 2021, 9 : 64524 - 64538
  • [4] Improved colored noise handling in Kalman filter-based speech enhancement algorithms
    Mustiere, Frederic
    Bolic, Miodrag
    Bouchard, Martin
    [J]. 2008 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-4, 2008, : 472 - 475
  • [5] DeepLPC-MHANet: Multi-Head Self-Attention for Augmented Kalman Filter-Based Speech Enhancement
    Roy, Sujan Kumar
    Nicolson, Aaron
    Paliwal, Kuldip K.
    [J]. IEEE ACCESS, 2021, 9 : 70516 - 70530
  • [6] Improved Kalman filter-based speech enhancement with perceptual post-filtering
    Wei, JQ
    Du, LM
    Yan, ZL
    Hui, Z
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2004, 13 (02) : 300 - 304
  • [7] Deep Residual Network-Based Augmented Kalman Filter for Speech Enhancement
    Roy, Sujan Kumar
    Paliwal, Kuldip K.
    [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 667 - 673
  • [8] LPC-based formant enhancement method in Kalman filtering for speech enhancement
    Mellahi, Tarek
    Hamdi, Rachid
    [J]. AEU-INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATIONS, 2015, 69 (02) : 545 - 554
  • [9] Extended Kalman Filter-Based Parallel Dynamic State Estimation
    Karimipour, Hadis
    Dinavahi, Venkata
    [J]. IEEE TRANSACTIONS ON SMART GRID, 2015, 6 (03) : 1539 - 1549
  • [10] Extended Kalman filter-based state estimation of MOSFET circuit
    Bansal, Rahul
    Majumdar, Sudipta
    [J]. COMPEL-THE INTERNATIONAL JOURNAL FOR COMPUTATION AND MATHEMATICS IN ELECTRICAL AND ELECTRONIC ENGINEERING, 2019, 38 (06) : 1885 - 1903