COMPLEX-VALUED GAUSSIAN PROCESS LATENT VARIABLE MODEL FOR PHASE-INCORPORATING SPEECH ENHANCEMENT

被引:0
|
作者
Chen, Sih-Huei [1 ]
Lee, Yuan-Shan [1 ]
Wang, Jia-Ching [1 ]
机构
[1] Natl Cent Univ, Dept Comp Sci & Informat Engn, Taoyuan, Taiwan
关键词
Phase; complex-valued Gaussian process latent variable model; binary mask; QUALITY;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Traditional speech enhancement techniques modify the magnitude of a speech in time-frequency domain, and use the phase of a noisy speech to resynthesize a time domain speech. This work proposes a complex-valued Gaussian process latent variable model (CGPLVM) to enhance directly the complex-valued noisy spectrum, modifying not only the magnitude but also the phase. The main idea that underlies the developed method is the modeling of short-time Fourier transform (STFT) coefficients across the time frames of a speech as a proper complex Gaussian process (GP) with noise added. The proposed method is based on projecting the spectrum into a low-dimensional subspace. Experiments were carried out on the CHTTL database, which contains the digits zero to nine in Mandarin. Several standard measures are used to demonstrate that the proposed method outperforms baselines with various types of noise and SNR levels.
引用
收藏
页码:5439 / 5443
页数:5
相关论文
共 50 条
  • [1] LOCALITY-PRESERVING COMPLEX-VALUED GAUSSIAN PROCESS LATENT VARIABLE MODEL FOR ROBUST FACE RECOGNITION
    Chen, Sih-Huei
    Lee, Yuan-Shan
    Hsu, Yu-Sheng
    Wu, Chung-Hsien
    Wang, Jia-Ching
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2696 - 2700
  • [2] Locality Preserving Discriminative Complex-Valued Latent Variable Model
    Chen, Sih-Huei
    Lee, Yuan-Shan
    Wang, Jia-Ching
    [J]. 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 1169 - 1174
  • [3] On the appropriateness of complex-valued neural networks for speech enhancement
    Drude, Lukas
    Raj, Bhiksha
    Haeb-Umbach, Reinhold
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1745 - 1749
  • [4] Complex-valued temporal convolutional network for speech enhancement
    Song, Jiaqi
    Zou, Lian
    Zhou, Liqing
    Liu, Ziao
    Fan, Cien
    Wang, Bin
    [J]. INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2024, 22 (05)
  • [5] COMPLEX-VALUED SPATIAL AUTOENCODERS FOR MULTICHANNEL SPEECH ENHANCEMENT
    Halimeh, Mhd Modar
    Kellermann, Walter
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 261 - 265
  • [6] A Bayesian complex-valued latent variable model applied to functional magnetic resonance imaging
    Sakitis, Chase J.
    Brown, D. Andrew
    Rowe, Daniel B.
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2024,
  • [7] CFTNet: Complex-valued Frequency Transformation Network for Speech Enhancement
    Mamun, Nursadul
    Hansen, John H. L.
    [J]. INTERSPEECH 2023, 2023, : 809 - 813
  • [8] Complex-valued gaussian process regression for time series analysis
    Ambrogioni, Luca
    Maris, Eric
    [J]. SIGNAL PROCESSING, 2019, 160 : 215 - 228
  • [9] Complex-valued independent vector analysis: Application to multivariate Gaussian model
    Anderson, Matthew
    Li, Xi-Lin
    Adali, Tuelay
    [J]. SIGNAL PROCESSING, 2012, 92 (08) : 1821 - 1831
  • [10] Supervised Gaussian Process Latent Variable Model Based on Gaussian Mixture Model
    Zhang, Jiayuan
    Zhu, Ziqi
    Zou, Jixin
    [J]. 2017 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), 2017, : 124 - 129