ARTIFICIAL BANDWIDTH EXTENSION USING CONDITIONAL VARIATIONAL AUTO-ENCODERS AND ADVERSARIAL LEARNING

被引:0
|
作者
Bachhav, Pramod [1 ]
Todisco, Massimiliano [1 ]
Evans, Nicholas [1 ]
机构
[1] EURECOM, Sophia Antipolis, France
关键词
variational auto-encoder; generative adversarial network; latent variable; artificial bandwidth extension; speech quality; NETWORK;
D O I
10.1109/icassp40776.2020.9053737
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Artificial bandwidth extension (ABE) algorithms have been developed to estimate missing highband frequency components (4-8kHz) to improve quality of narrowband (0-4kHz) telephone calls. Most ABE solutions employ deep neural networks (DNNs) due to their well-known ability to model highly complex, non-linear relationship between narrowband and highband features. Generative models such as conditional variational auto-encoders (CVAEs) are capable of modelling complex data distributions via latent representation learning. This paper reports their application to ABE. CVAEs, form of directed, graphical models, are exploited to model the probability distribution of highband features conditioned on narrowband features. While CVAEs are trained with the standard mean square criterion (MSE), their combination with adversarial learning give further improvements. When compared to results obtained with the baseline approach, the wideband PESQ is improved significantly by 0.21 points. The performance is also compared on an automatic speech recognition (ASR) task on the TIMIT dataset where word error rate (WER) is decreased by an absolute value of 0.3%.
引用
收藏
页码:6924 / 6928
页数:5
相关论文
共 50 条
  • [1] Artificial Bandwidth Extension with Memory Inclusion using Semi-supervised Stacked Auto-encoders
    Bachhav, Pramod
    Todisco, Massimiliano
    Evans, Nicholas
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1185 - 1189
  • [2] LATENT REPRESENTATION LEARNING FOR ARTIFICIAL BANDWIDTH EXTENSION USING A CONDITIONAL VARIATIONAL AUTO-ENCODER
    Bachhav, Pramod
    Todisco, Massimiliano
    Evans, Nicholas
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7010 - 7014
  • [3] Generation and Extraction of Color Palettes with Adversarial Variational Auto-Encoders
    Moussa, Ahmad
    Watanabe, Hiroshi
    [J]. PROCEEDINGS OF SIXTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICICT 2021), VOL 2, 2022, 236 : 889 - 897
  • [4] Time-sequential variational conditional auto-encoders for recommendation
    Hozumi J.
    Iwasawa Y.
    Matsuo Y.
    [J]. 1600, Japanese Society for Artificial Intelligence (36):
  • [5] Audio-Visual Speech Enhancement Using Conditional Variational Auto-Encoders
    Sadeghi, Mostafa
    Leglaive, Simon
    Alameda-Pineda, Xavier
    Girin, Laurent
    Horaud, Radu
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1788 - 1800
  • [6] Adaptive Augmentation of Medical Data Using Independently Conditional Variational Auto-Encoders
    Pesteie, Mehran
    Abolmaesumi, Purang
    Rohling, Robert N.
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2019, 38 (12) : 2807 - 2820
  • [7] Correlated Variational Auto-Encoders
    Tang, Da
    Liang, Dawen
    Jebara, Tony
    Ruozzi, Nicholas
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [8] Hyperspherical Variational Auto-Encoders
    Davidson, Tim R.
    Falorsi, Luca
    De Cao, Nicola
    Kipf, Thomas
    Tomczak, Jakub M.
    [J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2018, : 856 - 865
  • [9] Adversarial Training of Variational Auto-encoders for High Fidelity Image Generation
    Khan, Salman H.
    Hayat, Munawar
    Barnes, Nick
    [J]. 2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, : 1312 - 1320
  • [10] Nonparametric Variational Auto-encoders for Hierarchical Representation Learning
    Goyal, Prasoon
    Hu, Zhiting
    Liang, Xiaodan
    Wang, Chenyu
    Xing, Eric P.
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5104 - 5112