Bone-Conducted Speech to Air-Conducted Speech Conversion Based on Cycle-Consistent Adversarial Networks

被引:0
|
作者
Pan, Qing [1 ]
Zhou, Jian [1 ]
Gao, Teng [1 ]
Tao, Liang [1 ]
机构
[1] Anhui Univ, Minist Educ, Key Lab Intelligent Comp & Signal Proc, Hefei, Peoples R China
基金
中国国家自然科学基金;
关键词
bone-conducted speech conversion; Cyc1eGAN; high-frequency reconstruction; bandwidth extension;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Compared with traditional Air-Conducted Microphone (ACM) speech, Bone-Conducted Microphone (BCM) speech has the advantage of shielding background noise and helps to improve the communication quality in the strong noise environment. This paper proposes a method that uses Cycle-Consistent Adversarial Networks (Cyc1eGAN) to extend the bandwidth for converting BCM speech to ACM speech based on the analysis of the bandwidth difference. The proposed method learns the mapping relationship between BCM speech and ACM speech without relying on parallel data, and does not require any additional data, modules or alignment process, it also avoids the over smoothing that is easy to appear in many statistical models. The experimental results show that the method can better reconstruct the high-frequency components of BCM speech. Compared with the original speech, it improves the subjective and objective results, and obtains Melspectrum features with higher similarity to the target speech.
引用
收藏
页码:168 / 172
页数:5
相关论文
共 50 条
  • [1] Amplitude variation of bone-conducted speech compared with air-conducted speech
    Rahman, M. Shahidur
    Shimamura, Tetsuya
    [J]. ACOUSTICAL SCIENCE AND TECHNOLOGY, 2019, 40 (05) : 293 - 301
  • [2] Fundamental Frequency Estimation Combining Air-Conducted Speech with Bone-Conducted Speech in Noisy Environment
    Zhang, Shiming
    Sugiura, Yosuke
    Shimamura, Tetsuya
    Makinae, Hisanori
    [J]. 2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION ENGINEERING (ECCE), 2017, : 244 - 247
  • [3] ATTENTION-BASED FUSION FOR BONE-CONDUCTED AND AIR-CONDUCTED SPEECH ENHANCEMENT IN THE COMPLEX DOMAIN
    Wang, Heming
    Zhang, Xueliang
    Wang, DeLiang
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7757 - 7761
  • [4] MULTISENSORY SPEECH ENHANCEMENT IN NOISY ENVIRONMENTS USING BONE-CONDUCTED AND AIR-CONDUCTED MICROPHONES
    Li, Mingzi
    Cohen, Israel
    Mousazadeh, Saman
    [J]. 2014 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (CHINASIP), 2014, : 1 - 5
  • [5] Conversion of airborne to bone-conducted speech with deep neural networks
    Pucher, Michael
    Woltron, Thomas
    [J]. INTERSPEECH 2021, 2021, : 1 - 5
  • [6] CLINICAL MASKING OF AIR-CONDUCTED AND BONE-CONDUCTED STIMULI
    STUDEBAKER, GA
    [J]. JOURNAL OF SPEECH AND HEARING DISORDERS, 1964, 29 (01): : 23 - 35
  • [7] PHASE AND INTENSITY RELATIONSHIPS IN THE INTERFERENCE OF BONE-CONDUCTED AND AIR-CONDUCTED SOUND
    DOLCH, JP
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1954, 26 (05): : 942 - 942
  • [8] Comparison of brain magnetic fields evoked by air-conducted sounds, bone-conducted audible sounds, and bone-conducted ultrasounds
    Nakagawa, S
    Nigoro, T
    Yamaguchi, M
    Tonoike, M
    Hosoi, H
    Watanabe, Y
    Imaizumi, S
    [J]. NEUROIMAGE, 2001, 13 (06) : S915 - S915
  • [9] A lightweight speech enhancement network fusing bone- and air-conducted speech
    Kuang, Kelan
    Yang, Feiran
    Yang, Jun
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2024, 156 (02): : 1355 - 1366
  • [10] Quality improvement of bone-conducted speech
    Shimamura, T
    Tomikura, T
    [J]. PROCEEDINGS OF THE 2005 EUROPEAN CONFERENCE ON CIRCUIT THEORY AND DESIGN, VOL 3, 2005, : 73 - 76