An acoustic Doppler-based silent speech interface technology using generative adversarial networks

被引:0
|
作者
Lee, Ki-Seung [1 ]
机构
[1] Konkuk Univ, Dept Elect Engn, 120 Neungdong Ro, Seoul 05029, South Korea
来源
关键词
Silent speech interface; Generative adversarial networks; Ultrasonic Doppler; Speech synthesis;
D O I
10.7776/ASK.2021.40.2.161
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, a Silent Speech Interface (SSI) technology was proposed in which Doppler frequency shifts of the reflected signal were used to synthesize the speech signals when 40kHz ultrasonic signal was incident to speaker's mouth region. In SSI, the mapping rules from the features derived from non-speech signals to those from audible speech signals was constructed, the speech signals are synthesized from non-speech signals using the constructed mapping rules. The mapping rules were built by minimizing the overall errors between the estimated and true speech parameters in the conventional SSI methods. In the present study, the mapping rules were constructed so that the distribution of the estimated parameters is similar to that of the true parameters by using Generative Adversarial Networks (GAN). The experimental result using 60 Korean words showed that, both objectively and subjectively, the performance of the proposed method was superior to that of the conventional neural networks-based methods.
引用
收藏
页码:161 / 168
页数:8
相关论文
共 50 条
  • [1] Doppler-Based Acoustic Gyrator
    Zangeneh-Nejad, Farzad
    Fleury, Romain
    [J]. APPLIED SCIENCES-BASEL, 2018, 8 (07):
  • [2] Ultrasonic Doppler Based Silent Speech Interface Using Perceptual Distance
    Lee, Ki-Seung
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (02):
  • [3] Silent Speech Interface Using Ultrasonic Doppler Sonar
    Lee, Ki-Seung
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (08): : 1875 - 1887
  • [4] Emotional Speech Generator by using Generative Adversarial Networks
    Asakura, Takuya
    Akama, Shunsuke
    Shimokawara, Eri
    Yamaguchi, Toru
    Yamamoto, Shoji
    [J]. SOICT 2019: PROCEEDINGS OF THE TENTH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY, 2019, : 9 - 14
  • [5] ROBUST SPEECH RECOGNITION USING GENERATIVE ADVERSARIAL NETWORKS
    Sriram, Anuroop
    Jun, Heewoo
    Gaur, Yashesh
    Satheesh, Sanjeev
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5639 - 5643
  • [6] SPEECH BANDWIDTH EXTENSION USING GENERATIVE ADVERSARIAL NETWORKS
    Li, Sen
    Villette, Stephane
    Ramadas, Pravin
    Sinder, Daniel J.
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5029 - 5033
  • [7] On Enhancing Speech Emotion Recognition using Generative Adversarial Networks
    Sahu, Saurabh
    Gupta, Rahul
    Espy-Wilson, Carol
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3693 - 3697
  • [8] Blind Vocoder Speech Reconstruction using Generative Adversarial Networks
    Blum, Yoav
    Burshtein, David
    [J]. 2019 IEEE INTERNATIONAL WORKSHOP ON INFORMATION FORENSICS AND SECURITY (WIFS), 2019,
  • [9] Speech Enhancement Based On Spectrogram Conditional Generative Adversarial Networks
    Han, Ru
    Liu, Jianming
    Wang, Mingwen
    [J]. ELEVENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2019), 2020, 11373
  • [10] Adversarial Vulnerability in Doppler-based Human Activity Recognition
    Yang, Zhaoyuan
    Zhao, Yang
    Yan, Weizhong
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,