Real-time, full-band, online DNN-based voice conversion system using a single CPU

被引:0
|
作者
Saeki, Takaaki [1 ]
Saito, Yuki [1 ]
Takamichi, Shinnosuke [1 ]
Saruwatari, Hiroshi [1 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo, Japan
来源
关键词
voice conversion; full-band speech; real-time speech processing; online speech processing;
D O I
暂无
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
We present a real-time, full-band, online voice conversion (VC) system that uses a single CPU. For practical applications, VC must be high quality and able to perform real-time, online conversion with fewer computational resources. Our system achieves this by combining non-linear conversion with a deep neural network and short-tap, sub-band filtering. We evaluate our system and demonstrate that it 1) achieves the estimated complexity around 2.5 GFLOPS and measures real-time factor (RTF) around 0.5 with a single CPU and 2) can attain converted speech with a 3.4 / 5.0 mean opinion score (MOS) of naturalness.
引用
收藏
页码:1021 / 1022
页数:2
相关论文
共 50 条
  • [1] DNN-Based Full-Band Speech Synthesis Using GMM Approximation of Spectral Envelope
    Koguchi, Junya
    Takamichi, Shinnosuke
    Morise, Masanori
    Saruwatari, Hiroshi
    Sagayama, Shigeki
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (12): : 2673 - 2681
  • [2] Full-Band LPCNet: A Real-Time Neural Vocoder for 48 kHz Audio With a CPU
    Matsubara, Keisuke
    Okamoto, Takuma
    Takashima, Ryoichi
    Takiguchi, Tetsuya
    Toda, Tomoki
    Shiga, Yoshinori
    Kawai, Hisashi
    [J]. IEEE ACCESS, 2021, 9 : 94923 - 94933
  • [3] Real-Time Full-Band Voice Conversion with Sub-Band Modeling and Data-Driven Phase Estimation of Spectral Differentials
    Saeki, Takaaki
    Saito, Yuki
    Takamichi, Shinnosuke
    Saruwatari, Hiroshi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (07) : 1002 - 1016
  • [4] Real-time Control of a DNN-based Articulatory Synthesizer for Silent Speech Conversion: a pilot study
    Bocquelet, Florent
    Hueber, Thomas
    Girin, Laurent
    Savariaux, Christophe
    Yvert, Blaise
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2405 - 2409
  • [5] A DNN-based Metamodeling Techniques for Real-Time Simulations of Flexible Multibody System Dynamics
    Han, Seongji
    Choi, Hee-Sun
    Choi, Juhwan
    Choi, Jin Hwan
    Kim, Jin-Gyun
    [J]. TRANSACTIONS OF THE KOREAN SOCIETY OF MECHANICAL ENGINEERS A, 2021, 45 (10) : 853 - 861
  • [6] STARx A GPU Based Multi-System Full-Band Real-Time GNSS Software Receiver
    Huang, Bin
    Yao, Zheng
    Guo, Fu
    Deng, Shihai
    Cui, Xiaowei
    Lu, Mingquan
    [J]. PROCEEDINGS OF THE 26TH INTERNATIONAL TECHNICAL MEETING OF THE SATELLITE DIVISION OF THE INSTITUTE OF NAVIGATION (ION GNSS 2013), 2013, : 1549 - 1559
  • [7] DNN-Based Cross-Lingual Voice Conversion Using Bottleneck Features
    M. Kiran Reddy
    K. Sreenivasa Rao
    [J]. Neural Processing Letters, 2020, 51 : 2029 - 2042
  • [8] Research of real-time corn yield monitoring system with DNN-based prediction model
    Yin, Chaojie
    Zhang, Qi
    Mao, Xu
    Chen, Du
    Huang, Shengcao
    Li, Yutong
    [J]. FRONTIERS IN PLANT SCIENCE, 2024, 15
  • [9] DNN-Based Cross-Lingual Voice Conversion Using Bottleneck Features
    Reddy, M. Kiran
    Rao, K. Sreenivasa
    [J]. NEURAL PROCESSING LETTERS, 2020, 51 (02) : 2029 - 2042
  • [10] Implementation of Real-Time Adversarial Attacks on DNN-based Modulation Classifier
    Shtaiwi, Eyad
    Hussein, Ahmed Refaey
    Khawar, Awais
    Alkhateeb, Ahmed
    Abdelhadi, Ahmed
    Han, Zhu
    [J]. 2023 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS, ICNC, 2023, : 288 - 292