Real-time, full-band, online DNN-based voice conversion system using a single CPU

被引:0
|
作者
Saeki, Takaaki [1 ]
Saito, Yuki [1 ]
Takamichi, Shinnosuke [1 ]
Saruwatari, Hiroshi [1 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo, Japan
来源
关键词
voice conversion; full-band speech; real-time speech processing; online speech processing;
D O I
暂无
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
We present a real-time, full-band, online voice conversion (VC) system that uses a single CPU. For practical applications, VC must be high quality and able to perform real-time, online conversion with fewer computational resources. Our system achieves this by combining non-linear conversion with a deep neural network and short-tap, sub-band filtering. We evaluate our system and demonstrate that it 1) achieves the estimated complexity around 2.5 GFLOPS and measures real-time factor (RTF) around 0.5 with a single CPU and 2) can attain converted speech with a 3.4 / 5.0 mean opinion score (MOS) of naturalness.
引用
收藏
页码:1021 / 1022
页数:2
相关论文
共 50 条
  • [21] Augmented Speech Production based on Real-Time Statistical Voice Conversion
    Toda, Tomoki
    [J]. 2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 592 - 596
  • [22] NeuralVC: Any-to-Any Voice Conversion Using Neural Networks Decoder for Real-Time Voice Conversion
    Cao, Danyang
    Zhang, Zeyi
    Zhang, Jinyuan
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2070 - 2074
  • [23] A Real-time Accompaniment System Based on Sung Voice Recognition
    Luo, Li
    Lu, Peng-Fei
    Wang, Zeng-Fu
    [J]. 19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 531 - 534
  • [24] A DNN-based data-driven modeling employing coarse sample data for real-time flexible multibody dynamics simulations
    Han, Seongji
    Choi, Hee-Sun
    Choi, Juhwan
    Choi, Jin Hwan
    Kim, Jin-Gyun
    [J]. COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2021, 373
  • [25] Real-time Voice Conversion Using Artificial Neural Networks with Rectified Linear Units
    Azarov, Elias
    Vashkevich, Maxim
    Likhachov, Denis
    Petrovsky, Alexander
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1031 - 1035
  • [26] MNSSD: A Real-time DNN based Companion Image Data Annotation using MobileNet and Single Shot Multibox Detector
    Morshed, Md Golam
    Lee, Young-Koo
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (IEEE BIGCOMP 2022), 2022, : 251 - 258
  • [27] SSDMNV2: A real time DNN-based face mask detection system using single shot multibox detector and MobileNetV2 (vol 66, 102692, 2021)
    Nagrath, Preeti
    Jain, Rachna
    Madan, Agam
    Arora, Rohan
    Kataria, Piyush
    Hemanth, Jude
    [J]. SUSTAINABLE CITIES AND SOCIETY, 2021, 71
  • [28] Real-Time Fault Detection for IIoT Facilities Using GBRBM-Based DNN
    Huang, Huakun
    Ding, Shuxue
    Zhao, Lingjun
    Huang, Huawei
    Chen, Liang
    Gao, Honghao
    Ahmed, Syed Hassan
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (07): : 5713 - 5722
  • [29] A Real-time SAR Imaging System Based on CPU-GPU Heterogeneous Platform
    Wu, Yewei
    Chen, Jun
    Zhang, Hongqun
    [J]. PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 461 - 464
  • [30] A face to face communication using real-time media conversion system
    Miyashita, N
    Sakaguchi, T
    Morishima, S
    [J]. RO-MAN '96 - 5TH IEEE INTERNATIONAL WORKSHOP ON ROBOT AND HUMAN COMMUNICATION, PROCEEDINGS, 1996, : 543 - 544