Implementation of sequential real-time waveform generator for high-quality vocoder

被引:0
|
作者
Morise, Masanori [1 ,2 ]
机构
[1] Meiji Univ, Sch Interdisciplinary Math Sci, Tokyo, Japan
[2] JST, PRESTO, Saitama, Japan
关键词
SPEECH; ESTIMATOR; STRAIGHT; SYSTEM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We describe an implementation of real-time waveform generation from vocoded speech parameters. High-quality vocoders such as STRAIGHT and WORLD have been used for voice conversion and statistical parametric speech synthesis. The current implementation of such vocoders has a function for generating the whole waveform from the speech parameters in all frames at one time. To sequentially generate a short-period waveform, implementations such as realtime STRAIGHT have been proposed. However, the generated speech waveform is inferior in sound quality to that of the original vocoder. To achieve sequential real-time waveform generation, a struct named WorldSynthesizer (WS struct) and six functions were implemented. The implementation is based on the WORLD vocoder, and it can generate the completely same waveform as the original except for the several points such as random seed used for generating the white noise. We therefore evaluated its processing speed by using the real time factor (RTF). The results showed that the processing speed of the proposed implementation decreased by 14.5% compared with the original WORLD. On the other hand, the RTF of the proposed implementation calculated from female speech was below 0.1, which suggests that the implementation is able to carry out real-time synthesis.
引用
收藏
页码:821 / 825
页数:5
相关论文
共 50 条
  • [1] WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications
    Morise, Masanori
    Yokomori, Fumiya
    Ozawa, Kenji
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (07): : 1877 - 1884
  • [2] Implementation of Real-Time Post-Processing for High-Quality Stereo Vision
    Choi, Seungmin
    Jeong, Jae-Chan
    Chang, Jiho
    Shin, Hochul
    Lim, Eul-Gyoon
    Cho, Jae Ii
    Hwang, Daehwan
    [J]. ETRI JOURNAL, 2015, 37 (04) : 752 - 765
  • [3] HIGH-QUALITY CHANNEL VOCODER
    LARKIN, WD
    STEWART, LC
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1971, 50 (01): : 107 - &
  • [4] High-Quality Real-Time Simulation of a Turbulent Flame
    Opiola, Piotr
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2011, PT IV, 2011, 6785 : 112 - 122
  • [5] A HIGH-QUALITY MULTIRATE REAL-TIME CELP CODER
    KROON, P
    SWAMINATHAN, K
    [J]. IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 1992, 10 (05) : 850 - 857
  • [6] High-Quality Real-Time Video Inpainting with PixMix
    Herling, Jan
    Broll, Wolfgang
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2014, 20 (06) : 866 - 879
  • [7] Efficient FPGA Implementation of a High-Quality Super-Resolution Algorithm with Real-Time Performance
    Szydzik, Tomasz
    Callico, Gustavo M.
    Nunez, Antonio
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2011, 57 (02) : 664 - 672
  • [8] Implementation of a real-time HY-2 channel vocoder algorithm
    Loos, TS
    [J]. MILCOM 97 PROCEEDINGS, VOLS 1-3, 1997, : 525 - 529
  • [9] PixMix: A Real-Time Approach to High-Quality Diminished Reality
    Herling, Jan
    Broll, Wolfgang
    [J]. 2012 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY (ISMAR) - SCIENCE AND TECHNOLOGY, 2012, : 141 - 150
  • [10] Real-Time High-Quality Stereo Matching System on a GPU
    Chang, Qiong
    Maruyama, Tsutomu
    [J]. 2018 IEEE 29TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP), 2018, : 17 - 24