High-Quality Analysis/Synthesis Method Based on Temporal Decomposition for Speech Modification

被引:0
|
作者
Nguyen, Binh Phu [1 ]
Shibata, Takeshi [1 ]
Akagi, Masato [1 ]
机构
[1] Japan Adv Inst Sci & Technol, Sch Informat Sci, Nomi, Ishikawa 9231292, Japan
关键词
analysis/synthesis method; speech modification; temporal decomposition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The challenge of speech modification is to flexibly modify the speech without degrading speech quality. The conventional methods are limited by their inability to flexibly control speech signals in time and frequency domains. This causes degradation of the quality of modified speech. This paper proposes a high-quality analysis/synthesis method for speech modification. To control the temporal evolution, we use a speech analysis technique called temporal decomposition (TD), which decomposes a speech signal into event targets and event functions. The same event functions evaluated for the spectral parameters are also used to model the temporal evolution of the excitation parameters. The event functions describe the temporal evolution of the spectral and excitation parameters, and the event targets represent the "ideal" spectral parameters. To flexibly control speech signals in both time and frequency domains, we propose new methods to model the event functions and the event targets. The experimental results show that our proposed analysis/synthesis method produces high-quality synthesized speech, and allows the flexibility to modify speech signals.
引用
收藏
页码:662 / 665
页数:4
相关论文
共 50 条
  • [21] WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications
    Morise, Masanori
    Yokomori, Fumiya
    Ozawa, Kenji
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (07): : 1877 - 1884
  • [23] VOC - AN INTEGRATED HIGH-QUALITY SPEECH SYNTHESIZER BASED ON LPC TECHNIQUES
    ITALIANO, P
    PONTE, G
    SARTORI, M
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 1985, 31 (03) : 501 - 504
  • [24] High-quality speech synthesis using context-dependent syllabic units
    Saito, T
    Hashimoto, Y
    Sakamoto, M
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 381 - 384
  • [25] A SYSTEM FOR THE SYNTHESIS OF HIGH-QUALITY SPEECH FROM TEXTS ON GENERAL WEATHER CONDITIONS
    HIROSE, K
    FUJISAKI, H
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1993, E76A (11) : 1971 - 1980
  • [26] IMPULSE TRAIN EQUIVALENT EXCITATION SIGNALS FOR HIGH-QUALITY SPEECH SYNTHESIS.
    Imai, Satoshi
    Furuichi, Chieko
    Electronics and Communications in Japan, Part I: Communications (English translation of Denshi Tsushin Gakkai Ronbunshi), 1987, 70 (03): : 41 - 53
  • [27] HIGH-QUALITY SPEECH CODING WITH SAMPLE RNN
    Klejsa, Janusz
    Hedelin, Per
    Zhou, Cong
    Fejgin, Roy
    Villemoes, Lars
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7155 - 7159
  • [28] High-quality Speech Translation in the Flight Domain
    Wang, Chao
    Seneff, Stephanie
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 761 - +
  • [29] Methodology for Obtaining High-Quality Speech Corpora
    Wieczorkowska, Alicja
    APPLIED SCIENCES-BASEL, 2025, 15 (04):
  • [30] LitNeRF: Intrinsic Radiance Decomposition for High-Quality View Synthesis and Relighting of Faces
    Sarkar, Kripasindhu
    Buhler, Marcel C.
    Li, Gengyan
    Wang, Daoye
    Vicini, Delio
    Riviere, Jeremy
    Zhang, Yinda
    Orts-Escolano, Sergio
    Gotardo, Paulo
    Beeler, Thabo
    Meka, Abhimitra
    PROCEEDINGS OF THE SIGGRAPH ASIA 2023 CONFERENCE PAPERS, 2023,