Spectral Envelope Recovery beyond the Nyquist Limit for High-Quality Manipulation of Speech Sounds

被引:0
|
作者
Kawahara, Hideki [1 ]
Morise, Masanori
Banno, Hideki
Takahashi, Toru
Nisimura, Ryuichi [1 ]
Irino, Toshio [1 ]
机构
[1] Wakayama Univ, Dept Design Informat Sci, Wakayama, Japan
关键词
speech analysis; sampling theory; speech modification;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A simple new method to recover details in a spectral envelope is proposed based on a recently introduced speech analysis, modification and resynthesis framework called TANDEM-STRAIGHT. Spectral envelope recovery of voiced sounds is a discrete-to-analog conversion in the frequency domain. However, there is a fundamental problem because the spatial frequency contents of vocal tract functions generally exceed the Nyquist limit of the equivalent sampling rate determined by the fundamental frequency. TANDEM-STRAIGHT yields a method to recover a spectral envelope based on the consistent sampling theory and provides base information for exceeding this limit. At the final stage, the AR spectral envelope estimated from the TANDEM-STRAIGHT spectrum is divided by the F0 adaptively smoothed version of itself to supply the missing high-spatial-frequency details of the envelope. The underlying principle of the proposed method can also be applied to other speech synthesis frameworks.
引用
收藏
页码:650 / 653
页数:4
相关论文
共 50 条
  • [1] Cheap Trick, a spectral envelope estimator for high-quality speech synthesis
    Morise, Masanori
    [J]. SPEECH COMMUNICATION, 2015, 67 : 1 - 7
  • [2] Revisiting spectral envelope recovery from speech sounds generated by periodic excitation
    Kawahara, Hideki
    Morise, Masanori
    Hua, Kanru
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1674 - 1683
  • [3] Beyond bandlimited sampling of speech spectral envelope imposed by the harmonic structure of voiced sounds
    Kawahara, Hideki
    Morise, Masanori
    Toda, Tomoki
    Nisimura, Ryuichi
    Irino, Toshio
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 34 - 38
  • [4] Simplified aperiodicity representation for high-quality speech manipulation systems
    Kawahara, Hideki
    Morise, Masanori
    [J]. PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 579 - +
  • [5] Underlying principles of a high-quality speech manipulation system STRAIGHT and its application to speech segregation
    Kawahara, H
    Irino, T
    [J]. SPEECH SEPARATION BY HUMANS AND MACHINES, 2005, : 167 - 180
  • [6] Efficient spectral magnitude quantisation for high-quality sinusoidal speech coders
    Cho, YD
    Villette, S
    Kondoz, A
    [J]. IEEE VTC 53RD VEHICULAR TECHNOLOGY CONFERENCE, SPRING 2001, VOLS 1-4, PROCEEDINGS, 2001, : 1315 - 1318
  • [7] High-quality waveform generator from fundamental frequency, spectral envelope, and band aperiodicity
    Morise, Masanori
    Shono, Takuro
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 613 - 617
  • [8] High-Quality Speech Recovery Through Soundproof Protections via mmWave Sensing
    Lin, Feng
    Wang, Chao
    Liu, Tiantian
    Liu, Ziwei
    Shen, Yijie
    Ba, Zhongjie
    Lu, Li
    Xu, Wenyao
    Ren, Kui
    [J]. IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2024, 21 (04) : 3065 - 3081
  • [9] HIGH-QUALITY PARCOR SPEECH SYNTHESIZER
    SAMPEI, T
    ASADA, A
    NAKATA, K
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 1980, 26 (03) : 353 - 359
  • [10] High-quality speech processor for comms
    不详
    [J]. ELECTRONICS WORLD, 2001, 107 (1784): : 604 - 606