SIMULTANEOUS ESTIMATION OF VOCAL-TRACT AND VOICE SOURCE PARAMETERS BASED ON AN ARX MODEL

被引:0
|
作者
DING, W
KASUYA, H
ADACHI, S
机构
关键词
ARX MODEL; KALMAN FILTER; VOICING SOURCE MODEL; SIMULATED ANNEALING METHOD;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A novel adaptive pitch-synchronous analysis method is proposed to estimate simultaneously vocal tract (formant/antiformant) and voice source parameters from speech waveforms. We use the parametric Rosenberg-Klatt (RK) model to generate a glottal waveform and an autoregressive-exogenous (ARX) model to represent voiced speech production process. The Kalman filter algorithm is used to estimate the formant/antiformant parameters from the coefficients of the ARX model, and the simulated annealing method is employed as a nonlinear optimization approach to estimate the voice source parameters. The two approaches work together in a system identification procedure to find the best set of the parameters of both the models. The new method has been compared using synthetic speech with some other approaches in terms of accuracy of estimated parameter values and has been proved to be superior. We also show that the proposed method can estimate accurately the parameters from natural speech sounds. A major application of the analysis method lies in a concatenative formant synthesizer which allows us to make flexible control of voice quality of synthetic speech.
引用
收藏
页码:738 / 743
页数:6
相关论文
共 50 条
  • [1] Simultaneous Estimation of Glottal Source Waveforms and Vocal Tract Shapes from Speech Signals Based on ARX-LF Model
    Yongwei Li
    Ken-Ichi Sakakibara
    Masato Akagi
    [J]. Journal of Signal Processing Systems, 2020, 92 : 831 - 838
  • [2] Simultaneous Estimation of Glottal Source Waveforms and Vocal Tract Shapes from Speech Signals Based on ARX-LF Model
    Li, Yongwei
    Sakakibara, Ken-Ichi
    Akagi, Masato
    [J]. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2020, 92 (08): : 831 - 838
  • [3] Fast and robust joint estimation of vocal tract and voice source parameters
    Ding, W
    Campbell, N
    Higuchi, N
    Kasuya, H
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1291 - 1294
  • [4] Vocal-tract length estimation
    Sorokin, V. N.
    Geras'kin, I. V.
    [J]. JOURNAL OF COMMUNICATIONS TECHNOLOGY AND ELECTRONICS, 2013, 58 (12) : 1292 - 1301
  • [5] Vocal-tract length estimation
    V. N. Sorokin
    I. V. Geras’kin
    [J]. Journal of Communications Technology and Electronics, 2013, 58 : 1292 - 1301
  • [6] Mapping Articulatory-Features to Vocal-Tract Parameters for Voice Conversion
    Ariwardhani, Narpendyah Wisjnu
    Kimura, Masashi
    Iribe, Yurie
    Katsurada, Kouichi
    Nitta, Tsuneo
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (04): : 911 - 918
  • [7] ESTIMATION OF VOCAL-TRACT FILTER PARAMETERS USING A NEURAL NET
    RAHIM, MG
    GOODYEAR, CC
    [J]. SPEECH COMMUNICATION, 1990, 9 (01) : 49 - 55
  • [8] VOCAL-TRACT RESONANCE ADJUSTMENTS IN THE SINGING VOICE
    SCHUTTE, HK
    [J]. FOLIA PHONIATRICA, 1992, 44 (1-2): : 72 - 72
  • [9] GLOTTAL SOURCE VOCAL-TRACT INTERACTION
    KOIZUMI, T
    TANIGUCHI, S
    HIROMITSU, S
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1985, 78 (05): : 1541 - 1547
  • [10] POSTERIORI ESTIMATION OF VOCAL-TRACT LENGTH
    KIRLIN, RL
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1978, 26 (06): : 571 - 574