Causal-anticausal decomposition of speech using complex cepstrum for glottal source estimation

被引:43
|
作者
Drugman, Thomas [1 ]
Bozkurt, Bans [2 ]
Dutoit, Thierry [1 ]
机构
[1] Univ Mons, TCTS Lab, B-7000 Mons, Belgium
[2] Izmir Inst Technol, Dept Elect & Elect Engn, Izmir, Turkey
关键词
Complex cepstrum; Homomorphic analysis; Glottal source estimation; Source-tract separation;
D O I
10.1016/j.specom.2011.02.004
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Complex cepstrum is known in the literature for linearly separating causal and anticausal components. Relying on advances achieved by the Zeros of the Z-Transform (ZZT) technique, we here investigate the possibility of using complex cepstrum for glottal flow estimation on a large-scale database. Via a systematic study of the windowing effects on the deconvolution quality, we show that the complex cepstrum causal-anticausal decomposition can be effectively used for glottal flow estimation when specific windowing criteria are met. It is also shown that this complex cepstral decomposition gives similar glottal estimates as obtained with the ZZT method. However, as complex cepstrum uses FFT operations instead of requiring the factoring of high-degree polynomials, the method benefits from a much higher speed. Finally in our tests on a large corpus of real expressive speech, we show that the proposed method has the potential to be used for voice quality analysis. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:855 / 866
页数:12
相关论文
共 50 条
  • [21] Glottal source shape parameter estimation using phase minimization variants
    Huber, Stefan
    Roebel, Axel
    Degottex, Gilles
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1642 - 1645
  • [22] Estimation of Glottal Closing and Opening Instants in Voiced Speech Using the YAGA Algorithm
    Thomas, Mark R. P.
    Gudnason, Jon
    Naylor, Patrick A.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01): : 82 - 91
  • [23] Accurate Estimation of Glottal Closure Instants and Glottal Opening Instants from Electroglottographic Signal Using Variational Mode Decomposition
    Lal, G. Jyothish
    Gopalakrishnan, E. A.
    Govind, D.
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2018, 37 (02) : 810 - 830
  • [24] Accurate Estimation of Glottal Closure Instants and Glottal Opening Instants from Electroglottographic Signal Using Variational Mode Decomposition
    G. Jyothish Lal
    E. A. Gopalakrishnan
    D. Govind
    [J]. Circuits, Systems, and Signal Processing, 2018, 37 : 810 - 830
  • [25] Pitch and Formant Estimation of Bangla Speech Signal Using Autocorrelation, Cepstrum and LPC Algorithm
    Aadit, Muhammad Navid Anjum
    Kirtania, Sharadindu Gopal
    Mahin, Mehnaz Tabassum
    [J]. PROCEEDINGS OF THE 2016 19TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2016, : 371 - 376
  • [26] HMM-BASED SPEECH SYNTHESISER USING THE LF-MODEL OF THE GLOTTAL SOURCE
    Cabral, Joao P.
    Renals, Steve
    Yamagishi, Junichi
    Richmond, Korin
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4704 - 4707
  • [27] A time varying ARMAX speech modeling with phase compensation using glottal source model
    Funaki, K
    Miyanaga, Y
    Tochinai, K
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1299 - 1302
  • [28] Glottal closure instant estimation using an appropriateness measure of the source and continuity constraints
    Vincent, Damien
    Rosec, Olivier
    Chonavel, Thierry
    [J]. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 381 - 384
  • [29] Estimating the spectral tilt of the glottal source from telephone speech using a deep neural network
    [J]. Jokinen, Emma (emma.jokinen@aalto.fi), 1600, Acoustical Society of America (141):
  • [30] Estimating the spectral tilt of the glottal source from telephone speech using a deep neural network
    Jokinen, Emma
    Alku, Paavo
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 141 (04): : EL327 - EL330