Speech analysis and synthesis with a refined adaptive sinusoidal representation

被引:3
|
作者
Tabet, Youcef [1 ]
Boughazi, Mohamed [1 ]
Afifi, Saddek [1 ]
机构
[1] Univ Badji Mokhtar, Fac Sci Ingeniorat, Annaba, Algeria
关键词
Speech representation; Speech analysis; Speech synthesis; Adaptive sinusoidal modeling;
D O I
10.1007/s10772-018-9519-4
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper explores common speech signal representations along with a brief description of their corresponding analysis-synthesis stages. The main focus is on adaptive sinusoidal representations where a refined model of speech is suggested. This model is referred to as Refined adaptive Sinusoidal Representation (R_aSR). Based on the performance of the recently suggested adaptive Sinusoidal Models of speech, significant refinements are proposed at both the analysis and adaptive stages. First, a quasi-harmonic representation of speech is used in the analysis stage in order to obtain an initial estimation of the instantaneous model parameters. Next, in the adaptive stage, an adaptive scheme combined with an iterative frequency correction mechanism is used to allow a robust estimation of model parameters (amplitudes, frequencies, and phases). Finally, the speech signal is reconstructed as a sum of its estimated time-varying instantaneous components after an interpolation scheme. Objective evaluation tests prove that the suggested R_aSR achieves high quality reconstruction when applied in modeling voiced speech signals compared to state-of-the-art models. Moreover, transparent perceived quality was attained using the R_aSR according to results obtained from listening evaluation tests.
引用
收藏
页码:581 / 588
页数:8
相关论文
共 50 条
  • [1] SPEECH ANALYSIS SYNTHESIS BASED ON A SINUSOIDAL REPRESENTATION
    MCAULAY, RJ
    QUATIERI, TF
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1986, 34 (04): : 744 - 754
  • [2] Application of Cochlear Model in Speech Analysis/Synthesis Using Sinusoidal Representation
    Yuan Jingxian Wan Wanggen Yu Xiaoqing (School of Communication & Information Engineering
    [J]. Advances in Manufacturing, 1999, (01) : 47 - 52
  • [3] ROBUST FULL-BAND ADAPTIVE SINUSOIDAL ANALYSIS AND SYNTHESIS OF SPEECH
    Kafentzis, George P.
    Rosec, Olivier
    Stylianou, Yannis
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [4] High quality and low complexity speech analysis/synthesis based on sinusoidal representation
    Tan, JG
    Zhang, WJ
    Liu, PL
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (12): : 2893 - 2896
  • [5] ANALYSIS OF EMOTIONAL SPEECH USING AN ADAPTIVE SINUSOIDAL MODEL
    Kafentzis, George P.
    Yakoumaki, Theodora
    Mouchtaris, Athanasios
    Styhanou, Yannis
    [J]. 2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 1492 - 1496
  • [6] SPEECH TRANSFORMATIONS BASED ON A SINUSOIDAL REPRESENTATION
    QUATIERI, TF
    MCAULAY, RJ
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1986, 34 (06): : 1449 - 1464
  • [7] Speech coding with an analysis-by-synthesis sinusoidal model
    Etemoglu, ÇÖ
    Cuperman, V
    Gersho, A
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1371 - 1374
  • [8] Sparse Sinusoidal Signal Representation for Speech and Music Signals
    Mowlaee, Pejman
    Froghani, Amirhossein
    Sayadiyan, Abolghasem
    [J]. ADVANCES IN COMPUTER SCIENCE AND ENGINEERING, 2008, 6 : 469 - 476
  • [9] On the perceptually irrelevant phase information in sinusoidal representation of speech
    Kim, DS
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (08): : 900 - 905
  • [10] ADAPTIVE ANALYSIS OF SPEECH BASED ON A POLE-ZERO REPRESENTATION
    MORIKAWA, H
    FUJISAKI, H
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1982, 30 (01): : 77 - 88