On a Robust F0 Estimation of Speech based on IRAPT using Robust TV-CAR Analysis

被引:0
|
作者
Hotta, Kazushi [1 ]
Funaki, Keiichi [2 ]
机构
[1] Univ Ryukyus, Grad Sch Engn & Sci, Nishihara, Okinawa 90301, Japan
[2] Univ Ryukyus, Comp & Networking Ctr, Nishihara, Okinawa 90301, Japan
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Fundamental frequency (F-0) estimation is important in speech processing such as speech coding, synthesis, recognition and so on. A present F-0 estimation method performs well under clean condition, however the performance deteriorates significantly in noisy environment. As a result, robust F-0 estimation against additive noise is demanded. We have previously proposed F-0 estimation methods based on Time-Varying Complex AR (TV-CAR) analysis whose criterion is the weighted correlation of the complex residual obtained by the TV-CAR analysis, sum of the harmonics for the complex residual spectrum, or so on. On the other hand, E.Azarov et al. have proposed an improved method of RAPT (Robust Algorithm for Pitch Tracking) using an instantaneous harmonics that is called IRAPT (Instantaneous RAPT). The IRAPT can perform better estimation than RAPT. Since IRAPT uses band-limited analytic signal to obtain harmonic frequencies, the complex residual signal obtained by the TV-CAR analysis can also be applied to the IRAPT. In this paper, novel F-0 estimation method using the instantaneous frequency based on the robust ELS (Extended Least Square) TV-CAR residual is proposed and evaluated.
引用
收藏
页数:4
相关论文
共 50 条
  • [31] Multi-Microphone Periodicity Function for Robust F0 Estimation in Real Noisy and Reverberant Environments
    Flego, Federico
    Omologo, Maurizio
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2146 - 2149
  • [32] Noise robust F0 determination and epoch-marking algorithms
    Kotnik, Bojan
    Hoege, Harald
    Kacic, Zdravko
    SIGNAL PROCESSING, 2009, 89 (12) : 2555 - 2569
  • [33] Comparative evaluations of robust and accurate F0 estimates in reverberant environments
    Unoki, Masashi
    Hosorogiya, Toshihiro
    Ishimoto, Yuichi
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4569 - +
  • [34] F0 Estimation and Voicing Detection With Cascade Architecture in Noisy Speech
    Zhang, Yixuan
    Wang, Heming
    Wang, Deliang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 3760 - 3770
  • [35] Investigation of Prosodic F0 Layers in Hierarchical F0 Modeling for HMM-based Speech Synthesis
    Lei, Ming
    Wu, Yi-Jian
    Ling, Zhen-Hua
    Dai, Li-Rong
    2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 613 - +
  • [36] F0 ESTIMATION FOR NOISY SPEECH BASED ON EXPLORING LOCAL TIME-FREQUENCY SEGMENT
    Wang, Dongmei
    Hansen, John H. L.
    Tobey, Emily
    2015 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2015,
  • [37] JOINT ANALYSIS OF F0 AND SPEECH RATE WITH FUNCTIONAL DATA ANALYSIS
    Gubian, Michele
    Boves, Lou
    Cangemi, Francesco
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4972 - 4975
  • [38] On a robust ASR based on robust complex speech analysis
    Higa, Keita
    Funaki, Keiichi
    2015 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS), 2015, : 129 - 133
  • [39] A Study of F0 Estimation Based on RAPT Framework using Sustained Vowel
    Karunaimathi, Prarthana, V
    Gladis, Dennis
    Dalvi, Usha
    2015 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2015, : 2290 - 2295
  • [40] Robust f0 extraction from monophonic signals using adaptive sub-band filtering
    Rengaswamy, Pradeep
    Reddy, M. Kiran
    Rao, Krothapalli Sreenivasa
    Dasgupta, Pallab
    SPEECH COMMUNICATION, 2020, 116 : 77 - 85