On a Robust F0 Estimation of Speech based on IRAPT using Robust TV-CAR Analysis

被引：0

作者：

Hotta, Kazushi ^{[1
]}

Funaki, Keiichi ^{[2
]}

机构：

[1] Univ Ryukyus, Grad Sch Engn & Sci, Nishihara, Okinawa 90301, Japan

[2] Univ Ryukyus, Comp & Networking Ctr, Nishihara, Okinawa 90301, Japan

来源：

2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA) | 2014年

关键词：

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Fundamental frequency (F-0) estimation is important in speech processing such as speech coding, synthesis, recognition and so on. A present F-0 estimation method performs well under clean condition, however the performance deteriorates significantly in noisy environment. As a result, robust F-0 estimation against additive noise is demanded. We have previously proposed F-0 estimation methods based on Time-Varying Complex AR (TV-CAR) analysis whose criterion is the weighted correlation of the complex residual obtained by the TV-CAR analysis, sum of the harmonics for the complex residual spectrum, or so on. On the other hand, E.Azarov et al. have proposed an improved method of RAPT (Robust Algorithm for Pitch Tracking) using an instantaneous harmonics that is called IRAPT (Instantaneous RAPT). The IRAPT can perform better estimation than RAPT. Since IRAPT uses band-limited analytic signal to obtain harmonic frequencies, the complex residual signal obtained by the TV-CAR analysis can also be applied to the IRAPT. In this paper, novel F-0 estimation method using the instantaneous frequency based on the robust ELS (Extended Least Square) TV-CAR residual is proposed and evaluated.

引用

页数：4

共 50 条

[31] Multi-Microphone Periodicity Function for Robust F0 Estimation in Real Noisy and Reverberant Environments
Flego, Federico
Omologo, Maurizio
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2146 - 2149
[32] Noise robust F0 determination and epoch-marking algorithms
Kotnik, Bojan
Hoege, Harald
Kacic, Zdravko
SIGNAL PROCESSING, 2009, 89 (12) : 2555 - 2569
[33] Comparative evaluations of robust and accurate F0 estimates in reverberant environments
Unoki, Masashi
Hosorogiya, Toshihiro
Ishimoto, Yuichi
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4569 - +
[34] F0 Estimation and Voicing Detection With Cascade Architecture in Noisy Speech
Zhang, Yixuan
Wang, Heming
Wang, Deliang
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 3760 - 3770
[35] Investigation of Prosodic F0 Layers in Hierarchical F0 Modeling for HMM-based Speech Synthesis
Lei, Ming
Wu, Yi-Jian
Ling, Zhen-Hua
Dai, Li-Rong
2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 613 - +
[36] F0 ESTIMATION FOR NOISY SPEECH BASED ON EXPLORING LOCAL TIME-FREQUENCY SEGMENT
Wang, Dongmei
Hansen, John H. L.
Tobey, Emily
2015 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2015,
[37] JOINT ANALYSIS OF F0 AND SPEECH RATE WITH FUNCTIONAL DATA ANALYSIS
Gubian, Michele
Boves, Lou
Cangemi, Francesco
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4972 - 4975
[38] On a robust ASR based on robust complex speech analysis
Higa, Keita
Funaki, Keiichi
2015 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS), 2015, : 129 - 133
[39] A Study of F0 Estimation Based on RAPT Framework using Sustained Vowel
Karunaimathi, Prarthana, V
Gladis, Dennis
Dalvi, Usha
2015 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2015, : 2290 - 2295
[40] Robust f0 extraction from monophonic signals using adaptive sub-band filtering
Rengaswamy, Pradeep
Reddy, M. Kiran
Rao, Krothapalli Sreenivasa
Dasgupta, Pallab
SPEECH COMMUNICATION, 2020, 116 : 77 - 85

← 1 2 3 4 5 →