Noise Robust Fundamental Frequency Estimation of Speech using CNN-based discriminative modeling

被引：0

作者：

Kawamura, Tomonori ^{[1
]}

Kai, Atsuhiko ^{[1
]}

Nakagawa, Seiichi ^{[2
]}

机构：

[1] Shizuoka Univ, Grad Sch Integrated Sci & Technol, Hamamatsu, Shizuoka, Japan

[2] Chubu Univ, Kasugai, Aichi, Japan

来源：

2018 5TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATICS: CONCEPTS, THEORY AND APPLICATIONS (ICAICTA 2018) | 2018年

关键词：

Speech processing; Fundamental frequancy estimation; convolutional neural network;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The fundamental frequency (F0) is a quantity representing the pitch of periodic signal and its estimation for time-variant quasiperiodic acoustic signal is one of common problems in speech processing studies. The correct estimation of this contributes to the improvement of speech processing systems such as, analysis of prosody, test-to-speech system and speech recognition system. While many algorithms have been proposed and they exhibit excellent performance for clean environment, it is a very difficult task for noisy environment. It is generally known that machine learning approach is effective as a discriminative model for handling data in which noise is mixed. In this paper, we propose a robust fundamental frequency estimation method for noisy speech signal by using convolutional neural network (CNN) which is a type of deep neural network (DNN). In our proposed method, convolution layer and pooling layer serve as an approximator of autocorrelation analysis and followed by discriminative modeling for classifying quantized F0 state. This process acquires a discriminator that extracts noise robust FO features. Experimental result showed that our method outperforms convolutional methods based on autocorrelation analysis and its combination with DNN modeling.

引用

页码：60 / 65

页数：6

共 50 条

[1] ROBUST FUNDAMENTAL FREQUENCY ESTIMATION IN COLOURED NOISE
Jaramillo, Alfredo Esquivel
Jakobsson, Andreas
Nielsen, Jesper Kjaer
Christensen, Mads Graesboll
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 741 - 745
[2] Robust CNN-based Speech Recognition With Gabor Filter Kernels
Chang, Shuo-Yiin
Morgan, Nelson
[J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 905 - 909
[3] CNN-based Camera Model Identification Using Image Noise in Frequency Domain
Cai, Tiantian
Shao, Zhanjian
Tomioka, Yoichi
Liu, Yuanyuan
Li, Zhu
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 3518 - 3524
[4] Speech enhancement by LSTM-based noise suppression followed by CNN-based speech restoration
Strake, Maximilian
Defraene, Bruno
Fluyt, Kristoff
Tirry, Wouter
Fingscheidt, Tim
[J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2020, 2020 (01)
[5] Speech enhancement by LSTM-based noise suppression followed by CNN-based speech restoration
Maximilian Strake
Bruno Defraene
Kristoff Fluyt
Wouter Tirry
Tim Fingscheidt
[J]. EURASIP Journal on Advances in Signal Processing, 2020
[6] A nonlocal HEVC in-loop filter using CNN-based compression noise estimation
Weiheng Sun
Xiaohai He
Honggang Chen
Shuhua Xiong
Yifei Xu
[J]. Applied Intelligence, 2022, 52 : 17810 - 17828
[7] A nonlocal HEVC in-loop filter using CNN-based compression noise estimation
Sun, Weiheng
He, Xiaohai
Chen, Honggang
Xiong, Shuhua
Xu, Yifei
[J]. APPLIED INTELLIGENCE, 2022, 52 (15) : 17810 - 17828
[8] Object Viewpoint Estimation using CNN-based Classifier
Bong, Eunsoo
Lee, Eunho
Hwang, Youngbae
[J]. 2022 INTERNATIONAL CONFERENCE ON PLATFORM TECHNOLOGY AND SERVICE (PLATCON22), 2022, : 80 - 85
[9] Modeling Traffic Scenes for Intelligent Vehicles Using CNN-Based Detection and Orientation Estimation
Guindel, Carlos
Martin, David
Maria Armingol, Jose
[J]. ROBOT 2017: THIRD IBERIAN ROBOTICS CONFERENCE, VOL 2, 2018, 694 : 487 - 498
[10] Analysis of CNN-based Speech Recognition System using Raw Speech as Input
Palaz, Dimitri
Magimai-Doss, Mathew
Collobert, Ronan
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 11 - 15

← 1 2 3 4 5 →