ROBUST PITCH TRACKING IN NOISY SPEECH USING SPEAKER-DEPENDENT DEEP NEURAL NETWORKS

被引:0
|
作者
Liu, Yuzhou [1 ]
Wane, DeLiang [1 ,2 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[2] Ohio State Univ, Ctr Cognit & Brain Sci, Columbus, OH 43210 USA
关键词
Pitch estimation; deep neural network; hidden Markov model; speaker-dependent modeling; ADAPTATION; DATABASE;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A reliable estimate of pitch in noisy speech is crucial for many speech applications. In this paper, we propose to use speaker-dependent (SD) deep neural networks (DNNs) to model the harmonic patterns of each speaker. Specifically, SD-DNNs take spectral features as input and estimate probabilistic pitch states at each time frame. We investigate two methods for SD-DNN training. The first one is direct training when speaker-dependent data is sufficient. The second one is speaker adaptation of a speaker-independent (SI) DNN with limited data. The Viterbi algorithm is then used to track pitch through time. Experiments show that both training methods of SD-DNNs outperform an SI-DNN based system as well as a state-of-the-art pitch tracking algorithm in all SNR conditions.
引用
收藏
页码:5255 / 5259
页数:5
相关论文
共 50 条
  • [1] Speaker-dependent Multipitch Tracking Using Deep Neural Networks
    Liu, Yuzhou
    Wang, DeLiang
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3279 - 3283
  • [2] Speaker-dependent multipitch tracking using deep neural networks
    Liu, Yuzhou
    Wang, DeLiang
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 141 (02): : 710 - 721
  • [3] A UNIFIED SPEAKER-DEPENDENT SPEECH SEPARATION AND ENHANCEMENT SYSTEM BASED ON DEEP NEURAL NETWORKS
    Gao, Tian
    Du, Jun
    Xu, Li
    Liu, Cong
    Dai, Li-Rong
    Lee, Chin-Hui
    2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 687 - 691
  • [4] A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech
    Yan-Hui Tu
    Jun Du
    Chin-Hui Lee
    Journal of Signal Processing Systems, 2018, 90 : 963 - 973
  • [5] A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech
    Tu, Yan-Hui
    Du, Jun
    Lee, Chin-Hui
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2018, 90 (07): : 963 - 973
  • [6] Speaker-Dependent Voice Activity Detection Robust to Background Speech Noise
    Matsuda, Shigeki
    Ito, Naoya
    Tsujino, Kosuke
    Kashioka, Hideki
    Sagayama, Shigeki
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2625 - 2628
  • [7] Gender-dependent and speaker-dependent speech enhancement
    Potamitis, I
    Fakotakis, N
    Kokkinakis, G
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 249 - 252
  • [8] RECOGNITION OF SPEAKER-DEPENDENT CONTINUOUS SPEECH WITH KEAL
    MERCIER, G
    BIGORGNE, D
    MICLET, L
    LEGUENNEC, L
    QUERRE, M
    IEE PROCEEDINGS-I COMMUNICATIONS SPEECH AND VISION, 1989, 136 (02): : 145 - 154
  • [9] Speaker-dependent automatic helium speech normalisation
    Podhorski, A
    Sawicki, J
    Brykalski, A
    ICECS 2000: 7TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS & SYSTEMS, VOLS I AND II, 2000, : 282 - 285
  • [10] Neural Network Based Pitch Tracking in Very Noisy Speech
    Han, Kun
    Wang, DeLiang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (12) : 2158 - 2168