DICTIONARY LEARNING FOR PITCH ESTIMATION IN SPEECH SIGNALS

被引:0
|
作者
Huang, Feng [1 ]
Balazs, Peter [1 ]
机构
[1] Austrian Acad Sci, Acoust Res Inst, Vienna, Austria
基金
奥地利科学基金会;
关键词
Pitch estimation; sparsity; harmonic structure; dictionary learning; dictionary post-processing; MATRIX-FACTORIZATION; ALGORITHM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents an automatic approach for parameter training for a sparsity-based pitch estimation method that has been previously published. For this pitch estimation method, the harmonic dictionary is a key parameter that needs to be carefully prepared beforehand. In the original method, extensive human supervision and involvement are required to construct and label the dictionary. In this study, we propose to employ dictionary learning algorithms to learn the dictionary directly from training data. We apply and compare 3 typical dictionary learning algorithms, i.e., the method of optimized directions (MOD), K-SVD and online dictionary learning (ODL), and propose a post-processing method to label and adapt a learned dictionary for pitch estimation. Results show that MOD and properly initialized ODL (pi-ODL) can lead to dictionaries that exhibit the desired harmonic structures for pitch estimation, and the post-processing method can significantly improve performance of the learned dictionaries in pitch estimation. The dictionary obtained with pi-ODL and post-processing attained pitch estimation accuracy close to the optimal performance of the manual dictionary. It is positively shown that dictionary learning is feasible and promising for this application.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Fast Dictionary Learning for Sparse Representations of Speech Signals
    Jafari, Maria G.
    Plumbley, Mark D.
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2011, 5 (05) : 1025 - 1031
  • [2] The fourth order cumulant of speech signals applied to pitch estimation
    Maalem, H
    Marir, F
    [J]. 2004 IEEE International Conference on Industrial Technology (ICIT), Vols. 1- 3, 2004, : 1303 - 1306
  • [3] Nonlinear estimation of DEGG signals with applications to speech pitch detection
    Barner, KE
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2243 - 2246
  • [4] Pitch Estimation of Marathi Spoken Numbers in Various Speech Signals
    Nimbhore, S. S.
    Ramteke, G. D.
    Ramteke, R. J.
    [J]. 2013 INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND SIGNAL PROCESSING (ICCSP), 2013, : 405 - 409
  • [5] Fully-Convolutional Network for Pitch Estimation of Speech Signals
    Ardaillon, Luc
    Roebel, Axel
    [J]. INTERSPEECH 2019, 2019, : 2005 - 2009
  • [6] Pitch period estimation algorithms for speech signals using wavelet transforms
    Walker, SL
    Foo, SY
    [J]. International Conference on Computing, Communications and Control Technologies, Vol 5, Proceedings, 2004, : 142 - 144
  • [7] Complex-Domain Pitch Estimation Algorithm for Narrowband Speech Signals
    Hosoda, Yuya
    Kawamura, Arata
    Iiguni, Youji
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2067 - 2078
  • [8] The third order cumulant of speech signals with application to reliable pitch estimation
    Nemer, E
    Goubran, R
    Mahmoud, S
    [J]. NINTH IEEE SIGNAL PROCESSING WORKSHOP ON STATISTICAL SIGNAL AND ARRAY PROCESSING, PROCEEDINGS, 1998, : 427 - 430
  • [9] Pitch Estimation of Noisy Speech Signals using Empirical Mode Decomposition
    Molla, Md. Khademul Islam
    Hirose, Keikichi
    Minematsu, Nobuaki
    Hasan, Md. Kamrul
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2177 - +
  • [10] Robust Pitch Estimation in Distant Speech Signals Collected from Vehicle
    Mudatkar, Dipesh
    Adarsh, S.
    Govind, D.
    [J]. 2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 1784 - 1791