Voice Feature Selection to Improve Performance of Machine Learning Models for Voice Production Inversion

被引:5
|
作者
Zhang, Zhaoyan [1 ]
机构
[1] Univ Calif Los Angeles, Dept Head & Neck Surg, 31-24 Rehabil Ctr,1000 Veteran Ave, Los Angeles, CA 90095 USA
关键词
Voice inversion; Vocal fold geometry; Vocal fold stiffness; Machine learning; BODY-COVER MODEL; PARAMETERS; VIBRATION; ACOUSTICS; VARIABLES; STIFFNESS;
D O I
10.1016/j.jvoice.2021.03.004
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Objective. Estimation of physiological control parameters of the vocal system from the produced voice outcome has important applications in clinical management of voice disorders . Previously we developed a pressure from voice outcome features that characterize the acoustics of the produced voice. The goals of this study are to (1) explore the possibility of improving the estimation accuracy of physiological control parameters by including voice outcome features characterizing vocal fold vibration; and (2) identify voice feature sets that optimize both estimation accuracy and robustness to measurement noise.Methods. Feedforward neural networks are trained to solve the inversion problem of estimating the physiological control parameters of a three-dimensional body-cover vocal fold model from different sets of voice outcome features that characterize the simulated voice acoustics, glottal flow, and vocal fold vibration. A sensitivity analysis is then performed to evaluate the contribution of individual voice features to the overall performance of the neural networks in estimating the physiologic control parameters.Results and conclusions. While including voice outcome features characterizing vocal fold vibration increases estimation accuracy, it also reduces the network's robustness to measurement noise, due to high sensitivity of network performance to voice outcome features measuring the absolute amplitudes of the glottal flow and area waveforms, which are also difficult to measure accurately in practical applications. By excluding such glottal flow-based features and replacing glottal area-based features by their normalized counterparts, we are able to significantly improve both estimation accuracy and robustness to noise. We further show that similar estimation accuracy and robustness can be achieved with an even smaller set of voice outcome features by excluding features of small sensitivity.
引用
收藏
页码:479 / 485
页数:7
相关论文
共 50 条
  • [1] A Novel Voice Feature AVA and its Application to the Pathological Voice Detection Through Machine Learning
    Altaf, Abdulrehman
    Mahdin, Hairulnizam
    Maskat, Ruhaila
    Shaharudin, Shazlyn Milleana
    Altaf, Abdullah
    Mahmood, Awais
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (09) : 1085 - 1092
  • [2] A pertinent learning machine input feature for speaker discrimination by voice
    Ouamour, S.
    Sayoud, H.
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (02) : 181 - 190
  • [3] A pertinent learning machine input feature for speaker discrimination by voice
    S. Ouamour
    H. Sayoud
    [J]. International Journal of Speech Technology, 2012, 15 (2) : 181 - 190
  • [4] Feature Selection-based Voice Transformation
    Lee, Ki-Seung
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2012, 31 (01): : 39 - 50
  • [5] Voice activity detection based on statistical models and machine learning approaches
    Shin, Jong Won
    Chang, Joon-Hyuk
    Kim, Nam Soo
    [J]. COMPUTER SPEECH AND LANGUAGE, 2010, 24 (03): : 515 - 530
  • [6] Voice Recognition and Voice Comparison using Machine Learning Techniques: A Survey
    Tandel, Nishtha H.
    Prajapati, Harshadkumar B.
    Dabhi, Vipul K.
    [J]. 2020 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2020, : 459 - 465
  • [7] A Supervised Machine Learning Approach using Different Feature Selection Techniques on Voice Datasets for Prediction of Parkinson's Disease
    Aich, Satyabrata
    Kim, Hee-Cheol
    Younga, Kim
    Hui, Kueh Lee
    Al-Absi, Ahmed Abdulhakim
    Sain, Mangal
    [J]. 2019 21ST INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT): ICT FOR 4TH INDUSTRIAL REVOLUTION, 2019, : 1116 - 1121
  • [8] Reproducibility of Voice Analysis with Machine Learning
    Rusz, Jan
    Svihlik, Jan
    Kryze, Petr
    Novotny, Michal
    Tykalova, Tereza
    [J]. MOVEMENT DISORDERS, 2021, 36 (05) : 1282 - 1283
  • [9] Diagnosing Voice Disorder with Machine Learning
    Minh Pham
    Lin, Jing
    Zhang, Yanjia
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 5263 - 5266
  • [10] Crowdfunding performance prediction using feature-selection-based machine learning models
    Feng, Yuanyue
    Luo, Yuhong
    Peng, Nianjiao
    Niu, Ben
    [J]. EXPERT SYSTEMS, 2024,