TorchDIVA: An extensible computational model of speech production built on an open-source machine learning library

被引:2
|
作者
Kinahan, Sean P. [1 ,2 ]
Liss, Julie M. [1 ]
Berisha, Visar [1 ,2 ]
机构
[1] Arizona State Univ, Coll Hlth Solut, Tempe, AZ 85281 USA
[2] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85281 USA
来源
PLOS ONE | 2023年 / 18卷 / 02期
关键词
D O I
10.1371/journal.pone.0281306
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The DIVA model is a computational model of speech motor control that combines a simulation of the brain regions responsible for speech production with a model of the human vocal tract. The model is currently implemented in Matlab Simulink; however, this is less than ideal as most of the development in speech technology research is done in Python. This means there is a wealth of machine learning tools which are freely available in the Python ecosystem that cannot be easily integrated with DIVA. We present TorchDIVA, a full rebuild of DIVA in Python using PyTorch tensors. DIVA source code was directly translated from Matlab to Python, and built-in Simulink signal blocks were implemented from scratch. After implementation, the accuracy of each module was evaluated via systematic block-by-block validation. The TorchDIVA model is shown to produce outputs that closely match those of the original DIVA model, with a negligible difference between the two. We additionally present an example of the extensibility of TorchDIVA as a research platform. Speech quality enhancement in TorchDIVA is achieved through an integration with an existing PyTorch generative vocoder called DiffWave. A modified DiffWave mel-spectrum upsampler was trained on human speech waveforms and conditioned on the TorchDIVA speech production. The results indicate improved speech quality metrics in the DiffWave-enhanced output as compared to the baseline. This enhancement would have been difficult or impossible to accomplish in the original Matlab implementation. This proof-of-concept demonstrates the value TorchDIVA can bring to the research community. Researchers can download the new implementation at: https://github.com/skinahan/DIVA_PyTorch.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] An open-source, citizen science and machine learning approach to analyse subsea movies
    Anton, Victor
    Germishuys, Jannes
    Bergstrom, Per
    Lindegarth, Mats
    Obst, Matthias
    [J]. BIODIVERSITY DATA JOURNAL, 2021, 9
  • [32] Qsun: an open-source platform towards practical quantum machine learning applications
    Quoc Chuong Nguyen
    Le Bin Ho
    Lan Nguyen Tran
    Nguyen, Hung Q.
    [J]. MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2022, 3 (01):
  • [33] Meteorological Variables Forecasting System Using Machine Learning and Open-Source Software
    Segovia, Jenny Aracely
    Toaquiza, Jonathan Fernando
    Llanos, Jacqueline Rosario
    Rivas, David Raimundo
    [J]. ELECTRONICS, 2023, 12 (04)
  • [34] OpenPointCloud: An Open-Source Algorithm Library of Deep Learning Based Point Cloud Compression
    Gao, Wei
    Ye, Hua
    Li, Ge
    Zheng, Huiming
    Wu, Yuyang
    Xie, Liang
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 7347 - 7350
  • [35] HT_PREDICT: a machine learning-based computational open-source tool for screening HDAC6 inhibitors
    Tinkov, O. V.
    Osipov, V. N.
    Kolotaev, A. V.
    Khachatryan, D. S.
    Grigorev, V. Y.
    [J]. SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2024, 35 (06) : 505 - 530
  • [36] Open-source learning management systems: a predictive model for higher education
    van Rooij, S. Williams
    [J]. JOURNAL OF COMPUTER ASSISTED LEARNING, 2012, 28 (02) : 114 - 125
  • [37] ms-data-core-api: an open-source, metadata-oriented library for computational proteomics
    Perez-Riverol, Yasset
    Uszkoreit, Julian
    Sanchez, Aniel
    Ternent, Tobias
    del Toro, Noemi
    Hermjakob, Henning
    Vizcaino, Juan Antonio
    Wang, Rui
    [J]. BIOINFORMATICS, 2015, 31 (17) : 2903 - 2905
  • [38] Implementation of partial slip boundary conditions in an open-source finite-volume-based computational library
    Fernandes, Celio
    Ferras, Luis Lima
    Habla, Florian
    Carneiro, Olga Sousa
    Nobrega, Joao Miguel
    [J]. JOURNAL OF POLYMER ENGINEERING, 2019, 39 (04) : 377 - 387
  • [39] eqtools. Modular, extensible, open-source, cross-machine Python']Python tools for working with magnetic equilibria
    Chilenski, M. A.
    Faust, I. C.
    Walk, J. R.
    [J]. COMPUTER PHYSICS COMMUNICATIONS, 2017, 210 : 155 - 162
  • [40] Rseslib 3: Open Source Library of Rough Set and Machine Learning Methods
    Wojna, Arkadiusz
    Latkowski, Rafa
    [J]. ROUGH SETS, IJCRS 2018, 2018, 11103 : 162 - 176