TorchDIVA: An extensible computational model of speech production built on an open-source machine learning library

被引:2
|
作者
Kinahan, Sean P. [1 ,2 ]
Liss, Julie M. [1 ]
Berisha, Visar [1 ,2 ]
机构
[1] Arizona State Univ, Coll Hlth Solut, Tempe, AZ 85281 USA
[2] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85281 USA
来源
PLOS ONE | 2023年 / 18卷 / 02期
关键词
D O I
10.1371/journal.pone.0281306
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The DIVA model is a computational model of speech motor control that combines a simulation of the brain regions responsible for speech production with a model of the human vocal tract. The model is currently implemented in Matlab Simulink; however, this is less than ideal as most of the development in speech technology research is done in Python. This means there is a wealth of machine learning tools which are freely available in the Python ecosystem that cannot be easily integrated with DIVA. We present TorchDIVA, a full rebuild of DIVA in Python using PyTorch tensors. DIVA source code was directly translated from Matlab to Python, and built-in Simulink signal blocks were implemented from scratch. After implementation, the accuracy of each module was evaluated via systematic block-by-block validation. The TorchDIVA model is shown to produce outputs that closely match those of the original DIVA model, with a negligible difference between the two. We additionally present an example of the extensibility of TorchDIVA as a research platform. Speech quality enhancement in TorchDIVA is achieved through an integration with an existing PyTorch generative vocoder called DiffWave. A modified DiffWave mel-spectrum upsampler was trained on human speech waveforms and conditioned on the TorchDIVA speech production. The results indicate improved speech quality metrics in the DiffWave-enhanced output as compared to the baseline. This enhancement would have been difficult or impossible to accomplish in the original Matlab implementation. This proof-of-concept demonstrates the value TorchDIVA can bring to the research community. Researchers can download the new implementation at: https://github.com/skinahan/DIVA_PyTorch.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Open-Source Machine Learning in Computational Chemistry
    Hagg, Alexander
    Kirschner, Karl N.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2023, 63 (15) : 4505 - 4532
  • [2] LC: A Flexible, Extensible Open-Source Toolkit for Model Compression
    Idelbayev, Yerlan
    Carreira-Perpinan, Miguel A.
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 4504 - 4514
  • [3] pyStudio: An Open-Source Machine Learning Platform
    Gomicia-Murcia, Enrique
    Bordel Sanchez, Borja
    Souissi, Riad
    AL-Qurishi, Muhammad
    [J]. PROCEEDINGS OF THE 2023 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2023, 2023, : 436 - 440
  • [4] Frouros: An open-source Python']Python library for drift detection in machine learning systems
    Sisniega, Jaime Cespedes
    Garcia, alvaro Lopez
    [J]. SOFTWAREX, 2024, 26
  • [5] compomics-utilities: an open-source Java library for computational proteomics
    Harald Barsnes
    Marc Vaudel
    Niklaas Colaert
    Kenny Helsens
    Albert Sickmann
    Frode S Berven
    Lennart Martens
    [J]. BMC Bioinformatics, 12
  • [6] Open-source machine learning: R meets Weka
    Hornik, Kurt
    Buchta, Christian
    Zeileis, Achim
    [J]. COMPUTATIONAL STATISTICS, 2009, 24 (02) : 225 - 232
  • [7] Open-source machine learning: R meets Weka
    Kurt Hornik
    Christian Buchta
    Achim Zeileis
    [J]. Computational Statistics, 2009, 24 : 225 - 232
  • [8] FairCORELS, an Open-Source Library for Learning Fair Rule Lists
    Aivodji, Ulrich
    Ferry, Julien
    Gambs, Sebastien
    Huguet, Marie-Jose
    Siala, Mohamed
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 4665 - 4669
  • [9] An Open-Source Software Library for Explainable Support Vector Machine Classification
    Loor, Marcelo
    Tapia-Rosero, Ana
    De Tre, Guy
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2022,
  • [10] ImJoy: an open-source computational platform for the deep learning era
    Wei Ouyang
    Florian Mueller
    Martin Hjelmare
    Emma Lundberg
    Christophe Zimmer
    [J]. Nature Methods, 2019, 16 : 1199 - 1200