MRI-Based Vocal Tract Representations for the Three-Dimensional Finite Element Synthesis of Diphthongs

被引:19
|
作者
Arnela, Marc [1 ]
Dabbaghchian, Saeed [2 ]
Guasch, Oriol [1 ]
Engwall, Olov [2 ]
机构
[1] Univ Ramon Llull, GTM Grp Recerca Tecnol Media, Barcelona 08022, Spain
[2] KTH Royal Inst Technol, Sch Elect Engn & Comp Sci, Dept Speech Mus & Hearing, SE-10044 Stockholm, Sweden
关键词
Vocal tract acoustics; Finite Element Method; diphthongs; semi-polar grid; adaptive grid; speech synthesis; WAVE-EQUATION; GEOMETRY SIMPLIFICATIONS; PROPAGATION MODES; SIMULATION; HEAD;
D O I
10.1109/TASLP.2019.2942439
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The synthesis of diphthongs in three-dimensions (3D) involves the simulation of acoustic waves propagating through a complex 3D vocal tract geometry that deforms over time. Accurate 3D vocal tract geometries can be extracted from Magnetic Resonance Imaging (MRI), but due to long acquisition times, only static sounds can be currently studied with an adequate spatial resolution. In this work, 3D dynamic vocal tract representations are built to generate diphthongs, based on a set of cross-sections extracted from MRI-based vocal tract geometries of static vowel sounds. A diphthong can then be easily generated by interpolating the location, orientation and shape of these cross-sections, thus avoiding the interpolation of full 3D geometries. Two options are explored to extract the cross-sections. The first one is based on an adaptive grid (AG), which extracts the cross-sections perpendicular to the vocal tract midline, whereas the second one resorts to a semi-polar grid (SPG) strategy, which fixes the cross-section orientations. The finite element method (FEM) has been used to solve the mixed wave equation and synthesize diphthongs [${\alpha i}$] and [${\alpha u}$] in the dynamic 3D vocal tracts. The outputs from a 1D acoustic model based on the Transfer Matrix Method have also been included for comparison. The results show that the SPG and AG provide very close solutions in 3D, whereas significant differences are observed when using them in 1D. The SPG dynamic vocal tract representation is recommended for 3D simulations because it helps to prevent the collision of adjacent cross-sections.
引用
收藏
页码:2173 / 2182
页数:10
相关论文
共 50 条
  • [1] Formant Frequency Tuning of Three-Dimensional MRI-Based Vocal Tracts for the Finite Element Synthesis of Vowels
    Arnela, Marc
    Guasch, Oriol
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 2790 - 2799
  • [2] Three-dimensional measurement of the vocal tract by MRI
    Demolin, D
    Metens, T
    Soquet, A
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 272 - 275
  • [3] Finite Element Synthesis of Diphthongs Using Tuned Two-Dimensional Vocal Tracts
    Arnela, Marc
    Guasch, Oriol
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (10) : 2013 - 2023
  • [4] A parametric three-dimensional model of the vocal-tract based on MRI data
    Yehia, H
    Tiede, M
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1619 - 1622
  • [5] Human vocal tract resonances and the corresponding mode shapes investigated by three-dimensional finite-element modelling based on CT measurement
    Vampola, Tomas
    Horacek, Jaromir
    Laukkanen, Anne-Maria
    Svec, Jan G.
    LOGOPEDICS PHONIATRICS VOCOLOGY, 2015, 40 (01) : 14 - 23
  • [6] Vocal Tract Morphology in Inhaling Singing: An MRI-Based Study
    Moerman, Mieke
    Vanhecke, Francoise
    Van Assche, Lieven
    Vercruysse, Johan
    Daemers, Kristin
    Leman, Marc
    JOURNAL OF VOICE, 2016, 30 (04) : 466 - 471
  • [7] Three-dimensional dosimetry of TomoTherapy by MRI-based polymer gel technique
    Watanabe, Yoichi
    Gopishankar, N.
    JOURNAL OF APPLIED CLINICAL MEDICAL PHYSICS, 2011, 12 (01): : 14 - 27
  • [8] MRI-Based Three-Dimensional Modeling and Assessment of Epicardial Adipose Tissue
    Klingensmith, Jon D.
    Sop, Saygin
    Fernandez-del-Valle, Maria
    Mitra, Sunanda
    Naz, Mete
    Lee, H. Felix
    MEDICAL IMAGING 2018: BIOMEDICAL APPLICATIONS IN MOLECULAR, STRUCTURAL, AND FUNCTIONAL IMAGING, 2018, 10578
  • [9] Three-dimensional acoustic field in vocal-tract
    Motoki, Kunitoshi
    Acoustical Science and Technology, 2002, 23 (04) : 207 - 212
  • [10] Construction and control of a three-dimensional vocal tract model
    Birkholz, Peter
    Jackel, Dietmar
    Kroeger, Bernd J.
    2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 873 - 876