Human Vocal Tract Analysis by in Vivo 3D MRI during Phonation: A Complete System for Imaging, Quantitative Modeling, and Speech Synthesis

被引:0
|
作者
Wismueller, Axel [1 ,2 ]
Behrends, Johannes [1 ,2 ]
Hoole, Phil [3 ]
Leinsinger, Gerda L. [4 ]
Reiser, Maximilian F. [4 ]
Westesson, Per-Lennart [1 ,2 ]
机构
[1] Univ Rochester, Dept Imaging Sci, 601 Elmwood Ave,Box 648, Rochester, NY 14642 USA
[2] Univ Rochester, Dept Biomed Engn, Rochester, NY 14642 USA
[3] Univ Munich, Dept Phonet, D-80799 Munich, Germany
[4] Univ Munich, Dept Radiol, D-80336 Munich, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a complete system for image-based 3D vocal tract analysis ranging from MR image acquisition during phonation, semi-automatic image processing, quantitative modeling including model-based speech synthesis, to quantitative model evaluation by comparison between recorded and synthesized phoneme Sounds. For this purpose, six professionally trained speakers, age 22-34y, were examined using it standardized MRI protocol (1.5 T. T1w FLASH, ST 4mm. 23 slices, acq. time 21s). The volunteers performed a prolonged (>= 21s) emission of sounds of the German phonemic inventory. Simultaneous audio tape recording was obtained to control correct utterance. Scans were made in axial, coronal, and sagittal planes each. Computer-aided quantitative 3D evaluation included (i) automated registration of the phoneme-specific data acquired in different slice orientations, (ii) semi-automated segmentation of oropharyngeal structures, (iii) Computation of it curvilinear vocal tract midline in 3D by nonlinear PCA. (iv) computation of cross-sectional areas of the vocal tract perpendicular to this midline. For the vowels the extracted area functions were used to synthesize phoneme Sounds based on all articulatory-acoustic model. For quantitative analysis, recorded and synthesized phonemes were compared, where area functions extracted from 2D midsagittal slices were used as a reference. All vowels could be identified correctly based on the synthesized phoneme sounds. The comparison between synthesized and recorded vowel phonemes revealed that the quality of phoneme sound synthesis was improved for phonemes /a/, /o/, and /y/, if 3D instead of 2D data were used, as measured by the average relative frequency shift between recorded and synthesized vowel formants (p < 0.05, one-sided Wilcoxon rank sum test). In summary, the combination of fast MRI followed by subsequent 3D segmentation and analysis is a novel approach to examine human phonation in vivo. It unveils functional anatomical findings that may be essential for realistic modelling of the human vocal tract during speech production.
引用
收藏
页码:306 / 312
页数:7
相关论文
共 44 条
  • [31] In vivo 3D visualization of normal pyramidal tracts in human subjects using diffusion weighted magnetic resonance imaging and a neuronavigation system
    Krings, T
    Coenen, VA
    Axer, H
    Reinges, MHT
    Höller, M
    von Keyserlingk, DG
    Gilsbach, JM
    Thron, A
    NEUROSCIENCE LETTERS, 2001, 307 (03) : 192 - 196
  • [32] In vivo comparison of MRI- and CBCT-based 3D cephalometric analysis: beginning of a non-ionizing diagnostic era in craniomaxillofacial imaging?
    Juerchott, Alexander
    Freudlsperger, Christian
    Weber, Dorothea
    Jende, Johann M. E.
    Saleem, Muhammad Abdullah
    Zingler, Sebastian
    Lux, Christopher J.
    Bendszus, Martin
    Heiland, Sabine
    Hilgenfeld, Tim
    EUROPEAN RADIOLOGY, 2020, 30 (03) : 1488 - 1497
  • [33] In vivo comparison of MRI- and CBCT-based 3D cephalometric analysis: beginning of a non-ionizing diagnostic era in craniomaxillofacial imaging?
    Alexander Juerchott
    Christian Freudlsperger
    Dorothea Weber
    Johann M. E. Jende
    Muhammad Abdullah Saleem
    Sebastian Zingler
    Christopher J. Lux
    Martin Bendszus
    Sabine Heiland
    Tim Hilgenfeld
    European Radiology, 2020, 30 : 1488 - 1497
  • [34] 3D Modeling of Geometric Structure and Tissue Composition of Human Laryngeal Anatomy via High-Resolution MRI Segmentation and Histological Analysis
    Mason, Nena Lundgreen
    Wang, Haonan
    Heldt, Brett
    Long, BreAnna
    Nazaran, Amin
    Reid, ReyLynn
    Bangerter, Neal K.
    Wisco, Jonathan J.
    FASEB JOURNAL, 2016, 30
  • [35] Comparison of image processing techniques (Magnetic resonance imaging, computed tomography scan and ultrasound) for 3D modeling and analysis of the human bones
    Bhavin V. Mehta
    Sailesh Rajani
    Guatam Sinha
    Journal of Digital Imaging, 1997, 10 : 203 - 206
  • [36] Comparison of image processing techniques (magnetic resonance imaging, computed tomography scan and ultrasound) for 3D modeling and analysis of the human bones
    Mehta, BV
    Rajani, S
    Sinha, G
    JOURNAL OF DIGITAL IMAGING, 1997, 10 (03) : 203 - 206
  • [37] 3D analysis system for estimating intersegmental forces and moments exerted on human lower limbs during walking motion
    Yang, Eileen Chih-Ying
    Mao, Ming-Hsu
    MEASUREMENT, 2015, 73 : 171 - 179
  • [38] Development of a Posture-sensorial Modeling and Analysis 3D Structure for the Human Bio-system Involved in Motilitate Activities
    Baritz, Mihaela
    Cotoros, Diana
    Balcu, Ion
    2ND INTERNATIONAL CONFERENCE ON INNOVATIONS, RECENT TRENDS AND CHALLENGES IN MECHATRONICS, MECHANICAL ENGINEERING AND NEW HIGH-TECH PRODUCTS DEVELOPMENT (MECAHITECH '10), 2010, : 290 - 297
  • [39] 3D modeling of human cancer: A PEG-fibrin hydrogel system to study the role of tumor microenvironment and recapitulate the in vivo effect of oncolytic adenovirus
    Del Bufalo, Francesca
    Manzo, Teresa
    Hoyos, Valentina
    Yagyu, Shigeki
    Caruana, Ignazio
    Jacot, Jeffrey
    Benavides, Omar
    Rosen, Daniel
    Brenner, Malcolm K.
    BIOMATERIALS, 2016, 84 : 76 - 85
  • [40] Significant correlations between human cortical bone mineral density and quantitative susceptibility mapping (QSM) obtained with 3D Cones ultrashort echo time magnetic resonance imaging (UTE-MRI)
    Jerban, Saeed
    Lu, Xing
    Jang, Hyungseok
    Ma, Yajun
    Namiranian, Behnam
    Le, Nicole
    Li, Ying
    Chang, Eric Y.
    Du, Jiang
    MAGNETIC RESONANCE IMAGING, 2019, 62 : 104 - 110