Morphological and Acoustic Analysis of the Vocal Tract Using a Multi-Speaker Volumetric MRI Dataset

被引:0
|
作者
Kaburagi, Tokihiko [1 ]
机构
[1] Kyushu Univ, Minami Ku, 4-9-1 Shiobaru, Fukuoka, Japan
关键词
magnetic resonance imaging; cross-sectional area function; formant frequency; individual difference; AREA FUNCTIONS; INVERSION; MODEL;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The shape of the vocal tract was analyzed from both morphological and acoustic perspectives for ten male speakers of Japanese. A volumetric MRI (magnetic resonance imaging) measurement was performed while each speaker uttered each of the five Japanese vowels. The cross-sectional vocal -tract area function was computed from the MRI dataset and the resulting 50 vocal -tract shapes were analyzed statistically to determine the principal deformation patterns. A perturbation of the vocal tract shape was then given for each vowel to examine the effect on the first and second formant frequencies. When the perturbation was given by changing the coefficient values of the first and second principal modes, a local region on the coefficient plane was observed where the formant change was small. In other words, this region was acoustically insensitive to the perturbation of the vocal -tract shape. When the vocal -tract shapes of the ten speakers were marked on the same plot, it was also found that marked vocal -tract shapes were located in the vicinity of the acoustically insensitive region. From these numerical investigations, it was considered how the individual differences in the vocal -tract shape can be resolved to generate phonetically relevant speech sounds.
引用
收藏
页码:379 / 383
页数:5
相关论文
共 50 条
  • [1] Fuzzy Phoneme Classification Using Multi-speaker Vocal Tract Length Normalization
    Lung, Jensen Wong Jing
    Salam, Md Sah Hj
    Rehman, Amjad
    Rahim, Mohd Shafry Mohd
    Saba, Tanzila
    [J]. IETE TECHNICAL REVIEW, 2014, 31 (02) : 128 - 136
  • [2] ForumSum: A Multi-Speaker Conversation Summarization Dataset
    Khalman, Misha
    Zhao, Yao
    Saleh, Mohammad
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 4592 - 4599
  • [3] KMSAV: Korean multi-speaker spontaneous audiovisual dataset
    Park, Kiyoung
    Oh, Changhan
    Dong, Sunghee
    [J]. ETRI JOURNAL, 2024, 46 (01) : 71 - 81
  • [4] Speaker conditioned acoustic modeling for multi-speaker conversational ASR
    Chetupalli, Srikanth Raj
    Ganapathy, Sriram
    [J]. INTERSPEECH 2022, 2022, : 3834 - 3838
  • [5] SPEAKER CONDITIONING OF ACOUSTIC MODELS USING AFFINE TRANSFORMATION FOR MULTI-SPEAKER SPEECH RECOGNITION
    Yousefi, Midia
    Hansen, John H. L.
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 283 - 288
  • [6] A Volumetric Analysis of the Vocal Tract Associated with Laryngectomees Using Acoustic Reflection Technology
    Ng, Manwa L.
    Yan, Nan
    Chan, Venus
    Chen, Yang
    Lam, Paul K. Y.
    [J]. FOLIA PHONIATRICA ET LOGOPAEDICA, 2018, 70 (01) : 44 - 49
  • [7] Emotional Speech Synthesis for Multi-Speaker Emotional Dataset Using WaveNet Vocoder
    Choi, Heejin
    Park, Sangjun
    Park, Jinuk
    Hahn, Minsoo
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2019,
  • [8] Speaker dependent acoustic-to-articulatory inversion using real-time MRI of the vocal tract
    Csapo, Tamas Gabor
    [J]. INTERSPEECH 2020, 2020, : 3720 - 3724
  • [9] Speaker dependent articulatory-to-acoustic mapping using real-time MRI of the vocal tract
    Csapo, Tamas Gabor
    [J]. INTERSPEECH 2020, 2020, : 2722 - 2726
  • [10] Modelling of acoustic properties of vocal tract using MRI
    Peterova, V.
    Peterova, L.
    Krystufek, J.
    [J]. EUROPEAN JOURNAL OF NEUROLOGY, 2009, 16 : 504 - 504