Morphological and Acoustic Analysis of the Vocal Tract Using a Multi-Speaker Volumetric MRI Dataset

被引：0

作者：

Kaburagi, Tokihiko ^{[1
]}

机构：

[1] Kyushu Univ, Minami Ku, 4-9-1 Shiobaru, Fukuoka, Japan

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

magnetic resonance imaging; cross-sectional area function; formant frequency; individual difference; AREA FUNCTIONS; INVERSION; MODEL;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The shape of the vocal tract was analyzed from both morphological and acoustic perspectives for ten male speakers of Japanese. A volumetric MRI (magnetic resonance imaging) measurement was performed while each speaker uttered each of the five Japanese vowels. The cross-sectional vocal -tract area function was computed from the MRI dataset and the resulting 50 vocal -tract shapes were analyzed statistically to determine the principal deformation patterns. A perturbation of the vocal tract shape was then given for each vowel to examine the effect on the first and second formant frequencies. When the perturbation was given by changing the coefficient values of the first and second principal modes, a local region on the coefficient plane was observed where the formant change was small. In other words, this region was acoustically insensitive to the perturbation of the vocal -tract shape. When the vocal -tract shapes of the ten speakers were marked on the same plot, it was also found that marked vocal -tract shapes were located in the vicinity of the acoustically insensitive region. From these numerical investigations, it was considered how the individual differences in the vocal -tract shape can be resolved to generate phonetically relevant speech sounds.

引用

页码：379 / 383

页数：5

共 50 条

[1] Fuzzy Phoneme Classification Using Multi-speaker Vocal Tract Length Normalization
Lung, Jensen Wong Jing
Salam, Md Sah Hj
Rehman, Amjad
Rahim, Mohd Shafry Mohd
Saba, Tanzila
[J]. IETE TECHNICAL REVIEW, 2014, 31 (02) : 128 - 136
[2] ForumSum: A Multi-Speaker Conversation Summarization Dataset
Khalman, Misha
Zhao, Yao
Saleh, Mohammad
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 4592 - 4599
[3] KMSAV: Korean multi-speaker spontaneous audiovisual dataset
Park, Kiyoung
Oh, Changhan
Dong, Sunghee
[J]. ETRI JOURNAL, 2024, 46 (01) : 71 - 81
[4] Speaker conditioned acoustic modeling for multi-speaker conversational ASR
Chetupalli, Srikanth Raj
Ganapathy, Sriram
[J]. INTERSPEECH 2022, 2022, : 3834 - 3838
[5] SPEAKER CONDITIONING OF ACOUSTIC MODELS USING AFFINE TRANSFORMATION FOR MULTI-SPEAKER SPEECH RECOGNITION
Yousefi, Midia
Hansen, John H. L.
[J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 283 - 288
[6] A Volumetric Analysis of the Vocal Tract Associated with Laryngectomees Using Acoustic Reflection Technology
Ng, Manwa L.
Yan, Nan
Chan, Venus
Chen, Yang
Lam, Paul K. Y.
[J]. FOLIA PHONIATRICA ET LOGOPAEDICA, 2018, 70 (01) : 44 - 49
[7] Emotional Speech Synthesis for Multi-Speaker Emotional Dataset Using WaveNet Vocoder
Choi, Heejin
Park, Sangjun
Park, Jinuk
Hahn, Minsoo
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2019,
[8] Speaker dependent acoustic-to-articulatory inversion using real-time MRI of the vocal tract
Csapo, Tamas Gabor
[J]. INTERSPEECH 2020, 2020, : 3720 - 3724
[9] Speaker dependent articulatory-to-acoustic mapping using real-time MRI of the vocal tract
Csapo, Tamas Gabor
[J]. INTERSPEECH 2020, 2020, : 2722 - 2726
[10] Modelling of acoustic properties of vocal tract using MRI
Peterova, V.
Peterova, L.
Krystufek, J.
[J]. EUROPEAN JOURNAL OF NEUROLOGY, 2009, 16 : 504 - 504

← 1 2 3 4 5 →