MRI Vocal Tract Sagittal Slices Estimation During Speech Production of CV

被引:0
|
作者
Douros, Ioannis K. [1 ]
Kulkarni, Ajinkya [1 ]
Xie, Yu [2 ]
Dourou, Chrysanthi [3 ]
Felblinger, Jacques [4 ]
Isaieva, Karyna [5 ]
Vuissoz, Pierre-Andre [5 ]
Laprie, Yves [1 ]
机构
[1] Univ Lorraine, CNRS, INRIA, LORIA, F-54000 Nancy, France
[2] Wuhan Univ, Zhongnan Hosp, Dept Neurol, Wuhan 430071, Peoples R China
[3] Natl Tech Univ Athens, Sch Elect & Comp Engn, Athens 15773, Greece
[4] Univ Lorraine, INSERM 1433, CIC IT, CHRU Nancy, F-54000 Nancy, France
[5] Univ Lorraine, INSERM, IADI, U1254, F-54000 Nancy, France
关键词
image transformation; rtMRI data; speech resources enrichment; vocal tract; REAL-TIME MRI; RESOLUTION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we propose an algorithm for estimating vocal tract para sagittal slices in order to have a better overview of the behaviour of the articulators during speech production. The first step is to align the consonant-vowel (CV) data of the sagittal plains between them for the train speaker. Sets of transformations that connect the midsagittal frames with the neighbouring ones is acquired for the train speaker. Another set of transformations is calculated which transforms the midsagittal frames of the train speaker to the corresponding midsagittal frames of the test speaker and is used to adapt to the test speaker domain the previously computed sets of transformations. The newly adapted transformations are applied to the midsagittal frames of the test speaker in order to estimate the neighbouring sagittal frames. Several mono speaker models are combined to produce the final frame estimation. To evaluate the results, image cross-correlation between the original and the estimated frames was used. Results show good agreement between the original and the estimated frames.
引用
收藏
页码:1115 / 1119
页数:5
相关论文
共 50 条
  • [31] A multilinear tongue model derived from speech related MRI data of the human vocal tract
    Hewer, Alexander
    Wuhrer, Stefanie
    Steiner, Ingmar
    Richmond, Korin
    COMPUTER SPEECH AND LANGUAGE, 2018, 51 : 68 - 92
  • [32] Towards Speech Classification from Acoustic and Vocal Tract data in Real-time MRI
    Yue, Yaoyao
    Proctor, Michael
    Zhou, Luping
    Gupta, Rijul
    Piyadasa, Tharinda
    Gully, Amelia
    Ballard, Kirrie
    Tin, Craig
    INTERSPEECH 2024, 2024, : 1345 - 1349
  • [33] Compressible flow simulations of voiced speech using rigid vocal tract geometries acquired by MRI
    Schickhofer, Lukas
    Malinen, Jarmo
    Mihaescu, Mihai
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 145 (04): : 2049 - 2061
  • [34] Education system in acoustics of speech production using physical models of the human vocal tract
    Arai, Takayuki
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2007, 28 (03) : 190 - 201
  • [35] Speech production by a mechanical model: Construction of a vocal tract and its control by neural network
    Higashimoto, T
    Sawada, L
    2002 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS I-IV, PROCEEDINGS, 2002, : 3858 - 3863
  • [36] Improved chirp group delay based algorithms with applications to vocal tract estimation and speech recognition
    Jayesh, M. K.
    Ramalingam, C. S.
    SPEECH COMMUNICATION, 2016, 81 : 72 - 89
  • [37] Vocal tract length normalization using rapid maximum-likelihood estimation for speech recognition
    Emori, Tadashi
    Shinoda, Koichi
    Systems and Computers in Japan, 2002, 33 (05): : 30 - 40
  • [38] ESTIMATION OF VOCAL-TRACT SHAPES FROM ACOUSTICAL ANALYSIS OF THE SPEECH WAVE - STATE OF THE ART
    WAKITA, H
    IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (03): : 281 - 285
  • [39] A Hybrid Method for Acoustic Analysis of the Vocal Tract During Vowel Production
    Wang, Futang
    Hou, Qingzhi
    Pan, Dingyi
    Wei, Jianguo
    Dang, Jianwu
    STUDIES ON SPEECH PRODUCTION, 2018, 10733 : 68 - 77
  • [40] Vocal tract length estimation based on vowels using a database consisting of 385 speakers and a database with MRI-based vocal tract shape information
    Kawahara, Hideki
    Kitamura, Tatsuya
    Takemoto, Hironori
    Nisimura, Ryuichi
    Irino, Toshio
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 870 - 874