Tagged-MRI Sequence to Audio Synthesis via Self Residual Attention Guided Heterogeneous Translator

被引:1
|
作者
Liu, Xiaofeng [1 ,2 ]
Xing, Fangxu [1 ,2 ]
Prince, Jerry L. [3 ]
Zhuo, Jiachen [4 ]
Stone, Maureen [4 ]
El Fakhri, Georges [1 ,2 ]
Woo, Jonghye [1 ,2 ]
机构
[1] Massachusetts Gen Hosp, Boston, MA 02114 USA
[2] Harvard Med Sch, Boston, MA 02115 USA
[3] Johns Hopkins Univ, Baltimore, MD USA
[4] Univ Maryland, Baltimore, MD USA
关键词
TONGUE; MOTION;
D O I
10.1007/978-3-031-16446-0_36
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Understanding the underlying relationship between tongue and oropharyngeal muscle deformation seen in tagged-MRI and intelligible speech plays an important role in advancing speech motor control theories and treatment of speech related-disorders. Because of their heterogeneous representations, however, direct mapping between the two modalities-i.e., two-dimensional (mid-sagittal slice) plus time tagged-MRI sequence and its corresponding one-dimensional waveform-is not straightforward. Instead, we resort to two-dimensional spectrograms as an intermediate representation, which contains both pitch and resonance, from which to develop an end-to-end deep learning framework to translate from a sequence of tagged-MRI to its corresponding audio waveform with limited dataset size. Our framework is based on a novel fully convolutional asymmetry translator with guidance of a self residual attention strategy to specifically exploit the moving muscular structures during speech. In addition, we leverage a pairwise correlation of the samples with the same utterances with a latent space representation disentanglement strategy. Furthermore, we incorporate an adversarial training approach with generative adversarial networks to offer improved realism on our generated spectrograms. Our experimental results, carried out with a total of 63 tagged-MRI sequences alongside speech acoustics, showed that our framework enabled the generation of clear audio waveforms from a sequence of tagged-MRI, surpassing competing methods. Thus, our framework provides the great potential to help better understand the relationship between the two modalities.
引用
收藏
页码:376 / 386
页数:11
相关论文
共 6 条
  • [1] CMRI2SPEC: CINE MRI SEQUENCE TO SPECTROGRAM SYNTHESIS VIA A PAIRWISE HETEROGENEOUS TRANSLATOR
    Liu, Xiaofeng
    Xing, Fangxu
    Stone, Maureen
    Prince, Jerry L.
    Kim, Jangwon
    El Fakhri, Georges
    Woo, Jonghye
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1481 - 1485
  • [2] Tagged-to-Cine MRI Sequence Synthesis via Light Spatial-Temporal Transformer
    Liu, Xiaofeng
    Xing, Fangxu
    Bian, Zhangxing
    Arias-Vergara, Tomas
    Perez-Toro, Paula Andrea
    Maier, Andreas
    Stone, Maureen
    Zhuo, Jiachen
    Prince, Jerry L.
    Woo, Jonghye
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT VII, 2024, 15007 : 701 - 711
  • [3] Speech Audio Synthesis from Tagged MRI and Non-negative Matrix Factorization via Plastic Transformer
    Liu, Xiaofeng
    Xing, Fangxu
    Stone, Maureen
    Zhuo, Jiachen
    Fels, Sidney
    Prince, Jerry L.
    El Fakhri, Georges
    Woo, Jonghye
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT VII, 2023, 14226 : 435 - 445
  • [4] Polarimetric Thermal to Visible Face Verification via Self-Attention Guided Synthesis
    Di, Xing
    Riggan, Benjamin S.
    Hu, Shuowen
    Short, Nathaniel J.
    Patel, Vishal M.
    2019 INTERNATIONAL CONFERENCE ON BIOMETRICS (ICB), 2019,
  • [5] Improved MRI-based Pseudo-CT Synthesis via Segmentation Guided Attention Networks
    Dovletov, Gurbandurdy
    Pham, Duc Duy
    Pauli, Josef
    Gratz, Marcel
    Quick, Harald
    PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES (BIOIMAGING), VOL 2, 2021, : 131 - 140
  • [6] A Self-Attention-Guided 3D Deep Residual Network With Big Transfer to Predict Local Failure in Brain Metastasis After Radiotherapy Using Multi-Channel MRI
    Jalalifar, Seyed Ali
    Soliman, Hany
    Sahgal, Arjun
    Sadeghi-Naini, Ali
    IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE, 2023, 11 : 13 - 22