Semantic-Electromagnetic Inversion With Pretrained Multimodal Generative Model

被引:0
|
作者
Chen, Yanjin [1 ]
Zhang, Hongrui [1 ]
Ma, Jie [1 ]
Cui, Tie Jun [2 ,3 ]
del Hougne, Philipp [4 ]
Li, Lianlin [1 ,3 ]
机构
[1] Peking Univ, Sch Elect, State Key Lab Adv Opt Commun Syst & Networks, Beijing 100871, Peoples R China
[2] Southeast Univ, State Key Lab Millimeter Waves, Nanjing 210096, Peoples R China
[3] Pazhou Lab Huangpu, Guangzhou 510555, Peoples R China
[4] Univ Rennes, CNRS, IETR, UMR 6164, F-35000 Rennes, France
基金
中国国家自然科学基金;
关键词
inverse scattering; microwave imaging; pretrained large-capacity foundation models; semantic-electromagnetic inverse problem; RADAR TOMOGRAPHY;
D O I
10.1002/advs.202406793
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Across diverse domains of science and technology, electromagnetic (EM) inversion problems benefit from the ability to account for multimodal prior information to regularize their inherent ill-posedness. Indeed, besides priors that are formulated mathematically or learned from quantitative data, valuable prior information may be available in the form of text or images. Besides handling semantic multimodality, it is furthermore important to minimize the cost of adapting to a new physical measurement operator and to limit the requirements for costly labeled data. Here, these challenges are tackled with a frugal and multimodal semantic-EM inversion technique. The key ingredient is a multimodal generator of reconstruction results that can be pretrained, being agnostic to the physical measurement operator. The generator is fed by a multimodal foundation model encoding the multimodal semantic prior and a physical adapter encoding the measured data. For a new physical setting, only the lightweight physical adapter is retrained. The authors' architecture also enables a flexible iterative step-by-step solution to the inverse problem where each step can be semantically controlled. The feasibility and benefits of this methodology are demonstrated for three EM inverse problems: a canonical two-dimensional inverse-scattering problem in numerics, as well as three-dimensional and four-dimensional compressive microwave meta-imaging experiments. This work presents a semantic-EM inversion method capable of incorporating multimodal semantic priors in a flexible and frugal manner. It shows great advantages in handling semantic multimodality through a semantic-guided step-by-step manner and minimizing the cost of adapting to a new physical measurement operator and to limit the requirements for costly labeled training data. image
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Multimodal Dialog Systems with Dual Knowledge-enhanced Generative Pretrained Language Model
    Chen, Xiaolin
    Song, Xuemeng
    Jing, Liqiang
    Li, Shuo
    Hu, Linmei
    Nie, Liqiang
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2024, 42 (02)
  • [2] Debiasing Pretrained Generative Models by Uniformly Sampling Semantic Attributes
    Gerych, Walter
    Hickey, Kevin
    Buquicchio, Luke
    Chandrasekaran, Kavin
    Alajaji, Abdulaziz
    Rundensteiner, Elke
    Agu, Emmanuel
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [3] Learning spatiotemporal dynamics with a pretrained generative model
    Li, Zeyu
    Han, Wang
    Zhang, Yue
    Fu, Qingfei
    Li, Jingxuan
    Qin, Lizi
    Dong, Ruoyu
    Sun, Hao
    Deng, Yue
    Yang, Lijun
    NATURE MACHINE INTELLIGENCE, 2024, 6 (12) : 1566 - 1579
  • [4] Towards a Foundation Purchasing Model: Pretrained Generative Autoregression on Transaction Sequences
    Skalski, Piotr
    Sutton, David
    Burrell, Stuart
    Perez, Iker
    Wong, Jason
    PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON AI IN FINANCE, ICAIF 2023, 2023, : 141 - 149
  • [5] Urban Multimodal Transportation Generative Pretrained Transformer Foundation Model: Hierarchical Techniques and Application Scenarios of Spot-corridor-network Decomposition
    Zhou Z.
    Gu Z.-Y.
    Qu X.-B.
    Liu P.
    Liu Z.-Y.
    Zhongguo Gonglu Xuebao/China Journal of Highway and Transport, 2024, 37 (02): : 253 - 274
  • [6] β-CLVAE: a semantic disentangled generative model
    Cheng, Keyang
    Meng, Chunyun
    Ma, Guojian
    Zhan, Yongzhao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (03) : 8517 - 8532
  • [7] A generative model for semantic role labeling
    Thompson, CA
    Levy, R
    Manning, CD
    MACHINE LEARNING: ECML 2003, 2003, 2837 : 397 - 408
  • [8] Multimodal Semantic Communication for Generative Audio-Driven Video Conferencing
    Tong, Haonan
    Li, Haopeng
    Du, Hongyang
    Yang, Zhaohui
    Yin, Changchuan
    Niyato, Dusit
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2025, 14 (01) : 93 - 97
  • [9] Layerwised multimodal knowledge distillation for vision-language pretrained model
    Wang, Jin
    Liao, Dawei
    Zhang, You
    Xu, Dan
    Zhang, Xuejie
    NEURAL NETWORKS, 2024, 175
  • [10] A Model of Semantic Completion in Generative Episodic Memory
    Fayyaz, Zahra
    Altamimi, Aya
    Zoellner, Carina
    Klein, Nicole
    Wolf, Oliver T.
    Cheng, Sen
    Wiskott, Laurenz
    NEURAL COMPUTATION, 2022, 34 (09) : 1841 - 1870