Large Language Models Empower Multimodal Integrated Sensing and Communication

被引:0
|
作者
Cheng, Lu [1 ]
Zhang, Hongliang [2 ]
Di, Boya [1 ,2 ]
Niyato, Dusit [3 ]
Song, Lingyang [1 ,2 ]
机构
[1] Peking Univ, Shenzhen Grad Sch, Shenzhen, Peoples R China
[2] Peking Univ, Sch Elect, Shenzhen, Peoples R China
[3] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
基金
美国国家科学基金会; 新加坡国家研究基金会; 北京市自然科学基金;
关键词
Integrated sensing and communication; Multimodal sensors; Robot sensing systems; Training; Feature extraction; Drones; Cameras; Tuning; Laser radar; Large language models;
D O I
10.1109/MCOM.004.2400281
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Integrated sensing and communication (ISAC) is considered as a key candidate technology for the sixth-generation (6G) wireless networks. Notably, an integration of multimodal sensing information within ISAC systems promises an improvement for communication performance. Nevertheless, traditional methods for ISAC systems are typically designed to handle unimodal data, making it challenging to effectively process and integrate semantically complex multimodal information. Moreover, they are usually customized for specific types of data or tasks, leading to poor generalization ability. Multimodal large language models (MLLMs), which are trained on massive multimodal datasets and possess large parameter scales, are expected to be powerful tools to address the above issues. In this article, we first introduce an MLLM-enabled ISAC system to achieve enhanced communication and sensing performance. We begin with the introduction of the fundamental principles of ISAC and MLLMs. Moreover, we present the overall system and the corresponding opportunities to be achieved. Furthermore, this article provides a case study to demonstrate the superior performance of MLLMs in handling the beam prediction task within ISAC systems. Finally, we discuss several research challenges and potential directions for future research.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Multimodal Large Language Models Driven Privacy-Preserving Wireless Semantic Communication in 6G
    Cao, Daipeng
    Wu, Jun
    Bashir, Ali Kashif
    2024 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS, ICC WORKSHOPS 2024, 2024, : 171 - 176
  • [42] Harnessing large language models for coding, teaching and inclusion to empower research in ecology and evolution
    Cooper, Natalie
    Clark, Adam T.
    Lecomte, Nicolas
    Qiao, Huijie
    Ellison, Aaron M.
    METHODS IN ECOLOGY AND EVOLUTION, 2024, 15 (10): : 1757 - 1763
  • [43] EarthMarker: A Visual Prompting Multimodal Large Language Model for Remote Sensing
    Zhang, Wei
    Cai, Miaoxin
    Zhang, Tong
    Zhuang, Yin
    Li, Jun
    Mao, Xuerui
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [44] Align is not Enough: Multimodal Universal Jailbreak Attack against Multimodal Large Language Models
    Wang, Youze
    Hu, Wenbo
    Dong, Yinpeng
    Liu, Jing
    Zhang, Hanwang
    Hong, Richang
    IEEE Transactions on Circuits and Systems for Video Technology,
  • [45] Integrated Sensing, Communication, and Computing for Cost-effective Multimodal Federated Perception
    Chen, Ning
    Cheng, Zhipeng
    Fan, Xu wei
    Liu, Zhang
    Huang, Bangzhen
    Zhao, Yifeng
    Huang, Lianfen
    Du, Xiaojiang
    Guizani, Mohsen
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (08)
  • [46] Towards Language-Driven Video Inpainting via Multimodal Large Language Models
    Wu, Jianzong
    Li, Xiangtai
    Si, Chenyang
    Zhou, Shangchen
    Yang, Jingkang
    Zhang, Jiangning
    Li, Yining
    Chen, Kai
    Tong, Yunhai
    Liu, Ziwei
    Loy, Chen Change
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 12501 - 12511
  • [47] A Comprehensive Study of Multimodal Large Language Models for Image Quality Assessment
    Wu, Tianhe
    Ma, Kede
    Liang, Jie
    Yang, Yujiu
    Zhang, Lei
    COMPUTER VISION - ECCV 2024, PT LXXIV, 2025, 15132 : 143 - 160
  • [48] Incorporating Molecular Knowledge in Large Language Models via Multimodal Modeling
    Yang, Zekun
    Lv, Kun
    Shu, Jian
    Li, Zheng
    Xiao, Ping
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2025,
  • [49] Chat with the Environment: Interactive Multimodal Perception Using Large Language Models
    Zhao, Xufeng
    Li, Mengdi
    Weber, Cornelius
    Hafez, Muhammad Burhan
    Wermter, Stefan
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS, 2023, : 3590 - 3596
  • [50] Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models
    Ma, Chuofan
    Jiang, Yi
    Wu, Jiannan
    Yuan, Zehuan
    Qi, Xiaojuan
    COMPUTER VISION - ECCV 2024, PT VI, 2025, 15064 : 417 - 435