Large Language Models Empower Multimodal Integrated Sensing and Communication

被引:0
|
作者
Cheng, Lu [1 ]
Zhang, Hongliang [2 ]
Di, Boya [1 ,2 ]
Niyato, Dusit [3 ]
Song, Lingyang [1 ,2 ]
机构
[1] Peking Univ, Shenzhen Grad Sch, Shenzhen, Peoples R China
[2] Peking Univ, Sch Elect, Shenzhen, Peoples R China
[3] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
基金
新加坡国家研究基金会; 美国国家科学基金会; 北京市自然科学基金;
关键词
Integrated sensing and communication; Multimodal sensors; Robot sensing systems; Training; Feature extraction; Drones; Cameras; Tuning; Laser radar; Large language models;
D O I
10.1109/MCOM.004.2400281
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Integrated sensing and communication (ISAC) is considered as a key candidate technology for the sixth-generation (6G) wireless networks. Notably, an integration of multimodal sensing information within ISAC systems promises an improvement for communication performance. Nevertheless, traditional methods for ISAC systems are typically designed to handle unimodal data, making it challenging to effectively process and integrate semantically complex multimodal information. Moreover, they are usually customized for specific types of data or tasks, leading to poor generalization ability. Multimodal large language models (MLLMs), which are trained on massive multimodal datasets and possess large parameter scales, are expected to be powerful tools to address the above issues. In this article, we first introduce an MLLM-enabled ISAC system to achieve enhanced communication and sensing performance. We begin with the introduction of the fundamental principles of ISAC and MLLMs. Moreover, we present the overall system and the corresponding opportunities to be achieved. Furthermore, this article provides a case study to demonstrate the superior performance of MLLMs in handling the beam prediction task within ISAC systems. Finally, we discuss several research challenges and potential directions for future research.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Woodpecker: hallucination correction for multimodal large language models
    Yin, Shukang
    Fu, Chaoyou
    Zhao, Sirui
    Xu, Tong
    Wang, Hao
    Sui, Dianbo
    Shen, Yunhang
    Li, Ke
    Sun, Xing
    Chen, Enhong
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (12)
  • [22] Do multimodal large language models understand welding?
    Khvatskii, Grigorii
    Lee, Yong Suk
    Angst, Corey
    Gibbs, Maria
    Landers, Robert
    Chawla, Nitesh V.
    INFORMATION FUSION, 2025, 120
  • [23] Woodpecker: hallucination correction for multimodal large language models
    Shukang YIN
    Chaoyou FU
    Sirui ZHAO
    Tong XU
    Hao WANG
    Dianbo SUI
    Yunhang SHEN
    Ke LI
    Xing SUN
    Enhong CHEN
    Science China(Information Sciences), 2024, 67 (12) : 52 - 64
  • [24] Do Multimodal Large Language Models and Humans Ground Language Similarly?
    Jones, Cameron R.
    Bergen, Benjamin
    Trott, Sean
    COMPUTATIONAL LINGUISTICS, 2024, 50 (04) : 1415 - 1440
  • [25] Using Augmented Small Multimodal Models to Guide Large Language Models for Multimodal Relation Extraction
    He, Wentao
    Ma, Hanjie
    Li, Shaohua
    Dong, Hui
    Zhang, Haixiang
    Feng, Jie
    APPLIED SCIENCES-BASEL, 2023, 13 (22):
  • [26] Computing Architecture for Large-Language Models (LLMs) and Large Multimodal Models (LMMs)
    Liang, Bor-Sung
    PROCEEDINGS OF THE 2024 INTERNATIONAL SYMPOSIUM ON PHYSICAL DESIGN, ISPD 2024, 2024, : 233 - 234
  • [27] Urban sensing in the era of large language models
    Hou, Ce
    Zhang, Fan
    Li, Yong
    Li, Haifeng
    Mai, Gengchen
    Kang, Yuhao
    Yao, Ling
    Yu, Wenhao
    Yao, Yao
    Gao, Song
    Chen, Min
    Liu, Yu
    INNOVATION, 2025, 6 (01):
  • [28] SEED-Bench: Benchmarking Multimodal Large Language Models
    Li, Bohao
    Ge, Yuying
    Ge, Yixiao
    Wang, Guangzhi
    Wang, Rui
    Zhang, Ruimao
    Shi, Ying
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 13299 - 13308
  • [29] VCoder: Versatile Vision Encoders for Multimodal Large Language Models
    Jain, Jitesh
    Yang, Jianwei
    Shi, Humphrey
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 27992 - 28002
  • [30] Large Language and Emerging Multimodal Foundation Models: Boundless Opportunities
    Forghani, Reza
    RADIOLOGY, 2024, 313 (01)