Large Language Models Empower Multimodal Integrated Sensing and Communication

被引:0
|
作者
Cheng, Lu [1 ]
Zhang, Hongliang [2 ]
Di, Boya [1 ,2 ]
Niyato, Dusit [3 ]
Song, Lingyang [1 ,2 ]
机构
[1] Peking Univ, Shenzhen Grad Sch, Shenzhen, Peoples R China
[2] Peking Univ, Sch Elect, Shenzhen, Peoples R China
[3] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
基金
新加坡国家研究基金会; 美国国家科学基金会; 北京市自然科学基金;
关键词
Integrated sensing and communication; Multimodal sensors; Robot sensing systems; Training; Feature extraction; Drones; Cameras; Tuning; Laser radar; Large language models;
D O I
10.1109/MCOM.004.2400281
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Integrated sensing and communication (ISAC) is considered as a key candidate technology for the sixth-generation (6G) wireless networks. Notably, an integration of multimodal sensing information within ISAC systems promises an improvement for communication performance. Nevertheless, traditional methods for ISAC systems are typically designed to handle unimodal data, making it challenging to effectively process and integrate semantically complex multimodal information. Moreover, they are usually customized for specific types of data or tasks, leading to poor generalization ability. Multimodal large language models (MLLMs), which are trained on massive multimodal datasets and possess large parameter scales, are expected to be powerful tools to address the above issues. In this article, we first introduce an MLLM-enabled ISAC system to achieve enhanced communication and sensing performance. We begin with the introduction of the fundamental principles of ISAC and MLLMs. Moreover, we present the overall system and the corresponding opportunities to be achieved. Furthermore, this article provides a case study to demonstrate the superior performance of MLLMs in handling the beam prediction task within ISAC systems. Finally, we discuss several research challenges and potential directions for future research.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Multimodal large language models for inclusive collaboration learning tasks
    Lewis, Armanda
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2022, : 202 - 210
  • [32] Large Language and Multimodal Models Don't Come Cheap
    Anderson, Margo
    Perry, Tekla S.
    IEEE SPECTRUM, 2023, 60 (07) : 13 - 13
  • [33] Large Language Models in Rheumatologic Diagnosis: A Multimodal Performance Analysis
    Omar, Mahmud
    Agbareia, Reem
    Klang, Eyal
    Naffaa, Mohammaed E.
    JOURNAL OF RHEUMATOLOGY, 2025, 52 (02) : 187 - 188
  • [34] Exploring the Transferability of Visual Prompting for Multimodal Large Language Models
    Zhang, Yichi
    Dong, Yinpeng
    Zhang, Siyuan
    Min, Tianzan
    Su, Hang
    Zhu, Jun
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 26552 - 26562
  • [35] InteraRec: Interactive Recommendations Using Multimodal Large Language Models
    Karra, Saketh Reddy
    Tulabandhula, Theja
    TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2024 WORKSHOPS, RAFDA AND IWTA, 2024, 14658 : 32 - 43
  • [36] Enhancing Urban Walkability Assessment with Multimodal Large Language Models
    Blecic, Ivan
    Saiu, Valeria
    Trunfio, Giuseppe A.
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS-ICCSA 2024 WORKSHOPS, PT V, 2024, 14819 : 394 - 411
  • [37] Multimodal Large Language Models as Built Environment Auditing Tools
    Jang, Kee Moon
    Kim, Junghwan
    PROFESSIONAL GEOGRAPHER, 2025, 77 (01): : 84 - 90
  • [38] QueryMintAI: Multipurpose Multimodal Large Language Models for Personal Data
    Ghosh, Ananya
    Deepa, K.
    IEEE ACCESS, 2024, 12 : 144631 - 144651
  • [39] UniCode: Learning a Unified Codebook for Multimodal Large Language Models
    Zheng, Sipeng
    Zhou, Bohan
    Feng, Yicheng
    Wang, Ye
    Lu, Zongqing
    COMPUTER VISION - ECCV 2024, PT VIII, 2025, 15066 : 426 - 443
  • [40] BLINK: Multimodal Large Language Models Can See but Not Perceive
    Fu, Xingyu
    Hu, Yushi
    Li, Bangzheng
    Feng, Yu
    Wang, Haoyu
    Lin, Xudong
    Roth, Dan
    Smith, Noah A.
    Ma, Wei-Chiu
    Krishna, Ranjay
    COMPUTER VISION - ECCV 2024, PT XXIII, 2025, 15081 : 148 - 166