Agent-Driven Generative Semantic Communication With Cross-Modality and Prediction

被引:0
|
作者
Yang, Wanting [1 ]
Xiong, Zehui [1 ]
Yuan, Yanli [2 ]
Jiang, Wenchao [1 ]
Quek, Tony Q. S. [1 ]
Debbah, Merouane [3 ,4 ]
机构
[1] Singapore Univ Technol & Design, Pillar Informat Syst Technol & Design, Singapore 487372, Singapore
[2] Beijing Inst Technol, Sch Cyberspace Sci & Technol, Beijing 100081, Peoples R China
[3] Khalifa Univ Sci & Technol, KU 6G Res Ctr, Abu Dhabi, U Arab Emirates
[4] Univ Paris Saclay, CentraleSupelec, F-91192 Gif Sur Yvette, France
基金
新加坡国家研究基金会;
关键词
Semantics; Decoding; Surveillance; 6G mobile communication; Wireless communication; Semantic communication; Real-time systems; Layout; Training; Symbols; video streaming; diffusion model; deep reinforcement learning; semantic sampling; DEEP; SYSTEMS;
D O I
10.1109/TWC.2024.3519325
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In the era of 6G, with compelling visions of intelligent transportation systems and digital twins, remote surveillance is poised to become a ubiquitous practice. Substantial data volume and frequent updates present challenges in wireless networks. To address these challenges, we propose a novel agent-driven generative semantic communication (A-GSC) framework based on reinforcement learning. In contrast to the existing research on semantic communication (SemCom), which mainly focuses on either semantic extraction or semantic sampling, we seamlessly integrate both by jointly considering the intrinsic attributes of source information and the contextual information regarding the task. Notably, the introduction of generative artificial intelligence (GAI) enables the independent design of semantic encoders and decoders. In this work, we develop an agent-assisted semantic encoder with cross-modality capability, which can track the semantic changes, channel condition, to perform adaptive semantic extraction and sampling. Accordingly, we design a semantic decoder with both predictive and generative capabilities, consisting of two tailored modules. Moreover, the effectiveness of the designed models has been verified using the UA-DETRAC dataset, demonstrating the performance gains of the overall A-GSC framework in both energy saving and reconstruction accuracy.
引用
收藏
页码:2233 / 2248
页数:16
相关论文
共 50 条
  • [31] MSSA: Multispectral Semantic Alignment for Cross-Modality Infrared-RGB Person Reidentification
    Chen, Qingshan
    Zhang, Moyan
    Quan, Zhenzhen
    Zhang, Yumeng
    Mozerov, Mikhail G.
    Zhai, Chao
    Li, Hongjuan
    Li, Yujun
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024,
  • [32] Cross-Modality Segmentation by Self-supervised Semantic Alignment in Disentangled Content Space
    Yang, Junlin
    Li, Xiaoxiao
    Pak, Daniel
    Dvornek, Nicha C.
    Chapiro, Julius
    Lin, MingDe
    Duncan, James S.
    DOMAIN ADAPTATION AND REPRESENTATION TRANSFER, AND DISTRIBUTED AND COLLABORATIVE LEARNING, DART 2020, DCL 2020, 2020, 12444 : 52 - 61
  • [33] CMDFusion: Bidirectional Fusion Network With Cross-Modality Knowledge Distillation for LiDAR Semantic Segmentation
    Cen, Jun
    Zhang, Shiwei
    Pei, Yixuan
    Li, Kun
    Zheng, Hang
    Luo, Maochun
    Zhang, Yingya
    Chen, Qifeng
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (01) : 771 - 778
  • [34] CISum: Learning Cross-modality Interaction to Enhance Multimodal Semantic Coverage for Multimodal Summarization
    Zhang, Litian
    Zhang, Xiaoming
    Guo, Ziming
    Liu, Zhipeng
    PROCEEDINGS OF THE 2023 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2023, : 370 - 378
  • [35] Reasoning with Multimodal Sarcastic Tweets via Modeling Cross-Modality Contrast and Semantic Association
    Xu, Nan
    Zeng, Zhixiong
    Mao, Wenji
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3777 - 3786
  • [36] CCANet: Cross-Modality Comprehensive Feature Aggregation Network for Indoor Scene Semantic Segmentation
    Zhang, Zihao
    Yang, Yale
    Hou, Huifang
    Meng, Fanman
    Zhang, Fan
    Xie, Kangzhan
    Zhuang, Chunsheng
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2025, 17 (02) : 366 - 378
  • [37] Cross-modality interaction for few-shot multispectral object detection with semantic knowledge
    Huang, Lian
    Peng, Zongju
    Chen, Fen
    Dai, Shaosheng
    He, Ziqiang
    Liu, Kesheng
    Neural Networks, 2024, 173
  • [38] A Cross-Modality Latent Representation for the Prediction of Clinical Symptomatology in Parkinson's Disease
    Vazquez-Garcia, Cristobal
    Martinez-Murcia, F. J.
    Arco, Juan E.
    Illan, Ignacio A.
    Jimenez-Mesa, Carmen
    Ramirez, Javier
    Gorriz, Juan M.
    ARTIFICIAL INTELLIGENCE FOR NEUROSCIENCE AND EMOTIONAL SYSTEMS, PT I, IWINAC 2024, 2024, 14674 : 78 - 87
  • [39] Cross-modality integration framework with prediction, perception and discrimination for video anomaly detection
    Li, Chaobo
    Li, Hongjun
    Zhang, Guoan
    NEURAL NETWORKS, 2024, 172
  • [40] CFINet: Cross-Modality MRI Feature Interaction Network for Pseudoprogression Prediction of Glioblastoma
    Lv, Ya
    Liu, Jin
    Tian, Xu
    Yang, Pei
    Pan, Yi
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2025, 32 (02) : 212 - 224