Agent-Driven Generative Semantic Communication With Cross-Modality and Prediction

被引:0
|
作者
Yang, Wanting [1 ]
Xiong, Zehui [1 ]
Yuan, Yanli [2 ]
Jiang, Wenchao [1 ]
Quek, Tony Q. S. [1 ]
Debbah, Merouane [3 ,4 ]
机构
[1] Singapore Univ Technol & Design, Pillar Informat Syst Technol & Design, Singapore 487372, Singapore
[2] Beijing Inst Technol, Sch Cyberspace Sci & Technol, Beijing 100081, Peoples R China
[3] Khalifa Univ Sci & Technol, KU 6G Res Ctr, Abu Dhabi, U Arab Emirates
[4] Univ Paris Saclay, CentraleSupelec, F-91192 Gif Sur Yvette, France
基金
新加坡国家研究基金会;
关键词
Semantics; Decoding; Surveillance; 6G mobile communication; Wireless communication; Semantic communication; Real-time systems; Layout; Training; Symbols; video streaming; diffusion model; deep reinforcement learning; semantic sampling; DEEP; SYSTEMS;
D O I
10.1109/TWC.2024.3519325
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In the era of 6G, with compelling visions of intelligent transportation systems and digital twins, remote surveillance is poised to become a ubiquitous practice. Substantial data volume and frequent updates present challenges in wireless networks. To address these challenges, we propose a novel agent-driven generative semantic communication (A-GSC) framework based on reinforcement learning. In contrast to the existing research on semantic communication (SemCom), which mainly focuses on either semantic extraction or semantic sampling, we seamlessly integrate both by jointly considering the intrinsic attributes of source information and the contextual information regarding the task. Notably, the introduction of generative artificial intelligence (GAI) enables the independent design of semantic encoders and decoders. In this work, we develop an agent-assisted semantic encoder with cross-modality capability, which can track the semantic changes, channel condition, to perform adaptive semantic extraction and sampling. Accordingly, we design a semantic decoder with both predictive and generative capabilities, consisting of two tailored modules. Moreover, the effectiveness of the designed models has been verified using the UA-DETRAC dataset, demonstrating the performance gains of the overall A-GSC framework in both energy saving and reconstruction accuracy.
引用
收藏
页码:2233 / 2248
页数:16
相关论文
共 50 条
  • [41] Joint image and feature adaptative attention-aware networks for cross-modality semantic segmentation
    Zhong, Qihuang
    Zeng, Fanzhou
    Liao, Fei
    Liu, Juhua
    Du, Bo
    Shang, Jedi S.
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (05): : 3665 - 3676
  • [42] Caption-Aware Medical VQA via Semantic Focusing and Progressive Cross-Modality Comprehension
    Cong, Fuze
    Xu, Shibiao
    Guo, Li
    Tian, Yinbing
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3569 - 3577
  • [43] Joint image and feature adaptative attention-aware networks for cross-modality semantic segmentation
    Qihuang Zhong
    Fanzhou Zeng
    Fei Liao
    Juhua Liu
    Bo Du
    Jedi S. Shang
    Neural Computing and Applications, 2023, 35 : 3665 - 3676
  • [44] Cross-Modality Semantic Consistency Learning for Visible-Infrared Person Re-Identification
    Liu, Min
    Zhang, Zhu
    Bian, Yuan
    Wang, Xueping
    Sun, Yeqing
    Zhang, Baida
    Wang, Yaonan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 568 - 580
  • [45] Bidirectional cross-modality unsupervised domain adaptation using generative adversarial networks for cardiac image segmentation
    Cui, Hengfei
    Chang Yuwen
    Lei Jiang
    Yong Xia
    Zhang, Yanning
    COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 136
  • [46] MCG-MNER: A Multi-Granularity Cross-Modality Generative Framework for Multimodal NER with Instruction
    Wu, Junjie
    Gong, Chen
    Cao, Ziqiang
    Fu, Guohong
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3209 - 3218
  • [47] Multimodal Semantic Communication for Generative Audio-Driven Video Conferencing
    Tong, Haonan
    Li, Haopeng
    Du, Hongyang
    Yang, Zhaohui
    Yin, Changchuan
    Niyato, Dusit
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2025, 14 (01) : 93 - 97
  • [48] CMMS-GCL: cross-modality metabolic stability prediction with graph contrastive learning
    Du, Bing-Xue
    Long, Yahui
    Li, Xiaoli
    Wu, Min
    Shi, Jian-Yu
    BIOINFORMATICS, 2023, 39 (08)
  • [49] Driver intention prediction based on multi-dimensional cross-modality information interaction
    Mengfan Xue
    Zengkui Xu
    Shaohua Qiao
    Jiannan Zheng
    Tao Li
    Yuerong Wang
    Dongliang Peng
    Multimedia Systems, 2024, 30
  • [50] DEEP ACTIVE LEARNING FROM MULTISPECTRAL DATA THROUGH CROSS-MODALITY PREDICTION INCONSISTENCY
    Zhang, Heng
    Fromont, Elisa
    Lefevre, Sebastien
    Avignon, Bruno
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 449 - 453