SAW: Semantic-Aware WebRTC Transmission Using Diffusion-Based Scalable Video Coding

被引:0
|
作者
Wen, Yihan [1 ,2 ]
Zhang, Zheng [3 ]
Sun, Jiayi [1 ]
Li, Jinglei [4 ]
Chen, Chung Shue [5 ]
Niu, Guanchong [1 ]
机构
[1] Xidian Univ, Guangzhou Inst Technol, Guangzhou 510000, Peoples R China
[2] Hong Kong Polytech Univ, Dept Land Surveying & Geoinformat, Hong Kong, Peoples R China
[3] Dalian Univ Technol, Sch Software, Dalian 116024, Peoples R China
[4] Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China
[5] Nokia Bell Labs, Dept Machine Learning & Syst, F-91300 Massy, France
来源
IEEE INTERNET OF THINGS JOURNAL | 2025年 / 12卷 / 05期
基金
中国国家自然科学基金;
关键词
Computer vision; network adaptability; scalable video coding (SVC); service-aware WebRTC (SAW); video streaming; RECURRENT NEURAL-NETWORKS; IMAGE; PERFORMANCE; COMPRESSION; IMPACT;
D O I
10.1109/JIOT.2024.3486725
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As video transmission systems expand into various complex scenarios, real-time video coding methods are essential for maintaining low latency and high perceptual quality across varying network conditions. In this work, we propose service-aware Web real-time communication (WebRTC), a semantic-assisted WebRTC system built on scalable video coding (SVC). Specifically, this system is structured with three layers: 1) L-1 extracts and down-samples semantic information at the encoder, employing a novel super-resolution (SR) method named BUS-DDIM at the decoder to enhance the transmission efficiency and machine vision recognition rate; 2) L-2 adaptively compresses high-quality video by discarding frames with little motion at the encoder to address latency issues under poor network conditions, and utilize the adjacent frame-guided denoised interpolation model called the adjacent frame-guided denoised diffusion implicit model for restoring the video; and 3) L-3 transmits high-quality video tailored for users with high-definition video requirements and favorable network conditions. These layers dynamically enhance the visual experience and ensure low latency across various network environments. Experiments are conducted on diverse videos to validate the effectiveness of the proposed framework. The performance evaluation under real-time scenarios indicates significant enhancements in video quality and transmission efficiency, showcasing compatibility and versatility across various applications.
引用
收藏
页码:5346 / 5359
页数:14
相关论文
共 50 条
  • [41] Streaming and congestion control using scalable video coding based on H.264/AVC
    NGUYEN Dieu Thanh
    OSTERMANN Joern
    Journal of Zhejiang University Science A(Science in Engineering), 2006, (05) : 749 - 754
  • [42] VIDEO STREAMING USING STANDARD-COMPATIBLE SCALABLE MULTIPLE DESCRIPTION CODING BASED ON SVC
    Zhao, Zhijie
    Ostermann, Joern
    2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, : 1293 - 1296
  • [43] SCTS: Instance Segmentation of Single Cells Using a Transformer-Based Semantic-Aware Model and Space-Filling Augmentation
    Zhou, Yating
    Li, Wenjing
    Yang, Ge
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5933 - 5942
  • [44] Multi-Party WebRTC Services Using Delay and Bandwidth Aware SDN-Assisted IP Multicasting of Scalable Video Over 5G Networks
    Kirmizioglu, Riza Arda
    Tekalp, A. Murat
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (04) : 1005 - 1015
  • [45] An Error Resilient Video Transmission in Ad Hoc Network Using Error Diffusion Block Truncation Coding
    Kumar, S. Sasi
    Kumar, K. Siva
    Jalil, M. A.
    Kavikumar, J.
    Ray, K.
    Nagarajan, D.
    APPLIED INTELLIGENCE AND INFORMATICS, AII 2021, 2021, 1435 : 295 - 305
  • [46] Reliable rate-adaptive video transmission over cognitive cellular networks using multiple description scalable coding
    Chaoub, Abdelaali
    Ennaoui, Fatim Zahra
    Ibn-Elhaj, Elhassane
    TELECOMMUNICATION SYSTEMS, 2019, 71 (03) : 321 - 338
  • [47] Reliable rate-adaptive video transmission over cognitive cellular networks using multiple description scalable coding
    Abdelaali Chaoub
    Fatim Zahra Ennaoui
    Elhassane Ibn-Elhaj
    Telecommunication Systems, 2019, 71 : 321 - 338
  • [48] Semantic Communication Based Video Coding Using Temporal Prediction of Deep Neural Network Parameters
    Samarathunga, Prabhath
    Ganearachchi, Yasith
    Fernando, Thanuj
    Alahapperuma, Indika
    Fernando, Anil
    2024 IEEE GAMING, ENTERTAINMENT, AND MEDIA CONFERENCE, GEM 2024, 2024, : 25 - 30
  • [49] 3D Point Cloud Attribute Compression Using Diffusion-Based Texture-Aware Intra Prediction
    Shao, Yiting
    Yang, Xiaodong
    Gao, Wei
    Liu, Shan
    Li, Ge
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 9633 - 9646
  • [50] Online Optimization for Adaptive Scalable Video Transmission in SDN using Kernel-based ADP
    Chen, Xiang
    Chen, Shuangwu
    Ullah, Amin
    Yu, Chaojie
    Yang, Jian
    2019 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS), 2019,