SAW: Semantic-Aware WebRTC Transmission Using Diffusion-Based Scalable Video Coding

被引：0

作者：

Wen, Yihan ^{[1
,2
]}

Zhang, Zheng ^{[3
]}

Sun, Jiayi ^{[1
]}

Li, Jinglei ^{[4
]}

Chen, Chung Shue ^{[5
]}

Niu, Guanchong ^{[1
]}

机构：

[1] Xidian Univ, Guangzhou Inst Technol, Guangzhou 510000, Peoples R China

[2] Hong Kong Polytech Univ, Dept Land Surveying & Geoinformat, Hong Kong, Peoples R China

[3] Dalian Univ Technol, Sch Software, Dalian 116024, Peoples R China

[4] Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China

[5] Nokia Bell Labs, Dept Machine Learning & Syst, F-91300 Massy, France

来源：

IEEE INTERNET OF THINGS JOURNAL | 2025年 / 12卷 / 05期

基金：

中国国家自然科学基金;

关键词：

Computer vision; network adaptability; scalable video coding (SVC); service-aware WebRTC (SAW); video streaming; RECURRENT NEURAL-NETWORKS; IMAGE; PERFORMANCE; COMPRESSION; IMPACT;

D O I：

10.1109/JIOT.2024.3486725

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

As video transmission systems expand into various complex scenarios, real-time video coding methods are essential for maintaining low latency and high perceptual quality across varying network conditions. In this work, we propose service-aware Web real-time communication (WebRTC), a semantic-assisted WebRTC system built on scalable video coding (SVC). Specifically, this system is structured with three layers: 1) L-1 extracts and down-samples semantic information at the encoder, employing a novel super-resolution (SR) method named BUS-DDIM at the decoder to enhance the transmission efficiency and machine vision recognition rate; 2) L-2 adaptively compresses high-quality video by discarding frames with little motion at the encoder to address latency issues under poor network conditions, and utilize the adjacent frame-guided denoised interpolation model called the adjacent frame-guided denoised diffusion implicit model for restoring the video; and 3) L-3 transmits high-quality video tailored for users with high-definition video requirements and favorable network conditions. These layers dynamically enhance the visual experience and ensure low latency across various network environments. Experiments are conducted on diverse videos to validate the effectiveness of the proposed framework. The performance evaluation under real-time scenarios indicates significant enhancements in video quality and transmission efficiency, showcasing compatibility and versatility across various applications.

引用

页码：5346 / 5359

页数：14

共 50 条

[41] Streaming and congestion control using scalable video coding based on H.264/AVC
NGUYEN Dieu Thanh
OSTERMANN Joern
Journal of Zhejiang University Science A(Science in Engineering), 2006, (05) : 749 - 754
[42] VIDEO STREAMING USING STANDARD-COMPATIBLE SCALABLE MULTIPLE DESCRIPTION CODING BASED ON SVC
Zhao, Zhijie
Ostermann, Joern
2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, : 1293 - 1296
[43] SCTS: Instance Segmentation of Single Cells Using a Transformer-Based Semantic-Aware Model and Space-Filling Augmentation
Zhou, Yating
Li, Wenjing
Yang, Ge
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5933 - 5942
[44] Multi-Party WebRTC Services Using Delay and Bandwidth Aware SDN-Assisted IP Multicasting of Scalable Video Over 5G Networks
Kirmizioglu, Riza Arda
Tekalp, A. Murat
IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (04) : 1005 - 1015
[45] An Error Resilient Video Transmission in Ad Hoc Network Using Error Diffusion Block Truncation Coding
Kumar, S. Sasi
Kumar, K. Siva
Jalil, M. A.
Kavikumar, J.
Ray, K.
Nagarajan, D.
APPLIED INTELLIGENCE AND INFORMATICS, AII 2021, 2021, 1435 : 295 - 305
[46] Reliable rate-adaptive video transmission over cognitive cellular networks using multiple description scalable coding
Chaoub, Abdelaali
Ennaoui, Fatim Zahra
Ibn-Elhaj, Elhassane
TELECOMMUNICATION SYSTEMS, 2019, 71 (03) : 321 - 338
[47] Reliable rate-adaptive video transmission over cognitive cellular networks using multiple description scalable coding
Abdelaali Chaoub
Fatim Zahra Ennaoui
Elhassane Ibn-Elhaj
Telecommunication Systems, 2019, 71 : 321 - 338
[48] Semantic Communication Based Video Coding Using Temporal Prediction of Deep Neural Network Parameters
Samarathunga, Prabhath
Ganearachchi, Yasith
Fernando, Thanuj
Alahapperuma, Indika
Fernando, Anil
2024 IEEE GAMING, ENTERTAINMENT, AND MEDIA CONFERENCE, GEM 2024, 2024, : 25 - 30
[49] 3D Point Cloud Attribute Compression Using Diffusion-Based Texture-Aware Intra Prediction
Shao, Yiting
Yang, Xiaodong
Gao, Wei
Liu, Shan
Li, Ge
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 9633 - 9646
[50] Online Optimization for Adaptive Scalable Video Transmission in SDN using Kernel-based ADP
Chen, Xiang
Chen, Shuangwu
Ullah, Amin
Yu, Chaojie
Yang, Jian
2019 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS), 2019,

← 1 2 3 4 5 →