SAW: Semantic-Aware WebRTC Transmission Using Diffusion-Based Scalable Video Coding

被引：0

作者：

Wen, Yihan ^{[1
,2
]}

Zhang, Zheng ^{[3
]}

Sun, Jiayi ^{[1
]}

Li, Jinglei ^{[4
]}

Chen, Chung Shue ^{[5
]}

Niu, Guanchong ^{[1
]}

机构：

[1] Xidian Univ, Guangzhou Inst Technol, Guangzhou 510000, Peoples R China

[2] Hong Kong Polytech Univ, Dept Land Surveying & Geoinformat, Hong Kong, Peoples R China

[3] Dalian Univ Technol, Sch Software, Dalian 116024, Peoples R China

[4] Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China

[5] Nokia Bell Labs, Dept Machine Learning & Syst, F-91300 Massy, France

来源：

IEEE INTERNET OF THINGS JOURNAL | 2025年 / 12卷 / 05期

基金：

中国国家自然科学基金;

关键词：

Computer vision; network adaptability; scalable video coding (SVC); service-aware WebRTC (SAW); video streaming; RECURRENT NEURAL-NETWORKS; IMAGE; PERFORMANCE; COMPRESSION; IMPACT;

D O I：

10.1109/JIOT.2024.3486725

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

As video transmission systems expand into various complex scenarios, real-time video coding methods are essential for maintaining low latency and high perceptual quality across varying network conditions. In this work, we propose service-aware Web real-time communication (WebRTC), a semantic-assisted WebRTC system built on scalable video coding (SVC). Specifically, this system is structured with three layers: 1) L-1 extracts and down-samples semantic information at the encoder, employing a novel super-resolution (SR) method named BUS-DDIM at the decoder to enhance the transmission efficiency and machine vision recognition rate; 2) L-2 adaptively compresses high-quality video by discarding frames with little motion at the encoder to address latency issues under poor network conditions, and utilize the adjacent frame-guided denoised interpolation model called the adjacent frame-guided denoised diffusion implicit model for restoring the video; and 3) L-3 transmits high-quality video tailored for users with high-definition video requirements and favorable network conditions. These layers dynamically enhance the visual experience and ensure low latency across various network environments. Experiments are conducted on diverse videos to validate the effectiveness of the proposed framework. The performance evaluation under real-time scenarios indicates significant enhancements in video quality and transmission efficiency, showcasing compatibility and versatility across various applications.

引用

页码：5346 / 5359

页数：14

共 50 条

[21] Integrating semantic analysis and scalable video coding for efficient content-based adaptation
Luis Herranz
Multimedia Systems, 2007, 13 : 103 - 118
[22] Integrating semantic analysis and scalable video coding for efficient content-based adaptation
Herranz, Luis
MULTIMEDIA SYSTEMS, 2007, 13 (02) : 103 - 118
[23] A Quality-of-Experience-Aware Framework for Versatile Video Coding-Based Video Transmission
Udora, Carl
Adhuran, Jayasingam
Fernando, Anil
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2023, 69 (02) : 205 - 216
[24] Scalable Video Coding Based Video Transmission in MRMC Networks: A Cross-Layer Design Perspective
Long, Yan
Li, Hongyan
Pan, Miao
Li, Pan
Li, Jiandong
2013 IEEE/CIC INTERNATIONAL CONFERENCE ON COMMUNICATIONS IN CHINA (ICCC), 2013, : 763 - 767
[25] MOBILE TV USING SCALABLE VIDEO CODING AND LAYER-AWARE FORWARD ERROR CORRECTION
Hellge, Cornelius
Schierl, Thomas
Wiegand, Thomas
2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 1177 - 1180
[26] Temporal scalable video transmission using multi-reference prediction chain coding
Lin, XS
Dai, QH
2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, : 1919 - 1922
[27] Efficient HTTP-based streaming using Scalable Video Coding
Sanchez, Y.
Schierl, T.
Hellge, C.
Wiegand, T.
Hong, D.
De Vleeschauwer, D.
Van Leekwijck, W.
Le Louedec, Y.
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2012, 27 (04) : 329 - 342
[28] SCALABLE VIDEO CODING USING ALLPASS-BASED WAVELET FILTERS
Zhang, Xi
Suzuki, Takuya
2014 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS), 2014, : 264 - 267
[29] DS-Diff: a dual-stage network with degradation-aware and semantic-aware for adverse weather removal based on diffusion models
Zhang, Qian
Li, Shasha
Shao, Mingwen
MULTIMEDIA SYSTEMS, 2025, 31 (01)
[30] Joint Source-Channel Coding for Wavelet-Based Scalable Video Transmission Using an Adaptive Turbo Code
Naeem Ramzan
Shuai Wan
Ebroul Izquierdo
EURASIP Journal on Image and Video Processing, 2007

← 1 2 3 4 5 →