SAW: Semantic-Aware WebRTC Transmission Using Diffusion-Based Scalable Video Coding

被引：0

作者：

Wen, Yihan ^{[1
,2
]}

Zhang, Zheng ^{[3
]}

Sun, Jiayi ^{[1
]}

Li, Jinglei ^{[4
]}

Chen, Chung Shue ^{[5
]}

Niu, Guanchong ^{[1
]}

机构：

[1] Xidian Univ, Guangzhou Inst Technol, Guangzhou 510000, Peoples R China

[2] Hong Kong Polytech Univ, Dept Land Surveying & Geoinformat, Hong Kong, Peoples R China

[3] Dalian Univ Technol, Sch Software, Dalian 116024, Peoples R China

[4] Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China

[5] Nokia Bell Labs, Dept Machine Learning & Syst, F-91300 Massy, France

来源：

IEEE INTERNET OF THINGS JOURNAL | 2025年 / 12卷 / 05期

基金：

中国国家自然科学基金;

关键词：

Computer vision; network adaptability; scalable video coding (SVC); service-aware WebRTC (SAW); video streaming; RECURRENT NEURAL-NETWORKS; IMAGE; PERFORMANCE; COMPRESSION; IMPACT;

D O I：

10.1109/JIOT.2024.3486725

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

As video transmission systems expand into various complex scenarios, real-time video coding methods are essential for maintaining low latency and high perceptual quality across varying network conditions. In this work, we propose service-aware Web real-time communication (WebRTC), a semantic-assisted WebRTC system built on scalable video coding (SVC). Specifically, this system is structured with three layers: 1) L-1 extracts and down-samples semantic information at the encoder, employing a novel super-resolution (SR) method named BUS-DDIM at the decoder to enhance the transmission efficiency and machine vision recognition rate; 2) L-2 adaptively compresses high-quality video by discarding frames with little motion at the encoder to address latency issues under poor network conditions, and utilize the adjacent frame-guided denoised interpolation model called the adjacent frame-guided denoised diffusion implicit model for restoring the video; and 3) L-3 transmits high-quality video tailored for users with high-definition video requirements and favorable network conditions. These layers dynamically enhance the visual experience and ensure low latency across various network environments. Experiments are conducted on diverse videos to validate the effectiveness of the proposed framework. The performance evaluation under real-time scenarios indicates significant enhancements in video quality and transmission efficiency, showcasing compatibility and versatility across various applications.

引用

页码：5346 / 5359

页数：14

共 50 条

[31] Joint Source-Channel Coding for Wavelet-Based Scalable Video Transmission Using an Adaptive Turbo Code
Ramzan, Naeem
Wan, Shuai
Izquierdo, Ebroul
EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2007, 2007 (1)
[32] Robust video transmission over lossy packet networks using block-based fine granularity scalable coding
He, YW
Yang, SQ
VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2003, PTS 1-3, 2003, 5150 : 151 - 159
[33] Scalable Video Coding (SVC) Signal Transmission Scheme Using UAPA in a MIMO-OFDM System
Lee, Kyujin
Cha, Dongho
Lee, Kyesan
IEICE TRANSACTIONS ON COMMUNICATIONS, 2012, E95B (11) : 3519 - 3526
[34] Mobile Internet-Based Compression-Aware Scalable Video Coding - Rate Control for Enhancement Layers
Cai, Yu
Mei, Lin
Wu, Dazhou
Zhao, Rui
Jia, Lili
Wang, Weifei
COMMUNICATIONS AND INFORMATION PROCESSING, PT 2, 2012, 289 : 741 - 746
[35] Fine granular scalable video coding using context-based binary arithmetic coding for bit-plane coding
Kim, Seung-Hwan
Ho, Yo-Sung
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2007, 17 (10) : 1301 - 1310
[36] Buffer Based Adaptation Using Scalable Video Coding for 360-Degree Video Streaming over NDN
Ogasawara, Taku
Bandai, Masaki
2020 34TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2020), 2020, : 142 - 145
[37] Block-based fine granularity scalable video coding with optimized rate allocation for content-aware streaming
He, YW
Yang, SQ
Zhong, YH
INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2003, 13 (06) : 322 - 330
[38] Digital Television Backward Compatibility Based on Mixed Simulcast using Independent Scalable Video Coding
Soto D.
SMPTE Motion Imaging Journal, 2016, 125 (09): : 42 - 56
[39] Rate scalable video coding using a foveation-based human visual system model
Wang, Z
Lu, LG
Bovik, AC
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 1785 - 1788
[40] Streaming and congestion control using scalable video coding based on H.264/AVC
Nguyen D.T.
Ostermann J.
J Zhejiang Univ: Sci, 2006, 5 (749-754): : 749 - 754

← 1 2 3 4 5 →