Transform Network Architectures for Deep Learning Based End-to-End Image/Video Coding in Subsampled Color Spaces

被引：10

作者：

Egilmez, Hilmi E. ^{[1
]}

Singh, Ankitesh K. ^{[1
]}

Coban, Muhammed ^{[1
]}

Karczewicz, Marta ^{[1
]}

Zhu, Yinhao ^{[1
]}

Yang, Yang ^{[1
]}

Said, Amir ^{[1
]}

Cohen, Taco S. ^{[2
]}

机构：

[1] Qualcomm Technol Inc, San Diego, CA 92121 USA

[2] Qualcomm Technol Netherlands BV, NL-1098 XH Amsterdam, Netherlands

来源：

IEEE OPEN JOURNAL OF SIGNAL PROCESSING | 2021年 / 2卷

关键词：

Transforms; Encoding; Image coding; Standards; Image color analysis; Video coding; Quantization (signal); Deep learning; neural networks; transform network; data compression; image coding; video coding; color spaces; YUV; RGB;

D O I：

10.1109/OJSP.2021.3092257

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Most of the existing deep learning based end-to-end image/video coding (DLEC) architectures are designed for non-subsampled RGB color format. However, in order to achieve a superior coding performance, many state-of-the-art block-based compression standards such as High Efficiency Video Coding (HEVC/H.265) and Versatile Video Coding (VVC/H.266) are designed primarily for YUV 4:2:0 format, where U and V components are subsampled by considering the human visual system. This paper investigates various DLEC designs to support YUV 4:2:0 format by comparing their performance against the main profiles of HEVC and VVC standards under a common evaluation framework. Moreover, a new transform network architecture is proposed to improve the efficiency of coding YUV 4:2:0 data. The experimental results on YUV 4:2:0 datasets show that the proposed architecture significantly outperforms naive extensions of existing architectures designed for RGB format and achieves about 10% average BD-rate improvement over the intra-frame coding in HEVC.

引用

页码：441 / 452

页数：12

共 50 条

[31] End-to-End Deep Reinforcement Learning for Image-Based UAV Autonomous Control
Zhao, Jiang
Sun, Jiaming
Cai, Zhihao
Wang, Longhong
Wang, Yingxun
[J]. APPLIED SCIENCES-BASEL, 2021, 11 (18):
[32] Attention Based End-to-End Network for Short Video Classification
Zhu, Hui
Zou, Chao
Wang, Zhenyu
Xu, Kai
Huang, Zihao
[J]. 2022 18TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING, MSN, 2022, : 490 - 494
[33] DBVC: An End-to-End 3-D Deep Biomedical Video Coding Framework
Xue, Dongmei
Ma, Haichuan
Li, Li
Liu, Dong
Xiong, Zhiwei
Li, Houqiang
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 2922 - 2933
[34] An End-to-End Load Balancer Based on Deep Learning for Vehicular Network Traffic Control
Li, Jinglin
Luo, Guiyang
Cheng, Nan
Yuan, Quan
Wu, Zhiheng
Gao, Shang
Liu, Zhihan
[J]. IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (01) : 953 - 966
[35] Unsupervised Deep Learning-based End-to-end Network for Anomaly Detection and Localization
Olimov, Bekhzod
Subramanian, Barathi
Kim, Jeonghong
[J]. 2022 THIRTEENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN), 2022, : 444 - 449
[36] CSFusionNet: An End-to-end Image Fusion Network Based on Cascade-Skip Learning
Cheng, Bang
Cheng, Jianghua
Liu, Tong
Luo, Xiaobing
Du, Xiangyu
Zhang, Liang
Wang, Tao
[J]. KSII Transactions on Internet and Information Systems, 2024, 18 (11): : 3216 - 3235
[37] AN END-TO-END DEEP LEARNING SPEECH CODING AND DENOISING STRATEGY FOR COCHLEAR IMPLANTS
Gajecki, Tom
Nogueira, Waldo
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3109 - 3113
[38] MPNET: An End-to-End Deep Neural Network for Object Detection in Surveillance Video
Wang, Hanyu
Wang, Ping
Qian, Xueming
[J]. IEEE ACCESS, 2018, 6 : 30296 - 30308
[39] End-to-End Video Saliency Detection via a Deep Contextual Spatiotemporal Network
Wei, Lina
Zhao, Shanshan
Bourahla, Omar Farouk
Li, Xi
Wu, Fei
Zhuang, Yueting
Han, Junwei
Xu, Mingliang
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (04) : 1691 - 1702
[40] End-to-end video subtitle recognition via a deep Residual Neural Network
Yan, Hongyu
Xu, Xin
[J]. PATTERN RECOGNITION LETTERS, 2020, 131 : 368 - 375

← 1 2 3 4 5 →