Transform Network Architectures for Deep Learning Based End-to-End Image/Video Coding in Subsampled Color Spaces

被引:10
|
作者
Egilmez, Hilmi E. [1 ]
Singh, Ankitesh K. [1 ]
Coban, Muhammed [1 ]
Karczewicz, Marta [1 ]
Zhu, Yinhao [1 ]
Yang, Yang [1 ]
Said, Amir [1 ]
Cohen, Taco S. [2 ]
机构
[1] Qualcomm Technol Inc, San Diego, CA 92121 USA
[2] Qualcomm Technol Netherlands BV, NL-1098 XH Amsterdam, Netherlands
关键词
Transforms; Encoding; Image coding; Standards; Image color analysis; Video coding; Quantization (signal); Deep learning; neural networks; transform network; data compression; image coding; video coding; color spaces; YUV; RGB;
D O I
10.1109/OJSP.2021.3092257
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Most of the existing deep learning based end-to-end image/video coding (DLEC) architectures are designed for non-subsampled RGB color format. However, in order to achieve a superior coding performance, many state-of-the-art block-based compression standards such as High Efficiency Video Coding (HEVC/H.265) and Versatile Video Coding (VVC/H.266) are designed primarily for YUV 4:2:0 format, where U and V components are subsampled by considering the human visual system. This paper investigates various DLEC designs to support YUV 4:2:0 format by comparing their performance against the main profiles of HEVC and VVC standards under a common evaluation framework. Moreover, a new transform network architecture is proposed to improve the efficiency of coding YUV 4:2:0 data. The experimental results on YUV 4:2:0 datasets show that the proposed architecture significantly outperforms naive extensions of existing architectures designed for RGB format and achieves about 10% average BD-rate improvement over the intra-frame coding in HEVC.
引用
收藏
页码:441 / 452
页数:12
相关论文
共 50 条
  • [31] End-to-End Deep Reinforcement Learning for Image-Based UAV Autonomous Control
    Zhao, Jiang
    Sun, Jiaming
    Cai, Zhihao
    Wang, Longhong
    Wang, Yingxun
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (18):
  • [32] Attention Based End-to-End Network for Short Video Classification
    Zhu, Hui
    Zou, Chao
    Wang, Zhenyu
    Xu, Kai
    Huang, Zihao
    [J]. 2022 18TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING, MSN, 2022, : 490 - 494
  • [33] DBVC: An End-to-End 3-D Deep Biomedical Video Coding Framework
    Xue, Dongmei
    Ma, Haichuan
    Li, Li
    Liu, Dong
    Xiong, Zhiwei
    Li, Houqiang
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 2922 - 2933
  • [34] An End-to-End Load Balancer Based on Deep Learning for Vehicular Network Traffic Control
    Li, Jinglin
    Luo, Guiyang
    Cheng, Nan
    Yuan, Quan
    Wu, Zhiheng
    Gao, Shang
    Liu, Zhihan
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (01) : 953 - 966
  • [35] Unsupervised Deep Learning-based End-to-end Network for Anomaly Detection and Localization
    Olimov, Bekhzod
    Subramanian, Barathi
    Kim, Jeonghong
    [J]. 2022 THIRTEENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN), 2022, : 444 - 449
  • [36] CSFusionNet: An End-to-end Image Fusion Network Based on Cascade-Skip Learning
    Cheng, Bang
    Cheng, Jianghua
    Liu, Tong
    Luo, Xiaobing
    Du, Xiangyu
    Zhang, Liang
    Wang, Tao
    [J]. KSII Transactions on Internet and Information Systems, 2024, 18 (11): : 3216 - 3235
  • [37] AN END-TO-END DEEP LEARNING SPEECH CODING AND DENOISING STRATEGY FOR COCHLEAR IMPLANTS
    Gajecki, Tom
    Nogueira, Waldo
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3109 - 3113
  • [38] MPNET: An End-to-End Deep Neural Network for Object Detection in Surveillance Video
    Wang, Hanyu
    Wang, Ping
    Qian, Xueming
    [J]. IEEE ACCESS, 2018, 6 : 30296 - 30308
  • [39] End-to-End Video Saliency Detection via a Deep Contextual Spatiotemporal Network
    Wei, Lina
    Zhao, Shanshan
    Bourahla, Omar Farouk
    Li, Xi
    Wu, Fei
    Zhuang, Yueting
    Han, Junwei
    Xu, Mingliang
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (04) : 1691 - 1702
  • [40] End-to-end video subtitle recognition via a deep Residual Neural Network
    Yan, Hongyu
    Xu, Xin
    [J]. PATTERN RECOGNITION LETTERS, 2020, 131 : 368 - 375