Fcd-cnn: FPGA-based CU depth decision for HEVC intra encoder using CNN

被引:2
|
作者
Dehnavi, Hossein [1 ]
Dehnavi, Mohammad [1 ]
Klidbary, Sajad Haghzad [2 ]
机构
[1] Kermanshah Univ Technol, Energy Fac, Dept Elect Engn, Kermanshah, Iran
[2] Univ Zanjan, Dept Elect & Comp Engn, Zanjan, Iran
关键词
FPGA; Video compression; Hardware architecture; HEVC;
D O I
10.1007/s11554-024-01487-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video compression for storage and transmission has always been a focal point for researchers in the field of image processing. Their efforts aim to reduce the data volume required for video representation while maintaining its quality. HEVC is one of the efficient standards for video compression, receiving special attention due to the increasing demand for high-resolution videos. The main step in video compression involves dividing the coding unit (CU) blocks into smaller blocks that have a uniform texture. In traditional methods, The Discrete Cosine Transform (DCT) is applied, followed by the use of RDO for decision-making on partitioning. This paper presents a novel convolutional neural network (CNN) and its hardware implementation as an alternative to DCT, aimed at speeding up partitioning and reducing the hardware resources required. The proposed hardware utilizes an efficient and lightweight CNN to partition CUs with low hardware resources in real-time applications. This CNN is trained for different Quantization Parameters (QPs) and block sizes to prevent overfitting. Furthermore, the system's input size is fixed at 16x16\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$16\times 16$$\end{document}, and other input sizes are scaled to this dimension. Loop unrolling, data reuse, and resource sharing are applied in hardware implementation to save resources. The hardware architecture is fixed for all block sizes and QPs, and only the coefficients of the CNN are changed. In terms of compression quality, the proposed hardware achieves a 4.42%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$4.42\%$$\end{document} BD-BR and -0.19\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-\,0.19$$\end{document} BD-PSNR compared to HM16.5. The proposed system can process 64x64\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$64\times 64$$\end{document} CU at 150 MHz and in 4914 clock cycles. The hardware resources utilized by the proposed system include 13,141 LUTs, 15,885 Flip-flops, 51 BRAMs, and 74 DSPs.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Optimizing FPGA-Based CNN Accelerator Using Differentiable Neural Architecture Search
    Fan, Hongxiang
    Ferianc, Martin
    Liu, Shuanglong
    Que, Zhiqiang
    Niu, Xinyu
    Luk, Wayne
    2020 IEEE 38TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2020), 2020, : 465 - 468
  • [22] Adaptive Keypoint-based CU Depth Decision for HEVC Intra Coding
    Kim, Namuk
    Jeon, Seungsu
    Shim, Hiuk Jae
    Jeon, Byeungwoo
    Lim, Sung-Chang
    Ko, Hyunsuk
    2016 IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING (BMSB), 2016,
  • [23] Fast Depth Intra Coding based on Layer-classification and CNN for 3D-HEVC
    Liu, Chang
    Jia, Kebin
    Liu, Pengyu
    Sun, Zhonghua
    2020 DATA COMPRESSION CONFERENCE (DCC 2020), 2020, : 381 - 381
  • [24] SVG-CNN: A shallow CNN based on VGGNet applied to intra prediction partition block in HEVC
    Linck, Iris
    Gomez, Arthur Torgo
    Alaghband, Gita
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (30) : 73983 - 74001
  • [25] An FPGA-Based CNN Accelerator Integrating Depthwise Separable Convolution
    Liu, Bing
    Zou, Danyin
    Feng, Lei
    Feng, Shou
    Fu, Ping
    Li, Junbao
    ELECTRONICS, 2019, 8 (03)
  • [26] A Collaborative Framework for FPGA-based CNN Design Modeling and Optimization
    Mu, Jiandong
    Zhang, Wei
    Liang, Hao
    Sinha, Sharad
    2018 28TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2018, : 139 - 146
  • [27] Real Time FPGA-Based CNN Training and Recognition of Signals
    Groom, Tyler
    George, Kiran
    2022 IEEE WORLD AI IOT CONGRESS (AIIOT), 2022, : 22 - 26
  • [28] CU Partition Mode Decision for HEVC Hardwired Intra Encoder Using Convolution Neural Network
    Liu, Zhenyu
    Yu, Xianyu
    Gao, Yuan
    Chen, Shaolin
    Ji, Xiangyang
    Wang, Dongsheng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (11) : 5088 - 5103
  • [29] Deep CNN Co-design for HEVC CU Partition Prediction on FPGA–SoC
    Soulef Bouaafia
    Randa Khemiri
    Seifeddine Messaoud
    Fatma Ezahra Sayadi
    Neural Processing Letters, 2022, 54 : 3283 - 3301
  • [30] A Cost-Efficient FPGA-Based CNN-Transformer Using Neural ODE
    Okubo, Ikumi
    Sugiura, Keisuke
    Matsutani, Hiroki
    IEEE ACCESS, 2024, 12 : 155773 - 155788