Fcd-cnn: FPGA-based CU depth decision for HEVC intra encoder using CNN

被引：2

作者：

Dehnavi, Hossein ^{[1
]}

Dehnavi, Mohammad ^{[1
]}

Klidbary, Sajad Haghzad ^{[2
]}

机构：

[1] Kermanshah Univ Technol, Energy Fac, Dept Elect Engn, Kermanshah, Iran

[2] Univ Zanjan, Dept Elect & Comp Engn, Zanjan, Iran

来源：

JOURNAL OF REAL-TIME IMAGE PROCESSING | 2024年 / 21卷 / 04期

关键词：

FPGA; Video compression; Hardware architecture; HEVC;

D O I：

10.1007/s11554-024-01487-9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Video compression for storage and transmission has always been a focal point for researchers in the field of image processing. Their efforts aim to reduce the data volume required for video representation while maintaining its quality. HEVC is one of the efficient standards for video compression, receiving special attention due to the increasing demand for high-resolution videos. The main step in video compression involves dividing the coding unit (CU) blocks into smaller blocks that have a uniform texture. In traditional methods, The Discrete Cosine Transform (DCT) is applied, followed by the use of RDO for decision-making on partitioning. This paper presents a novel convolutional neural network (CNN) and its hardware implementation as an alternative to DCT, aimed at speeding up partitioning and reducing the hardware resources required. The proposed hardware utilizes an efficient and lightweight CNN to partition CUs with low hardware resources in real-time applications. This CNN is trained for different Quantization Parameters (QPs) and block sizes to prevent overfitting. Furthermore, the system's input size is fixed at 16x16\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$16\times 16$$\end{document}, and other input sizes are scaled to this dimension. Loop unrolling, data reuse, and resource sharing are applied in hardware implementation to save resources. The hardware architecture is fixed for all block sizes and QPs, and only the coefficients of the CNN are changed. In terms of compression quality, the proposed hardware achieves a 4.42%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$4.42\%$$\end{document} BD-BR and -0.19\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-\,0.19$$\end{document} BD-PSNR compared to HM16.5. The proposed system can process 64x64\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$64\times 64$$\end{document} CU at 150 MHz and in 4914 clock cycles. The hardware resources utilized by the proposed system include 13,141 LUTs, 15,885 Flip-flops, 51 BRAMs, and 74 DSPs.

引用

页数：10

共 50 条

[21] Optimizing FPGA-Based CNN Accelerator Using Differentiable Neural Architecture Search
Fan, Hongxiang
Ferianc, Martin
Liu, Shuanglong
Que, Zhiqiang
Niu, Xinyu
Luk, Wayne
2020 IEEE 38TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2020), 2020, : 465 - 468
[22] Adaptive Keypoint-based CU Depth Decision for HEVC Intra Coding
Kim, Namuk
Jeon, Seungsu
Shim, Hiuk Jae
Jeon, Byeungwoo
Lim, Sung-Chang
Ko, Hyunsuk
2016 IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING (BMSB), 2016,
[23] Fast Depth Intra Coding based on Layer-classification and CNN for 3D-HEVC
Liu, Chang
Jia, Kebin
Liu, Pengyu
Sun, Zhonghua
2020 DATA COMPRESSION CONFERENCE (DCC 2020), 2020, : 381 - 381
[24] SVG-CNN: A shallow CNN based on VGGNet applied to intra prediction partition block in HEVC
Linck, Iris
Gomez, Arthur Torgo
Alaghband, Gita
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (30) : 73983 - 74001
[25] An FPGA-Based CNN Accelerator Integrating Depthwise Separable Convolution
Liu, Bing
Zou, Danyin
Feng, Lei
Feng, Shou
Fu, Ping
Li, Junbao
ELECTRONICS, 2019, 8 (03)
[26] A Collaborative Framework for FPGA-based CNN Design Modeling and Optimization
Mu, Jiandong
Zhang, Wei
Liang, Hao
Sinha, Sharad
2018 28TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2018, : 139 - 146
[27] Real Time FPGA-Based CNN Training and Recognition of Signals
Groom, Tyler
George, Kiran
2022 IEEE WORLD AI IOT CONGRESS (AIIOT), 2022, : 22 - 26
[28] CU Partition Mode Decision for HEVC Hardwired Intra Encoder Using Convolution Neural Network
Liu, Zhenyu
Yu, Xianyu
Gao, Yuan
Chen, Shaolin
Ji, Xiangyang
Wang, Dongsheng
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (11) : 5088 - 5103
[29] Deep CNN Co-design for HEVC CU Partition Prediction on FPGA–SoC
Soulef Bouaafia
Randa Khemiri
Seifeddine Messaoud
Fatma Ezahra Sayadi
Neural Processing Letters, 2022, 54 : 3283 - 3301
[30] A Cost-Efficient FPGA-Based CNN-Transformer Using Neural ODE
Okubo, Ikumi
Sugiura, Keisuke
Matsutani, Hiroki
IEEE ACCESS, 2024, 12 : 155773 - 155788

← 1 2 3 4 5 →