Optimizing OpenCL-Based CNN Design on FPGA with Comprehensive Design Space Exploration and Collaborative Performance Modeling

被引:9
|
作者
Mu, Jiandong [1 ]
Zhang, Wei [1 ]
Liang, Hao [2 ]
Sinha, Sharad [3 ]
机构
[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
[3] Indian Inst Technol IIT, Veling, Goa, India
关键词
CNN; modeling; hardware design; design space exploration;
D O I
10.1145/3397514
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recent success in applying convolutional neural networks (CNNs) to object detection and classification has sparked great interest in accelerating CNNs using hardware-like field-programmable gate arrays (FPGAs). However, finding an efficient FPGA design for a given CNN model and FPGA board is not trivial since a strong background in hardware design and detailed knowledge of the target board are required. In this work, we try to solve this problem by design space exploration with a collaborative framework. Our framework consists of three main parts: FPGA design generation, coarse-grained modeling, and fine-grained modeling. In the FPGA design generation, we propose a novel data structure, LoopTree, to capture the details of the FPGA design for CNN applications without writing down the source code. Different LoopTrees, which indicate different FPGA designs, are automatically generated in this process. A coarse-grained model will evaluate LoopTrees at the operation level, e.g., add, mult, and so on, so that the most efficient LoopTrees can be selected. A fine-grained model, which is based on the source code, will then refine the selected design in a cycle-accurate manner. A set of comprehensive OpenCL-based designs have been implemented on board to verify our framework. An average estimation error of 8.87% and 4.8% has been observed for our coarse-grained model and fine-grained model, respectively. This is much lower than the prevalent operation-statistics-based estimation, which is obtained according to a predefined formula for specific loop schedules.
引用
收藏
页数:28
相关论文
共 50 条
  • [41] ZIP-CNN: Design Space Exploration for CNN Implementation within a MCU
    Garbay, Thomas
    Hachicha, Khalil
    Dobias, Petr
    Pinna, Andrea
    Hocine, Karim
    Dron, Wilfried
    Lusich, Pedro
    Khalis, Imane
    Granado, Bertrand
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2025, 24 (01)
  • [42] Design Space Exploration of FPGA-Based System With Multiple DNN Accelerators
    Kedia, Rajesh
    Goel, Shikha
    Balakrishnan, M.
    Paul, Kolin
    Sen, Rijurekha
    IEEE EMBEDDED SYSTEMS LETTERS, 2021, 13 (03) : 114 - 117
  • [43] Design Space Exploration of FPGA-Based Deep Convolutional Neural Networks
    Motamedi, Mohammad
    Gysel, Philipp
    Akella, Venkatesh
    Ghiasi, Soheil
    2016 21ST ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2016, : 575 - 580
  • [44] Hierarchical Design Space Exploration for Distributed CNN Inference at the Edge
    Guo, Xiaotian
    Pimentel, Andy D.
    Stefanov, Todor
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT I, 2023, 1752 : 545 - 556
  • [45] A Layer-based Structured Design of CNN on FPGA
    Huang, Chao
    Ni, Siyu
    Chen, Gengsheng
    2017 IEEE 12TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), 2017, : 1037 - 1040
  • [46] High Throughput CNN Accelerator Design Based on FPGA
    Xie, Liang
    Fan, Xitian
    Cao, Wei
    Wang, Lingli
    2018 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT 2018), 2018, : 277 - 280
  • [47] Parallel Design and Performance Optimization based on OpenCL Snort
    Xie, Hongying
    Xiang, Yangxia
    Chen, Caisen
    PROCEEDINGS OF THE 2017 2ND JOINT INTERNATIONAL INFORMATION TECHNOLOGY, MECHANICAL AND ELECTRONIC ENGINEERING CONFERENCE (JIMEC 2017), 2017, 62 : 644 - 647
  • [48] Roofline-Model-Based Design Space Exploration for Dataflow Techniques of CNN Accelerators
    Park, Chan
    Park, Sungkyung
    Park, Chester Sungchung
    IEEE ACCESS, 2020, 8 : 172509 - 172523
  • [49] ACDSE: A Design Space Exploration Method for CNN Accelerator based on Adaptive Compression Mechanism
    Feng, Kaijie
    Fan, Xiaoya
    An, Jianfeng
    Li, Chuxi
    Di, Kaiyue
    Li, Jiangfei
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2023, 22 (06)
  • [50] Comprehensive Design Space Exploration of Silicon Photonic Interconnects
    Bahadori, Meisam
    Rumley, Sebastien
    Nikolova, Dessislava
    Bergman, Keren
    JOURNAL OF LIGHTWAVE TECHNOLOGY, 2016, 34 (12) : 2975 - 2987