Performance evaluation of convolutional neural network on Tianhe-3 prototype

被引:0
|
作者
Weiduo Chen
Xiaoshe Dong
Heng Chen
Qiang Wang
Xingda Yu
Xingjun Zhang
机构
[1] Xi’an Jiaotong University,School of Computer Science and Technology
来源
关键词
Tianhe-3 prototype; Convolutional neural network; Performance evaluation; Roofline model;
D O I
暂无
中图分类号
学科分类号
摘要
Exascale supercomputers will greatly support the expanding computational resource demand of convolutional neural networks (CNNs). At present, the prototype cluster of Tianhe-3 supercomputer, which is based on the Chinese-made many-core processors, the Phytium-2000+ (FTP) and Matrix-2000+ (MTP), has gone into service. We evaluated the training performance of CNN on the Tianhe-3 prototype. The performance of image convolution and matrix multiplication on the FTP and MTP was tested to evaluate the single-node performance, and the Allreduce element was tested to evaluate the scalability of the distributed training on the prototype cluster. We also qualitatively analyzed the performance bottlenecks of CNN on the FTP and MTP processors by Roofline model and provided some optimization suggestions for improving the CNN on the Tianhe-3 prototype.
引用
收藏
页码:12647 / 12665
页数:18
相关论文
共 50 条
  • [1] Performance evaluation of convolutional neural network on Tianhe-3 prototype
    Chen, Weiduo
    Dong, Xiaoshe
    Chen, Heng
    Wang, Qiang
    Yu, Xingda
    Zhang, Xingjun
    JOURNAL OF SUPERCOMPUTING, 2021, 77 (11): : 12647 - 12665
  • [2] Performance Evaluation and Analysis of Linear Algebra Kernels in the Prototype Tianhe-3 Cluster
    You, Xin
    Yang, Hailong
    Luan, Zhongzhi
    Liu, Yi
    Qian, Depei
    SUPERCOMPUTING FRONTIERS, SCFA 2019, 2019, 11416 : 86 - 105
  • [3] Deploying and scaling distributed parallel deep neural networks on the Tianhe-3 prototype system
    Jia Wei
    Xingjun Zhang
    Zeyu Ji
    Jingbo Li
    Zheng Wei
    Scientific Reports, 11
  • [4] Deploying and scaling distributed parallel deep neural networks on the Tianhe-3 prototype system
    Wei, Jia
    Zhang, Xingjun
    Ji, Zeyu
    Li, Jingbo
    Wei, Zheng
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [5] OHTMA: an optimized heuristic topology-aware mapping algorithm on the Tianhe-3 exascale supercomputer prototype
    Yi-shui Li
    Xin-hai Chen
    Jie Liu
    Bo Yang
    Chun-ye Gong
    Xin-biao Gan
    Sheng-guo Li
    Han Xu
    Frontiers of Information Technology & Electronic Engineering, 2020, 21 : 939 - 949
  • [6] OHTMA: an optimized heuristic topology-aware mapping algorithm on the Tianhe-3 exascale supercomputer prototype
    Li, Yi-shui
    Chen, Xin-hai
    Liu, Jie
    Yang, Bo
    Gong, Chun-ye
    Gan, Xin-biao
    Li, Sheng-guo
    Xu, Han
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2020, 21 (06) : 939 - 949
  • [7] Performance evaluation of Convolutional Neural Network for web security
    Jemal, Ines
    Haddar, Mohamed Amine
    Cheikhrouhou, Omar
    Mahfoudhi, Adel
    COMPUTER COMMUNICATIONS, 2021, 175 : 58 - 67
  • [8] High-performance computing of 3D blasting wave propagation in underground rock cavern by using 4D-LSM on TianHe-3 prototype E class supercomputer
    Meng Fu
    Gaofeng Zhao
    Deep Underground Science and Engineering, 2022, 1 (01) : 87 - 100
  • [9] High-performance computing of 3D blasting wave propagation in underground rock cavern by using 4D-LSM on TianHe-3 prototype E class supercomputer
    Fu M.
    Zhao G.
    Deep Underground Science and Engineering, 2022, 1 (01) : 87 - 100
  • [10] Performance Evaluation of Deep Convolutional Maxout Neural Network in Speech Recognition
    Dehghani, Arash
    Seyyedsalehi, Seyyed Ali
    2018 25TH IRANIAN CONFERENCE ON BIOMEDICAL ENGINEERING AND 2018 3RD INTERNATIONAL IRANIAN CONFERENCE ON BIOMEDICAL ENGINEERING (ICBME), 2018, : 240 - 245