A deep learning container cloud for GPU resources

被引:0
|
作者
机构
[1] [1,Xiao, Yi
[2] Gao, Pengdong
[3] Qi, Quan
[4] Lu, Yongquan
来源
Gao, Pengdong (pdgao@cuc.edu.cn) | 1600年 / Universidad Central de Venezuela卷 / 55期
关键词
Graphics processing unit - Topology - Deep neural networks - Distributed computer systems - Computer aided instruction;
D O I
暂无
中图分类号
学科分类号
摘要
With the development of deep learning, deep learning framework has become an important tool for the deep neural network development. The framework greatly shortens the network construction and computing time, and its powerful computing ability comes from GPU. But It is an important issue that how to effectively allocate and use GPU resources in heterogeneous cluster among many frameworks. In this paper, we propose a Deep Learning Container Cloud (DLC) architecture for GPU resources specifically. With the characteristics of easy deployment and easy migration, the frameworks can be deployed on heterogeneous cluster in the form of container, and the GPU driver and container can be decoupled according to NVIDIA-docker volume. The DLC provides services in the form of the MESOS framework. After obtaining resources through scheduler, a deep learning framework is created quickly to meet the requirements. DLC will loads the specified GPU resource and the corresponding runtime library to achieve the rapid creation of a deep learning environment with specific version. In addition, this paper proposes an allocation algorithm based on GPU topology. DLC constructs the topo-tree by analyzing the GPU topology structure in agent node, and on this basis, assigns the GPU with the P2P function within the node. Our experiment shows that the use of P2P data transmission in containers will significantly increase bandwidth. It is of great significance for promoting the development of deep learning.
引用
收藏
相关论文
共 50 条
  • [21] A Fast Deep Learning System Using GPU
    Chen, Zhilu
    Wang, Jing
    He, Haibo
    Huang, Xinming
    2014 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2014, : 1552 - 1555
  • [22] Bioinformatics Tools with Deep Learning Based on GPU
    Hung, Che-Lun
    Tang, Chuan Yi
    2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2017, : 1906 - 1908
  • [23] Provisioning Computational Resources for Cloud-Based e-Learning Platforms Using Deep Learning Techniques
    Ariza, Jorge
    Jimeno, Miguel
    Villanueva-Polanco, Ricardo
    Capacho, Jose
    IEEE ACCESS, 2021, 9 : 89798 - 89811
  • [24] Deep Learning in the Enhanced Cloud
    Chung, Eric
    ISPD'17: PROCEEDINGS OF THE 2017 ACM INTERNATIONAL SYMPOSIUM ON PHYSICAL DESIGN, 2017, : 5 - 5
  • [25] Multi-Model Deep Learning For Cloud Resources Prediction to Support Proactive Workflow Adaptation
    El Kassabi, Hadeel T.
    Serhani, Mohamed Adel
    Bouktif, Salah
    Benharref, Abdelghani
    2019 3RD IEEE INTERNATIONAL CONFERENCE ON CLOUD AND FOG COMPUTING TECHNOLOGIES AND APPLICATIONS (IEEE CLOUD SUMMIT 2019), 2019, : 78 - 85
  • [26] Deep Reinforcement Learning with Successive Over-Relaxation and its Application in Autoscaling Cloud Resources
    John, Indu
    Bhatnagar, Shalabh
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [27] Cloud-based Resources and the Transformation of Learning
    Li, Yanyan
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON ADVANCED LEARNING TECHNOLOGIES (ICALT 2013), 2013, : 514 - 514
  • [28] Learning Microelectronics with Open Educational Resources in the Cloud
    Raleva, Katerina
    Stankovski, Mile
    Gochev, Ivan
    Nadzinski, Gorjan
    Chavdarov, Risto
    PROCEEDINGS OF 2018 IEEE GLOBAL ENGINEERING EDUCATION CONFERENCE (EDUCON) - EMERGING TRENDS AND CHALLENGES OF ENGINEERING EDUCATION, 2018, : 1860 - 1864
  • [29] Container stacking optimization based on Deep Reinforcement Learning
    Jin, Xin
    Duan, Zhentang
    Song, Wen
    Li, Qiqiang
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 123
  • [30] Lightweight container number recognition based on deep learning
    Liu, Tao
    Wu, Xianqing
    Li, Fang
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2025, : 1058 - 1071