Distributed learning strategy based on chips for classification with large-scale dataset

被引:2
|
作者
Yang, Bo
Su, Xiaohong
Wang, Yadong
机构
[1] IBM Corp, China Res Lab, Beijing 100094, Peoples R China
[2] Sch Comp Sci & Technol, Harbin Inst Technol, Harbin 150001, Peoples R China
关键词
distributed learning strategy; artificial neural network; classification; multiagent; protein secondary structure prediction;
D O I
10.1142/S0218001407005739
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning with very large-scale datasets is always necessary when handling real problems using artificial neural networks. However, it is still an open question how to balance computing efficiency and learning stability, when traditional neural networks spend a large amount of running time and memory to solve a problem with large-scale learning dataset. In this paper, we report the. rst evaluation of neural network distributed-learning strategies in large-scale classification over protein secondary structure. Our accomplishments include: (1) an architecture analysis on distributed-learning, (2) the development of scalable distributed system for large-scale dataset classification, (3) the description of a novel distributed-learning strategy based on chips, (4) a theoretical analysis of distributed-learning strategies for structure-distributed and data-distributed, (5) an investigation and experimental evaluation of distributed-learning strategy based-on chips with respect to time complexity and their effect on the classification accuracy of artificial neural networks. It is demonstrated that the novel distributed-learning strategy is better-balanced in parallel computing efficiency and stability as compared with the previous algorithms. The application of the protein secondary structure prediction demonstrates that this method is feasible and effective in practical applications.
引用
收藏
页码:899 / 920
页数:22
相关论文
共 50 条
  • [1] A large-scale hyperspectral dataset for flower classification
    Zheng, Yongrong
    Zhang, Tao
    Fu, Ying
    KNOWLEDGE-BASED SYSTEMS, 2022, 236
  • [2] DIESEL: A Dataset-Based Distributed Storage and Caching System for Large-Scale Deep Learning Training
    Wang, Lipeng
    Ye, Songgao
    Yang, Baichen
    Lu, Youyou
    Zhang, Hequan
    Yan, Shengen
    PROCEEDINGS OF THE 49TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2020, 2020,
  • [3] SDFC dataset: a large-scale benchmark dataset for hyperspectral image classification
    Sun, Liwei
    Zhang, Junjie
    Li, Jia
    Wang, Yueming
    Zeng, Dan
    OPTICAL AND QUANTUM ELECTRONICS, 2023, 55 (02)
  • [4] SDFC dataset: a large-scale benchmark dataset for hyperspectral image classification
    Liwei Sun
    Junjie Zhang
    Jia Li
    Yueming Wang
    Dan Zeng
    Optical and Quantum Electronics, 2023, 55
  • [5] Large-scale asynchronous distributed learning based on parameter exchanges
    Joshi, Bikash
    Iutzeler, Franck
    Amini, Massih-Reza
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2018, 5 (04) : 223 - 232
  • [6] Hierarchical Classification for Large-Scale Learning
    Wang, Boshi
    Barbu, Adrian
    ELECTRONICS, 2023, 12 (22)
  • [7] Coding for Large-Scale Distributed Machine Learning
    Xiao, Ming
    Skoglund, Mikael
    ENTROPY, 2022, 24 (09)
  • [8] Automatic Classification of Large-Scale Respiratory Sound Dataset Based on Convolutional Neural Network
    Minami, Koki
    Lu, Huimin
    Kim, Hyoungseop
    Mabu, Shingo
    Hirano, Yasushi
    Kido, Shoji
    2019 19TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2019), 2019, : 804 - 807
  • [9] A load balancing strategy for large-scale distributed computing
    Yang, Ji-Xiang
    Tan, Guo-Zhen
    Wang, Fan
    Zhou, Mei-Na
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2012, 40 (11): : 2226 - 2231
  • [10] CAS Landslide Dataset: A Large-Scale and Multisensor Dataset for Deep Learning-Based Landslide Detection
    Xu, Yulin
    Ouyang, Chaojun
    Xu, Qingsong
    Wang, Dongpo
    Zhao, Bo
    Luo, Yutao
    SCIENTIFIC DATA, 2024, 11 (01)