Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing

被引:0
|
作者
Alman, Josh [1 ]
Liang, Jiehao [2 ]
Song, Zhao [3 ]
Zhang, Ruizhe [4 ]
Zhuo, Danyang [5 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
[2] Univ Calif Berkeley, Berkeley, CA 94720 USA
[3] Adobe Res, San Jose, CA USA
[4] Simons Inst Theory Comp, Berkeley, CA USA
[5] Duke Univ, Durham, NC 27706 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Over the last decade, deep neural networks have transformed our society, and they are already widely applied in various machine learning applications. State-of-the-art deep neural networks are becoming larger in size every year to deliver increasing model accuracy, and as a result, model training consumes substantial computing resources and will only consume more in the future. Using current training methods, in each iteration, to process a data point x is an element of R-d in a layer, we need to spend Theta(md) time to evaluate all the m neurons in the layer. This means processing the entire layer takes Theta(nmd) time for n data points. Recent work [Song, Yang and Zhang, NeurIPS 2021] reduces this time per iteration to o(nmd) but requires exponential time to preprocess either the data or the neural network weights, making it unlikely to have practical usage. In this work, we present a new preprocessing method that simply stores the weight-data correlation in a tree data structure in order to quickly, and dynamically detect which neurons fire at each iteration. Our method requires only O(nmd) time in preprocessing and still achieves o(nmd) time per iteration. We complement our new algorithm with a lower bound, proving that assuming a popular conjecture from complexity theory, one could not substantially speed up our algorithm for dynamic detection of firing neurons.
引用
收藏
页数:28
相关论文
共 50 条
  • [21] Preprocessing-Based Fast Design of Multiple EM Structures With One Deep Neural Network
    Wang, Peng
    Li, Zhenning
    Luo, Chao
    Wei, Zhaohui
    Wu, Tong
    Jiang, Wen
    Hong, Tao
    Parchin, Naser Ojaroudi
    Pedersen, Gert Frolund
    Shen, Ming
    IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, 2024, 72 (05) : 4298 - 4310
  • [22] Radon short range forecasting through time series preprocessing and neural network modeling
    Pasini, A
    Ameli, F
    GEOPHYSICAL RESEARCH LETTERS, 2003, 30 (07)
  • [23] Improving Artificial Neural Network Based Streamflow Forecasting Models through Data Preprocessing
    Muhammad Hassan
    Ishtiaq Hassan
    KSCE Journal of Civil Engineering, 2021, 25 : 3583 - 3595
  • [24] Improving Artificial Neural Network Based Streamflow Forecasting Models through Data Preprocessing
    Hassan, Muhammad
    Hassan, Ishtiaq
    KSCE JOURNAL OF CIVIL ENGINEERING, 2021, 25 (09) : 3583 - 3595
  • [25] Simulation of UML graph classification model by using data preprocessing and convolutional neural network
    Wang, Fangli
    OPTICAL AND QUANTUM ELECTRONICS, 2024, 56 (02)
  • [26] Integration of Neural Network Preprocessing Model for OMI Aerosol Optical Depth Data Assimilation
    Ali, A.
    Amin, S. E.
    Ramadan, H. H.
    Tolba, M. F.
    ADVANCED MACHINE LEARNING TECHNOLOGIES AND APPLICATIONS, 2012, 322 : 496 - 506
  • [27] Fuzzy Time Series Prediction with Data Preprocessing and Error Compensation Based on Correlation Analysis
    Bang, Young-Keun
    Lee, Chul-Heui
    THIRD 2008 INTERNATIONAL CONFERENCE ON CONVERGENCE AND HYBRID INFORMATION TECHNOLOGY, VOL 2, PROCEEDINGS, 2008, : 714 - 721
  • [28] Extended Deep Adaptive Input Normalization for Preprocessing Time Series Data for Neural Networks
    September, Marcus A. K.
    Passino, Francesco Sanna
    Goldmann, Leonie
    Hinel, Anton
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [29] DATA PREPROCESSING FOR ARTIFICIAL NEURAL NETWORK APPLICATIONS IN PRIORITIZING RAILROAD PROJECTS - A PRACTICAL EXPERIENCE IN TAIWAN
    Cheng, Min-Yuan
    Su, Cheng-Wei
    Tsai, Ming-Hsiu
    Lin, Kuo-Shian
    JOURNAL OF CIVIL ENGINEERING AND MANAGEMENT, 2012, 18 (04) : 483 - 494
  • [30] Pca data preprocessing for neural network-based detection of parametric defects in analog ic
    Malosek, P.
    Stopjakova, V.
    PROCEEDINGS OF THE 2006 IEEE WORKSHOP ON DESIGN AND DIAGNOSTICS OF ELECTRONIC CIRCUITS AND SYSTEMS, 2006, : 131 - +