Accelerating Sparse Autoencoder Training via Layer-Wise Transfer Learning in Large Language Models

被引:0
|
作者
Ghilardi, Davide [1 ]
Belotti, Federico [1 ]
Molinari, Marco [2 ,4 ]
Lim, Jaehyuk [2 ,3 ]
机构
[1] University of Milan-Bicocca, Italy
[2] LSE.AI
[3] University of Pennsylvania, United States
[4] London School of Economics, United Kingdom
关键词
Compendex;
D O I
暂无
中图分类号
学科分类号
摘要
Computational linguistics
引用
收藏
页码:530 / 550
相关论文
共 50 条
  • [1] Layer-Wise Sparse Training of Transformer via Convolutional Flood Filling
    Yoon, Bokyeong
    Han, Yoonsang
    Moon, Gordon Euhyun
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT II, PAKDD 2024, 2024, 14646 : 158 - 170
  • [2] Learning sparse reparameterization with layer-wise continuous sparsification
    Wang, Xiaodong
    Huang, Yaxiang
    Zeng, Xianxian
    Guo, Jianlan
    Chen, Yuqiang
    KNOWLEDGE-BASED SYSTEMS, 2023, 276
  • [3] Multithreaded Layer-wise Training of Sparse Deep Neural Networks using Compressed Sparse Column
    Mofrad, Mohammad Hasanzadeh
    Melhem, Rami
    Ahmad, Yousuf
    Hammoud, Mohammad
    2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
  • [4] Accelerating On-Device Learning with Layer-Wise Processor Selection Method on Unified Memory
    Ha, Donghee
    Kim, Mooseop
    Moon, KyeongDeok
    Jeong, Chi Yoon
    SENSORS, 2021, 21 (07)
  • [5] Guided Layer-Wise Learning for Deep Models Using Side Information
    Sulimov, Pavel
    Sukmanova, Elena
    Chereshnev, Roman
    Kertesz-Farkas, Attila
    ANALYSIS OF IMAGES, SOCIAL NETWORKS AND TEXTS (AIST 2019), 2020, 1086 : 50 - 61
  • [6] Personalized Federated Learning with Layer-Wise Feature Transformation via Meta-Learning
    Tu, Jingke
    Huang, Jiaming
    Yang, Lei
    Lin, Wanyu
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2024, 18 (04)
  • [7] A Layer-wise Training and Pruning Method for Memory Efficient On-chip Learning Hardware
    Lew, Dongwoo
    Park, Jongsun
    2022 19TH INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC), 2022, : 97 - 98
  • [8] Post-training deep neural network pruning via layer-wise calibration
    Lazarevich, Ivan
    Kozlov, Alexander
    Malinin, Nikita
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 798 - 805
  • [9] Potential Layer-Wise Supervised Learning for Training Multi-Layered Neural Networks
    Kamimura, Ryotaro
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 2568 - 2575
  • [10] Explaining Deep Learning Models for Tabular Data Using Layer-Wise Relevance Propagation
    Ullah, Ihsan
    Rios, Andre
    Gala, Vaibhav
    Mckeever, Susan
    APPLIED SCIENCES-BASEL, 2022, 12 (01):