Improving Pre-Trained Weights through Meta-Heuristics Fine-Tuning

被引：0

作者：

de Rosa, Gustavo H. ^{[1
]}

Roder, Mateus ^{[1
]}

Papa, Joao Paulo ^{[1
]}

dos Santos, Claudio F. G. ^{[2
]}

机构：

[1] Sao Paulo State Univ, Dept Comp, Bauru, SP, Brazil

[2] Eldorado Res Inst, Campinas, SP, Brazil

来源：

2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021) | 2021年

基金：

巴西圣保罗研究基金会;

关键词：

Machine Learning; Meta-Heuristic Optimization; Weights; Fine-Tuning; ALGORITHM;

D O I：

10.1109/SSCI50451.2021.9659945

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Machine Learning algorithms have been extensively researched throughout the last decade, leading to unprecedented advances in a broad range of applications, such as image classification and reconstruction, object recognition, and text categorization. Nonetheless, most Machine Learning algorithms are trained via derivative-based optimizers, such as the Stochastic Gradient Descent, leading to possible local optimum entrapments and inhibiting them from achieving proper performances. A bio-inspired alternative to traditional optimization techniques, denoted as meta-heuristic, has received significant attention due to its simplicity and ability to avoid local optimums imprisonment. In this work, we propose to use meta-heuristic techniques to fine-tune pre-trained weights, exploring additional regions of the search space, and improving their effectiveness. The experimental evaluation comprises two classification tasks (image and text) and is assessed under four literature datasets. Experimental results show nature-inspired algorithms' capacity in exploring the neighborhood of pre-trained weights, achieving superior results than their counterpart pre-trained architectures. Additionally, a thorough analysis of distinct architectures, such as Multi-Layer Perceptron and Recurrent Neural Networks, attempts to visualize and provide more precise insights into the most critical weights to be fine-tuned in the learning process.

引用

页数：8

共 50 条

[1] Pruning Pre-trained Language ModelsWithout Fine-Tuning
Jiang, Ting
Wang, Deqing
Zhuang, Fuzhen
Xie, Ruobing
Xia, Feng
[J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 594 - 605
[2] Span Fine-tuning for Pre-trained Language Models
Bao, Rongzhou
Zhang, Zhuosheng
Zhao, Hai
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1970 - 1979
[3] Overcoming Catastrophic Forgetting for Fine-Tuning Pre-trained GANs
Zhang, Zeren
Li, Xingjian
Hong, Tao
Wang, Tianyang
Ma, Jinwen
Xiong, Haoyi
Xu, Cheng-Zhong
[J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT V, 2023, 14173 : 293 - 308
[4] Waste Classification by Fine-Tuning Pre-trained CNN and GAN
Alsabei, Amani
Alsayed, Ashwaq
Alzahrani, Manar
Al-Shareef, Sarah
[J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (08): : 65 - 70
[5] Variational Monte Carlo on a Budget - Fine-tuning pre-trained NeuralWavefunctions
Scherbela, Michael
Gerard, Leon
Grohs, Philipp
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[6] Fine-Tuning Pre-Trained CodeBERT for Code Search in Smart Contract
JIN Huan
LI Qinying
[J]. Wuhan University Journal of Natural Sciences, 2023, 28 (03) : 237 - 245
[7] Debiasing Pre-Trained Language Models via Efficient Fine-Tuning
Gira, Michael
Zhang, Ruisu
Lee, Kangwook
[J]. PROCEEDINGS OF THE SECOND WORKSHOP ON LANGUAGE TECHNOLOGY FOR EQUALITY, DIVERSITY AND INCLUSION (LTEDI 2022), 2022, : 59 - 69
[8] Exploiting Syntactic Information to Boost the Fine-tuning of Pre-trained Models
Liu, Chaoming
Zhu, Wenhao
Zhang, Xiaoyu
Zhai, Qiuhong
[J]. 2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 575 - 582
[9] Pathologies of Pre-trained Language Models in Few-shot Fine-tuning
Chen, Hanjie
Zheng, Guoqing
Awadallah, Ahmed Hassan
Ji, Yangfeng
[J]. PROCEEDINGS OF THE THIRD WORKSHOP ON INSIGHTS FROM NEGATIVE RESULTS IN NLP (INSIGHTS 2022), 2022, : 144 - 153
[10] Fine-tuning the hyperparameters of pre-trained models for solving multiclass classification problems
Kaibassova, D.
Nurtay, M.
Tau, A.
Kissina, M.
[J]. COMPUTER OPTICS, 2022, 46 (06) : 971 - 979

← 1 2 3 4 5 →