Statistical-Mechanical Analysis of Pre-training and Fine Tuning in Deep Learning

被引:9
|
作者
Ohzeki, Masayuki [1 ]
机构
[1] Kyoto Univ, Grad Sch Informat, Dept Syst Sci, Kyoto 6068501, Japan
关键词
ALGORITHM;
D O I
10.7566/JPSJ.84.034003
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
In this paper, we present a statistical-mechanical analysis of deep learning. We elucidate some of the essential components of deep learning-pre-training by unsupervised learning and fine tuning by supervised learning. We formulate the extraction of features from the training data as a margin criterion in a high-dimensional feature-vector space. The self-organized classifier is then supplied with small amounts of labelled data, as in deep learning. Although we employ a simple single-layer perceptron model, rather than directly analyzing a multi-layer neural network, we find a nontrivial phase transition that is dependent on the number of unlabelled data in the generalization error of the resultant classifier. In this sense, we evaluate the efficacy of the unsupervised learning component of deep learning. The analysis is performed by the replica method, which is a sophisticated tool in statistical mechanics. We validate our result in the manner of deep learning, using a simple iterative algorithm to learn the weight vector on the basis of belief propagation.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Pre-training Fine-tuning data Enhancement method based on active learning
    Cao, Deqi
    Ding, Zhaoyun
    Wang, Fei
    Ma, Haoyang
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, 2022, : 1447 - 1454
  • [2] FOOD IMAGE RECOGNITION USING DEEP CONVOLUTIONAL NETWORK WITH PRE-TRAINING AND FINE-TUNING
    Yanai, Keiji
    Kawano, Yoshiyuki
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2015,
  • [3] Deep-Learning-Based Pre-Training and Refined Tuning for Web Summarization Software
    Liu, Mingyue
    Ma, Zhe
    Li, Jiale
    Wu, Ying Cheng
    Wang, Xukang
    [J]. IEEE ACCESS, 2024, 12 : 92120 - 92129
  • [4] Tri-Train: Automatic Pre-Fine Tuning between Pre-Training and Fine-Tuning for SciNER
    Zeng, Qingkai
    Yu, Wenhao
    Yu, Mengxia
    Jiang, Tianwen
    Weninger, Tim
    Jiang, Meng
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4778 - 4787
  • [5] Improved Fine-Tuning by Better Leveraging Pre-Training Data
    Liu, Ziquan
    Xu, Yi
    Xu, Yuanhong
    Qian, Qi
    Li, Hao
    Ji, Xiangyang
    Chan, Antoni B.
    Jin, Rong
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [6] SAR-HUB: Pre-Training, Fine-Tuning, and Explaining
    Yang, Haodong
    Kang, Xinyue
    Liu, Long
    Liu, Yujiang
    Huang, Zhongling
    [J]. REMOTE SENSING, 2023, 15 (23)
  • [7] AlignDet: Aligning Pre-training and Fine-tuning in Object Detection
    Li, Ming
    Wu, Jie
    Wang, Xionghui
    Chen, Chen
    Qin, Jie
    Xiao, Xuefeng
    Wang, Rui
    Zheng, Min
    Pan, Xin
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6843 - 6853
  • [8] MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot Learning
    Rafailov, Rafael
    Hatch, Kyle
    Kolev, Victor
    Martin, John D.
    Phielipp, Mariano
    Finn, Chelsea
    [J]. CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229
  • [9] Knowledge-guided pre-training and fine-tuning: Video representation learning for action recognition
    Wang, Guanhong
    Zhou, Yang
    He, Zhanhao
    Lu, Keyu
    Feng, Yang
    Liu, Zuozhu
    Wang, Gaoang
    [J]. NEUROCOMPUTING, 2024, 571
  • [10] Bridging the Gap between Pre-Training and Fine-Tuning for Commonsense Generation
    Yang, Haoran
    Wang, Yan
    Li, Piji
    Bi, Wei
    Lam, Wai
    Xu, Chen
    [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 376 - 383