GrOD: Deep Learning with Gradients Orthogonal Decomposition for Knowledge Transfer, Distillation, and Adversarial Training

被引:7
|
作者
Xiong, Haoyi [1 ]
Wan, Ruosi [1 ,2 ]
Zhao, Jian [3 ]
Chen, Zeyu [1 ]
Li, Xingjian [1 ]
Zhu, Zhanxing [2 ]
Huan, Jun [1 ]
机构
[1] Baidu Inc, Baidu Technol Pk, Beijing 100085, Peoples R China
[2] Peking Univ, Sch Math Sci, Beijing 100871, Peoples R China
[3] Inst North Elect Equipment, Beijing, Peoples R China
基金
国家重点研发计划; 美国国家科学基金会;
关键词
Deep neural networks; regularized deep learning; gradient-based learning;
D O I
10.1145/3530836
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Regularization that incorporates the linear combination of empirical loss and explicit regularization terms as the loss function has been frequently used for many machine learning tasks. The explicit regularization term is designed in different types, depending on its applications. While regularized learning often boost the performance with higher accuracy and faster convergence, the regularization would sometimes hurt the empirical loss minimization and lead to poor performance. To deal with such issues in this work, we propose a novel strategy, namely Gradients Orthogonal Decomposition (GrOD), that improves the training procedure of regularized deep learning. Instead of linearly combining gradients of the two terms, GrOD re-estimates a new direction for iteration that does not hurt the empirical loss minimization while preserving the regularization affects, through orthogonal decomposition. We have performed extensive experiments to use GrOD improving the commonly used algorithms of transfer learning [2], knowledge distillation [3], and adversarial learning [4]. The experiment results based on large datasets, including Caltech 256 [5], MIT indoor 67 [6], CIFAR-10 [7], and ImageNet [8], show significant improvement made by GrOD for all three algorithms in all cases.
引用
下载
收藏
页数:25
相关论文
共 50 条
  • [31] An active learning framework for adversarial training of deep neural networks
    Susmita Ghosh
    Abhiroop Chatterjee
    Lance Fiondella
    Neural Computing and Applications, 2025, 37 (9) : 6849 - 6876
  • [32] Refining Design Spaces in Knowledge Distillation for Deep Collaborative Learning
    Iwata, Sachi
    Minami, Soma
    Hirakawa, Tsubasa
    Yamashita, Takayoshi
    Fujiyoshi, Hironobu
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2371 - 2377
  • [33] A hybrid adversarial training for deep learning model and denoising network resistant to adversarial examples
    Gwonsang Ryu
    Daeseon Choi
    Applied Intelligence, 2023, 53 : 9174 - 9187
  • [34] Enhancing adversarial robustness for deep metric learning via neural discrete adversarial training
    Li, Chaofei
    Zhu, Ziyuan
    Niu, Ruicheng
    Zhao, Yuting
    COMPUTERS & SECURITY, 2024, 143
  • [35] Adversarial Attack for Deep Steganography Based on Surrogate Training and Knowledge Diffusion
    Tao, Fangjian
    Cao, Chunjie
    Li, Hong
    Zou, Binghui
    Wang, Longjuan
    Sun, Jingzhang
    APPLIED SCIENCES-BASEL, 2023, 13 (11):
  • [36] A hybrid adversarial training for deep learning model and denoising network resistant to adversarial examples
    Ryu, Gwonsang
    Choi, Daeseon
    APPLIED INTELLIGENCE, 2023, 53 (08) : 9174 - 9187
  • [37] Selective transfer learning with adversarial training for stock movement prediction
    Li, Yang
    Dai, Hong-Ning
    Zheng, Zibin
    CONNECTION SCIENCE, 2022, 34 (01) : 492 - 510
  • [38] Adversarial Training Helps Transfer Learning via Better Representations
    Deng, Zhun
    Zhang, Linjun
    Vodrahalli, Kailas
    Kawaguchi, Kenji
    Zou, James
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [39] AI-KD: Adversarial learning and Implicit regularization for self-Knowledge Distillation
    Kim, Hyungmin
    Suh, Sungho
    Baek, Sunghyun
    Kim, Daehwan
    Jeong, Daun
    Cho, Hansang
    Kim, Junmo
    KNOWLEDGE-BASED SYSTEMS, 2024, 293
  • [40] Distill-DBDGAN: Knowledge Distillation and Adversarial Learning Framework for Defocus Blur Detection
    Jonna, Sankaraganesh
    Medhi, Moushumi
    Sahay, Rajiv Ranjan
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (02)