Overcoming Recency Bias of Normalization Statistics in Continual Learning: Balance and Adaptation

被引：0

作者：

Lyu, Yilin ^{[1
]}

Wang, Liyuan ^{[2
]}

Zhang, Xingxing ^{[2
]}

Sun, Zicheng ^{[1
]}

Su, Hang ^{[2
]}

Zhu, Jun ^{[2
]}

Jing, Liping ^{[1
]}

机构：

[1] Beijing Jiaotong Univ, Beijing Key Lab Traff Data Anal & Min, Beijing, Peoples R China

[2] Tsinghua Univ, Tsinghua Bosch Joint Ctr ML, Dept Comp Sci & Tech, Inst AI,BNRist Ctr,THBI Lab, Beijing, Peoples R China

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

基金：

中国国家自然科学基金; 北京市自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Continual learning entails learning a sequence of tasks and balancing their knowledge appropriately. With limited access to old training samples, much of the current work in deep neural networks has focused on overcoming catastrophic forgetting of old tasks in gradient-based optimization. However, the normalization layers provide an exception, as they are updated interdependently by the gradient and statistics of currently observed training samples, which require specialized strategies to mitigate recency bias. In this work, we focus on the most popular Batch Normalization (BN) and provide an in-depth theoretical analysis of its sub-optimality in continual learning. Our analysis demonstrates the dilemma between balance and adaptation of BN statistics for incremental tasks, which potentially affects training stability and generalization. Targeting on these particular challenges, we propose Adaptive Balance of BN (AdaB2N), which incorporates appropriately a Bayesian-based strategy to adapt task-wise contributions and a modified momentum to balance BN statistics, corresponding to the training and testing stages. By implementing BN in a continual learning fashion, our approach achieves significant performance gains across a wide range of benchmarks, particularly for the challenging yet realistic online scenarios (e.g., up to 7.68%, 6.86% and 4.26% on Split CIFAR-10, Split CIFAR-100 and Split Mini-ImageNet, respectively). Our code is available at https://github.com/lvyilin/AdaB2N.

引用

页数：20

共 50 条

[1] Quantum Continual Learning Overcoming Catastrophic Forgetting
Jiang, Wenjie
Lu, Zhide
Deng, Dong-Ling
[J]. CHINESE PHYSICS LETTERS, 2022, 39 (05)
[2] Quantum Continual Learning Overcoming Catastrophic Forgetting
蒋文杰
鲁智徳
邓东灵
[J]. Chinese Physics Letters, 2022, 39 (05) : 29 - 41
[3] Bayesian Structural Adaptation for Continual Learning
Kumar, Abhishek
Chatterjee, Sunabha
Rai, Piyush
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[4] CBA: Improving Online Continual Learning via Continual Bias Adaptor
Wang, Quanziang
Wang, Renzhen
Wu, Yichen
Jia, Xixi
Meng, Deyu
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19036 - 19046
[5] Issues for Continual Learning in the Presence of Dataset Bias
Lee, Donggyu
Jung, Sangwon
Moon, Taesup
[J]. AAAI BRIDGE PROGRAM ON CONTINUAL CAUSALITY, VOL 208, 2023, 208 : 92 - 99
[6] Concept drift detection and adaptation for federated and continual learning
Fernando E. Casado
Dylan Lema
Marcos F. Criado
Roberto Iglesias
Carlos V. Regueiro
Senén Barro
[J]. Multimedia Tools and Applications, 2022, 81 : 3397 - 3419
[7] Concept drift detection and adaptation for federated and continual learning
Casado, Fernando E.
Lema, Dylan
Criado, Marcos F.
Iglesias, Roberto
Regueiro, Carlos, V
Barro, Senen
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (03) : 3397 - 3419
[8] Gradient Regularized Contrastive Learning for Continual Domain Adaptation
Tang, Shixiang
Su, Peng
Chen, Dapeng
Ouyang, Wanli
[J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2665 - 2673
[9] Disentangled Representations for Continual Learning: Overcoming Forgetting and Facilitating Knowledge Transfer
Xu, Zhaopeng
Qin, Qi
Liu, Bing
Zhao, Dongyan
[J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT IV, ECML PKDD 2024, 2024, 14944 : 143 - 159
[10] Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting
Li, Xilai
Zhou, Yingbo
Wu, Tianfu
Socher, Richard
Xiong, Caiming
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97

← 1 2 3 4 5 →