Memristive KDG-BNN: Memristive binary neural networks trained via knowledge distillation and generative adversarial networks

被引：5

作者：

Gao, Tongtong ^{[1
]}

Zhou, Yue ^{[1
]}

Duan, Shukai ^{[1
,2
,3
]}

Hu, Xiaofang ^{[1
,2
,3
]}

机构：

[1] Southwest Univ, Coll Artificial Intelligence, Chongqing 400715, Peoples R China

[2] Brain inspired Comp & Intelligent Control Chongqin, Chongqing 400715, Peoples R China

[3] Southwest Univ, Key Lab Luminescence Anal & Mol Sensing, Minist Educ, Chongqing, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2022年 / 249卷

基金：

国家重点研发计划;

关键词：

Binary neural networks; Knowledge distillation; Generative adversarial networks; Wasserstein generative adversarial networks; Memristive circuit;

D O I：

10.1016/j.knosys.2022.108962

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the increasing requirements for the combination of software and hardware, network compression and hardware deployment have become hot research topics. In network compression, binary neural networks (BNNs) are widely applied in artificial intelligence chips because of memory saving, high computational efficiency, and hardware friendliness. However, there is a performance gap between BNNs and full-precision neural networks (FNNs). This paper proposes a BNN training framework called KDG-BNN, consisting of three modules: a full-precision network, a 1-bit binary network, and a discriminator. The full-precision network guides the 1-bit binary network to train through distillation loss in this framework. Meanwhile, the 1-bit binary network acts as a generator and conducts adversarial training with the discriminator. By simultaneously optimizing the adversarial loss and distillation loss, the 1-bit binary network can learn the feature distribution of the full-precision network more accurately. Then, the generative adversarial network (GAN) is replaced by Wasserstein GAN with gradient penalty (WGAN-GP) to deal with gradient disappearance, and KDG-BNN is developed into KDWG-BNN. Experiments show that AdamBNN trained with KDWG-BNN can achieve 85.89% and 70.7% accuracy on CIFAR-10 and ImageNet, respectively, exceeding 0.76% on CIFAR-10 and 0.2% on ImageNet. The memristor has many features for hardware deployment, such as memory functions, continuous input and output, nanoscale size, etc., making it an ideal device for deploying neural networks. Therefore, this paper further proposes a memristor-based KDG-BNN implementation scheme by levering the merits of memristors and the lightweight BNNs in the hope of realizing and promoting end-side intelligent applications. (C) 2022 Elsevier B.V. All rights reserved.

引用

页数：14

共 50 条

[1] Memristive Spiking Neural Networks Trained with Unsupervised STDP
Zhou, Errui
Fang, Liang
Yang, Binbin
[J]. ELECTRONICS, 2018, 7 (12)
[2] Research on Knowledge Distillation of Generative Adversarial Networks
Wang, Wei
Zhang, Baohua
Cui, Tao
Chai, Yimeng
Li, Yue
[J]. 2021 DATA COMPRESSION CONFERENCE (DCC 2021), 2021, : 376 - 376
[3] KDGAN: Knowledge Distillation with Generative Adversarial Networks
Wang, Xiaojie
Zhang, Rui
Sun, Yu
Qi, Jianzhong
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[4] Application of Knowledge Distillation in Generative Adversarial Networks
Zhang, Xu
[J]. 2023 3RD ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS TECHNOLOGY AND COMPUTER SCIENCE, ACCTCS, 2023, : 65 - 71
[5] Private Knowledge Transfer via Model Distillation with Generative Adversarial Networks
Gao, Di
Zhuo, Cheng
[J]. ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 1794 - 1801
[6] PKDGAN: Private Knowledge Distillation With Generative Adversarial Networks
Zhuo, Cheng
Gao, Di
Liu, Liangwei
[J]. IEEE Transactions on Big Data, 2024, 10 (06): : 775 - 788
[7] Compressing Deep Graph Neural Networks via Adversarial Knowledge Distillation
He, Huarui
Wang, Jie
Zhang, Zhanqiu
Wu, Feng
[J]. PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 534 - 544
[8] Evolutionary Generative Adversarial Networks with Crossover Based Knowledge Distillation
Li, Junjie
Zhang, Junwei
Gong, Xiaoyu
Lu, Shuai
[J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[9] Online adversarial knowledge distillation for graph neural networks
Wang, Can
Wang, Zhe
Chen, Defang
Zhou, Sheng
Feng, Yan
Chen, Chun
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
[10] Synchronization of memristive delayed neural networks via hybrid impulsive control
Wang, Huamin
Duan, Shukai
Huang, Tingwen
Tan, Jie
[J]. NEUROCOMPUTING, 2017, 267 : 615 - 623

← 1 2 3 4 5 →