Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions

被引：90

作者：

Jagtap, Ameya D. ^{[1
]}

Shin, Yeonjong ^{[1
]}

Kawaguchi, Kenji ^{[2
]}

Karniadakis, George Em ^{[1
,3
]}

机构：

[1] Brown Univ, Div Appl Math, 182 George St, Providence, RI 02912 USA

[2] Harvard Univ, Ctr Math Sci & Applicat, Cambridge, MA 02138 USA

[3] Brown Univ, Sch Engn, Providence, RI 02912 USA

来源：

NEUROCOMPUTING | 2022年 / 468卷

关键词：

Deep neural networks; Kronecker product; Rowdy activation functions; Gradient flow dynamics; physics-informed neural networks; Deep learning benchmarks; LEARNING FRAMEWORK;

D O I：

10.1016/j.neucom.2021.10.036

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose a new type of neural networks, Kronecker neural networks (KNNs), that form a general framework for neural networks with adaptive activation functions. KNNs employ the Kronecker product, which provides an efficient way of constructing a very wide network while keeping the number of parameters low. Our theoretical analysis reveals that under suitable conditions, KNNs induce a faster decay of the loss than that by the feed-forward networks. This is also empirically verified through a set of computational examples. Furthermore, under certain technical assumptions, we establish global convergence of gradient descent for KNNs. As a specific case, we propose the Rowdy activation function that is designed to get rid of any saturation region by injecting sinusoidal fluctuations, which include trainable parameters. The proposed Rowdy activation function can be employed in any neural network architecture like feed-forward neural networks, Recurrent neural networks, Convolutional neural networks etc. The effectiveness of KNNs with Rowdy activation is demonstrated through various computational experiments including function approximation using feed-forward neural networks, solution inference of partial differential equations using the physics-informed neural networks, and standard deep learning benchmark problems using convolutional and fully-connected neural networks. (c) 2021 Elsevier B.V. All rights reserved.

引用

页码：165 / 180

页数：16

共 50 条

[31] Adaptive Weight Decay for Deep Neural Networks
Nakamura, Kensuke
Hong, Byung-Woo
[J]. IEEE ACCESS, 2019, 7 : 118857 - 118865
[32] Adaptive propagation deep graph neural networks
Chen, Wei
Yan, Wenxu
Wang, Wenyuan
[J]. PATTERN RECOGNITION, 2024, 154
[33] On the approximation of rough functions with deep neural networks
De Ryck T.
Mishra S.
Ray D.
[J]. SeMA Journal, 2022, 79 (3) : 399 - 440
[34] Deep Convolutional Neural Networks on Cartoon Functions
Grohs, Philipp
Wiatowski, Thomas
Bolcskei, Helmut
[J]. 2016 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, 2016, : 1163 - 1167
[35] Adaptive Morphing Activation Function for Neural Networks
Herrera-Alcantara, Oscar
Arellano-Balderas, Salvador
[J]. FRACTAL AND FRACTIONAL, 2024, 8 (08)
[36] Neural networks with adaptive spline activation function
Campolucci, P
Capparelli, F
Guarnieri, S
Piazza, F
Uncini, A
[J]. MELECON '96 - 8TH MEDITERRANEAN ELECTROTECHNICAL CONFERENCE, PROCEEDINGS, VOLS I-III: INDUSTRIAL APPLICATIONS IN POWER SYSTEMS, COMPUTER SCIENCE AND TELECOMMUNICATIONS, 1996, : 1442 - 1445
[37] Quantum activation functions for quantum neural networks
Marco Maronese
Claudio Destri
Enrico Prati
[J]. Quantum Information Processing, 21
[38] A Comparison of Activation Functions in Artificial Neural Networks
Bircanoglu, Cenk
Arica, Nafiz
[J]. 2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
[39] HOLDER CONTINUOUS ACTIVATION FUNCTIONS IN NEURAL NETWORKS
Tatar, Nasser-Eddine
[J]. ADVANCES IN DIFFERENTIAL EQUATIONS AND CONTROL PROCESSES, 2015, 15 (02): : 93 - 106
[40] General adaptive transfer functions design for volume rendering by using neural networks
Wang, Liansheng
Chen, Xucan
Li, Sikun
Cai, Xun
[J]. NEURAL INFORMATION PROCESSING, PT 2, PROCEEDINGS, 2006, 4233 : 659 - 670

← 1 2 3 4 5 →