Multi-grade Deep Learning

被引：0

作者：

Xu, Yuesheng ^{[1
]}

机构：

[1] Old Dominion Univ, Dept Math & Stat, Norfolk, VA 23529 USA

来源：

COMMUNICATIONS ON APPLIED MATHEMATICS AND COMPUTATION | 2025年

基金：

美国国家科学基金会;

关键词：

Deep learning; Deep neural network (DDN); Multi-grade deep learning (MGDL); EMPIRICAL MODE DECOMPOSITION; ONLINE GRADIENT-METHOD; DETERMINISTIC CONVERGENCE; NETWORK;

D O I：

10.1007/s42967-024-00474-y

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

Deep learning requires solving a nonconvex optimization problem of a large size to learn a deep neural network (DNN). The current deep learning model is of a single-grade, that is, it trains a DNN end-to-end, by solving a single nonconvex optimization problem. When the layer number of the neural network is large, it is computationally challenging to carry out such a task efficiently. The complexity of the task comes from learning all weight matrices and bias vectors from one single nonconvex optimization problem of a large size. Inspired by the human education process which arranges learning in grades, we propose a multi-grade learning model: instead of solving one single optimization problem of a large size, we successively solve a number of optimization problems of small sizes, which are organized in grades, to learn a shallow neural network (a network having a few hidden layers) for each grade. Specifically, the current grade is to learn the leftover from the previous grade. In each of the grades, we learn a shallow neural network stacked on the top of the neural network, learned in the previous grades, whose parameters remain unchanged in training of the current and future grades. By dividing the task of learning a DDN into learning several shallow neural networks, one can alleviate the severity of the nonconvexity of the original optimization problem of a large size. When all grades of the learning are completed, the final neural network learned is a stair-shape neural network, which is the superposition of networks learned from all grades. Such a model enables us to learn a DDN much more effectively and efficiently. Moreover, multi-grade learning naturally leads to adaptive learning. We prove that in the context of function approximation if the neural network generated by a new grade is nontrivial, the optimal error of a new grade is strictly reduced from the optimal error of the previous grade. Furthermore, we provide numerical examples which confirm that the proposed multi-grade model outperforms significantly the standard single-grade model and is much more robust to noise than the single-grade model. They include three proof-of-concept examples, classification on two benchmark data sets MNIST and Fashion MNIST with two noise rates, which is to find classifiers, functions of 784 dimensions, and as well as numerical solutions of the one-dimensional Helmholtz equation.

引用

页数：52

共 50 条

[31] SEQUENTIAL CHAIN OF CONTIGUOUS EVALUATIONS IN THE ACCEPTANCE OF MULTI-GRADE PRODUCTS
DUBROV, AM
INDUSTRIAL LABORATORY, 1979, 45 (10): : 1148 - 1153
[32] READING SPACES IN RURAL SCHOOLS: a study in multi-grade classes
de Lima Winchuar, Marcio Jose
Bufrem, Leilah Santiago
PERIFERIA, 2021, 13 (01) : 217 - 242
[33] Multi-grade teaching practices in Austrian and Finnish primary schools
Hyry-Beihammer, Eeva Kaisa
Hascher, Tina
INTERNATIONAL JOURNAL OF EDUCATIONAL RESEARCH, 2015, 74 : 104 - 113
[34] Improved EfficientNet Architecture for Multi-Grade Brain Tumor Detection
Ishaq, Ahmad
Ullah, Fath U. Min
Hamandawana, Prince
Cho, Da-Jung
Chung, Tae-Sun
ELECTRONICS, 2025, 14 (04):
[35] Pedagogical perspectives for teaching in multi-grade classes in Ilha Grande
Aparecida Alves, Maria
DIALOGIA, 2020, (34): : 82 - 94
[36] Bidirectional evolutionary structural optimization algorithm for multi-grade materials
Zhang, Huzhi
Huang, Yaosen
Li, Yonggui
Yin, Bin
Journal of Railway Science and Engineering, 2022, 19 (06): : 1726 - 1733
[37] Fostering Communication through Blogs in an International, Multi-grade Context
Castellanos, Andrea
HOW-A COLOMBIAN JOURNAL FOR TEACHERS OF ENGLISH, 2009, 16 (01): : 151 - 165
[38] Facilitating differentiated instruction in a multi-grade setting: the case of a small school
Mariyam Shareefa
Visal Moosa
Rohani Matzin
Nor Zaiham Midwati Abdulla
Rosmawijah Jawawi
SN Social Sciences, 1 (5):
[39] An approach for evaluation of process sustainability using multi-grade fuzzy method
Vimal, K. E. K.
Vinodh, S.
Muralidharan, R.
INTERNATIONAL JOURNAL OF SUSTAINABLE ENGINEERING, 2015, 8 (01) : 40 - 54
[40] Soft sensor development based on just-in-time learning and dynamic time warping for multi-grade processes
Song, Min Jun
Ju, Sung Hyun
Lee, Jong Min
KOREAN JOURNAL OF CHEMICAL ENGINEERING, 2023, 40 (05) : 1023 - 1036

← 1 2 3 4 5 →