Adaptive Orthogonal Search for Network Structure Learning of ELM

被引:0
|
作者
Xu R. [1 ]
Liang X. [1 ]
Ma Y.-F. [2 ]
Qi J.-S. [3 ]
机构
[1] School of Information, Renmin University of China, Beijing
[2] School of Information Science and Engineering, Qufu Normal University, Rizhao
[3] School of Computer Science and Technology, Huaiyin Normal University, Huaiyin
来源
基金
中国国家自然科学基金;
关键词
Color constancy computation; Extreme learning machine; Orthogonal backward elimination; Orthogonal forward selection; Parsimonious network structure; Subset model selection;
D O I
10.11897/SP.J.1016.2021.01888
中图分类号
学科分类号
摘要
In the past several decades, Single Hidden Layer Feedforward Neural Network (SLFN) has drawn a large amount of attention in the field of machine learning, data mining and pattern recognition, due to its unique characteristics, i.e., learning capability from the input samples, and universal approximation capability for complex nonlinear mappings. Although SLFN has been investigated extensively from both theoretical and application aspects, it is still quite challenging to automatically determine a suitable network architecture for solving a specific task so that the resulting learner model can achieve sound performance for both learning and generalization. Extreme Learning Machine (ELM) is a powerful learning scheme for generalized SLFN with fast learning speed and has been widely used for both regression and classification. The hidden node parameters of ELM need not be exhaustively tuned during training, but assigned with random values simply, and the output weights are then analytically determined by solving a linear equation system using the generalized inverse method. However, for ELM, the suitable number of hidden nodes is usually pre-determined by trial and error, which may be tedious in some applications and does not guarantee that the selected network size will be close to optimal or will generalize well. Therefore, how to choose a parsimonious structure for ELM and to present a good capacity of generalization is the main objective of this paper. By formulating the learning problem as a subset model selection, we present an adaptive orthogonal search method to address the architectural design of ELM (referred to as AOS-ELM) for regression problems. In AOS-ELM, the hidden nodes can be deleted or recruited dynamically according to their significance to network performance, so that the network architecture can be self-configurable. More precisely, we first randomly generate a large number of hidden nodes using preliminary ELM as the candidate reservoir. Then, the hidden node output vector that has the highest correlation with the target output is selected from the candidates and added to the existing network by orthogonal forward selection in each step. Meanwhile, after a new hidden node is added to the set of selected variables, orthogonal backward elimination is commenced to see if any of the previously selected hidden nodes can be deleted without appreciably increasing the squared error. The procedure stops when no further additions or deletions are possible which satisfy the criteria. Finally, an enhanced backward refinement is implemented to correct mistakes made in earlier steps, so that the redundant hidden nodes are able to be deleted from the model as much as possible, and then the network complexity can be further reduced. To sum up, the proposed method can take into account the intrinsic connections and interactions between the hidden nodes, therefore offers a potential for finding the parsimonious network solutions that will fit the data. We demonstrate effective performance and superiority of the proposed method with experiments on several benchmark regression problems as well as two different color constancy tasks. Simulation results show that our method not only obtains a similar or higher learning accuracy than the preliminary ELM and other well-known constructive and pruning ELMs with a small number of hidden nodes, but also achieves better or comparable illuminant estimates over most of the test error metrics in comparison to several state-of-the-art color constancy algorithms. © 2021, Science Press. All right reserved.
引用
收藏
页码:1888 / 1906
页数:18
相关论文
共 53 条
  • [1] Cybenko G., Approximation by superpositions of a sigmoidal function, Mathematics of Control, Signals and Systems, 2, 4, pp. 303-314, (1989)
  • [2] Huang G B, Zhu Q Y, Siew C K., Extreme learning machine: Theory and applications, Neurocomputing, 70, 1-3, pp. 489-501, (2006)
  • [3] Huang G B, Chen L, Siew C K., Universal approximation using incremental constructive feedforward networks with random hidden nodes, IEEE Transactions on Neural Networks, 17, 4, pp. 879-892, (2006)
  • [4] Lu Hui-Juan, An Chun-Lin, Ma Xiao-Ping, Et al., Disagreement measure based ensemble of extreme learning machine for gene expression data classification, Chinese Journal of Computers, 36, 2, pp. 341-348, (2013)
  • [5] Zhang Jing, Chen Yi-Qiang, Ji Wen, Two-tie image super-resolution based on CNN and ELM, Chinese Journal of Computers, 41, 11, pp. 2581-2597, (2018)
  • [6] Xu Rui, Liang Xun, Qi Jin-Shan, Et al., Advances and trends in extreme learning machine, Chinese Journal of Computers, 42, 7, pp. 1640-1670, (2019)
  • [7] Wang N, Er M J, Han M., Parsimonious extreme learning machine using recursive orthogonal least squares, IEEE Transactions on Neural Networks and Learning Systems, 25, 10, pp. 1828-1841, (2014)
  • [8] Kwok T Y, Yeung D Y., Constructive algorithms for structure learning in feedforward neural networks for regression problems, IEEE Transactions on Neural Networks, 8, 3, pp. 630-645, (1997)
  • [9] Parekh R, Yang J, Honavar V., Constructive neural-network learning algorithms for pattern classification, IEEE Transactions on Neural Networks, 11, 2, pp. 436-451, (2000)
  • [10] Huang G B, Chen L., Convex incremental extreme learning machine, Neurocomputing, 70, 16-18, pp. 3056-3062, (2007)