A Bayesian approach to artificial neural network model selection

被引：0

作者：

Kingston, G. B. ^{[1
]}

Maier, H. R. ^{[1
]}

Lambert, M. F. ^{[1
]}

机构：

[1] Univ Adelaide, Sch Civil & Environm Engn, Ctr Appl Modelling Water Engn, Adelaide, SA 5005, Australia

来源：

MODSIM 2005: INTERNATIONAL CONGRESS ON MODELLING AND SIMULATION: ADVANCES AND APPLICATIONS FOR MANAGEMENT AND DECISION MAKING: ADVANCES AND APPLICATIONS FOR MANAGEMENT AND DECISION MAKING | 2005年

关键词：

Artificial neural networks; model selection; Bayes factors; Markov chain Monte Carlo;

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Artificial neural networks (ANNs) have proven to be extremely valuable tools in the field of water resources engineering. However, one of the most difficult tasks in developing an ANN is determining the optimum level of complexity required to model a given problem, as there is no formal systematic model selection method. The generalisability of an ANN, which is defined by its predictive performance on the universe of possible data, can be significantly impaired if there are too few or too many hidden nodes in the network. Therefore, for an ANN to be a valuable prediction tool, it is important that some effort is made to optimise the number of hidden nodes. This paper presents a Bayesian model selection (BMS) method for ANNs that provides an objective approach for comparing models of varying complexity in order to select the most appropriate ANN structure. Given a set of competing models H-1,..., H-H, BMS is used to compare the posterior probability that each model Hi is the true data generating function, given a set of observed data y. This probability is also known as the evidence of a model and the ratio of two competing models' evidence values, known as the Bayes' factor, can be used to rank the competing models in terms of the relative evidence in support of each model. For ANNs (and other complex models), the evidence of a model p(H vertical bar y) is analytically intractable and, consequently, alternative methods are required to estimate these probabilities for the competing models. One such method involves the use of Markov chain Monte Carlo (MCMC) simulations from the posterior weight distribution p(w vertical bar y, H) to approximate the evidence. It has already been shown that there are numerous benefits to estimating the posterior distribution of ANN weights with MCMC methods; therefore, the proposed BMS approach is based on such an approximation of p(y vertical bar H), as this only requires a simple additional step after sampling from p(w vertical bar y, H). Furthermore, the weight distributions obtained from the MCMC simulation provide a useful check of the accuracy to the approximated Bayes' factors. A problem associated with the use of posterior simulations to estimate a model's evidence is that the approximation may be sensitive to factors associated with the MCMC simulation. Therefore, the proposed BMS method for ANNs incorporates a further check of the accuracy of the computed Bayes' factors by inspecting the marginal posterior distributions of the hidden-to-output layer weights, which indicate whether all of the hidden nodes in the model are necessary. The fact that this check is available is one of the greatest advantages of the proposed approach over conventional model selection methods, which do not provide such a test and instead rely on the modeller's subjective choice of selection criterion. The aim of model selection is to enable generalisation to new cases. Therefore, in the case study presented in this paper, the performance of the proposed BMS method was assessed in comparison to the performance of conventional ANN selection methods on data outside the domain of the training data. This case study, which involves forecasting salinity concentrations in the River Murray at Murray Bridge, South Australia, 14 days in advance, was chosen as it had been shown previously that, if an ANN was trained on the first half of the available data, it would be required to extrapolate in some cases when applied to the second half of the available data set. In this case study, the proposed BMS framework for ANNs was shown to be more successful than conventional model selection methods in selecting an ANN that could approximate the relationship contained in the training data and generalise to new cases outside the domain of those used for training. The Bayes' factors calculated were useful for obtaining an initial guide to the most appropriate model; however, the final step involving inspection of marginal posterior hidden-to-output weight distributions was necessary for the final selection of the optimum number of hidden nodes. The model selected using the proposed BMS approach not only had the best generalisability, but was also more parsimonious than the models selected using conventional methods and required considerably less time for training.

引用

页码：190 / 196

页数：7

共 50 条

[1] An artificial neural network approach to multicriteria model selection
Ulengin, F
Topcu, YI
Sahin, SO
[J]. MULTIPLE CRITERIA DECISION MAKING IN THE NEW MILLENNIUM, 2001, 507 : 101 - 110
[2] Multioutput feedforward neural network selection: A Bayesian approach
Vila, JP
Rossi, V
[J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 495 - 500
[3] A neural network approach to forecasting model selection
Department of Decision Sciences, Whittemore School of Business and Economics, University of New Hampshire, Durham, New Hampshire NH 03824, United States
[J]. Inf Manage, 6 (297-303):
[4] A neural network approach to forecasting model selection
Sohl, JE
Venkatachalam, AR
[J]. INFORMATION & MANAGEMENT, 1995, 29 (06) : 297 - 303
[5] A neural network model approach to athlete selection
Maszczyk A.
Zajac A.
Ryguła I.
[J]. Sports Engineering, 2011, 13 (2) : 83 - 93
[6] An Artificial Neural Network and Bayesian Network model for liquidity risk assessment in banking
Tavana, Madjid
Abtahi, Amir-Reza
Di Caprio, Debora
Poortarigh, Maryam
[J]. NEUROCOMPUTING, 2018, 275 : 2525 - 2554
[7] Bayesian theory and artificial neural network approach in MEG inverse problem
Ye, S
Hu, J
[J]. 2002 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-4, PROCEEDINGS, 2002, : 1502 - 1504
[8] Bayesian Estimation with Artificial Neural Network
Yun, Sehyun
Zanetti, Renato
[J]. 2021 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2021, : 1149 - 1155
[9] A hybrid approach to monthly streamflow forecasting: Integrating hydrological model outputs into a Bayesian artificial neural network
Humphrey, Greer B.
Gibbs, Matthew S.
Dandy, Graeme C.
Maier, Holger R.
[J]. JOURNAL OF HYDROLOGY, 2016, 540 : 623 - 640
[10] Bayesian Approach for Neural Network
Permai, S. D.
Ulama, B. S. S.
Iriawan, N.
[J]. INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS & STATISTICS, 2015, 53 (04): : 220 - 226

← 1 2 3 4 5 →