The research presented in this paper is to determine the appropriate complexity and appropriate training of an Artificial Neural Network (ANN). For an ANN, 'complexity' refers to the network structure and thus to the number of neurons in the network. The ANN is used for one-day ahead forecasting of the discharge in the river Meuse (western Europe) at Borgharen (in the south of,the Netherlands), based on the recorded precipitation upstream of Borgharen. The forecasting performance is measured with the Nash-Sutcliffe coefficient R2 and the Relative Mean Absolute Error RMAE, the applied training algorithm is the Levenberg-Marquardt (LM) algorithm and the applied performance function is the Mean Square Error (MSE). All networks are trained multiple times, so that not only the mean of the R2 and the RMAE values are calculated, but also the standard deviations to evaluate their uncertainties. First, the numbers of input and hidden neurons are varied to determine the effect of network complexity on the forecasting performance. Secondly, the influence of weight decay on the forecasting performance is determined for different network complexities. Weight decay is a method used to train an ANN with a modified performance function, which normally is MSE. For weight decay, a penalty term is added to the performance function to prevent the values of the weights and biases becoming too large during the training to enable a smoother network response. Different degrees of weight decay influence are introduced by varying the value of the 'decay coefficient' from 0.1 to 1, with higher values corresponding to a smaller influence of weight decay. Network complexity is now expressed in terms of the total number of neurons in the network. Thirdly, the effect of the number of training epochs (or iterations) on the forecasting performance is determined, again for different network complexities in terms of total number of neurons in the network. The network structure (or complexity) has the largest influence on the flow forecasting performance. The influence of the number of training epochs is somewhat smaller, and weight decay has the smallest influence on the flow forecasting performance. An 8-4-1 network (8 neurons in the input layer, 4 in the hidden layer and 1 in the output layer) trained for I I epochs with no weight decaying being applied was identified as an appropriate network. Networks simpler than an 8-4-1 network should be trained more than 13 epochs. For networks more complex than an 8-4-1 network, the appropriate training epochs range between 8 and 11. For a simple network, weight decay is not a useful method to improve the network's generalization ability. For a complex network, weight decay can help to prevent overfitting, by compensating for the negative influence of a greater network complexity on network performance provided.