Forecasting traffic volume is an important task in controlling urban highways, guiding drivers' routes, and providing real-time transportation information. Previous research on traffic volume forecasting has concentrated on a single forecasting model and has reported positive results, which has been frequently better than those of other models. In addition, many previous researchers have claimed that neural network models are better than linear statistical models in terms of prediction accuracy. However, the forecasting power of a single model is limited to the typical cases to which the model fits best. In other words, even though many research efforts have-claimed the general superiority of a single model over others in predicting future events, we believe it depends on the data characteristics used, the composition of the training data; the model: architecture, and the algorithm itself. In this paper, we have studied the relationship in forecasting traffic Volume between data characteristics and the forecasting accuracy of different models, particularly neural network models. To compare and test the forecasting accuracy of the models, three different data sets of traffic volume were collected from interstate highways, intercity highways, and urban intersections. The data sets show very different characteristics in terms of volatility, period, and fluctuations as measured by the Hurst exponent, the correlation dimension. The data sets were tested using a back-propagation network model, a FIR model, and a time-delayed recurrent model. The test results show that the time-delayed recurrent model outperforms other models in forecasting very randomly moving data described by a low Hurst exponent. In contrast, the FIR model shows better forecasting accuracy than the time-delayed recurrent network for relatively regular periodic data described by a high Hurst exponent. The interpretation of these results shows' that the feedback mechanism of the previous error, through the temporal learning technique in the time-delayed recurrent network, naturally absorbs the dynamic change of any underlying nonlinear movement. The FIR and back-propagation model, which have claimed a nonlinear learning mechanism, may not be very good in handling randomly fluctuating events. (C) 1998 Elsevier Science Ltd. All rights reserved.