Using the multivariate statistical methods, this study interprets a set of data containing 23 water quality parameters from 10 quality monitoring stations in Karkheh River located in southwest of Iran over 5 years. According to cluster analysis, the stations are classified into three classes of quality, and the most important factors on the whole set of parameters and each class are determined by the help of factor analysis. The results indicate the effects of natural factors, soil weathering and erosion, urban and human wastewater, agricultural and industrial wastewater on water quality at different levels and any location. Afterwards, five input selection methods such as correlation model, principal component analysis, combination of gamma test and backward regression, gamma test and genetic algorithm, and gamma test by elimination method are used for modeling BOD, and then their efficiency is investigated in simulation BOD with local linear regression, Artificial Neural Network, and genetic programming. From five methods of input variables in BOD simulation by local linear regression, genetic test and backward regression with RMSE error of 0.27 are the best input methods; gamma test based on genetic algorithm is the best model in simulation by Artificial Neural Network with RMSE error of 0.28, and finally, the gamma test model based on genetic algorithm with RMSE error of 0.1303 is the most appropriate model in simulation with genetic programming.