The performance of the Mantel-Haenszel odds-ratio estimator and chi2 significance test were investigated using simulated data. Multiparameter logistic item response theory models were used to generate item scores for 20- and 40-item tests for 500 reference group and 500 focal group examinees. The difficulty, discrimination, and guessing parameters, and the difference in the group trait level averages were varied and combined factorially. Within each cell of the design, 200 replications were completed under both differential item functioning (DIF) and no-DIF Conditions. The empirical chi2 Type I and Type II error rates, and the average of the odds-ratio estimates, were analyzed over the 200 replications. Under no-DIF conditions, inflated chi2 Type I error rates and misestimated odds-ratio values were found for the 20-item test and resulted from interactions between item parameter values and trait differences. For the 40-item test, Type I error rate inflation disappeared but odds-ratios still were misestimated. Under DIF conditions, Type II error rates were not inflated, but odds-ratios were misestimated, due to parameter x trait level interactions for both test lengths. The results demonstrate the importance of using both the odds-ratio and the significance test in interpreting the presence or absence of DIF. In addition, the accuracy under the DIF conditions depended on the size and uniformity of DIF.