Objective: To evaluate the predictive accuracy of the severity of illness scoring systems in a single institution. Design: A prospective study conducted by collecting data on consecutive patients admitted to the medical intensive care unit over 20 months. Surgical and coronary care admissions were excluded. Setting: Veterans Affairs Medical Center at Buffalo, New York. Patients and participants: Data collected on 302 unique, consecutive patients admitted to the medical intensive care unit. Interventions: None. Measurements and results: Data required to calculate the patients' predicted mortality by the Mortality Probability Model (MPM) II, Acute Physiology and Chronic Health Evaluation (APACHE) II and Simplified Acute Physiology Score (SAPS) II scoring systems were collected. The probability of mortality for the cohort of patients was analyzed using confidence interval analyses, receiver operator characteristic (ROC) curves, two by two contingency tables and the Lemeshow–Hosmer chi-square statistic. Predicted mortality for all three scoring systems lay within the 95 % confidence interval for actual mortality. For the MPM II, SAPS II and APACHE II, the c-index (equivalent to the area under the ROC curve) was 0.695 ± 0.0307 SE, 0.702 ± 0.063 SE and 0.672 ± 0.0306 SE, respectively, which were not statistically different from each other but were lower than values obtained in previous studies. Conclusion: Although the overall mortality was consistent with the predicted mortality, the poor fit of the data to the model impairs the validity of the result. The observed outcoume could be due to erratic quality of care, or differences between the study population and the patient population in the original studies. The data cannot be used to distinguish between these possibilities. To increase predictive accuracy when studying individual intensive care units and enhance quality of care assessments it may be necessary to adapt the model to the patient population.