Purpose Adequate comorbidity risk adjustment is central for reliable outcome prediction and provider performance evaluation. The two most commonly employed risk-adjustment methods in orthopaedic surgery were not originally validated in this patient population. We sought (1) to develop a single numeric comorbidity score for predicting inpatient mortality in patients undergoing orthopaedic surgery by combining and reweighting the conditions included in the Charlson and Elixhauser measures, and to compare its predictive performance to each of the separate component scores. We also (2) evaluated the new score separately for spine surgery, adult reconstruction, hip fracture, and musculoskeletal oncology admissions. Methods Data from the National Hospital Discharge Survey for the years 1990 through 2007 were obtained. A comorbidity score for predicting inpatient mortality was developed by combining conditions from the Charlson and Elixhauser measures. Weights were derived from a random sample of 80 % of the cohort (n = 26,454,972), and the predictive ability of the new score was internally validated on the remaining 20 % (n = 6,739,169). Performance of scores was assessed and compared using the area under the receiver operating characteristic curve (AUC) derived from multivariable logistic regression models. Results The new combined comorbidity score (AUC = 0.858, 95 % CI 0.856-0.859) performed 58 % better than the Charlson score (AUC = 0.794, 95 % CI 0.792-0.796) and 12 % better than the Elixhauser score (AUC = 0.845, 95 % CI 0.844-0.847). Of the seven conditions that received the highest weights in the new combined score, only three of them were included in both the Charlson and the Elixhauser indices. The new combined score achieved higher discriminatory power for all orthopaedic admission subgroups. Conclusion A single numeric comorbidity score combining conditions from the Charlson and Elixhauser models provided better discrimination of inpatient mortality than either of its constituent scores. Future research should test this score in other populations and data settings.