We evaluated the validity of DIBELS (Dynamic Indicators of Basic Early Literacy Skills) ORF (Oral Reading Fluency) for predicting performance on the Florida Comprehensive Assessment Test (FCAT SSS) and Stanford Achievement Test (SAT-10) reading comprehension measures. The usefulness of previously established ORF risk-level cutoffs [Good, R.H., Simmons, D.C., and Kame'enui, E.J. (2001). The importance and decision-making utility of a continuum of fluency-based indicators of foundational reading skills for third-grade high-stakes outcomes. Scientific Studies of Reading, 5, 257-288.] for third grade students were evaluated on calibration (n(S1) = 16,539) and cross-validation (n(S2) = 16,908) samples representative of Florida's Reading First population. The strongest correlations were the third (February/March) administration of ORF with both FCAT-SSS and SAT-10 (r(S) = .70 - .71), when the three tests were administered concurrently. Recalibrated ORF risk-level cut scores derived from ROC (receiver-operating characteristic) curve analyses produced more accurate identification of true positives than previously established benchmarks. The recalibrated risk-level cut scores predict performance on the FCAT-SSS equally well for students from different socio-economic, language, and race/ethnicity categories. (c) 2007 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.