This paper develops and validates a reliability sampling methodology using simulation, and re-sampling; and which incorporates unit-to-unit variation in the determination of significant sample sizes for analytically intractable reliability cases. This sample size determination is very important because the reliability of the sampled vehicles should represent the reliability of the entire fleet. Smaller-than-required sample sizes may lead to an incorrect representation of the reliability of the fleet, which may mislead the Army to make poor decisions, such as deploying a fleet that may not be reliable. These type II errors can be minimized by incorporating a more realistic sampling methodology, as developed in this research. Prior to using this methodology, analytical formulas were used to compute reliability sample sizes with unit-to-unit variation assumed to be constant. This new methodology confirms the analytically derived solutions for fixed usage & true failure rate, as well as for fixed usage & varying vehicle true failure rate. Existing reliability data shows that unit-to-unit variation does exist. Vehicle variation in true failure rate is modeled with a Gamma prior distribution. Recent reliability data are used to validate the hypothesis that this is an adequate tool for reliability sampling when unit-to-unit variation exists. Results of this validation accept the hypothesis, validating that the methodology is an adequate tool. The Army is currently using this methodology for fleet assessment.