Vehicle detection in high-resolution remote sensing imagery faces challenges such as varying scales, complex backgrounds, and high intra-class variability. We propose an enhanced YOLOv8 framework, incorporating three key advancements: the Adaptive Feature Pyramid Network (AFPN), Omni-Dimensional Convolution (ODConv), and a Slim Neck with Generalized Shuffle Convolution (GSConv). These enhancements improve vehicle detection accuracy, computational efficiency, and visual AI capabilities for applications such as computer animation and virtual worlds. Our model achieves a Mean Average Precision (mAP) of 0.7153, representing a 4.99% improvement over the baseline YOLOv8. Precision and recall increase to 0.9233 and 0.9329, respectively, while box loss is reduced from 1.213 to 1.054. This framework supports real-time surveillance, traffic monitoring, and urban planning. The NEPU-OWOD V2.0 dataset, used for evaluation, includes high-resolution images from multiple regions and seasons, along with diverse annotations and augmentations. Our modular approach allows for separate assessments of each enhancement. The dataset and source code are available for future research and development at (https://doi.org/10.5281/zenodo.13075939).