The rail surface condition is a critical factor influencing wheel-rail adhesion performance. To address the engineering challenges associated with existing rail surface condition identification models, such as high-parameter complexity, significant computational delay, and the difficulty of onboard deployment, a lightweight rail surface condition identification method integrating knowledge distillation and transfer learning is proposed. A rail surface image dataset is constructed, covering typical working conditions, including dry, wet, and oily surfaces. A "teacher-student" collaborative optimization framework is developed, in which GoogLeNet, fine tuned via transfer learning, serves as the teacher network to guide the MobileNet student network, which is also fine tuned through transfer learning, thereby achieving model compression. Additionally, an FP16/FP32 mixed-precision computing strategy is employed to accelerate the training process. The experimental results demonstrate that the optimized student model has a compact size of only 4.21 MB, achieves an accuracy of 97.38% on the test set, and attains an inference time of 0.0371 s. Integrating this model into the estimation system of the maximum adhesion coefficient for heavy-haul locomotives enhances estimation confidence, reduces estimation errors under varying operating conditions, and provides real-time and reliable environmental perception for optimizing adhesion control strategies. This approach holds significant engineering value in improving adhesion utilization under complex wheel-rail contact conditions.