To meet the real-time and high-precision detection requirements in actual yarn production, this paper introduces a novel lightweight yarn detection method named PHL-YOLO, an optimized version of the YOLOv8 algorithm. Initially, we developed the PEC2F module by fusing PConv (partial convolution) and EMA (efficient multi-scale attention) components to enhance the model's feature extraction capabilities while concurrently reducing computational demands and the number of parameters. Subsequently, we deployed the HWD (Haar wavelet downsampling) module for downsampling, which effectively diminishes image resolution without sacrificing the critical information necessary for precise detection. Additionally, to address the excessive parameters and computational complexity in the detection head of YOLOv8, we introduced the LSCDH (lightweight shared convolutional detection head) module, which significantly simplifies the structure of the detection head. Finally, we applied a structured pruning technique, GSPMD (group-level structured pruning method with DepGraph), to further refine the model. The experimental outcomes indicate that our enhanced model PHL-YOLO has achieved a 1.0% increase in precision, a 2.5% improvement in recall, and a 2.3% increase in mAP (mean average precision), along with a 26 FPS performance boost over the original model. Furthermore, our model's computational load and parameter count are only 29.6% and 10.2% of those in the original YOLOv8 model, respectively. PHL-YOLO not only effectively reduces model complexity, but also upholds high performance levels, providing a valuable reference for rapid and accurate yarn detection in real-world production scenarios.