Surface defect recognition (SDR) with limited data is a common challenge in industrial production. Recent methods generally utilize generative adversarial networks (GANs) to generate defect samples as training data for improving the performance of SDR. However, the instability of GAN training often results in uncontrollable and low-quality samples under severe data constraints, making it difficult for the existing methods to effectively handle SDR tasks with different granularities. To address the issue, this article proposes a human-guided data augmentation method under extremely limited data. Its core idea is to introduce human feedback into a diffusion model for synthesizing controllable and high-quality defect samples by reinforcement learning (RL), aiming to improve various-granularity SDR tasks such as defect classification and segmentation. First, a conditional diffusion model (CDM) is constructed to generate controllable defect samples using semantic labels, which learn defect distribution from a small number of annotated defect samples. Then, a reward model is designed to evaluate the outcome of the CDM by human feedback. Next, based on the trained reward model, the CDM is further optimized by proximal policy optimization (PPO). Finally, the refined CDM is used to generate high-quality defect samples as training data for enhancing defect classification and segmentation. Extensive experiments on NEU-Seg, magnetic-tile (MT), and the collected Tire datasets demonstrate that our method outperforms the state-of-the-art generative methods in terms of generated image quality. Furthermore, the performance of defect classification and segmentation has also shown significant enhancements based on the generated samples, with a maximum improvement of 16.90% in accuracy and 12.85% in mean intersection over union (mIoU) compared to results obtained without data augmentation.