Due to the limitations of the dataset, Data-driven methods are still stagnant. Currently, paired sand-dust image sets are mainly obtained by synthesis methods, but the synthesized methods are still at an early stage, and there is still a big gap between the synthesized sand-dust images and the real sand-dust images, which affects the applicability of the real scenarios. Therefore, to promote the progress of data-driven methods and improve the performance of image desanding, we provide a method to construct a real-world sand-dust dataset and establish the Sand-dust Image Enhancement Benchmark (SIEB), which consists of 904 real sand-dust images, 865 of which have the corresponding refined reference images. Due to unsatisfactory results, the remaining 39 images are used as the challenge dataset. Additionally, we propose a sand-dust enhancement network (called S-net) and validate the generalization of SIEB. The network mines the features of the spatial domain and the frequency domain through the designed dual-domain feature residual learning module (DFRLM), and uses the proposed nonparallel efficient feature fusion (NEFF) module to realize the efficient utilization of different levels and global features. The network combines a pre-trained model with semantic awareness and designed semantic attention to guide the decoding features, thereby improving network performance. Experimental results demonstrate the generalization of the proposed benchmark dataset and outperform the state-of-the-art methods both quantitatively and qualitatively. Specifically, comprehensive evaluations on four real-world datasets demonstrate that Snet performs well in peak signal-to-noise ratio (PSNR), structural similarity (SSIM), spatial-spectral entropy- based quality (SSEQ), blind image quality indices (BIQI), blind/referenceless image spatial quality evaluator (BRISQUE), and Fre<acute accent>chet Inception Distance (FID) metrics. For example, in terms of the PSNR metric, S-net is 0.70 dB higher than the second-ranked method, and in terms of the SSIM metric, it is 0.0236 higher.