The learning-based multi-view stereo method predicts depth maps across various scales in a coarse-to-fine approach, effectively enhancing both the quality and efficiency of reconstruction. Addressing the challenge of obtaining adaptable deep refinement intervals and lightweight functionality within this technique, we propose uncertainty awareness with adaptive propagation for multi-view stereo (AP-UCSNet) as a solution. When sampling the depth hypothesis, we employ convolution operations to compute a set of neighboring points that are consistently located on the same physical surface for each pixel. We then weight and average the uncertainty awareness results for each pixel and all its neighbors to derive spatially related depth hypothesis samples, a process that mitigates the noise influence in some areas of weak texture. Furthermore, we extend the network to a four-scale structure to bolster its performance. The first three scales utilize the 3D UNet structure to regularize the cost volume, whereas for the final scale, the probability volume is constructed directly from the feature map to simplify the regularization process. Our experimental results demonstrate that the proposed method delivers superior results and performance. In comparison to UCSNet, the completeness error and overall error are reduced by 0.051 mm and 0.021 mm, respectively. On the Quadro RTX 5000 GPU, predicting a depth map with a resolution of 1600 × 1184 only requires 0.57s and 4398 M, reflecting a decrease of approximately 19.7% and 34.2% respectively.