Underwater image enhancement is a key technology for improving underwater image quality and enhancing visualization. Due to the light propagation characteristics and absorption scattering by water bodies in underwater environments, underwater images often exhibit blurring, color distortion and low contrast, which severely limit the accuracy and reliability of underwater observation, underwater navigation and underwater research. To address these problems, we propose a new probabilistic network (MSAP) to learn the enhanced distribution of degraded underwater images, in which we design a high-performance transformer model that enables it to capture long-range pixel interactions, construct the enhanced distribution by conditional variational autoencoder (VAE) and adaptive instance normalization, use an attention mechanism to locate to the interested information to suppress useless information. Afterward, we predict deterministic outcomes based on a set of samples in the distribution, and we obtain more robust and stable results in the consensus process. Moreover, qualitative analysis and quantitative evaluation show that our proposed method achieves excellent performance on three different underwater datasets, compared to the traditional UIE approach.