This study focuses on the parameters necessary for the successful deployment of intelligent control systems for flotation processes, focusing on froth surface characteristics. Depth information is not available for conventional 2D features extracted from single froth images, thereby constraining their capacity to fully capture froth surface characteristics. To overcome this limitation, this study proposes a methodology for extracting 3D froth features by employing a deep learning-based binocular stereo vision approach. Initially, the froth image is preprocessed, including correction and segmentation. Subsequently, a froth stereo matching model called dual-attention encoding volume stereo (DAEV-Stereo) is developed. This model is trained utilizing simulated and real-world datasets to determine the froth parallax. The distance between the froth surface and camera is computed through the binocular vision approach based on camera intrinsics and extrinsics. Subsequently, the Froth is reconstructed in three dimensions, and the surface area and volume are calculated utilizing the triangular prism differential traversal technique. A mathematical model is formulated to integrate depth information and froth layer thickness to compute colour characteristics. The experimental results indicate that the DAEV-Stereo model obtains an endpoint error of 0.5 for froth stereo matching and a processing speed of 0.37 s, thereby satisfying the operational criteria for industrial use. After conducting pilot plant flotation tests, the median absolute deviations for the froth surface area, volume, and colour features were determined as 18.42, 17.62, and 9.67, respectively. Moreover, compared with traditional features, stereo features demonstrate reduced fluctuations, improved stability, and better correlations with operational conditions. The binocular stereo vision approach and DAEVStereo model for extracting 3D froth features are valuable for accurately characterizing froth surface data.