Viewpoint planning determines the accuracy, processing speed, and lightweight of structure from motion. Despite the importance of viewpoint planning optimization to industrial digital services, existing methods show evident shortages in balancing between the reconstruction accuracy and the viewpoint number. Hence, this paper defines a new next-best-view problem for structure from motion, which aims to improve the accuracy, reduce the viewpoint number, and strike a balance between the two, simultaneously. Besides, to resolve the problem, this paper presents a novel viewpoint planning optimization method based on Proximal Policy Optimization. This method incorporates double models, action mask, and sim-to-real training to improve the training efficiency. Additionally, this method applies transfer-learning and fine-tuning to improve the versatility of the optimized viewpoint plan. A case study and experiments with multiple house models illustrate the method. In the experiment, the optimized viewpoint plan achieved 12.42%, 14.87%, 16.39%, 15.58%, and 32.35% reduction in Chamfer Distance, Earth Mover's Distance, the viewpoint number, the file size, and reconstruction processing time compared to the na & iuml;ve baseline, respectively. Also, compared to existing methods, the proposed method showed advantages from different perspectives, particularly in the balance between the reconstruction accuracy and the viewpoint number.