Arthroscopy is a minimally invasive surgery that imposes great physical and mental challenges to surgeons. Extensive experience is required to safely navigate camera and instruments in narrow spaces of human joints. Robust camera localization as well as a detailed reconstruction of the anatomy can benefit surgeons and would be essential for future robotic assistants. Our existing simultaneous localization and mapping (SLAM) system provides a robust, at-scale camera localization and a sparse map. However, a denser map is required to be of clinical relevance. In this latter, we propose a new system that combines the robust localizer with a keyframe selection strategy and a batch multiview stereo (MVS) for three-dimensional reconstruction. Tissues are reconstructed at scale, accurately and densely even under challenging arthroscopic conditions. The consistency of our system is verified in tests with synthetic noise and several keyframing strategies. Nine experiments were performed in phantom and three cadavers including various imaging conditions, camera settings, and scope motions. Our system reconstructed surfaces of more than 12 cm(2) with a root mean square error of no more than 0.5 mm. In comparison, monocular state-of-the-art SLAMfeature-based (ORBSLAM) and direct (LSDSLAM) methods commonly failed to track more than 20% of any camera motion and, in the few successful cases, yielded much larger estimation errors.