Since late 2019, coronavirus disease 2019 (COVID-19) has been spreading globally, presenting a significant threat to human lives and health, and exerting a profound impact on worldwide economic development. Due to the highly contagious nature of COVID-19, precise and prompt diagnosis has become paramount. The effective and rapid identification of COVID-19 through computed tomography (CT) images has thus garnered substantial interest, prompting scientists to propose various segmentation methods aimed at improving the diagnostic accuracy of CT images. Drawing from these foundations, the study introduces an innovative multilevel threshold segmentation method known as the Reinforcement Learning-based Enhanced Sand Cat algorithm (QLSCSO). QLSCSO represents a novel optimization algorithm distinguished by its remarkable convergence accuracy and the capacity to escape local optima. The introduction of this optimizer incorporates reinforcement learning methodologies into the population iteration process of heuristic techniques. In the algorithm's update phase, a hybrid model and three distinct mutation strategies are employed to enhance its capability to overcome local optima. Consequently, the developed QLSCSO method produces high-quality segmentation results while reducing vulnerability to segmentation process stagnation. To establish the effectiveness of the proposed method, comparative analyses are initiated between QLSCSO and other advanced meta-algorithms using the IEEE CEC 2022 benchmark functions. Furthermore, QLSCSO undergoes experimental evaluations on CT images of COVID-19, including comprehensive comparative assessments with other competing segmentation methods and thorough validation. The results conclusively demonstrate the outstanding performance of the unique segmentation method based on QLSCSO across a range of performance evaluation metrics. Therefore, this approach offers an efficient segmentation procedure for COVID-19 images and even other pathological medical images.