Alzheimer's disease (AD) is a progressive neurodegenerative disorder that presents challenges for early diagnosis and treatment. Magnetic resonance imaging (MRI) is a valuable tool for identifying the structural changes associated with AD. The complexity of MRI data and imbalances pose challenges in medical research, necessitating the collection of additional data. Machine learning techniques, including deep learning and deep reinforcement learning (RL), can extract complex patterns from medical image data, such as MRIs, to augment existing information. The characteristics of the dataset influence the selection of data augmentation methods. This dependency was mitigated through the utilization of RL and the incorporation of feedback during data augmentation. However, the design of an appropriate reward function that provides effective feedback for RL agents remains a challenge. This study proposes a novel framework for reward calculations in RL. Initially, the framework performs clustering on minority-class data. The similarity between the generated image and cluster centers was quantified using similarity metrics. In this context, the reward was allocated to the data augmentation method exhibiting the greatest similarity to the original data, whereas a reward was also assigned to the process demonstrating the least similarity to the original data. This study utilized the Alzheimer's Disease Neuroimaging Initiative (ADNI) and Australian Imaging, Biomarkers, and Lifestyle (AIBL) datasets, and the results obtained were compared with those of other existing techniques. The accuracies of the proposed data augmentation method for the ADNI and AIBL datasets are 97.55% and 96.30%, respectively. Based on RL, deep learning architectures, and data augmentation, the proposed approach was designed to enhance the early diagnosis and prognosis of AD as well as to facilitate more effective clinical interventions and patient care. © 2024 Elsevier Ltd