Background Though significant progress in disease elimination has been made over the past decades, trachoma is the leading infectious cause of blindness globally. Further efforts in trachoma elimination are paradoxically being limited by the relative rarity of the disease, which makes clinical training for monitoring surveys difficult. In this work, we evaluate the plausibility of an Artificial Intelligence model to augment or replace human image graders in the evaluation/diagnosis of trachomatous inflammation-follicular (TF). Methods We utilized a dataset consisting of 2300 images with a 5% positivity rate for TF. We developed classifiers by implementing two state-of-the-art Convolutional Neural Network architectures, ResNet101 and VGG16, and applying a suite of data augmentation/oversampling techniques to the positive images. We then augmented our data set with additional images from independent research groups and evaluated performance. Results Models performed well in minimizing the number of false negatives, given the constraint of the low numbers of images in which TF was present. The best performing models achieved a sensitivity of 95% and positive predictive value of 50-70% while reducing the number images requiring skilled grading by 66-75%. Basic oversampling and data augmentation techniques were most successful at improving model performance, while techniques that are grounded in clinical experience, such as highlighting follicles, were less successful. Discussion The developed models perform well and significantly reduce the burden on graders by minimizing the number of false negative identifications. Further improvements in model skill will benefit from data sets with more TF as well as a range in image quality and image capture techniques used. While these models approach/meet the community-accepted standard for skilled field graders (i.e., Cohen's Kappa >0.7), they are insufficient to be deployed independently/clinically at this time; rather, they can be utilized to significantly reduce the burden on skilled image graders. Author summary Trachoma is an infectious disease, experienced primarily in the developing world, and is a leading cause of global blindness. As recent efforts to address the disease have led to a significant reduction in disease prevalence, it has become difficult to train health workers to detect trachoma, due to its rarity; this if often referred to as the "last mile" problem. To address this issue, we have implemented a convolutional neural network to detect the presence of TF in images of everted eyelids. The trained network has comparable performance to trained, but non-expert, human image graders. Further, we found that misclassified images were typically characterized by poor image quality (e.g., blurry, eyelid not in image, etc.), which could be addressed by a standardization of the image acquisition protocol.