Different international agencies have designed large-scale assessment tests (ILSA) for decades to measure student performance and establish comparisons between education systems. We present a systematic review of the literature that aims to learn about the evolution and characteristics of the secondary analyses of three ILSAs (PISA, TIMSS, and PIRLS) and identify possible improvements and future lines of research. We searched three repositories: Web of Science, Scopus, and ERIC, focused on secondary analyses of data from these tests developed since 2000, whose main objective was to analyze questions related to predictors of school performance and effectiveness. After applying different criteria, 63 articles were selected from the 470 identified. This analysis allowed us to identify the increase in interest in this topic over the last few years, the lack of studies in certain regions, and the prevalence of multilevel methodology. The review has also highlighted some limitations in the existing literature, such as the lack of consideration of the elements inherent to the design of large-scale evaluations, as well as those inherent to the data used for the analysis, such as the impossibility of establishing causal relationships or the lack of data at the classroom level. These results are used to make recommendations and detect less explored areas, such as curriculum-related or country-level variables.