Multi-objective optimization (MOP) has been widely applied in various applications such as engineering and economics. MOP is an important practical optimization problem, and finding approaches to better solve it is of both practical and theoretical significance. The core of solving an MOP is to find the global optimal solution set efficiently and accurately. The current MOP algorithm has premature convergence or poor population diversity, and the solution set obtained falls easily into the local optimal or clustering phenomenon. In this study, an MOP algorithm based on the non-dominated sorting genetic algorithm based on reinforcement learning (RL-NSGA-II) is proposed; the algorithm is adopted based on the prediction, forecasting, Monte Carlo method, which is based on the action of population genetic information, and environment interaction information and Markov decision process in mathematical modeling. This is because using a non-dominated solution contains valuable information, which can be used to guide the evolution direction of the population and search the optimal solution set more accurately. The proposed RL-NSGA-II algorithm was evaluated on the ZDT and DTLZ test sets, and the experimental results verified the effectiveness of the proposed algorithm in solving MOPs.