In view of the difficulties in identifying the event-related potential (ERP) signals due to their characteristics of strong individual differences, and the overfitting problem of traditional convolutional neural network (CNN) for small samples, based on a fusion model of CNN and support vector machine (SVM), a CNN and SVM combined classifier for ERP signal classification and recognition is proposed. This method takes the filtered original multi-channel ERP signals as input. Firstly, a one-dimensional time convolution kernel is used to convolve the time domain of the signals, then the spatial convolution is performed by using a one-dimensional spatial convolution kernel to learn features from temporal and spatial information of signals. Finally, the training of the CNN model is accomplished by such operations as down-sampling and full connection. After that, the signals are imported into the trained model again to extract the down-sampling layer features of the signals, and SVM is finally used to classify and identify the features. Classification results show that the proposed combined classifier can effectively recognize the P300 signal component under a small number of repeated visual stimuli. The average recognition accuracy of the proposed method is 94.08% after the repetition of more than four times of visual stimuli, and it has an average accuracy improvement of 4.36% compared with the traditional CNN method. Compared with the classical algorithms of stepwise linear discriminant analysis (SWLDA) and Bayesian linear discriminant analysis (BLDA), the average recognition accuracy of the combined classifier is improved by 6.83% and 4.16%, respectively. The combined classifier only needs a small number of repeated experiments to obtain higher target recognition accuracy, and effectively improves the recognition effect of ERP signals. © 2021, Editorial Office of Journal of Xi'an Jiaotong University. All right reserved.