Objective Protein flexibility plays important roles in various biochemical processes in the living organisms, such as enzyme catalysis, signal transduction, substance transport and storage, etc. Prediction of the intrinsic flexible motions based on the tertiary structure of proteins is helpful for our better understanding of the mechanism of protein functions, which is an important scientific problem in the research field of protein structure-function relationship. Convolutional neural network (CNN), one of the mainstream algorithms in deep learning, has been successfully applied in the study of protein structure-function relationship. Methods In the present work, based on the idea of PointNet method developed in the computer vision research, a CNN model was proposed to predict the protein flexibility. In this model, protein structures were treated as three-dimensional point clouds, where the atomic coordinates of proteins were directly inputted into the model, and the permutation invariance and global rotation invariance of the point cloud were delt with by using the pooling operations and a spatial transformation network, respectively. In addition, considering the varied sizes of different proteins, a new mini-batch optimization strategy was proposed, where the model was trained by using the mini-batches of protein structures with different sizes as input. The Pearson correlation coefficient was used as the evaluation function for the training of the model. Besides that, in order to further enhance the performance of the network, an improved model was constructed based on the PointNet-based CNN model, in which the max-pooling and the average-pooling were concatenated to better extract the global features of protein structures. Then the PointNet-based CNN model and the improved model were trained and tested by using the temperature factors (B-factors) of 243 non-redundant proteins. Results The results show that the average Pearson correlation coefficient between the predicted and the experimental temperature factors predicted by the PointNet-based model and the improved model were 0.64 and 0.65, respectively. The prediction accuracy of our models is better than that of the Gaussian network model that has been widely used in investigating protein flexibility. Especially, for the 74 relatively loose natural disordered proteins from the Disbind website, the average Pearson correlation coefficient predicted by our models were 0.62 and 0.64, respectively, which were significantly better than GNM. Conclusion Our studies provide an effective model for the effective prediction of the intrinsic flexibility encoded in protein structures.