This paper focuses on a practical technique for mosaicking video frames captured by thermal infrared (IR) cameras flown on a small Unmanned Aerial Vehicle (UAV). A Scale Invariant Feature Transform (SIFT) algorithm is used for detecting the matching feature points. Then, the k-d tree and RANSAC algorithms are used to find the best match as well as to eliminate the outliers. We propose a novel method called random M-least square to find the optimized projective transformation parameters between frames. Finally, we warp the images and adopt the multi-resolution blending method to stitch the registered frames. The whole process is applied to real UAV IR video to validate its robustness to noise and de-focusing. Also, the computational efficiency shows this method is a step in the direction of implementing a real-time UAV system.