Human-Object Interaction (HOI) Detection is a critical task in scene understanding, which aims to detect the triplet<human, object, interaction> in images or videos. Existing methods solve this problem under a strong assumption that all triplets that are to be detected would be available during training stage. However, in real scene, new HOIs may be introduced continuously, which requires the trained model to have the ability to identify new classes without forgetting old ones. Due to the limitations of storage, computing resources and the privacy of data, it is impractical to train the model from scratch using old and new data every time. In this paper, we propose a new HOI detection task scenario called Lifelong Learning Human-Object Interaction Detection (LL-HOI) which is more natural than the existing closed-world one and solve this problem in an incremental and contrastive learning manner (Fig. 1). Our method is composed of two stages according to under incremental setting or not: 1) identify humans, objects and actions in HOIs using backbone detector and contrastive learning and 2) incrementally learn new HOI classes without forgetting previously learned ones. Besides, to address the catastrophic forgetting problem, we propose a Feature Replay Network (FRN) based on contrastive learning to adaptively process the images conditioned on the incremental process. Extensive experiments on HICO-DET and HOI-W datasets demonstrate the effectiveness and superiority of our method on lifelong human-object interaction detection.