Video Harmonization with Triplet Spatio-Temporal Variation Patterns

被引:0
|
作者
Guo, Zonghui [1 ]
Han, Xinyu [2 ]
Zhang, Jie [1 ,3 ]
Shan, Shiguang [1 ,3 ]
Zheng, Haiyong [2 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
[2] Ocean Univ China, Coll Elect Engn, Qingdao, Peoples R China
[3] Univ Chinese Acad Sci, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR52733.2024.01814
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video harmonization is an important and challenging task that aims to obtain visually realistic composite videos by automatically adjusting the foreground's appearance to harmonize with the background. Inspired by the short-term and long-term gradual adjustment process of manual harmonization, we present a Video Triplet Transformer framework to model three spatio-temporal variation patterns within videos, i.e., short-term spatial as well as long-term global and dynamic, for video-to-video tasks like video harmonization. Specifically, for short-term harmonization, we adjust foreground appearance to consist with background in spatial dimension based on the neighbor frames; for long-term harmonization, we not only explore global appearance variations to enhance temporal consistency but also alleviate motion offset constraints to align similar contextual appearances dynamically. Extensive experiments and ablation studies demonstrate the effectiveness of our method, achieving state-of-the-art performance in video harmonization, video enhancement, and video demoireing tasks. We also propose a temporal consistency metric to better evaluate the harmonized videos. Code is available at https://github.com/zhenglab/VideoTripletTransformer.
引用
收藏
页码:19177 / 19186
页数:10
相关论文
共 50 条
  • [41] Spatio-Temporal Transformer Network for Video Restoration
    Kim, Tae Hyun
    Sajjadi, Mehdi S. M.
    Hirsch, Michael
    Schoelkopf, Bernhard
    COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 : 111 - 127
  • [42] A spatio-temporal pyramid matching for video retrieval
    Choi, Jaesik
    Wang, Ziyu
    Lee, Sang-Chul
    Jeon, Won J.
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2013, 117 (06) : 660 - 669
  • [43] A framework of spatio-temporal analysis for video surveillance
    Chen, Duan-Yu
    Cannons, Kevin
    Tyan, Hsiao-Rong
    Shih, Sheng-Wen
    Liao, Hong-Yuan Mark
    PROCEEDINGS OF 2008 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-10, 2008, : 2745 - +
  • [44] Flexible Spatio-Temporal Networks for Video Prediction
    Lu, Chaochao
    Hirsch, Michael
    Scholkopf, Bernhard
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2137 - 2145
  • [45] Spatio-temporal transform based video hashing
    Coskun, Baris
    Sankur, Bulent
    Memon, Nasir
    IEEE TRANSACTIONS ON MULTIMEDIA, 2006, 8 (06) : 1190 - 1208
  • [46] Video coding with spatio-temporal texture synthesis
    Zhu, Chunbo
    Sun, Xiaoyan
    Wu, Feng
    Li, Houqiang
    2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 112 - +
  • [47] Spatio-temporal pattern mining in sports video
    Lan, Dong-Jun
    Ma, Yu-Fei
    Ma, Wei-Ying
    Zhang, Hong-Jiang
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3332 : 306 - 313
  • [48] Video anomaly detection with spatio-temporal dissociation
    Chang, Yunpeng
    Tu, Zhigang
    Xie, Wei
    Luo, Bin
    Zhang, Shifu
    Sui, Haigang
    Yuan, Junsong
    PATTERN RECOGNITION, 2022, 122
  • [49] Automatic spatio-temporal video sequence segmentation
    Vass, J
    Palaniappan, K
    Zhuang, XH
    1998 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL 1, 1998, : 958 - 962
  • [50] Spatio-temporal scalability for MPEG video coding
    Domanski, M
    Luczak, A
    Mackowiak, S
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2000, 10 (07) : 1088 - 1093