Feature pre-inpainting enhanced transformer for video inpainting

被引：6

作者：

Li, Guanxiao ^{[1
]}

Zhang, Ke ^{[1
]}

Su, Yu ^{[1
]}

Wang, Jingyu ^{[1
,2
]}

机构：

[1] Northwestern Polytech Univ, Sch Astronaut, Xian 710072, Shaanxi, Peoples R China

[2] Northwestern Polytech Univ, Sch Artificial Intelligence OPt & Elect iOPEN, Xian 710072, Shaanxi, Peoples R China

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2023年 / 123卷

基金：

中国国家自然科学基金;

关键词：

Video inpainting; Feature pre-inpainting; Local-global interleaving transformer;

D O I：

10.1016/j.engappai.2023.106323

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Transformer-based video inpainting methods aggregate coherent contents into missing regions by learning dependencies spatial-temporally. However, existing methods suffer from the inaccurate self-attention calcu-lation and excessive quadratic computational complexity, due to uninformative representations of missing regions and inefficient global self-attention mechanisms, respectively. To mitigate these problems, we propose a Feature pre-Inpainting enhanced Transformer (FITer) video inpainting method, in which the feature pre-inpainting network (FPNet) and local-global interleaving Transformer are designed. The FPNet pre-inpaints missing features before the Transformer by exploiting spatial context, and the representations of missing regions are thus enhanced with more informative content. Therefore, the interleaving Transformer can calculate more accurate self-attention weights and learns more effective dependencies between missing and valid regions. Since the interleaving Transformer involves both global and window-based local self-attention mechanisms, the proposed FITer method can effectively aggregate spatial-temporal features into missing regions while improving efficiency. Experiments on YouTube-VOS and DAVIS datasets demonstrate that the FITer method outperforms previous methods qualitatively and quantitatively.

引用

页数：12

共 50 条

[41] Recurrent Feature Reasoning for Image Inpainting
Li, Jingyuan
Wang, Ning
Zhang, Lefei
Du, Bo
Tao, Dacheng
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 7757 - 7765
[42] Application of Inpainting Technology to Video Restoration
Chang, Rong-Chi
Tang, Nick C.
Chao, Chia Cheng
2008 FIRST IEEE INTERNATIONAL CONFERENCE ON UBI-MEDIA COMPUTING AND WORKSHOPS, PROCEEDINGS, 2008, : 359 - 364
[43] Character Superimposition Inpainting in Surveillance Video
Jia, Lili
Tao, Junjie
You, Ying
INTERNATIONAL CONFERENCE ON OPTOELECTRONICS AND MICROELECTRONICS TECHNOLOGY AND APPLICATION, 2017, 10244
[44] Properties of a Variational Model for Video Inpainting
March, Riccardo
Riey, Giuseppe
NETWORKS & SPATIAL ECONOMICS, 2022, 22 (02): : 315 - 326
[45] Video Inpainting Localization With Contrastive Learning
Lou, Zijie
Cao, Gang
Lin, Man
IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 611 - 615
[46] Context Encoders: Feature Learning by Inpainting
Pathak, Deepak
Krahenbuhl, Philipp
Donahue, Jeff
Darrell, Trevor
Efros, Alexei A.
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2536 - 2544
[47] MOTION-CONSISTENT VIDEO INPAINTING
Thuc Trinh Le
Almansa, Andres
Gousseau, Yann
Masnou, Simon
2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 2094 - 2098
[48] Video Editing Using Motion Inpainting
Tsai, Joseph C.
Shih, Timothy K.
Wattanachote, Kanoksak
Li, Kuan-Ching
2012 IEEE 26TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA), 2012, : 649 - 654
[49] Video inpainting of occluding and occluded objects
Patwardhan, KA
Sapiro, G
Bertalmio, M
2005 International Conference on Image Processing (ICIP), Vols 1-5, 2005, : 1593 - 1596
[50] Super Resolution Based Video Inpainting
Tudavekar, Gajanan
Patil, Sanjay R.
2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH, 2016, : 291 - 293

← 1 2 3 4 5 →