Temporal-spatial information mining and aggregation for video matting

被引：0

作者：

Zhiwei Ma

Guilin Yao

机构：

[1] Harbin University of Commerce,

来源：

Multimedia Tools and Applications | 2024年 / 83卷

关键词：

Double decoder; Spatial continuity; Temporal coherence; Video matting;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In previous video matting methods, there are some problems that require additional auxiliary information and lack of temporal consistency. To solve these problems, we propose a novel video matting framework (STMI-Net) based on temporal-spatial information mining and aggregation. This framework doesn’t require any auxiliary information and adopts a double decoder network structure, specifically, one decoder is composed of the recurrent network, which can make full use of the temporal information in the video frames to ensure the temporal coherence in results; and the other decoder is composed of the convolution network, which deeply restores the frame-by-frame spatial features to achieve the spatial continuity in results. By aggregating these two parts of the information at the global level, our model achieves 0.0066 MSE on the VideoMatte240K dataset, which surpasses the RVM baseline by 13%; and achieves 0.0047 MSE on PPM-100 portrait matting dataset, which surpasses the MG baseline by 26.5%. We also implement an ablation study to demonstrate the specific functions of the temporal decoder and the spatial decoder in our model.

引用

页码：29221 / 29237

页数：16

共 50 条

[31] The concept, key technologies and applications of temporal-spatial information infrastructure
Li, Chengming
Liu, Po
Yin, Jie
Liu, Xiaoli
GEO-SPATIAL INFORMATION SCIENCE, 2016, 19 (02) : 148 - 156
[32] Low Light Video Enhancement Based on Temporal-Spatial Complementary Feature
Zhang, Gengchen
Zeng, Yuhang
Fu, Ying
ARTIFICIAL INTELLIGENCE, CICAI 2022, PT I, 2022, 13604 : 368 - 379
[33] Deepfake Video Detection Based on Improved CapsNet and Temporal-Spatial Features
Lu, Tianliang
Bao, Yuxuan
Li, Lanting
CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (01): : 715 - 740
[34] Encoder-Embedded Temporal-Spatial Wiener Filter for Video Encoding
Tang, Huiming
Fan, Minjun
Yu, Lu
2012 PICTURE CODING SYMPOSIUM (PCS), 2012, : 361 - 364
[35] Temporal-Spatial Request Aggregation for Cache-Enabled Wireless Multicasting Networks
Xing, Jifang
Cui, Ying
Lau, Vincent
GLOBECOM 2017 - 2017 IEEE GLOBAL COMMUNICATIONS CONFERENCE, 2017,
[36] System with temporal-spatial noise
Li, JH
PHYSICAL REVIEW E, 2003, 67 (06):
[37] Video segmentation based on spatial and temporal information
Choi, JG
Lee, SW
Kim, SD
1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 2661 - 2664
[38] Knowledge Discovery by Mining Association Rules and Temporal-Spatial Information from Large-Scale Geospatial Image Databases
Shyu, Chi-Ren
Klaric, Matt
Scott, Grant
Mahamaneerat, Wannapa Kay
2006 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS 1-8, 2006, : 17 - 20
[39] A TEMPORAL-SPATIAL CONCEPT SCALE
HARTIGAN, RR
FULLER, GB
JOURNAL OF CLINICAL PSYCHOLOGY, 1964, 20 (04) : 478 - 483
[40] Two-Stage Video Shadow Detection via Temporal-Spatial Adaption
Duan, Xin
Cao, Yu
Zhu, Lei
Fu, Gang
Wang, Xin
Zhang, Renjie
Li, Ping
COMPUTER VISION - ECCV 2024, PT XLVIII, 2025, 15106 : 196 - 214

← 1 2 3 4 5 →