Temporal-spatial information mining and aggregation for video matting

被引:0
|
作者
Zhiwei Ma
Guilin Yao
机构
[1] Harbin University of Commerce,
来源
Multimedia Tools and Applications | 2024年 / 83卷
关键词
Double decoder; Spatial continuity; Temporal coherence; Video matting;
D O I
暂无
中图分类号
学科分类号
摘要
In previous video matting methods, there are some problems that require additional auxiliary information and lack of temporal consistency. To solve these problems, we propose a novel video matting framework (STMI-Net) based on temporal-spatial information mining and aggregation. This framework doesn’t require any auxiliary information and adopts a double decoder network structure, specifically, one decoder is composed of the recurrent network, which can make full use of the temporal information in the video frames to ensure the temporal coherence in results; and the other decoder is composed of the convolution network, which deeply restores the frame-by-frame spatial features to achieve the spatial continuity in results. By aggregating these two parts of the information at the global level, our model achieves 0.0066 MSE on the VideoMatte240K dataset, which surpasses the RVM baseline by 13%; and achieves 0.0047 MSE on PPM-100 portrait matting dataset, which surpasses the MG baseline by 26.5%. We also implement an ablation study to demonstrate the specific functions of the temporal decoder and the spatial decoder in our model.
引用
收藏
页码:29221 / 29237
页数:16
相关论文
共 50 条
  • [31] The concept, key technologies and applications of temporal-spatial information infrastructure
    Li, Chengming
    Liu, Po
    Yin, Jie
    Liu, Xiaoli
    GEO-SPATIAL INFORMATION SCIENCE, 2016, 19 (02) : 148 - 156
  • [32] Low Light Video Enhancement Based on Temporal-Spatial Complementary Feature
    Zhang, Gengchen
    Zeng, Yuhang
    Fu, Ying
    ARTIFICIAL INTELLIGENCE, CICAI 2022, PT I, 2022, 13604 : 368 - 379
  • [33] Deepfake Video Detection Based on Improved CapsNet and Temporal-Spatial Features
    Lu, Tianliang
    Bao, Yuxuan
    Li, Lanting
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (01): : 715 - 740
  • [34] Encoder-Embedded Temporal-Spatial Wiener Filter for Video Encoding
    Tang, Huiming
    Fan, Minjun
    Yu, Lu
    2012 PICTURE CODING SYMPOSIUM (PCS), 2012, : 361 - 364
  • [35] Temporal-Spatial Request Aggregation for Cache-Enabled Wireless Multicasting Networks
    Xing, Jifang
    Cui, Ying
    Lau, Vincent
    GLOBECOM 2017 - 2017 IEEE GLOBAL COMMUNICATIONS CONFERENCE, 2017,
  • [36] System with temporal-spatial noise
    Li, JH
    PHYSICAL REVIEW E, 2003, 67 (06):
  • [37] Video segmentation based on spatial and temporal information
    Choi, JG
    Lee, SW
    Kim, SD
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 2661 - 2664
  • [38] Knowledge Discovery by Mining Association Rules and Temporal-Spatial Information from Large-Scale Geospatial Image Databases
    Shyu, Chi-Ren
    Klaric, Matt
    Scott, Grant
    Mahamaneerat, Wannapa Kay
    2006 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS 1-8, 2006, : 17 - 20
  • [39] A TEMPORAL-SPATIAL CONCEPT SCALE
    HARTIGAN, RR
    FULLER, GB
    JOURNAL OF CLINICAL PSYCHOLOGY, 1964, 20 (04) : 478 - 483
  • [40] Two-Stage Video Shadow Detection via Temporal-Spatial Adaption
    Duan, Xin
    Cao, Yu
    Zhu, Lei
    Fu, Gang
    Wang, Xin
    Zhang, Renjie
    Li, Ping
    COMPUTER VISION - ECCV 2024, PT XLVIII, 2025, 15106 : 196 - 214