XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model

被引:118
|
作者
Cheng, Ho Kei [1 ]
Schwing, Alexander G. [1 ]
机构
[1] Univ Illinois, Champaign, IL 61820 USA
来源
关键词
D O I
10.1007/978-3-031-19815-1_37
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present XMem, a video object segmentation architecture for long videos with unified feature memory stores inspired by the Atkinson-Shiffrin memory model. Prior work on video object segmentation typically only uses one type of feature memory. For videos longer than a minute, a single feature memory model tightly links memory consumption and accuracy. In contrast, following the Atkinson-Shiffrin model, we develop an architecture that incorporates multiple independent yet deeply-connected feature memory stores: a rapidly updated sensory memory, a high-resolution working memory, and a compact thus sustained long-term memory. Crucially, we develop a memory potentiation algorithm that routinely consolidates actively used working memory elements into the long-term memory, which avoids memory explosion and minimizes performance decay for long-term prediction. Combined with a new memory reading mechanism, XMem greatly exceeds state-of-the-art performance on long-video datasets while being on par with state-of-the-art methods (that do not work on long videos) on short-video datasets.
引用
收藏
页码:640 / 658
页数:19
相关论文
共 50 条
  • [1] On human memory: Evolution, progress, and reflections on the 30th anniversary of the Atkinson-Shiffrin model
    Cowan, N
    Rouder, JN
    Stadler, MA
    AMERICAN JOURNAL OF PSYCHOLOGY, 2000, 113 (04): : 639 - 648
  • [3] A Social Group Chatbot System by Multiple Topics Tracking and Atkinson-Shiffrin Memory Model Using AI Agents Collaboration
    Zhang, Guoshuai
    Wu, Jiaji
    Jeon, Gwanggil
    Wang, Penghui
    EXPERT SYSTEMS, 2025, 42 (02)
  • [4] Black-box Unsupervised Domain Adaptation with Bi-directional Atkinson-Shiffrin Memory
    Zhang, Jingyi
    Huang, Jiaxing
    Jiang, Xueying
    Lu, Shijian
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 11737 - 11748
  • [5] LVOS: A Benchmark for Long-term Video Object Segmentation
    Hong, Lingyi
    Chen, Wenchao
    Liu, Zhongying
    Zhang, Wei
    Guo, Pinxue
    Chen, Zhaoyu
    Zhang, Wenqiang
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 13434 - 13446
  • [6] Atkinson and Shiffrin ' s (1968) influential model overshadowed their contemporary theory of human memory
    Wixted, John T.
    JOURNAL OF MEMORY AND LANGUAGE, 2024, 136
  • [7] Efficient Semisupervised Object Segmentation for Long-Term Videos Using Adaptive Memory Network
    Zhong, Shan
    Li, Guoqiang
    Ying, Wenhao
    Zhao, Fuzhou
    Xie, Gengsheng
    Gong, Shengrong
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2024, 16 (05) : 1789 - 1802
  • [8] Attention-Guided Memory Model for Video Object Segmentation
    Lin, Yunjian
    Tan, Yihua
    Communications in Computer and Information Science, 2022, 1566 CCIS : 67 - 85
  • [9] The hippocampus and long-term object memory in the rat
    Vnek, N
    Rothblat, LA
    JOURNAL OF NEUROSCIENCE, 1996, 16 (08): : 2780 - 2787
  • [10] Strike the Balance: On-the-Fly Uncertainty based User Interactions for Long-Term Video Object Segmentation
    Fraunhofer IOSB, Ettlingen, Germany
    不详
    arXiv,