Efficient Semisupervised Object Segmentation for Long-Term Videos Using Adaptive Memory Network

被引：0

作者：

Zhong, Shan ^{[1
,2
,3
]}

Li, Guoqiang ^{[2
]}

Ying, Wenhao ^{[1
]}

Zhao, Fuzhou ^{[4
]}

Xie, Gengsheng ^{[5
]}

Gong, Shengrong ^{[1
,2
,3
]}

机构：

[1] Changshu Inst Technol, Sch Comp Sci & Engn, Changshu 215500, Jiangsu, Peoples R China

[2] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215000, Jiangsu, Peoples R China

[3] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130000, Peoples R China

[4] Changshu Inst Technol, Sch Automot Engn, Suzhou 215000, Jiangsu, Peoples R China

[5] Jiangxi Normal Univ, Sch Software, Nanchang 330022, Jiangxi, Peoples R China

来源：

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS | 2024年 / 16卷 / 05期

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

Feature extraction; Videos; Object recognition; Data mining; Adaptation models; Adaptive systems; Video sequences; Long-term videos; memory network; object segmentation; semisupervised learning;

D O I：

10.1109/TCDS.2024.3385849

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Video object segmentation (VOS) uses the first annotated video mask to achieve consistent and precise segmentation in subsequent frames. Recently, memory-based methods have received significant attention owing to their substantial performance enhancements. However, these approaches rely on a fixed global memory strategy, which poses a challenge to segmentation accuracy and speed in the context of longer videos. To alleviate this limitation, we propose a novel semisupervised VOS model, founded on the principles of the adaptive memory network. Our proposed model adaptively extracts object features by focusing on the object area while effectively filtering out extraneous background noise. An identification mechanism is also thoughtfully applied to discern each object in multiobject scenarios. To further reduce storage consumption without compromising the saliency of object information, the outdated features residing in the memory pool are compressed into salient features through the employment of a self-attention mechanism. Furthermore, we introduce a local matching module, specifically devised to refine object features by fusing the contextual information from historical frames. We demonstrate the efficiency of our approach through experiments, substantially augmenting both the speed and precision of segmentation for long-term videos, while maintaining comparable performance for short videos.

引用

页码：1789 / 1802

页数：14

共 50 条

[21] Adaptive Failure Prediction Using Long Short-term Memory in Optical Network
Zhang, Chunyu
Wang, Minghui
Zhang, Min
Wang, Danshi
Song, Chuang
Guan, Luyao
Liu, Zhuo
2019 24TH OPTOELECTRONICS AND COMMUNICATIONS CONFERENCE (OECC) AND 2019 INTERNATIONAL CONFERENCE ON PHOTONICS IN SWITCHING AND COMPUTING (PSC), 2019,
[22] Quantum Adaptive Agents with Efficient Long-Term Memories
Elliott, Thomas J.
Gu, Mile
Garner, Andrew J. P.
Thompson, Jayne
PHYSICAL REVIEW X, 2022, 12 (01)
[23] Incremental learning in dynamic environments using neural network with long-term memory
Tsumori, K
Ozawa, S
PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 2583 - 2588
[24] Video Object Segmentation Using Kernelized Memory Network With Multiple Kernels
Seong, Hongje
Hyun, Junhyuk
Kim, Euntai
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) : 2595 - 2612
[25] Video Object Segmentation using Point-based Memory Network
Gao, Mingqi
Han, Jungong
Zheng, Feng
Yu, James J. Q.
Montana, Giovanni
PATTERN RECOGNITION, 2023, 134
[26] Language Modeling through Long-Term Memory Network
Nugaliyadde, Anupiya
Wong, Kok Wai
Sohel, Ferdous
Xie, Hong
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[27] An efficient fully unsupervised video object segmentation scheme using an adaptive neural-network classifier architecture
Doulamis, A
Doulamis, N
Ntalianis, K
Kollias, S
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2003, 14 (03): : 616 - 630
[28] Highly Efficient Short Term Load Forecasting Scheme Using Long Short Term Memory Network
Rafi, Shafiul Hasan
Nahid-Al-Masood
2020 8TH INTERNATIONAL ELECTRICAL ENGINEERING CONGRESS (IEECON), 2020,
[29] Modulated Memory Network for Video Object Segmentation
Lu, Hannan
Guo, Zixian
Zuo, Wangmeng
MATHEMATICS, 2024, 12 (06)
[30] Summarisation of Short-Term and Long-Term Videos using Texture and Colour
Carvajal, Johanna
McCool, Chris
Sanderson, Conrad
2014 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2014, : 769 - 775

← 1 2 3 4 5 →