Composite Object Relation Modeling for Few-Shot Scene Recognition

被引:0
|
作者
Song, Xinhang [1 ,2 ]
Liu, Chenlong [1 ,2 ]
Zeng, Haitao [1 ,2 ]
Zhu, Yaohui [1 ,2 ]
Chen, Gongwei [1 ,2 ]
Qin, Xiaorong [1 ,2 ]
Jiang, Shuqiang [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Scene recognition; few-shot learning; graph modeling; generalization ability; NETWORK;
D O I
10.1109/TIP.2023.3321475
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of few-shot image recognition is to classify different categories with only one or a few training samples. Previous works of few-shot learning mainly focus on simple images, such as object or character images. Those works usually use a convolutional neural network (CNN) to learn the global image representations from training tasks, which are then adapted to novel tasks. However, there are many more abstract and complex images in real world, such as scene images, consisting of many object entities with flexible spatial relations among them. In such cases, global features can hardly obtain satisfactory generalization ability due to the large diversity of object relations in the scenes, which may hinder the adaptability to novel scenes. This paper proposes a composite object relation modeling method for few-shot scene recognition, capturing the spatial structural characteristic of scene images to enhance adaptability on novel scenes, considering that objects commonly co- occurred in different scenes. In different few-shot scene recognition tasks, the objects in the same images usually play different roles. Thus we propose a task-aware region selection module (TRSM) to further select the detected regions in different few-shot tasks. In addition to detecting object regions, we mainly focus on exploiting the relations between objects, which are more consistent to the scenes and can be used to cleave apart different scenes. Objects and relations are used to construct a graph in each image, which is then modeled with graph convolutional neural network. The graph modeling is jointly optimized with few-shot recognition, where the loss of few-shot learning is also capable of adjusting graph based representations. Typically, the proposed graph based representations can be plugged in different types of few-shot architectures, such as metric-based and meta-learning methods. Experimental results of few-shot scene recognition show the effectiveness of the proposed method.
引用
收藏
页码:5678 / 5691
页数:14
相关论文
共 50 条
  • [1] Spatio-temporal Relation Modeling for Few-shot Action Recognition
    Thatipelli, Anirudh
    Narayan, Sanath
    Khan, Salman
    Anwer, Rao Muhammad
    Khan, Fahad Shahbaz
    Ghanem, Bernard
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19926 - 19935
  • [2] Loss Architecture Search for Few-Shot Object Recognition
    Yue, Jun
    Miao, Zelang
    He, Yueguang
    Du, Nianchun
    [J]. COMPLEXITY, 2020, 2020
  • [3] Few-shot object detection with affinity relation reasoning
    Huang, Lian
    He, Ziqiang
    Feng, Xiao
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (03)
  • [4] Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection
    Zhu, Chenchen
    Chen, Fangyi
    Ahmed, Uzair
    Shen, Zhiqiang
    Savvides, Marios
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8778 - 8787
  • [5] Task Adaptive Modeling for Few-shot Action Recognition
    Wang, Jiayi
    Jin, Yi
    Feng, Songhe
    Li, Yidong
    [J]. 2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2022,
  • [6] Few-Shot Object Detection: A Survey
    Antonelli, Simone
    Avola, Danilo
    Cinque, Luigi
    Crisostomi, Donato
    Foresti, Gian Luca
    Galasso, Fabio
    Marini, Marco Raoul
    Mecca, Alessio
    Pannone, Daniele
    [J]. ACM COMPUTING SURVEYS, 2022, 54 (11S)
  • [7] Few-shot human-object interaction video recognition with transformers
    Li, Qiyue
    Xie, Xuemei
    Zhang, Jin
    Shi, Guangming
    [J]. NEURAL NETWORKS, 2023, 163 : 1 - 9
  • [8] Few-Shot Object Detection Algorithm Based on Adaptive Relation Distillation
    Duan, Danting
    Zhong, Wei
    Peng, Liang
    Ran, Shuang
    Hu, Fei
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XII, 2024, 14436 : 328 - 339
  • [9] Few-Shot Video Object Detection
    Fan, Qi
    Tang, Chi-Keung
    Tai, Yu-Wing
    [J]. COMPUTER VISION, ECCV 2022, PT XX, 2022, 13680 : 76 - 98
  • [10] Δ-encoder: an effective sample synthesis method for few-shot object recognition
    Schwartz, Eli
    Karlinsky, Leonid
    Shtok, Joseph
    Harary, Sivan
    Marder, Mattias
    Kumar, Abhishek
    Feris, Rogerio
    Giryes, Raja
    Bronstein, Alex M.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31