Batch construction and multitask learning in visual relationship recognition

被引:0
|
作者
Josias, Shane [1 ,2 ]
Brink, Willie [1 ]
机构
[1] Stellenbosch Univ, Appl Math, Stellenbosch, South Africa
[2] Stellenbosch Univ, CAIR, Stellenbosch, South Africa
关键词
visual relationship recognition; batch construction; multitask learning;
D O I
10.1109/saupec/robmech/prasa48453.2020.9041144
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
An image can be described by the objects within it, as well as interactions between those objects. A pair of object labels together with an interaction label is known as a visual relationship, and is represented as a triplet of the form (subject, predicate, object). Recognising visual relationships in a given image is a challenging task, owing to the combinatorially large number of possible relationship triplets, which leads to an extreme classification problem, as well as a very long tail found typically in the distribution of those possible triplets. We investigate the effects of three strategies that could potentially address these issues. Firstly, instead of predicting the full triplet we opt to predict each element separately. Secondly, we investigate the use of shared network parameters to perform these separate predictions in a multitask setting. Thirdly, we consider a class selective batch construction strategy to expose the network to more of the many rare classes during mini-batch training. Our experiments demonstrate that batch construction can improve performance on the long tail, possibly at the expense of accuracy on the small number of dominating classes. We also find that a multitask model neither improves nor impedes performance in any significant way, but that its smaller size may be beneficial.
引用
收藏
页码:713 / 718
页数:6
相关论文
共 50 条
  • [1] Class-Selective Mini-Batching and Multitask Learning for Visual Relationship Recognition
    Josias, S.
    Brink, W.
    SAIEE AFRICA RESEARCH JOURNAL, 2021, 112 (02): : 99 - 109
  • [2] Multitask Learning Method for Detecting the Visual Focus of Attention of Construction Workers
    Cai, Jiannan
    Yang, Liu
    Zhang, Yuxi
    Li, Shuai
    Cai, Hubo
    JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT, 2021, 147 (07)
  • [3] Visual Place Recognition via a Multitask Learning Method With Attentive Feature Aggregation
    Guan, Peiyu
    Cao, Zhiqiang
    Yu, Junzhi
    Tan, Min
    Wang, Shuo
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2023, 15 (03) : 1263 - 1278
  • [4] Multitask Learning for Visual Question Answering
    Ma, Jie
    Liu, Jun
    Lin, Qika
    Wu, Bei
    Wang, Yaxian
    You, Yang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (03) : 1380 - 1394
  • [5] Neural Multitask Learning for Simile Recognition
    Liu, Lizhen
    Hu, Xiao
    Song, Wei
    Fu, Ruiji
    Liu, Ting
    Hu, Guoping
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1543 - 1553
  • [6] Efficient Multitask Feature and Relationship Learning
    Zhao, Han
    Stretcu, Otilia
    Smola, Alexander J.
    Gordon, Geoffrey J.
    35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 777 - 787
  • [7] Visual Attributes Based Sparse Multitask Action Recognition
    Wang, Qicong
    Zhao, Jinhao
    Shen, Yehu
    Li, Maozhen
    Wu, Yuxiang
    Lei, Yunqi
    2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 1767 - 1772
  • [8] Multitask Extreme Learning Machine for Visual Tracking
    Liu, Huaping
    Sun, Fuchun
    Yu, Yuanlong
    COGNITIVE COMPUTATION, 2014, 6 (03) : 391 - 404
  • [9] Student behavior recognition based on multitask learning
    Mo, Jianwen
    Zhu, Rui
    Yuan, Hua
    Shou, Zhaoyu
    Chen, Lingping
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (12) : 19091 - 19108
  • [10] Multitask Learning for Chinese Named Entity Recognition
    Zhang, Qun
    Li, Zhenzhen
    Feng, Dawei
    Li, Dongsheng
    Huang, Zhen
    Peng, Yuxing
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II, 2018, 11165 : 653 - 662