Evaluation of graph convolutional networks performance for visual question answering on reasoning datasets

被引:0
|
作者
Abdulganiyu Abdu Yusuf
Feng Chong
Mao Xianling
机构
[1] School of Computer Science and Technology,
[2] Beijing Institute of Technology,undefined
[3] National Biotechnology Development Agency,undefined
[4] South-East Information Technology Institute of Beijing Institute of Technology,undefined
[5] Beijing Engineering Research Centre of High Volume Language Information Processing and Cloud Computing Application,undefined
来源
关键词
VQA; GCN; Performance measure; Fine-tuned representation; Reasoning datasets;
D O I
暂无
中图分类号
学科分类号
摘要
In the recent era, graph neural networks are widely used on vision-to-language tasks and achieved promising results. In particular, graph convolution network (GCN) is capable of capturing spatial and semantic relationships needed for visual question answering (VQA). But, applying GCN on VQA datasets with different subtasks can lead to varying results. Also, the training and testing size, evaluation metrics and hyperparameter used are other factors that affect VQA results. These, factors can be subjected into similar evaluation schemes in order to obtain fair evaluations of GCN based result for VQA. This study proposed a GCN framework for VQA based on fine tune word representation to solve handle reasoning type questions. The framework performance is evaluated using various performance measures. The results obtained from GQA and VQA 2.0 datasets slightly outperform most existing methods.
引用
收藏
页码:40361 / 40370
页数:9
相关论文
共 50 条
  • [1] Evaluation of graph convolutional networks performance for visual question answering on reasoning datasets
    Yusuf, Abdulganiyu Abdu
    Feng Chong
    Mao Xianling
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (28) : 40361 - 40370
  • [2] An analysis of graph convolutional networks and recent datasets for visual question answering
    Yusuf, Abdulganiyu Abdu
    Feng Chong
    Mao Xianling
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (08) : 6277 - 6300
  • [3] An analysis of graph convolutional networks and recent datasets for visual question answering
    Abdulganiyu Abdu Yusuf
    Feng Chong
    Mao Xianling
    [J]. Artificial Intelligence Review, 2022, 55 : 6277 - 6300
  • [4] Question Answering by Reasoning Across Documents with Graph Convolutional Networks
    De Cao, Nicola
    Aziz, Wilker
    Titov, Ivan
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2306 - 2317
  • [5] Object-difference drived graph convolutional networks for visual question answering
    Xi Zhu
    Zhendong Mao
    Zhineng Chen
    Yangyang Li
    Zhaohui Wang
    Bin Wang
    [J]. Multimedia Tools and Applications, 2021, 80 : 16247 - 16265
  • [6] Object-difference drived graph convolutional networks for visual question answering
    Zhu, Xi
    Mao, Zhendong
    Chen, Zhineng
    Li, Yangyang
    Wang, Zhaohui
    Wang, Bin
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 16247 - 16265
  • [7] Bilinear Graph Networks for Visual Question Answering
    Guo, Dalu
    Xu, Chang
    Tao, Dacheng
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (02) : 1023 - 1034
  • [8] Semantic Relation Graph Reasoning Network for Visual Question Answering
    Lan, Hong
    Zhang, Pufen
    [J]. TWELFTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING SYSTEMS, 2021, 11719
  • [9] A survey of methods, datasets and evaluation metrics for visual question answering
    Sharma, Himanshu
    Jalal, Anand Singh
    [J]. IMAGE AND VISION COMPUTING, 2021, 116
  • [10] An effective spatial relational reasoning networks for visual question answering
    Shen, Xiang
    Han, Dezhi
    Chen, Chongqing
    Luo, Gaofeng
    Wu, Zhongdai
    [J]. PLOS ONE, 2022, 17 (11):