A Crowdsourcing Tool for Data Augmentation in Visual Question Answering Tasks

被引:0
|
作者
Silva, Ramon [1 ]
Fonseca, Augusto [1 ]
Goldschmidt, Ronaldo [2 ]
dos Santos, Joel [1 ]
Bezerra, Eduardo [1 ]
机构
[1] CEFET RJ, Rio De Janeiro, Brazil
[2] Inst Mil Engn, Rio De Janeiro, Brazil
关键词
Crowdsourcing; Human Computation; Data Augmentation; Image Annotation;
D O I
10.1145/3243082.3267455
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Visual Question Answering (VQA) is a task that connects the fields of Computer Vision and Natural Language Processing. Taking as input an image I and a natural language question Q about I, a VQA model must be able to produce a coherent answer R (also in natural language) to Q. A particular type of visual question is one in which the question is binary (i.e., a question whose answer belongs to the set {yes, no}). Currently, deep neural networks correspond to the state of the art technique for training of VQA models. Despite its success, the application of neural networks to the VQA task requires a very large amount of data in order to produce models with adequate precision. Datasets currently used for the training of VQA models are the result of laborious manual labeling processes (i.e., made by humans). This context makes relevant the study of approaches to augment these datasets in order to train more accurate prediction models. This paper describes a crowdsourcing tool which can be used in a collaborative manner to augment an existing VQA dataset for binary questions. Our tool actively integrates candidate items from an external data source in order to optimize the selection of queries to be presented to curators.
引用
收藏
页码:137 / 140
页数:4
相关论文
共 50 条
  • [1] Rethinking Data Augmentation for Robust Visual Question Answering
    Chen, Long
    Zheng, Yuhang
    Xiao, Jun
    [J]. COMPUTER VISION, ECCV 2022, PT XXXVI, 2022, 13696 : 95 - 112
  • [2] Data Augmentation Method for Question Answering
    Ding, Jiajie
    Xiao, Kang
    Ye, Heng
    Zhou, Xiabing
    Zhang, Min
    [J]. Beijing Daxue Xuebao (Ziran Kexue Ban)/Acta Scientiarum Naturalium Universitatis Pekinensis, 2022, 58 (01): : 54 - 60
  • [3] Improving Data Augmentation for Robust Visual Question Answering with Effective Curriculum Learning
    Zheng, Yuhang
    Wang, Zhen
    Chen, Long
    [J]. PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1084 - 1088
  • [4] Data Augmentation for Biomedical Factoid Question Answering
    Pappas, Dimitris
    Malakasiotis, Prodromos
    Androutsopoulos, Ion
    [J]. PROCEEDINGS OF THE 21ST WORKSHOP ON BIOMEDICAL LANGUAGE PROCESSING (BIONLP 2022), 2022, : 63 - 81
  • [5] Rich Visual Knowledge-Based Augmentation Network for Visual Question Answering
    Zhang, Liyang
    Liu, Shuaicheng
    Liu, Donghao
    Zeng, Pengpeng
    Li, Xiangpeng
    Song, Jingkuan
    Gao, Lianli
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (10) : 4362 - 4373
  • [6] Robust visual question answering via semantic cross modal augmentation
    Mashrur, Akib
    Luo, Wei
    Zaidi, Nayyar A.
    Robles-Kelly, Antonio
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 238
  • [7] Retrieval Data Augmentation Informed by Downstream Question Answering Performance
    Ferguson, James
    Dasigi, Pradeep
    Khot, Tushar
    Hajishirzi, Hannaneh
    [J]. PROCEEDINGS OF THE FIFTH FACT EXTRACTION AND VERIFICATION WORKSHOP (FEVER 2022), 2022, : 1 - 5
  • [8] Improving Biomedical Question Answering by Data Augmentation and Model Weighting
    Du, Yongping
    Yan, Jingya
    Lu, Yuxuan
    Zhao, Yiliang
    Jin, Xingnan
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (02) : 1114 - 1124
  • [9] Retrieval Enhanced Data Augmentation for Question Answering on Privacy Policies
    Parvez, Md Rizwan
    Chi, Jianfeng
    Ahmad, Wasi Uddin
    Tian, Yuan
    Chang, Kai-Wei
    [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 201 - 210
  • [10] Beyond Accuracy: A Consolidated Tool for Visual Question Answering Benchmarking
    Vath, Dirk
    Tilli, Pascal
    Ngoc Thang Vu
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, 2021, : 114 - 123