Improvisation of Dataset Efficiency in Visual Question Answering Domain

被引:0
|
作者
Mohamed, Sheerin Sitara Noor [1 ]
Srinivasan, Kavitha [1 ]
机构
[1] Sri Sivasubramaniya Nadar Coll Engn, Dept Comp Sci & Engn, Kalavakkam 603110, India
来源
STATISTICS AND APPLICATIONS | 2022年 / 20卷 / 02期
关键词
Medical VQA; Data augmentation; Label smoothing; Mixup; VGGNet; ResNet;
D O I
暂无
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The technology revolution moves the world towards automation and most of the activities are performed with minimum human intervention. The medical domain is not an exception, few developments in the medical domain helps both the patient and physician to some extent. As a part of this advancement, Visual Question Answering (VQA) in the medical domain is evolved and which helps the physician and partially visually sighted people in clinical decision making and patient education. One of the main disadvantages in achieving this advancement is data limitation problem. In this paper, two methods for handling the data limitation problem are explained and validated using appropriate pretrained models like VGGNet and ResNet. The methods namely label smoothing and mixup are used to reduce the hard samples and augmentation of the medical data. From the performance analysis, it has been inferred that the highest accuracy and BLEU score are obtained for improved dataset as 0.297 and 0.313 for ResNet with a significant improvement of 7.9% and 5.9% respectively.
引用
收藏
页码:279 / 289
页数:11
相关论文
共 50 条
  • [1] DuReadervis: A Chinese Dataset for Open-domain Document Visual Question Answering
    Qi, Le
    Lv, Shangwen
    Li, Hongyu
    Liu, Jing
    Zhang, Yu
    She, Qiaoqiao
    Wu, Hua
    Wang, Haifeng
    Liu, Ting
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1338 - 1351
  • [2] Dataset bias: A case study for visual question answering
    Das, Anubrata
    Anjum, Samreen
    Gurari, Danna
    [J]. Proceedings of the Association for Information Science and Technology, 2019, 56 (01): : 58 - 67
  • [3] OVQA: A Clinically Generated Visual Question Answering Dataset
    Huang, Yefan
    Wang, Xiaoli
    Liu, Feiyan
    Huang, Guofeng
    [J]. PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 2924 - 2938
  • [4] A Large Visual Question Answering Dataset for Cultural Heritage
    Asprino, Luigi
    Bulla, Luana
    Marinucci, Ludovica
    Mongiovi, Misael
    Presutti, Valentina
    [J]. MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE (LOD 2021), PT II, 2022, 13164 : 193 - 197
  • [5] Cross-Dataset Adaptation for Visual Question Answering
    Chao, Wei-Lun
    Hu, Hexiang
    Sha, Fei
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5716 - 5725
  • [6] A dataset and baselines for sequential open-domain question answering
    Elgohary, Ahmed
    Zhao, Chen
    Boyd-Graber, Jordan
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1077 - 1083
  • [7] AVQA: A Dataset for Audio-Visual Question Answering on Videos
    Yang, Pinci
    Wang, Xin
    Duan, Xuguang
    Chen, Hong
    Hou, Runze
    Jin, Cong
    Zhu, Wenwu
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3480 - 3491
  • [8] LoRA: A Logical Reasoning Augmented Dataset for Visual Question Answering
    Gao, Jingying
    Wu, Qi
    Blair, Alan
    Pagnucco, Maurice
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [9] ODSQA: OPEN-DOMAIN SPOKEN QUESTION ANSWERING DATASET
    Lee, Chia-Hsuan
    Wang, Shang-Ming
    Chang, Huan-Cheng
    Lee, Hung-Yi
    [J]. 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 949 - 956
  • [10] CIRCUITVQA: A Visual Question Answering Dataset for Electrical Circuit Images
    Mehta, Rahul
    Singh, Bhavyajeet
    Varma, Vasudeva
    Gupta, Manish
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT I, ECML PKDD 2024, 2024, 14941 : 440 - 460