DCF-VQA: COUNTERFACTUAL STRUCTURE BASED ON MULTI-FEATURE ENHANCEMENT

被引：0

作者：

Yang, Guan ^{[1
,2
]}

Ji, Cheng ^{[1
,2
]}

Liu, Xiaoming ^{[1
,2
]}

Zhang, Ziming ^{[1
,2
]}

Wang, Chen ^{[1
,2
]}

机构：

[1] Zhongyuan Univ Technol, Sch Comp Sci, 41 Zhongyuan Middle Rd, Zhengzhou 450007, Henan, Peoples R China

[2] Zhongyuan Univ Technol, Henan Key Lab Publ Opin Intelligent Anal, 41 Zhongyuan Middle Rd, Zhengzhou 450007, Henan, Peoples R China

来源：

INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE | 2024年 / 34卷 / 03期

关键词：

visual question answering; multi-feature enhancement; counterfactual; discrete cosine transform;

D O I：

10.61822/amcs-2024-0032

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Visual question answering (VQA) is a pivotal topic at the intersection of computer vision and natural language processing. This paper addresses the challenges of linguistic bias and bias fusion within invalid regions encountered in existing VQA models due to insufficient representation of multi-modal features. To overcome those issues, we propose a multi-feature enhancement scheme. This scheme involves the fusion of one or more features with the original ones, incorporating discrete cosine transform (DCT) features into the counterfactual reasoning framework. This approach harnesses finegrained information and spatial relationships within images and questions, enabling a more refined understanding of the indirect relationship between images and questions. Consequently, it effectively mitigates linguistic bias and bias fusion within invalid regions in the model. Extensive experiments are conducted on multiple datasets, including VQA2 and VQA-CP2, employing various baseline models and fusion techniques, resulting in promising and robust performance.

引用

页码：453 / 466

页数：14

共 50 条

[21] Multi-Feature Gesture Recognition Based on Kinect
Zhao, Yue
Liu, Yunda
Dong, Min
Si, Sheng
2016 IEEE INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (CYBER), 2016, : 392 - 396
[22] Birdsong classification based on multi-feature fusion
Yan, Na
Chen, Aibin
Zhou, Guoxiong
Zhang, Zhiqiang
Liu, Xiangyong
Wang, Jianwu
Liu, Zhihua
Chen, Wenjie
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (30) : 36529 - 36547
[23] Palmprint Recognition Based On Multi-feature Integration
Zhang Yaxin
Liu Huanhuan
Geng Xuefei
Liu Lili
PROCEEDINGS OF 2016 IEEE ADVANCED INFORMATION MANAGEMENT, COMMUNICATES, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IMCEC 2016), 2016, : 992 - 995
[24] EEG FEATURE EXTRACTION AND RECOGNITION BASED ON MULTI-FEATURE FUSION
Sun, Jian
Wu, Quanyu
Gao, Nan
Pan, Lingjiao
Tao, Weige
BIOMEDICAL ENGINEERING-APPLICATIONS BASIS COMMUNICATIONS, 2024, 36 (06):
[25] Knowledge tracing based on multi-feature fusion
Xiao, Yongkang
Xiao, Rong
Huang, Ning
Hu, Yixin
Li, Huan
Sun, Bo
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (02): : 1819 - 1833
[26] Configurable ontology mapping based on multi-feature
钱鹏飞
王英林
张申生
Journal of Harbin Institute of Technology(New series), 2009, (06) : 781 - 788
[27] Multi-feature based fire detection in video
Yu, Fa-Xin
Su, Jing-Yong
Lu, Zhe-Ming
Huang, Ping-He
Pan, Jeng-Shyang
INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2008, 4 (08): : 1987 - 1993
[28] MULTI-FEATURE HASHING BASED ON SNR MAXIMIZATION
Yu, Honghai
Moulin, Pierre
2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 1815 - 1819
[29] Image retrieval based on multi-feature fusion
Dong Wenfei
Yu Shuchun
Liu Songyu
Zhang Zhiqiang
Gu Wenbo
2014 FOURTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION AND MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC), 2014, : 240 - 243
[30] Multi-feature fusion dehazing based on CycleGAN
Wang, Jingpin
Ge, Yuan
Zhao, Jie
Han, Chao
AI COMMUNICATIONS, 2024, 37 (04) : 619 - 635

← 1 2 3 4 5 →