Relevant Visual Semantic Context-Aware Attention-Based Dialog

被引:0
|
作者
Hong, Eugene Tan Boon [1 ]
Chong, Yung-Wey [1 ]
Wan, Tat-Chee [1 ]
Yau, Kok-Lim Alvin [2 ]
机构
[1] Univ Sains Malaysia, Natl Adv IPv6 Ctr, George Town, Malaysia
[2] Univ Tunku Abdul Rahman, Lee Kong Chian Fac Engn & Sci LKCFES, Sungai Long, Selangor, Malaysia
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2023年 / 76卷 / 02期
关键词
Visual dialog; context-aware; relevant history; computer vision; natural language processing; NETWORK;
D O I
10.32604/cmc.2023.038695
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The existing dataset for visual dialog comprises multiple rounds of questions and a diverse range of image contents. However, it faces challenges in overcoming visual semantic limitations, particularly in obtaining sufficient context from visual and textual aspects of images. This paper proposes a new visual dialog dataset called Diverse History-Dialog (DS-Dialog) to address the visual semantic limitations faced by the existing dataset. DS-Dialog groups relevant histories based on their respective Microsoft Common Objects in Context (MSCOCO) image categories and consolidates them for each image. Specifically, each MSCOCO image category consists of top relevant histories extracted based on their semantic relationships between the original image caption and historical context. These relevant histories are consolidated for each image, and DS-Dialog enhances the current dataset by adding new context-aware relevant history to provide more visual semantic context for each image. The new dataset is generated through several stages, including image semantic feature extraction, keyphrase extraction, relevant question extraction, and relevant history dialog generation. The DS-Dialog dataset contains about 2.6 million question-answer pairs, where 1.3 million pairs correspond to existing VisDial's question-answer pairs, and the remaining 1.3 million pairs include a maximum of 5 image features for each VisDial image, with each image comprising 10-round relevant question-answer pairs. Moreover, a novel adaptive relevant history selection is proposed to resolve missing visual semantic information for each image. DS-Dialog is used to benchmark the performance of previous visual dialog models and achieves better performance than previous models. Specifically, the proposed DS-Dialog model achieves an 8% higher mean reciprocal rank (MRR), 11% higher R@1%, 6% higher R@5%, 5% higher R@10%, and 8% higher normalized discounted cumulative gain (NDCG) compared to LF. DS-Dialog also achieves approximately 1 point improvement on R@k, mean, MRR, and NDCG compared to the original RVA, and 2 points improvement compared to LF and DualVD. These results demonstrates the importance of the relevant semantic historical context in enhancing the visual semantic relationship between textual and visual representations of the images and questions.
引用
收藏
页码:2337 / 2354
页数:18
相关论文
共 50 条
  • [1] Attention-based context-aware sequential recommendation model
    Yuan, Weihua
    Wang, Hong
    Yu, Xiaomei
    Liu, Nan
    Li, Zhenghao
    [J]. INFORMATION SCIENCES, 2020, 510 : 122 - 134
  • [2] Context-aware Attention-based Data Augmentation for POI Recommendation
    Li, Yang
    Luo, Yadan
    Zhang, Zheng
    Sadiq, Shazia
    Cui, Peng
    [J]. 2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW 2019), 2019, : 177 - 184
  • [3] Hierarchical attention-based context-aware network for red tide forecasting
    He, Xiaoyu
    Shi, Suixiang
    Geng, Xiulin
    Xu, Lingyu
    [J]. APPLIED SOFT COMPUTING, 2022, 127
  • [4] Context-aware Cascade Attention-based RNN for Video Emotion Recognition
    Sun, Man-Chin
    Hsu, Shih-Huan
    Yang, Min-Chun
    Chien, Jen-Hsien
    [J]. 2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII ASIA), 2018,
  • [5] Attention-Based Context Aware Network for Semantic Comprehension of Aerial Scenery
    Shi, Weipeng
    Qin, Wenhu
    Yun, Zhonghua
    Ping, Peng
    Wu, Kaiyang
    Qu, Yuke
    [J]. SENSORS, 2021, 21 (06) : 1 - 23
  • [6] ACANet: attention-based context-aware network for infrared small target detection
    Ling, Siyao
    Chen, Lunfeng
    Wu, Yujie
    Zhang, Yuanmin
    Gao, Zhisheng
    [J]. JOURNAL OF SUPERCOMPUTING, 2024, 80 (12): : 17068 - 17096
  • [7] Context-Aware and Time-Aware Attention-Based Model for Disease Risk Prediction With Interpretability
    Zhang, Xianli
    Qian, Buyue
    Li, Yang
    Cao, Shilei
    Davidson, Ian
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (04) : 3551 - 3562
  • [8] Context-Aware Semantic Matching with Self Attention Mechanism
    Chen, Yanmin
    Wang, Hao
    Sun, Ruijun
    Chen, Enhong
    [J]. 2022 5th International Conference on Pattern Recognition and Artificial Intelligence, PRAI 2022, 2022, : 1007 - 1011
  • [9] Context-Aware Graph Inference With Knowledge Distillation for Visual Dialog
    Guo, Dan
    Wang, Hui
    Wang, Meng
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 6056 - 6073
  • [10] Diarec: Dynamic Intention-Aware Recommendation with Attention-Based Context-Aware Item Attributes Modeling
    Vaghari, Hadise
    Aghdam, Mehdi Hosseinzadeh
    Emami, Hojjat
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2024, 14 (02) : 171 - 189