TAM of SCNU at SemEval-2023 Task 1: FCLL: A Fine-grained Contrastive Language-Image Learning Model for Cross-language VisualWord Sense Disambiguation

被引：0

作者：

Yang, Qihao ^{[1
]}

Li, Yong ^{[1
]}

Wang, Xuelin ^{[2
]}

Li, Shunhao ^{[1
]}

Hao, Tianyong ^{[1
]}

机构：

[1] South China Normal Univ, Sch Comp Sci, Guangzhou, Peoples R China

[2] Jinan Univ, Coll Chinese Language & Culture, Guangzhou, Peoples R China

来源：

17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023 | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Visual Word Sense Disambiguation (WSD), as a fine-grained image-text retrieval task, aims to identify the images that are relevant to ambiguous target words or phrases. However, the difficulties of limited contextual information and cross-linguistic background knowledge in text processing make this task challenging. To alleviate this issue, we propose a Fine-grained Contrastive Language-Image Learning (FCLL) model, which learns fine-grained image-text knowledge by employing a new fine-grained contrastive learning mechanism and enriches contextual information by establishing relationship between concepts and sentences. In addition, a new multimodal-multilingual knowledge base involving ambiguous target words is constructed for visual WSD. Experiment results on the benchmark datasets from SemEval-2023 Task 1 show that our FCLL ranks at the first in overall evaluation with an average H@1 of 72.56% and an average MRR of 82.22%. The results demonstrate that FCLL is effective in inference on fine-grained language-vision knowledge. Source codes and the knowledge base are publicly available at https://github.com/CharlesYang030/FCLL.

引用

页码：506 / 511

页数：6

共 4 条

[1] ECNU MIV at SemEval-2023 Task 1: CTIM - Contrastive Text-Image Model for Multilingual Visual Word Sense Disambiguation
Li, Zhenghui
Zhang, Qi
Xia, Xueyin
Ye, YinXiang
Zhang, Qi
Huang, Cong
17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 101 - 107
[2] UoR-NCL at SemEval-2023 Task 1: Learning Word-Sense and Image Embeddings for Word Sense Disambiguation
Markchom, Thanet
Liang, Huizhi
Gitau, Joyce
Liu, Zehao
Ojha, Varun
Taylor, Lee
Bonnici, Jake
Alshadadi, Abdullah
17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 16 - 22
[3] Histopathology language-image representation learning for fine-grained digital pathology cross-modal retrieval
Hu, Dingyi
Jiang, Zhiguo
Shi, Jun
Xie, Fengying
Wu, Kun
Tang, Kunming
Cao, Ming
Huai, Jianguo
Zheng, Yushan
MEDICAL IMAGE ANALYSIS, 2024, 35
[4] Sartipi-Sedighin at SemEval-2023 Task 2: Fine-grained Named Entity Recognition with Pre-trained Contextual Language Models and Data Augmentation fromWikipedia
Sartipi, Amir
Sedighin, Amirreza
Fatemi, Afsaneh
Kashani, Hamidreza Baradaran
17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 565 - 579

← 1 →