Weakly Supervised Hashing with Reconstructive Cross-modal Attention

被引：0

作者：

Du, Yongchao ^{[1
]}

Wang, Min ^{[2
]}

Lu, Zhenbo ^{[2
]}

Zhou, Wengang ^{[1
]}

Li, Houqiang ^{[1
]}

机构：

[1] Univ Sci & Technol China, Hefei 230027, Peoples R China

[2] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei, Peoples R China

来源：

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS | 2023年 / 19卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Weakly supervised hashing; attention; QUANTIZATION;

D O I：

10.1145/3589185

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

On many popular social websites, images are usually associated with some meta-data such as textual tags, which involve semantic information relevant to the image and can be used to supervise the representation learning for image retrieval. However, these user-provided tags are usually polluted by noise, therefore the main challenge lies in mining the potential useful information from those noisy tags. Many previous works simply treat different tags equally to generate supervision, which will inevitably distract the network learning. To this end, we propose a new framework, termed as Weakly Supervised Hashing with Reconstructive Cross-modal Attention (WSHRCA), to learn compact visual-semantic representation with more reliable supervision for retrieval task. Specifically, for each image-tag pair, the weak supervision from tags is refined by cross-modal attention, which takes image feature as query to aggregate the most content-relevant tags. Therefore, tags with relevant content will be more prominent while noisy tags will be suppressed, which provides more accurate supervisory information. To improve the effectiveness of hash learning, the image embedding in WSHRCA is reconstructed from hash code, which is further optimized by cross-modal constraint and explicitly improves hash learning. The experiments on two widely-used datasets demonstrate the effectiveness of our proposed method for weakly-supervised image retrieval. The code is available at https://github.com/duyc168/weakly-supervised-hashing.

引用

页数：19

共 50 条

[1] Weakly Supervised Cross-Modal Hashing
Liu, Xuanwu
Yu, Guoxian
Domeniconi, Carlotta
Wang, Jun
Xiao, Guoqiang
Guo, Maozu
IEEE TRANSACTIONS ON BIG DATA, 2022, 8 (02) : 552 - 563
[2] Supervised Hierarchical Cross-Modal Hashing
Sun, Changchang
Song, Xuemeng
Feng, Fuli
Zhao, Wayne Xin
Zhang, Hao
Nie, Liqiang
PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, : 725 - 734
[3] Collective Reconstructive Embeddings for Cross-Modal Hashing
Hu, Mengqiu
Yang, Yang
Shen, Fumin
Xie, Ning
Hong, Richang
Shen, Heng Tao
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (06) : 2770 - 2784
[4] Dual-supervised attention network for deep cross-modal hashing
Peng, Hanyu
He, Junjun
Chen, Shifeng
Wang, Yali
Qiao, Yu
PATTERN RECOGNITION LETTERS, 2019, 128 : 333 - 339
[5] Weakly-Supervised Deep Image Hashing based on Cross-Modal Transformer
Yang, Ching-Ching
Chu, Wei-Ta
Dubey, Shiv Ram
2023 18TH INTERNATIONAL CONFERENCE ON MACHINE VISION AND APPLICATIONS, MVA, 2023,
[6] SUPERVISED CROSS-MODAL HASHING WITHOUT RELAXATION
Huang, Hua-Junjie
Yang, Rui
Li, Chuan-Xiang
Shi, Yuliang
Guo, Shanqing
Xu, Xin-Shun
2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 1159 - 1164
[7] Cross-Modal Attention Mechanism for Weakly Supervised Video Anomaly Detection
Sun, Wenwen
Cao, Lin
Guo, Yanan
Du, Kangning
BIOMETRIC RECOGNITION, CCBR 2023, 2023, 14463 : 437 - 446
[8] Weakly-Supervised Enhanced Semantic-Aware Hashing for Cross-Modal Retrieval
Zhang, Chao
Li, Huaxiong
Gao, Yang
Chen, Chunlin
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (06) : 6475 - 6488
[9] Deep medical cross-modal attention hashing
Zhang, Yong
Ou, Weihua
Shi, Yufeng
Deng, Jiaxin
You, Xinge
Wang, Anzhi
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2022, 25 (04): : 1519 - 1536
[10] Deep medical cross-modal attention hashing
Yong Zhang
Weihua Ou
Yufeng Shi
Jiaxin Deng
Xinge You
Anzhi Wang
World Wide Web, 2022, 25 : 1519 - 1536

← 1 2 3 4 5 →