Multi-modal Information Integration for Document Retrieval

被引：3

作者：

Hassan, Ehtesham ^{[1
]}

Chaudhury, Santanu ^{[1
]}

Gopal, M. ^{[2
]}

机构：

[1] Indian Inst Technol Delhi, Dept Elect Engn, Delhi, India

[2] SNU, Sch Engn, Gautam Buddha Nagar, India

来源：

2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR) | 2013年

关键词：

Document Indexing; Multi-modal Retrieval; Multiple Kernel Learning; TEXT; SPACE;

D O I：

10.1109/ICDAR.2013.243

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The paper proposes a novel multi-modal document image retrieval framework by exploiting the information of text and graphics regions. The framework applies multiple kernel learning based hashing formulation for generation of composite document indexes using different modalities. The existing multimedia management methods for imaged text documents have not addressed the requirement of old and degraded documents. In the subsequent contribution, we propose novel multi-modal document indexing framework for retrieval of old and degraded text documents by combining OCRed text and image based representation using learning. The evaluation of proposed concepts is demonstrated on sampled magazine cover pages, and documents of Devanagari script.

引用

页码：1200 / 1204

页数：5

共 50 条

[21] Interpretable multi-modal data integration
Osorio, Daniel
NATURE COMPUTATIONAL SCIENCE, 2022, 2 (01): : 8 - 9
[22] Interpretable multi-modal data integration
Daniel Osorio
Nature Computational Science, 2022, 2 : 8 - 9
[23] Towards Flexible Multi-modal Document Models
Inoue, Naoto
Kikuchi, Kotaro
Simo-Serra, Edgar
Otani, Mayu
Yamaguchi, Kota
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14287 - 14296
[24] RetrievalMMT: Retrieval-Constrained Multi-Modal Prompt Learning for Multi-Modal Machine Translation
Wang, Yan
Zeng, Yawen
Liang, Junjie
Xing, Xiaofen
Xu, Jin
Xu, Xiangmin
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 860 - 868
[25] Multi-modal and cross-modal for lecture videos retrieval
Nhu Van Nguyen
Coustaty, Mickal
Ogier, Jean-Marc
2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 2667 - 2672
[26] Unsupervised Multi-modal Hashing for Cross-Modal Retrieval
Yu, Jun
Wu, Xiao-Jun
Zhang, Donglin
COGNITIVE COMPUTATION, 2022, 14 (03) : 1159 - 1171
[27] Cross-Modal Retrieval Augmentation for Multi-Modal Classification
Gur, Shir
Neverova, Natalia
Stauffer, Chris
Lim, Ser-Nam
Kiela, Douwe
Reiter, Austin
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 111 - 123
[28] Unsupervised Multi-modal Hashing for Cross-Modal Retrieval
Jun Yu
Xiao-Jun Wu
Donglin Zhang
Cognitive Computation, 2022, 14 : 1159 - 1171
[29] Multi-modal semantic autoencoder for cross-modal retrieval
Wu, Yiling
Wang, Shuhui
Huang, Qingming
NEUROCOMPUTING, 2019, 331 : 165 - 175
[30] CLIP Multi-modal Hashing for Multimedia Retrieval
Zhu, Jian
Sheng, Mingkai
Huang, Zhangmin
Chang, Jingfei
Jiang, Jinling
Long, Jian
Luo, Cheng
Liu, Lei
MULTIMEDIA MODELING, MMM 2025, PT I, 2025, 15520 : 195 - 205

← 1 2 3 4 5 →