Finding captions in PDF-documents for semantic annotations of images

被引:0
|
作者
Maderlechner, Gerd [1 ]
Panyr, Jiri [1 ]
Suda, Peter [1 ]
机构
[1] Siemens AG, Corp Technol, D-81730 Munich, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Portable Document Format (PDF) is widely-used in the Web and searchable by search engines; but only for the text content. The goal of this work is the extraction and annotation of images in PDF-documents, to make them searchable and to perform semantic image annotation. The first step is the extraction and conversion of the images into a standard format like jpeg, and the recognition of corresponding image captions using the layout structure and geometric relationships. The second step uses linguistic-semantic analysis of the image caption text in the context of the document domain. The result on a PDF-document collection with about 3300 pages with 6500 images has a precision of 95.5% and a recall of 88.8% for the correct image captions.
引用
收藏
页码:422 / 430
页数:9
相关论文
共 33 条
  • [31] Look, Perceive and Segment: Finding the Salient Objects in Images via Two-stream Fixation-Semantic CNNs
    Chen, Xiaowu
    Zheng, Anlin
    Li, Jia
    Lu, Feng
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1050 - 1058
  • [32] Finding the semantic similarity in single-particle diffraction images using self-supervised contrastive projection learning
    Zimmermann, Julian
    Beguet, Fabien
    Guthruf, Daniel
    Langbehn, Bruno
    Rupp, Daniela
    [J]. NPJ COMPUTATIONAL MATERIALS, 2023, 9 (01)
  • [33] Finding the semantic similarity in single-particle diffraction images using self-supervised contrastive projection learning
    Julian Zimmermann
    Fabien Beguet
    Daniel Guthruf
    Bruno Langbehn
    Daniela Rupp
    [J]. npj Computational Materials, 9