Unified layout analysis and text localization framework

被引:4
|
作者
Vasilopoulos, Nikos [1 ]
Kavallieratou, Ergina [1 ]
机构
[1] Univ Aegean, Dept Informat & Commun Syst Engn, Samos, Greece
关键词
document images; page layout analysis; text localization; PAGE SEGMENTATION; IMAGES; COMPETITION; EXTRACTION; IDENTIFICATION; RECOGNITION; CHARACTERS; ALGORITHM;
D O I
10.1117/1.JEI.26.1.013009
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A technique appropriate for extracting textual information from documents with complex layouts, such as newspapers and journals, is presented. It is a combination of a foreground analysis and a text localization method. The first one is used to segment the page in text and nontext blocks, whereas the second one is used to detect text that may be embedded inside images, charts, diagrams, tables, etc. Detailed experiments on two public databases showed that mixing layout analysis and text localization techniques can lead to improved page segmentation and text extraction results. (C) 2017 SPIE and IS&T
引用
收藏
页数:11
相关论文
共 50 条
  • [21] OmniAL: A unified CNN framework for unsupervised anomaly localization
    Zhao, Ying
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 3924 - 3933
  • [22] Indoor Localization Based on Factor Graphs: A Unified Framework
    Yang, Lyuxiao
    Wu, Nan
    Li, Bin
    Yuan, Weijie
    Hanzo, Lajos
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (05) : 4353 - 4366
  • [23] Performance analysis framework for layout analysis methods
    Antonacopoulos, A.
    Bridson, D.
    ICDAR 2007: NINTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2007, : 1258 - 1262
  • [24] Analysis Into the Impacts of A Unified Layout of Business Letters on a Company
    吴绍强
    广州市经济管理干部学院学报, 2003, (02) : 80 - 82
  • [25] Unified Cache Modeling for WCET Analysis and Layout Optimizations
    Chattopadhyay, Sudipta
    Roychoudhury, Abhik
    2009 30TH IEEE REAL-TIME SYSTEMS SYMPOSIUM, PROCEEDINGS, 2009, : 47 - 56
  • [26] A Unified Framework for Synaesthesia Analysis
    Sheng, Kun
    Wang, Zhongqing
    Zhao, Qingqing
    Jiang, Xiaotong
    Zhou, Guodong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 6038 - 6048
  • [27] UniTRec: A Unified Text-to-Text Transformer and Joint Contrastive Learning Framework for Text-based Recommendation
    Mao, Zhiming
    Wang, Huimin
    Du, Yiming
    Wong, Kam-Fai
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1160 - 1170
  • [28] META: A Unified Toolkit for Text Retrieval and Analysis
    Massung, Sean
    Geigle, Chase
    Zhai, ChengXiang
    PROCEEDINGS OF 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL-2016): SYSTEM DEMONSTRATIONS, 2016, : 91 - 96
  • [29] A Modular Region and Text Line Layout Analysis System
    Kiessling, Benjamin
    2020 17TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2020), 2020, : 313 - 318
  • [30] Text Classification and Document Layout Analysis of Paper Fragments
    Diem, Markus
    Kleber, Florian
    Sablatnig, Robert
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 854 - 858