Font Group Identification Using Reconstructed Fonts

被引:0
|
作者
Cutter, Michael P. [1 ]
van Beusekom, Joost [1 ]
Shafait, Faisal [1 ]
Breuel, Thomas M. [1 ]
机构
[1] Univ Kaiserslautern, D-67663 Kaiserslautern, Germany
来源
关键词
Font Reconstruction; Font Identification; Reconstructed Font; Token Matching;
D O I
10.1117/12.873398
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ideally, digital versions of scanned documents should be represented in a format that is searchable, compressed, highly readable, and faithful to the original. These goals can theoretically be achieved through OCR and font recognition, re-typesetting the document text with original fonts. However, OCR and font recognition remain hard problems, and many historical documents use fonts that are not available in digital forms. It is desirable to be able to reconstruct fonts with vector glyphs that approximate the shapes of the letters that form a font. In this work, we address the grouping of tokens in a token-compressed document into candidate fonts. This permits us to incorporate font information into token-compressed images even when the original fonts are unknown or unavailable in digital format. This paper extends previous work in font reconstruction by proposing and evaluating an algorithm to assign a font to every character within a document. This is a necessary step to represent a scanned document image with a reconstructed font. Through our evaluation method, we have measured a 98.4% accuracy for the assignment of letters to candidate fonts in multi-font documents.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Effects of font size, display resolution and task type on reading Chinese fonts from mobile devices
    Huang, Ding-Long
    Rau, Pei-Luen Patrick
    Liu, Ying
    INTERNATIONAL JOURNAL OF INDUSTRIAL ERGONOMICS, 2009, 39 (01) : 81 - 89
  • [22] Font Identification - In Context of an Indic Script
    Chanda, Sukalpa
    Pal, Umapada
    Franke, Katrin
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 1655 - 1658
  • [23] Character-Independent Font Identification
    Haraguchi, Daichi
    Harada, Shota
    Iwana, Brian Kenji
    Shinahara, Yuto
    Uchida, Seiichi
    DOCUMENT ANALYSIS SYSTEMS, 2020, 12116 : 497 - 511
  • [24] Multi-font script identification using texture-based features
    Busch, Andrew
    IMAGE ANALYSIS AND RECOGNITION, PT 2, 2006, 4142 : 844 - 852
  • [25] Modal Parameter Identification of Structures Using Reconstructed Displacements and Stochastic Subspace Identification
    Guo, Xiangying
    Li, Changkun
    Luo, Zhong
    Cao, Dongxing
    APPLIED SCIENCES-BASEL, 2021, 11 (23):
  • [26] Identification of Reconstructed Speech
    Wu, Haojun
    Wang, Yong
    Huang, Jiwu
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2017, 13 (01)
  • [27] Sensor fault identification in MSPM using reconstructed monitoring statistics
    Lee, C
    Choi, SW
    Lee, JM
    Lee, IB
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2004, 43 (15) : 4293 - 4304
  • [28] Sensor Variable Font: A Model to Improve the Process of Signification of Graphic Interfaces Through Variable Fonts and Data Collected by Sensors
    Huelves, Ivan
    Marco, Lourdes
    PERSPECTIVES ON DESIGN AND DIGITAL COMMUNICATION: RESEARCH, INNOVATIONS AND BEST PRACTICES, 2021, : 65 - 90
  • [29] Font clustering and cluster identification in document images
    Öztürk, S
    Sankur, B
    Abak, AT
    JOURNAL OF ELECTRONIC IMAGING, 2001, 10 (02) : 418 - 430
  • [30] Image indexing applied to typewriter font identification
    Augusteijn, MF
    Warrender, CE
    NEURAL COMPUTING & APPLICATIONS, 1996, 4 (04): : 209 - 217