Experimental Application of a Japanese Historical Document Image Synthesis Method to Text Line Segmentation

被引:1
|
作者
Inuzuka, Naoto [1 ]
Suzuki, Tetsuya [1 ]
机构
[1] Shibaura Inst Technol, Grad Sch Syst Engn & Sci, Saitama, Japan
来源
PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM) | 2021年
关键词
Text Line Segmentation; Historical Document; Deep Learning; Data Synthesis;
D O I
10.5220/0010330206280634
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We plan to use a text line segmentation method based on machine learning in our transcription support system for handwritten Japanese historical document in Kana, and are searching for a data synthesis method of annotated document images because it is time consuming to manually annotate a large set of document images for training data for machine learning. In this paper, we report our synthesis method of annotated document images designed for a Japanese historical document. To compare manually annotated Japanese historical document images and annotated document images synthesized by the method as training data for an object detection algorithm YOLOv3, we conducted text line segmentation experiments using the object detection algorithm. The experimental results show that a model trained by the synthetic annotated document images are competitive with that trained by the manually annotated document images from the view point of a metric intersection-over-union.
引用
收藏
页码:628 / 634
页数:7
相关论文
共 50 条
  • [1] Experimental application of a Japanese historical document image synthesis method to text line segmentation
    Inuzuka, Naoto
    Suzuki, Tetsuya
    ICPRAM 2021 - Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods, 2021, : 628 - 634
  • [2] A novel method of text line segmentation for historical document image of the uchen Tibetan
    Li, Zhenjiang
    Wang, Weilan
    Chen, Yang
    Hao, Yusheng
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 61 : 23 - 32
  • [3] An effective method for text line segmentation in historical document images
    Tien-Nam Nguyen
    Burie, Jean-Christophe
    Thi-Lan Le
    Schweyer, Anne-Valerie
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1593 - 1599
  • [4] A simple text/graphic separation method for document image segmentation
    Zirari, F.
    Ennaji, A.
    Nicolas, S.
    Mammass, D.
    2013 ACS INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2013,
  • [5] Text segmentation in degraded historical document images
    Kavitha, A. S.
    Shivakumara, P.
    Kumar, G. H.
    Lu, Tong
    EGYPTIAN INFORMATICS JOURNAL, 2016, 17 (02) : 189 - 197
  • [6] Text Line Segmentation in Historical Newspapers
    Lenc, Ladislav
    Martinek, Jiri
    Kral, Pavel
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2022, PT II, 2023, 13589 : 35 - 48
  • [7] Text line extraction for historical document images
    Saabni, Raid
    Asi, Abedelkadir
    El-Sana, Jihad
    PATTERN RECOGNITION LETTERS, 2014, 35 : 23 - 33
  • [8] A two-step framework for text line segmentation in historical Arabic and Latin document images
    Olfa Mechi
    Maroua Mehri
    Rolf Ingold
    Najoua Essoukri Ben Amara
    International Journal on Document Analysis and Recognition (IJDAR), 2021, 24 : 197 - 218
  • [9] A two-step framework for text line segmentation in historical Arabic and Latin document images
    Mechi, Olfa
    Mehri, Maroua
    Ingold, Rolf
    Essoukri Ben Amara, Najoua
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2021, 24 (03) : 197 - 218
  • [10] SegHist: A General Segmentation-Based Framework for Chinese Historical Document Text Line Detection
    Hu, Xingjian
    Wei, Baole
    Gao, Liangcai
    Wang, Jun
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT III, 2024, 14806 : 391 - 410