Experimental Application of a Japanese Historical Document Image Synthesis Method to Text Line Segmentation

被引:1
|
作者
Inuzuka, Naoto [1 ]
Suzuki, Tetsuya [1 ]
机构
[1] Shibaura Inst Technol, Grad Sch Syst Engn & Sci, Saitama, Japan
关键词
Text Line Segmentation; Historical Document; Deep Learning; Data Synthesis;
D O I
10.5220/0010330206280634
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We plan to use a text line segmentation method based on machine learning in our transcription support system for handwritten Japanese historical document in Kana, and are searching for a data synthesis method of annotated document images because it is time consuming to manually annotate a large set of document images for training data for machine learning. In this paper, we report our synthesis method of annotated document images designed for a Japanese historical document. To compare manually annotated Japanese historical document images and annotated document images synthesized by the method as training data for an object detection algorithm YOLOv3, we conducted text line segmentation experiments using the object detection algorithm. The experimental results show that a model trained by the synthetic annotated document images are competitive with that trained by the manually annotated document images from the view point of a metric intersection-over-union.
引用
收藏
页码:628 / 634
页数:7
相关论文
共 50 条
  • [1] Experimental application of a Japanese historical document image synthesis method to text line segmentation
    Inuzuka, Naoto
    Suzuki, Tetsuya
    [J]. ICPRAM 2021 - Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods, 2021, : 628 - 634
  • [2] A novel method of text line segmentation for historical document image of the uchen Tibetan
    Li, Zhenjiang
    Wang, Weilan
    Chen, Yang
    Hao, Yusheng
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 61 : 23 - 32
  • [3] An effective method for text line segmentation in historical document images
    Tien-Nam Nguyen
    Burie, Jean-Christophe
    Thi-Lan Le
    Schweyer, Anne-Valerie
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1593 - 1599
  • [4] A simple text/graphic separation method for document image segmentation
    Zirari, F.
    Ennaji, A.
    Nicolas, S.
    Mammass, D.
    [J]. 2013 ACS INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2013,
  • [5] Text segmentation in degraded historical document images
    Kavitha, A. S.
    Shivakumara, P.
    Kumar, G. H.
    Lu, Tong
    [J]. EGYPTIAN INFORMATICS JOURNAL, 2016, 17 (02) : 189 - 197
  • [6] Text Line Segmentation in Historical Newspapers
    Lenc, Ladislav
    Martinek, Jiri
    Kral, Pavel
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2022, PT II, 2023, 13589 : 35 - 48
  • [7] Text line extraction for historical document images
    Saabni, Raid
    Asi, Abedelkadir
    El-Sana, Jihad
    [J]. PATTERN RECOGNITION LETTERS, 2014, 35 : 23 - 33
  • [8] A two-step framework for text line segmentation in historical Arabic and Latin document images
    Olfa Mechi
    Maroua Mehri
    Rolf Ingold
    Najoua Essoukri Ben Amara
    [J]. International Journal on Document Analysis and Recognition (IJDAR), 2021, 24 : 197 - 218
  • [9] A two-step framework for text line segmentation in historical Arabic and Latin document images
    Mechi, Olfa
    Mehri, Maroua
    Ingold, Rolf
    Essoukri Ben Amara, Najoua
    [J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2021, 24 (03) : 197 - 218
  • [10] iDocChip: A Configurable Hardware Architecture for Historical Document Image ProcessingMultiresolution Morphology-based Text and Image Segmentation
    Menbere Kina Tekleyohannes
    Vladimir Rybalkin
    Muhammad Mohsin Ghaffar
    Javier Alejandro Varela
    Norbert Wehn
    Andreas Dengel
    [J]. International Journal of Parallel Programming, 2021, 49 : 253 - 284