Historical Text Line Segmentation Using Deep Learning Algorithms: Mask-RCNN against U-Net Networks

被引:2
|
作者
Fizaine, Florian Come [1 ,2 ]
Bard, Patrick [1 ]
Paindavoine, Michel [1 ]
Robin, Cecile [2 ,3 ]
Bouye, Edouard [2 ]
Lefevre, Raphael [4 ]
Vinter, Annie [1 ]
机构
[1] Univ Bourgogne, LEAD CNRS, F-21000 Dijon, France
[2] Arch Dept Cote dOr, F-21000 Dijon, France
[3] Inst Natl Patrimoine, F-75002 Paris, France
[4] Soc Natl Chemins Fer Francais, F-93200 St Denis, France
关键词
deep learning; line segmentation; instance segmentation; Mask-RCNN; U-Net; historical document analysis; DOCUMENTS;
D O I
10.3390/jimaging10030065
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Text line segmentation is a necessary preliminary step before most text transcription algorithms are applied. The leading deep learning networks used in this context (ARU-Net, dhSegment, and Doc-UFCN) are based on the U-Net architecture. They are efficient, but fall under the same concept, requiring a post-processing step to perform instance (e.g., text line) segmentation. In the present work, we test the advantages of Mask-RCNN, which is designed to perform instance segmentation directly. This work is the first to directly compare Mask-RCNN- and U-Net-based networks on text segmentation of historical documents, showing the superiority of the former over the latter. Three studies were conducted, one comparing these networks on different historical databases, another comparing Mask-RCNN with Doc-UFCN on a private historical database, and a third comparing the handwritten text recognition (HTR) performance of the tested networks. The results showed that Mask-RCNN outperformed ARU-Net, dhSegment, and Doc-UFCN using relevant line segmentation metrics, that performance evaluation should not focus on the raw masks generated by the networks, that a light mask processing is an efficient and simple solution to improve evaluation, and that Mask-RCNN leads to better HTR performance.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Genetic U-Net: Automatically Designed Deep Networks for Retinal Vessel Segmentation Using a Genetic Algorithm
    Wei, Jiahong
    Zhu, Guijie
    Fan, Zhun
    Liu, Jinchao
    Rong, Yibiao
    Mo, Jiajie
    Li, Wenji
    Chen, Xinjian
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2022, 41 (02) : 292 - 307
  • [42] Deep Learning Segmentation and Classification for Urban Village Using a Worldview Satellite Image Based on U-Net
    Pan, Zhuokun
    Xu, Jiashu
    Guo, Yubin
    Hu, Yueming
    Wang, Guangxing
    REMOTE SENSING, 2020, 12 (10)
  • [43] Landslide detection in the Himalayas using machine learning algorithms and U-Net
    Meena, Sansar Raj
    Soares, Lucas Pedrosa
    Grohmann, Carlos H.
    van Westen, Cees
    Bhuyan, Kushanav
    Singh, Ramesh P.
    Floris, Mario
    Catani, Filippo
    LANDSLIDES, 2022, 19 (05) : 1209 - 1229
  • [44] Landslide detection in the Himalayas using machine learning algorithms and U-Net
    Sansar Raj Meena
    Lucas Pedrosa Soares
    Carlos H. Grohmann
    Cees van Westen
    Kushanav Bhuyan
    Ramesh P. Singh
    Mario Floris
    Filippo Catani
    Landslides, 2022, 19 : 1209 - 1229
  • [45] Retinal blood vessel segmentation using a deep learning method based on modified U-NET model
    Yadav, Arun Kumar
    Akbar, Mohd
    Kumar, Mohit
    Yadav, Divakar
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (35) : 82659 - 82678
  • [46] Segmentation of Activated Sludge Phase Contrast Microscopy Images Using U-Net Deep Learning Model
    Zhao, Li-Jie
    Zou, Shi-Da
    Zhang, Yu-Hong
    Huang, Ming-Zhong
    Zuo, Yue
    Wang, Jia
    Lu, Xing-Kui
    Wu, Zhi-Hao
    Liu, Xiang-Yu
    SENSORS AND MATERIALS, 2019, 31 (06) : 2013 - 2028
  • [47] Automatic Brain Structures Segmentation Using Deep Residual Dilated U-Net
    Li, Hongwei
    Zhygallo, Andrii
    Menze, Bjoern
    BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2018, PT I, 2019, 11383 : 385 - 393
  • [48] An Automatic Nuclei Segmentation on Microscopic Images using Deep Residual U-Net
    Shree, H. P. Ramya
    Minavathi
    Dinesh, M. S.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (10) : 571 - 577
  • [49] Image Segmentation of Rectal Tumor Based on Improved U-Net Model with Deep Learning
    Faguo Zhou
    Yuansheng Ye
    Yanan Song
    Journal of Signal Processing Systems, 2022, 94 : 1145 - 1157
  • [50] Image Segmentation of Rectal Tumor Based on Improved U-Net Model with Deep Learning
    Zhou, Faguo
    Ye, Yuansheng
    Song, Yanan
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2022, 94 (11): : 1145 - 1157