A novel method of text line segmentation for historical document image of the uchen Tibetan

被引:11
|
作者
Li, Zhenjiang [1 ]
Wang, Weilan [1 ]
Chen, Yang [2 ]
Hao, Yusheng [1 ,2 ]
机构
[1] Northwest Minzu Univ, Key Lab Chinas Ethn Languages & Informat Technol, Minist Educ, Lanzhou, Gansu, Peoples R China
[2] Northwest Minzu Univ, Sch Math & Comp Sci, Lanzhou, Gansu, Peoples R China
基金
中国国家自然科学基金;
关键词
Tibetan historical document; Text line segmentation; Baseline; Upper edge; Connected region analysis; Dataset; Image processing; OBJECTS;
D O I
10.1016/j.jvcir.2019.01.021
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text line segmentation is a key step in Tibetan historical document recognition. A novel method for text line segmentation was proposed based on the baseline in uchen Tibetan, and a new dataset was released, which was used to evaluate the results of text line segmentation of uchen Tibetan historical documents. In this paper, there were two steps for the proposed method: baseline detection and text line segmentation using the baseline. In baseline detection, the upper edges of all characters in the document were obtained by a horizontal gradient operator, then an edge connectivity definition was proposed by which the upper edge set was divided into disjoint subsets. Eligible sets were selected from these subsets, and the edges in these sets were joined in turn to obtain the baseline. In text line segmentation, the document image was truncated at the baseline position, then the adhesion regions were segmented again. Each connected region in the image was assigned to its nearest baseline. All connected regions belonging to the same baseline formed a text line. Experiments on the proposed dataset showed that the method could effectively avoid document distortion, the accuracy of text line segmentation was high, and the text line adhesion could be handled. (C) 2019 Published by Elsevier Inc.
引用
收藏
页码:23 / 32
页数:10
相关论文
共 50 条
  • [1] Character Segmentation for Historical Uchen Tibetan Document Based on Structure Attributes
    Zhang Ce
    Wang Weilan
    [J]. LASER & OPTOELECTRONICS PROGRESS, 2021, 58 (20)
  • [2] Touching text line segmentation combined local baseline and connected component for Uchen Tibetan historical documents
    Hu, Pengfei
    Wang, Weilan
    Li, Qiaoqiao
    Wang, Tiejun
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (06)
  • [3] Experimental application of a Japanese historical document image synthesis method to text line segmentation
    Inuzuka, Naoto
    Suzuki, Tetsuya
    [J]. ICPRAM 2021 - Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods, 2021, : 628 - 634
  • [4] Experimental Application of a Japanese Historical Document Image Synthesis Method to Text Line Segmentation
    Inuzuka, Naoto
    Suzuki, Tetsuya
    [J]. PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM), 2021, : 628 - 634
  • [5] An effective method for text line segmentation in historical document images
    Tien-Nam Nguyen
    Burie, Jean-Christophe
    Thi-Lan Le
    Schweyer, Anne-Valerie
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1593 - 1599
  • [6] A Novel Text Line Segmentation Method Based on Contour Curve Tracking for Tibetan Historical Documents
    Zhou, Fengming
    Wang, Weilan
    Lin, Qiang
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2018, 32 (10)
  • [7] A Recognition Method of the Similarity Character for Uchen Script Tibetan Historical Document Based on DNN
    Wang, Xiaojuan
    Wang, Weilan
    Li, Zhenjiang
    Wang, Yiqun
    Han, Yuehui
    Hao, Zhanjun
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PT III, 2018, 11258 : 52 - 62
  • [8] Tibetan Historical Document Recognition of Uchen Script using Baseline Information
    Li, Zhenjiang
    Wang, Weilan
    [J]. TENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2018), 2019, 11069
  • [9] A Text-Line Segmentation Method for Historical Tibetan Documents Based on Baseline Detection
    Li, Yanxing
    Ma, Longlong
    Duan, Lijuan
    Wu, Jian
    [J]. COMPUTER VISION, PT I, 2017, 771 : 356 - 367
  • [10] Character Detection and Segmentation of Historical Uchen Tibetan Documents in Complex Situations
    Zhang, Ce
    Wang, Weilan
    Liu, Huaming
    Zhang, Guowei
    Lin, Qiang
    [J]. IEEE ACCESS, 2022, 10 : 25376 - 25391