Character Segmentation for Historical Uchen Tibetan Document Based on Structure Attributes

被引:1
|
作者
Zhang Ce [1 ,2 ]
Wang Weilan [1 ]
机构
[1] Northwest Minzu Univ, Minist Educ, Key Lab Chinas Ethn Languages & Informat Technol, Lanzhou 730030, Gansu, Peoples R China
[2] Chongqing Univ Educ, Sch Math & Informat Engn, Chongqing 400065, Peoples R China
关键词
image processing; historical Tibetan document; character block; local basclinc; touching strokes detection and segmentation; strokes attribution;
D O I
10.3788/LOP202158.2010020
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Character segmentation is an important part in image analysis and recognition of historical Tibetan document. Aiming at the problems of text line slanting, stroke overlapping, crossing, touching between characters, stroke breaking and noise interference of historical Uchen Tibetan document, a character segmentation method for historical Uchen Tibetan document based on structure attributes is proposed in this paper. First, a character block datasct of historical Uchen Tibetan document is established. Then, the local baseline of character block is detected by using syllable point position information or combining horizontal projection and linear detection, and the character block is divided horizontally into two parts above and below the baseline. The improved template matching algorithm is used to detect touching strokes and touching type above the baseline. The multi-direction and multipath touching character segmentation algorithm is used to realize crossing and touching strokes segmentation. Finally, according to Tibetan structure attribute, to complete the attribution of each stroke. Experimental results show that the proposed method can effectively solve the challenge problem in character segmentation. The recall rate, precision rate and F-Measure of character segmentation reached 96.52% , 98.24 % and 97.37% , respectively.
引用
收藏
页数:16
相关论文
共 19 条
  • [1] Text extraction method for historical Tibetan document images based on block projections
    Duan L.-J.
    Zhang X.-Q.
    Ma L.-L.
    Wu J.
    [J]. Optoelectronics Letters, 2017, 13 (6) : 457 - 461
  • [2] HanY H, 2019, INT J PATTERN RECOGN, V33
  • [3] Jin X, 2020, COMPUTER ENG APPL, V56, P135
  • [4] A Text-Line Segmentation Method for Historical Tibetan Documents Based on Baseline Detection
    Li, Yanxing
    Ma, Longlong
    Duan, Lijuan
    Wu, Jian
    [J]. COMPUTER VISION, PT I, 2017, 771 : 356 - 367
  • [5] Li Z J, 2019, CVC 2019 ADV COMPUTE, V943, P614
  • [6] A novel method of text line segmentation for historical document image of the uchen Tibetan
    Li, Zhenjiang
    Wang, Weilan
    Chen, Yang
    Hao, Yusheng
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 61 : 23 - 32
  • [7] Li ZJ, 2018, LECT NOTES COMPUTER, V1258, P52
  • [8] LiJ C, 2021, J LASER OPTOELECTRON, V58
  • [9] LiZ J, 2019, J P SPIE, V1069
  • [10] Qi Y M, 2019, SCI TECHNOLOGY ENG, V19, P232