Segmentation of Persian/arabic printed text using ink spread effect

被引:0
|
作者
Shirali-Shahreza, Sajad [1 ]
Manzuri-Shalmani, M. T. [1 ]
Shirali-Shahreza, M. Hassan [1 ]
机构
[1] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
关键词
pattern recognition; page segmentation; Persian/Arabic document; OCR; image processing;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, OCR (Optical Character Recognition) is widely used for converting written documents to digital documents. One of the OCR phases is page segmentation. In page segmentation, text regions must be found in input image. In addition, text parts like text columns must be separated. In this paper, a new method for segmenting Persian/Arabic printed text is proposed. This method is based on Ink Spread Effect idea, a new idea that has particular features. Main features of Persian/Arabic scripts are considered in designing this method. This method is skew resistant and can segment text within frames and tables or regions with gray background.
引用
收藏
页码:3997 / 4000
页数:4
相关论文
共 50 条
  • [1] Automatic segmentation of printed Persian (Farsi) text
    Yektaie, MH
    Zahzah, EH
    Menard, M
    [J]. SCIA '97 - PROCEEDINGS OF THE 10TH SCANDINAVIAN CONFERENCE ON IMAGE ANALYSIS, VOLS 1 AND 2, 1997, : 767 - 772
  • [2] A method for text-line segmentation for unconstrained Arabic and Persian handwritten text image
    Shakoori, Reza
    [J]. 2014 IEEE 15TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2014, : 338 - 344
  • [3] Adaptive dissection based subword segmentation of printed Arabic text
    Zidouri, A
    Sarfraz, M
    Shahab, SA
    Jafri, SM
    [J]. NINTH INTERNATIONAL CONFERENCE ON INFORMATION VISUALISATION, PROCEEDINGS, 2005, : 239 - 243
  • [4] Efficient Recognition of Machine Printed Arabic Text Using Partial Segmentation and Hausdorff Distance
    Saabni, Raid
    [J]. 2014 6TH INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2014, : 284 - 289
  • [5] Persian/Arabic text font estimation using dots
    Shirali-Shahreza, Mohammad Hassan
    Shirali-Shahreza, Sajad
    [J]. 2006 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, VOLS 1 AND 2, 2006, : 420 - 425
  • [6] Effect of Word Segmentation on Arabic Text Classification
    Al-Thubaity, Abdulmohsen
    Al-Subaie, Abdullah
    [J]. PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, 2015, : 127 - 131
  • [7] Contour-based character segmentation for printed Arabic text with diacritics
    Mohammad, Khader
    Qaroush, Aziz
    Ayesh, Muna
    Washha, Mahdi
    Alsadeh, Ahmad
    Agaian, Sos
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2019, 28 (04)
  • [8] A New Persian/Arabic Text Steganography Using "La" Word
    Shirali-Shahreza, Mohammad
    [J]. ADVANCES IN COMPUTER AND INFORMATIOM SCIENCES AND ENGINEERING, 2008, : 339 - 342
  • [9] Persian/Arabic Unicode Text Steganography
    Shirali-Shahreza, Mohammad
    Shirali-Shahreza, Sajad
    [J]. FOURTH INTERNATIONAL SYMPOSIUM ON INFORMATION ASSURANCE AND SECURITY, PROCEEDINGS, 2008, : 62 - 66
  • [10] Recognition of printed Arabic text using neural networks
    Amin, A
    Mansoor, W
    [J]. PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, 1997, : 612 - 615