FARSI/ARABIC DOCUMENT IMAGE RETRIEVAL THROUGH SUB - LETTER SHAPE CODING

被引:0
|
作者
Bahmani, Zahra [1 ]
Azmi, Reza [1 ]
机构
[1] Alzahra Univ, Dept Comp, Tehran, Iran
关键词
Retrieval; shape code; sub-word; sub-letter; base line;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, A Novel method for Recognition free Farsi document retrieval is proposed. In this method, the retrieval is done through recognition of sub-letters and other elements of letters such as dots and some signs like Sarkesh. So at first in pre processing phase, lines and words are extracted using blank space between them. In the next phase, each word is divided to its sub-words. A sub-word is a combination of joint letters. For each sub-word, connectors of sub-letters are removed from the initial body of it and remains are recognized as sub-letters by using of their extracted features. The recognized sub-letters are encoded using a dictionary that has been defined in this system. Finally, the document content is encoded and this code can be used for retrieval of existing words in this document. Experimental results show advantages of this method in the retrieval of Persian printed documents.
引用
收藏
页码:661 / 665
页数:5
相关论文
共 15 条
  • [1] Document image retrieval through word shape coding
    Lu, Shijian
    Li, Linlin
    Tan, Chew Lim
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (11) : 1913 - 1918
  • [2] Document Image Coding for Processing and Retrieval
    Omid E. Kia
    David S. Doermann
    Journal of VLSI signal processing systems for signal, image and video technology, 1998, 20 : 121 - 135
  • [3] Document image coding for processing and retrieval
    Kia, OE
    Doermann, DS
    JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 1998, 20 (1-2): : 121 - 135
  • [4] Document image coding for processing and retrieval
    Natl Inst of Standards and, Technology, Gaithersburg, United States
    J VLSI Signal Process Syst Signal Image Video Technol, 1-2 (121-135):
  • [5] A new method to separation of Farsi and Arabic sub-words using image processing techniques
    Shirvani, Parisa
    Khouzani, Mehrdad Vatankhah
    2013 FIRST IRANIAN CONFERENCE ON PATTERN RECOGNITION AND IMAGE ANALYSIS (PRIA), 2013,
  • [6] Scanned english document retrieval based on OCR and word shape coding
    Xia, Yong
    Dai, Ru-Wei
    Xiao, Bai-Hua
    Wang, Chun-Heng
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2009, 22 (03): : 488 - 493
  • [7] Word shape recognition for image-based document retrieval
    Huang, WH
    Tan, CL
    Sung, SY
    Xu, Y
    2001 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL I, PROCEEDINGS, 2001, : 1114 - 1117
  • [8] SUPERVISED LOCAL SPARSE CODING OF SUB-IMAGE FEATURES FOR IMAGE RETRIEVAL
    Thiagarajan, Jayaraman J.
    Ramamurthy, Karthikeyan Natesan
    Sattigeri, Prasanna
    Spanias, Andreas
    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 3117 - 3120
  • [9] Image retrieval of sub-region visual phrases with sparse coding
    Wang, Ruixia
    Peng, Guohua
    Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2015, 33 (05): : 721 - 726
  • [10] Fast Structural Matching for Document Image Retrieval through Spatial Databases
    Gao, Hongxing
    Rusinol, Marcal
    Karatzas, Dimosthenis
    Llados, Josep
    DOCUMENT RECOGNITION AND RETRIEVAL XXI, 2014, 9021