Script identification of document image analysis

被引:0
|
作者
Cheng, Juan [1 ]
Ping, Xijian [1 ]
Zhou, Guanwei [1 ]
Yang, Yang [1 ]
机构
[1] Zhengzhou Informat Sci & Technol Inst, Zhengzhou, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Script identification prior to OCR is necessary in document image analysis. And each script has unique spatial distribution and visual attribute that make it possible to identify itself from other languages. The key technology of script identification algorithm is to abstract effective measure feature. By analyzing vision differences based on normalized histogram statistic, Chinese, Japanese, English and Russian are identified respectively from others. Therefore, automatic identification of four scripts is realized successfully.
引用
收藏
页码:178 / +
页数:3
相关论文
共 50 条
  • [21] Text identification for document image analysis using a neural network
    Strouthopoulos, C
    Papamarkos, N
    [J]. IMAGE AND VISION COMPUTING, 1998, 16 (12-13) : 879 - 896
  • [22] Hierarchical content classification and script determination for automatic document image processing
    Wang, Q
    Chi, Z
    Zhao, RC
    [J]. 16TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL III, PROCEEDINGS, 2002, : 77 - 80
  • [23] Hierarchical content classification and script determination for automatic document image processing
    Chi, Z
    Wang, Q
    Siu, WC
    [J]. PATTERN RECOGNITION, 2003, 36 (11) : 2483 - 2500
  • [24] Content Independent Writer Identification on Bangla Script: A Document Level Approach
    Halder, Chayan
    Obaidullah, Sk. Md.
    Santosh, K. C.
    Roy, Kaushik
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2018, 32 (09)
  • [25] Dimensionality Reduction and Feature Selection Methods for Script Identification on Document Images
    Poon, Bruce
    Rahman, Saami
    Amin, M. Ashraful
    Yan, Hong
    [J]. INFORMATION TECHNOLOGY IN INDUSTRY, 2014, 2 (01): : 1 - 5
  • [26] Word level Script and Language identification for Unconstrained handwritten document images
    Prasanthkumar, P., V
    Dileesh, E. D.
    [J]. 2014 3RD INTERNATIONAL CONFERENCE ON ECO-FRIENDLY COMPUTING AND COMMUNICATION SYSTEMS (ICECCS 2014), 2014, : 14 - 18
  • [27] An Approach for Automatic Indic Script Identification from Handwritten Document Images
    Obaidullah, Sk. Md.
    Halder, Chayan
    Das, Nibaran
    Roy, Kaushik
    [J]. ADVANCED COMPUTING AND SYSTEMS FOR SECURITY, VOL 2, 2016, 396 : 37 - 51
  • [28] Script Identification from Camera Based Tri-Lingual Document
    Mukarambi, Gururaj
    Mallapa, Satishkumar
    Dhandra, B. V.
    [J]. 2017 IEEE 3RD INTERNATIONAL CONFERENCE ON SENSING, SIGNAL PROCESSING AND SECURITY (ICSSS), 2017, : 214 - 217
  • [29] PHDIndic_11: page-level handwritten document image dataset of 11 official Indic scripts for script identification
    Obaidullah, Sk Md
    Halder, Chayan
    Santosh, K. C.
    Das, Nibaran
    Roy, Kaushik
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (02) : 1643 - 1678
  • [30] PHDIndic_11: page-level handwritten document image dataset of 11 official Indic scripts for script identification
    Sk Md Obaidullah
    Chayan Halder
    K. C. Santosh
    Nibaran Das
    Kaushik Roy
    [J]. Multimedia Tools and Applications, 2018, 77 : 1643 - 1678