Clustering documents in evolving languages by image texture analysis

被引:0
|
作者
Darko Brodić
Alessia Amelio
Zoran N. Milivojević
机构
[1] University of Belgrade,Technical Faculty in Bor
[2] DIMES University of Calabria,undefined
[3] College of Applied Technical Sciences,undefined
来源
Applied Intelligence | 2017年 / 46卷
关键词
Coding; Clustering; Document analysis; Evolving languages; Image processing; Italian language; Language recognition; Statistical analysis;
D O I
暂无
中图分类号
学科分类号
摘要
This paper introduces a new method for clustering of documents, which have been written in a language evolving during different historical periods, with an example of the Italian language. In the first phase, the text is transformed into a string of four numerical codes, which have been derived from the energy profile of each letter, defining the height of the letters and their location in the text line. Each code represents a gray level and the text is codified as a 1-D image. In the second phase, texture features are extracted from the obtained image in order to create document feature vectors. Subsequently, a new clustering algorithm is employed on the feature vectors to discriminate documents from different historical periods of the language. Experiments are performed on a database of Italian documents given in Italian Vulgar and modern Italian. Results demonstrate that this proposed method perfectly identifies the historical periods of the language of the documents, outperforming other well-known clustering algorithms generally adopted for document categorization and other state-of-the-art text-based language models.
引用
收藏
页码:916 / 933
页数:17
相关论文
共 50 条
  • [1] Clustering documents in evolving languages by image texture analysis
    Brodic, Darko
    Amelio, Alessia
    Milivojevic, Zoran N.
    APPLIED INTELLIGENCE, 2017, 46 (04) : 916 - 933
  • [2] Image Clustering using Color and Texture
    Maheshwari, Manish
    Silakari, Sanjay
    Motwani, Mahesh
    2009 1ST INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, COMMUNICATION SYSTEMS AND NETWORKS(CICSYN 2009), 2009, : 403 - +
  • [3] Document image characterization using a multiresolution analysis of the texture: application to old documents
    Journet, Nicholas
    Ramel, Jean-Yves
    Mullot, Remy
    Eglin, Veronique
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2008, 11 (01) : 9 - 18
  • [4] Document image characterization using a multiresolution analysis of the texture: application to old documents
    Nicholas Journet
    Jean-Yves Ramel
    Rémy Mullot
    Véronique Eglin
    International Journal of Document Analysis and Recognition (IJDAR), 2008, 11 : 9 - 18
  • [5] Identification of Fraktur and Latin Scripts in German Historical Documents Using Image Texture Analysis
    Brodic, Darko
    Amelio, Alessia
    Milivojevic, Zoran N.
    APPLIED ARTIFICIAL INTELLIGENCE, 2016, 30 (05) : 379 - 395
  • [6] Image texture clustering for prostate ultrasound diagnosis
    Sheppard, Mark A.
    Shih, Liwen
    2007 IEEE ULTRASONICS SYMPOSIUM PROCEEDINGS, VOLS 1-6, 2007, : 2473 - 2476
  • [7] Texture Image Segmentation Using Spectral Clustering
    Du, Hui
    Wang, Yuping
    Dong, Xiaopan
    Cheung, Yiu-ming
    HCI INTERNATIONAL 2015 - POSTERS' EXTENDED ABSTRACTS, PT I, 2015, 528 : 670 - 676
  • [8] A Modified Spatial Fuzzy Clustering Method Based on Texture Analysis for Ultrasound Image Segmentation
    Xu, Yan
    ISIE: 2009 IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS, 2009, : 741 - 746
  • [9] Image texture clustering based on locality preserving projection
    Xing R.
    Zhang Y.
    Zhang S.-Y.
    Zhu L.-Q.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2010, 44 (09): : 1654 - 1658
  • [10] IMAGE SEGMENTATION WITH TEXTURE CLUSTERING BASED JS']JSEG
    Zhang, Jing
    Gao, Yong-Wei
    Feng, Sheng-Wei
    Chen, Zhi-Hua
    Yuan, Yu-Bo
    PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOL. 2, 2015, : 599 - 603