Clustering documents in evolving languages by image texture analysis

被引:0
|
作者
Darko Brodić
Alessia Amelio
Zoran N. Milivojević
机构
[1] University of Belgrade,Technical Faculty in Bor
[2] DIMES University of Calabria,undefined
[3] College of Applied Technical Sciences,undefined
来源
Applied Intelligence | 2017年 / 46卷
关键词
Coding; Clustering; Document analysis; Evolving languages; Image processing; Italian language; Language recognition; Statistical analysis;
D O I
暂无
中图分类号
学科分类号
摘要
This paper introduces a new method for clustering of documents, which have been written in a language evolving during different historical periods, with an example of the Italian language. In the first phase, the text is transformed into a string of four numerical codes, which have been derived from the energy profile of each letter, defining the height of the letters and their location in the text line. Each code represents a gray level and the text is codified as a 1-D image. In the second phase, texture features are extracted from the obtained image in order to create document feature vectors. Subsequently, a new clustering algorithm is employed on the feature vectors to discriminate documents from different historical periods of the language. Experiments are performed on a database of Italian documents given in Italian Vulgar and modern Italian. Results demonstrate that this proposed method perfectly identifies the historical periods of the language of the documents, outperforming other well-known clustering algorithms generally adopted for document categorization and other state-of-the-art text-based language models.
引用
收藏
页码:916 / 933
页数:17
相关论文
共 50 条
  • [41] Texture analysis by Gibbs image models
    Gimel'farb, G
    Klette, R
    CARS 2000: COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2000, 1214 : 661 - 665
  • [42] An efficient method for image texture analysis
    Kandaswamy, U
    Adjeroh, D
    Lee, MC
    INTELLIGENT SYSTEMS IN DESIGN AND MANUFACTURING V, 2004, 5605 : 187 - 194
  • [43] Texture Analysis for Dermoscopic Image Processing
    Nowak, Leszek A.
    Ogorzalek, Maciej J.
    Pawlowski, Marcin P.
    2012 IEEE BIOMEDICAL CIRCUITS AND SYSTEMS CONFERENCE (BIOCAS): INTELLIGENT BIOMEDICAL ELECTRONICS AND SYSTEM FOR BETTER LIFE AND BETTER ENVIRONMENT, 2012, : 292 - 295
  • [44] A SET OF OPERATORS FOR IMAGE TEXTURE ANALYSIS
    SIMON, JC
    CRETTEZ, JP
    COMPTES RENDUS DE L ACADEMIE DES SCIENCES SERIE II, 1981, 293 (12): : 889 - 894
  • [45] Contour and Texture Analysis for Image Segmentation
    Jitendra Malik
    Serge Belongie
    Thomas Leung
    Jianbo Shi
    International Journal of Computer Vision, 2001, 43 : 7 - 27
  • [46] Fourier Spectrum Image Texture Analysis
    Hu, Xinyu
    Ensor, Andrew
    2018 INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2018,
  • [47] Image Analysis: Focus on Texture Similarity
    Pappas, Thrasyvoulos N.
    Neuhoff, David L.
    de Ridder, Huib
    Zujovic, Jana
    PROCEEDINGS OF THE IEEE, 2013, 101 (09) : 2044 - 2057
  • [48] Regression Models for Texture Image Analysis
    Plastinin, Anatoliy
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, 2011, 6744 : 136 - 141
  • [49] Extracting halftones from printed documents using texture analysis
    Dunn, DF
    Weldon, TP
    Higgins, WE
    OPTICAL ENGINEERING, 1997, 36 (04) : 1044 - 1052
  • [50] Extracting halftones from printed documents using texture analysis
    Dunn, DF
    Weldon, TP
    Higgins, WE
    INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, PROCEEDINGS - VOL II, 1996, : 225 - 228