Hierarchical clustering of text corpora using suffix trees

被引:0
|
作者
Maslowska, I [1 ]
Slowinski, R [1 ]
机构
[1] Poznan Tech Univ, Inst Comp Sci, PL-60965 Poznan, Poland
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a novel method for hierarchical clustering of text corpora, which proves especially suitable for online clustering. Information overload - the current phenomenon in electronic document repositories and the Internet in particular - constitutes an unceasing challenge for researchers. Clustering has been proposed as a comprehensive information access method. We describe a system, which automatically builds a navigable hierarchy of meaningful document groups. We claim that our system addresses two chief needs of the Web users: the need for efficient access to the up-to-date information on every available topic and the need for an organized and meaningful presentation of the desired information.
引用
收藏
页码:179 / 188
页数:10
相关论文
共 50 条
  • [1] Malware clustering using suffix trees
    Oprisa, Ciprian
    Cabau, George
    Pal, Gheorghe Sebestyen
    JOURNAL OF COMPUTER VIROLOGY AND HACKING TECHNIQUES, 2016, 12 (01): : 1 - 10
  • [2] Processing comparable corpora with bilingual suffix trees
    Munteanu, DS
    Marcu, D
    PROCEEDINGS OF THE 2002 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, 2002, : 289 - 295
  • [3] Text clustering using a suffix tree similarity measure
    Huang C.
    Yin J.
    Hou F.
    Journal of Computers, 2011, 6 (10) : 2180 - 2186
  • [4] Compressed suffix arrays and suffix trees with applications to text indexing and string matching
    Grossi, R
    Vitter, JS
    SIAM JOURNAL ON COMPUTING, 2005, 35 (02) : 378 - 407
  • [5] Faster Compressed Suffix Trees for Repetitive Text Collections
    Navarro, Gonzalo
    Ordonez, Alberto
    EXPERIMENTAL ALGORITHMS, SEA 2014, 2014, 8504 : 424 - 435
  • [6] Dotted suffix trees - A structure for approximate text indexing
    Coelho, Luis Pedro
    Oliveira, Arlindo L.
    STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2006, 4209 : 329 - 336
  • [7] Applying hierarchical clustering to homophonic substitution ciphers using historical corpora
    Lehofer, Anna
    CRYPTOLOGIA, 2022, 46 (05) : 422 - 438
  • [8] Text Analysis with Enhanced Annotated Suffix Trees: Algorithms and Implementation
    Dubov, Mikhail
    ANALYSIS OF IMAGES, SOCIAL NETWORKS AND TEXTS, AIST 2015, 2015, 542 : 308 - 319
  • [9] Hierarchical clustering of text documents
    Lomakina, L. S.
    Rodionov, V. B.
    Surkova, A. S.
    AUTOMATION AND REMOTE CONTROL, 2014, 75 (07) : 1309 - 1315
  • [10] Hierarchical clustering of text documents
    L. S. Lomakina
    V. B. Rodionov
    A. S. Surkova
    Automation and Remote Control, 2014, 75 : 1309 - 1315