Similar document detection using self-organizing maps

被引:0
|
作者
Lensu, Anssi [1 ]
Koikkalainen, Pasi [1 ]
机构
[1] Univ of Jyvaskyla, Jyvaskyla, Finland
关键词
Algorithms - Data reduction - Data structures - Encoding (symbols);
D O I
暂无
中图分类号
学科分类号
摘要
This paper describes how similar free-form textual documents can be matched using the Self-Organizing Maps (SOMs). The analysis chain is made of three parts: first, similar words are located using an alphabet occurrence coding and SOM; second, three-word contexts are clustered using codes obtained from the word SOM to build a context map; and third, whole documents are clustered using codes from the context SOM. Although this work is inspired by the WEBSOM method, it is quite different since our goal was to build a fast system, which is tolerant to the special features of different languages.
引用
收藏
页码:174 / 177
相关论文
共 50 条
  • [11] The research of Self-Organizing Maps based on Document Collections
    Ding, Yi
    Fu, Xian
    FRONTIERS OF ADVANCED MATERIALS AND ENGINEERING TECHNOLOGY, PTS 1-3, 2012, 430-432 : 1232 - 1235
  • [12] Elaborate document clusters on nested self-organizing maps
    Ye, HL
    IC-AI'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS I-III, 2001, : 137 - 143
  • [13] Using self-organizing maps for anomaly detection in hyperspectral imagery
    Penn, BS
    2002 IEEE AEROSPACE CONFERENCE PROCEEDINGS, VOLS 1-7, 2002, : 1531 - 1535
  • [14] Dynamic muscle fatigue detection using self-organizing maps
    Moshou, D
    Hostens, I
    Papaioannou, G
    Ramon, H
    APPLIED SOFT COMPUTING, 2005, 5 (04) : 391 - 398
  • [15] Fuzzy optimized self-organizing maps and their application to document clustering
    Romero, Francisco P.
    Peralta, Arturo
    Soto, Andres
    Olivas, Jose A.
    Serrano-Guerrero, Jesus
    SOFT COMPUTING, 2010, 14 (08) : 857 - 867
  • [16] Self-Organizing Maps
    Matera, F
    SUBSTANCE USE & MISUSE, 1998, 33 (02) : 365 - 381
  • [17] Fuzzy optimized self-organizing maps and their application to document clustering
    Francisco P. Romero
    Arturo Peralta
    Andres Soto
    Jose A. Olivas
    Jesus Serrano-Guerrero
    Soft Computing, 2010, 14 : 857 - 867
  • [18] Detection of system changes for a pneumatic cylinder using self-organizing maps
    Zachrison, Anders
    Sethson, Magnus
    2006 IEEE CONFERENCE ON COMPUTER-AIDED CONTROL SYSTEM DESIGN, VOLS 1 AND 2, 2006, : 547 - +
  • [19] Detection of Fake Followers using Feature Ratio in Self-Organizing Maps
    Simon, Nitin T.
    Elias, Susan
    2017 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTED, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI), 2017,
  • [20] Visualizing Syscalls using Self-organizing Maps for System Intrusion Detection
    Landauer, Max
    Skopik, Florian
    Wurzenberger, Markus
    Hotwagner, Wolfgang
    Rauber, Andreas
    ICISSP: PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS SECURITY AND PRIVACY, 2020, : 349 - 360