Topic discovery based on text mining techniques

被引:50
|
作者
Pons-Porrata, Aurora
Berlanga-Llavori, Rafael
Ruiz-Shulcloper, Jose
机构
[1] Univ Jaume 1, E-12071 Castellon de La Plana, Spain
[2] Univ Oriente, Ctr Pattern Recognit & Data Min, Santiago De Cuba 90500, Cuba
[3] Adv Technol Applicat Ctr, Havana, Cuba
关键词
hierarchical clustering; text summarization; topic detection;
D O I
10.1016/j.ipm.2006.06.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present a topic discovery system aimed to reveal the implicit knowledge present in news streams. This knowledge is expressed as a hierarchy of topic/subtopics, where each topic contains the set of documents that are related to it and a summary extracted from these documents. Summaries so built are useful to browse and select topics of interest from the generated hierarchies. Our proposal consists of a new incremental hierarchical clustering algorithm, which combines both partitional and agglomerative approaches, taking the main benefits from them. Finally, a new summarization method based on Testor Theory has been proposed to build the topic summaries. Experimental results in the TDT2 collection demonstrate its usefulness and effectiveness not only as a topic detection system, but also as a classification and summarization tool. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:752 / 768
页数:17
相关论文
共 50 条
  • [21] Improving Literature-Based Discovery with Advanced Text Mining
    Korhonen, Anna
    Guo, Yufan
    Baker, Simon
    Yetisgen-Yildiz, Meliha
    Stenius, Ulla
    Narita, Masashi
    Lio, Pietro
    COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS, CIBB 2014, 2015, 8623 : 89 - 98
  • [22] Digital Social Network Mining for Topic Discovery
    Moradianzadeh, Pooya
    Mohi, Maryam
    Moshkenani, Mohsen Sadighi
    ADVANCES IN COMPUTER SCIENCE AND ENGINEERING, 2008, 6 : 1000 - 1003
  • [23] A Survey on Text Mining Techniques
    Tandel, Sayali Sunil
    Jamadar, Abhishek
    Dudugu, Siddharth
    2019 5TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS (ICACCS), 2019, : 1022 - 1026
  • [24] Procrustes techniques for text mining
    Balbi, Simona
    Misuraca, Michelangelo
    DATA ANALYSIS, CLASSIFICATION AND THE FORWARD SEARCH, 2006, : 227 - +
  • [25] On text mining techniques for personalization
    Aggarwal, CC
    Yu, PS
    NEW DIRECTIONS IN ROUGH SETS, DATA MINING, AND GRANULAR-SOFT COMPUTING, 1999, 1711 : 12 - 18
  • [26] Topic Features in Negative Customer Reviews: Evidence Based on Text Data Mining
    Li, Zhen
    Li, Fangzhou
    Xiao, Jing
    Yang, Zhi
    REVIEW OF SOCIONETWORK STRATEGIES, 2020, 14 (01): : 19 - 40
  • [27] User group based emotion detection and topic discovery over short text
    Feng, Jiachun
    Rao, Yanghui
    Xie, Haoran
    Wang, Fu Lee
    Li, Qing
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2020, 23 (03): : 1553 - 1587
  • [28] User group based emotion detection and topic discovery over short text
    Jiachun Feng
    Yanghui Rao
    Haoran Xie
    Fu Lee Wang
    Qing Li
    World Wide Web, 2020, 23 : 1553 - 1587
  • [29] Topic Features in Negative Customer Reviews: Evidence Based on Text Data Mining
    Zhen Li
    Fangzhou Li
    Jing Xiao
    Zhi Yang
    The Review of Socionetwork Strategies, 2020, 14 : 19 - 40
  • [30] Fuzzy topic modeling approach for text mining over short text
    Rashid, Junaid
    Shah, Syed Muhammad Adnan
    Irtaza, Aun
    INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (06)