Topic discovery based on text mining techniques

被引:50
|
作者
Pons-Porrata, Aurora
Berlanga-Llavori, Rafael
Ruiz-Shulcloper, Jose
机构
[1] Univ Jaume 1, E-12071 Castellon de La Plana, Spain
[2] Univ Oriente, Ctr Pattern Recognit & Data Min, Santiago De Cuba 90500, Cuba
[3] Adv Technol Applicat Ctr, Havana, Cuba
关键词
hierarchical clustering; text summarization; topic detection;
D O I
10.1016/j.ipm.2006.06.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present a topic discovery system aimed to reveal the implicit knowledge present in news streams. This knowledge is expressed as a hierarchy of topic/subtopics, where each topic contains the set of documents that are related to it and a summary extracted from these documents. Summaries so built are useful to browse and select topics of interest from the generated hierarchies. Our proposal consists of a new incremental hierarchical clustering algorithm, which combines both partitional and agglomerative approaches, taking the main benefits from them. Finally, a new summarization method based on Testor Theory has been proposed to build the topic summaries. Experimental results in the TDT2 collection demonstrate its usefulness and effectiveness not only as a topic detection system, but also as a classification and summarization tool. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:752 / 768
页数:17
相关论文
共 50 条
  • [41] Common Topic Group Mining for Web Service Discovery
    Wang, Jian
    Gao, Panpan
    Ma, Yutao
    He, Keqing
    ADVANCES IN SERVICES COMPUTING, APSCC 2015, 2015, 9464 : 92 - 107
  • [42] Topic Discovery Using Frequent Subgraph Mining Approach
    Tri Nguyen
    Phuc Do
    COMPUTATIONAL SCIENCE AND TECHNOLOGY, ICCST 2017, 2018, 488 : 432 - 442
  • [43] Topic Identification and Emotional Analysis of the Real Estate Favorable Policies Based on Text Mining
    Xu, Jiangnan
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON MODELING, NATURAL LANGUAGE PROCESSING AND MACHINE LEARNING, CMNM 2024, 2024, : 14 - 22
  • [44] Experimental explorations on short text topic mining between LDA and NMF based Schemes
    Chen, Yong
    Zhang, Hui
    Liu, Rui
    Ye, Zhiwen
    Lin, Jianying
    KNOWLEDGE-BASED SYSTEMS, 2019, 163 : 1 - 13
  • [45] A text semantic topic discovery method based on the conditional co-occurrence degree
    Wei, Wei
    Guo, Chonghui
    NEUROCOMPUTING, 2019, 368 : 11 - 24
  • [46] Discovery of knowledge using text mining techniques applied to textual documents of Brazilian police investigation
    da Silva, Marcio Ponciano
    Godoy Viera, Angel Freddy
    INVESTIGACION BIBLIOTECOLOGICA, 2021, 35 (88): : 161 - 183
  • [47] Text Mining: Techniques, Applications, and Challenges
    Justicia de la Torre, C.
    Sanchez, D.
    Blanco, I
    Martin-Bautista, M. J.
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2018, 26 (04) : 553 - 582
  • [48] A SURVEY ON CLASSIFICATION TECHNIQUES FOR TEXT MINING
    Brindha, S.
    Sukumaran, S.
    Prabha, K.
    2016 3RD INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2016,
  • [49] Text Mining: Techniques, Applications and Issues
    Talib, Ramzan
    Hanif, Muhammad Kashif
    Ayesha, Shaeela
    Fatima, Fakeeha
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (11) : 414 - 418
  • [50] Text mining techniques for patent analysis
    Tseng, Yuen-Hsien
    Lin, Chi-Jen
    Lin, Yu-I
    INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (05) : 1216 - 1247