Topic discovery based on text mining techniques

被引:50
|
作者
Pons-Porrata, Aurora
Berlanga-Llavori, Rafael
Ruiz-Shulcloper, Jose
机构
[1] Univ Jaume 1, E-12071 Castellon de La Plana, Spain
[2] Univ Oriente, Ctr Pattern Recognit & Data Min, Santiago De Cuba 90500, Cuba
[3] Adv Technol Applicat Ctr, Havana, Cuba
关键词
hierarchical clustering; text summarization; topic detection;
D O I
10.1016/j.ipm.2006.06.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present a topic discovery system aimed to reveal the implicit knowledge present in news streams. This knowledge is expressed as a hierarchy of topic/subtopics, where each topic contains the set of documents that are related to it and a summary extracted from these documents. Summaries so built are useful to browse and select topics of interest from the generated hierarchies. Our proposal consists of a new incremental hierarchical clustering algorithm, which combines both partitional and agglomerative approaches, taking the main benefits from them. Finally, a new summarization method based on Testor Theory has been proposed to build the topic summaries. Experimental results in the TDT2 collection demonstrate its usefulness and effectiveness not only as a topic detection system, but also as a classification and summarization tool. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:752 / 768
页数:17
相关论文
共 50 条
  • [1] Heterogeneous Latent Topic Discovery for Semantic Text Mining
    Li, Yawen
    Jiang, Di
    Lian, Rongzhong
    Wu, Xueyang
    Tan, Conghui
    Xu, Yi
    Su, Zhiyang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (01) : 533 - 544
  • [2] A Hybrid Method for Manufacturing Text Mining Based on Document Clustering and Topic Modeling Techniques
    Shotorbani, Peyman Yazdizadeh
    Ameri, Farhad
    Kulvatunyou, Boonserm
    Ivezic, Nenad
    ADVANCES IN PRODUCTION MANAGEMENT SYSTEMS: INITIATIVES FOR A SUSTAINABLE WORLD, 2016, 488 : 777 - 786
  • [3] Study on Topic Evolution based on Text Mining
    Wang, Jinlong
    Geng, Xueyu
    Gao, Ke
    Li, Lan
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 509 - +
  • [4] Decarbonization of Turkey: Text Mining Based Topic Modeling for the Literature
    Yilmaz, Selin
    Yesil, Ercem
    Kaya, Tolga
    INTELLIGENT AND FUZZY SYSTEMS: DIGITAL ACCELERATION AND THE NEW NORMAL, INFUS 2022, VOL 2, 2022, 505 : 372 - 379
  • [5] The Research of Popular Topic Mining Method Based on Microblogging Text
    Wen Hao
    Li Zhao-hui
    2014 FOURTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION AND MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC), 2014, : 888 - 892
  • [6] Topic Modeling Techniques for Text Mining over a Large-Scale Scientific and Biomedical Text Corpus
    Avasthi S.
    Chauhan R.
    Acharjya D.P.
    International Journal of Ambient Computing and Intelligence, 2022, 13 (01)
  • [7] A Survey of Topic Modeling in Text Mining
    Alghamdi, Rubayyi
    Alfalqi, Khalid
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2015, 6 (01) : 147 - 153
  • [8] Research of BBS Based on Text Mining Techniques
    Wu, Weijie
    Huang, Weitong
    Yang, Shiqiang
    Xu, Luxiong
    2009 WASE INTERNATIONAL CONFERENCE ON INFORMATION ENGINEERING, ICIE 2009, VOL I, 2009, : 595 - +
  • [9] Text Mining-Based Drug Discovery in Osteoarthritis
    Yu, Rong-Guo
    Zhang, Jia-Yu
    Liu, Zhen-Tao
    Zhuo, You-Guang
    Wang, Hai-Yang
    Ye, Jie
    Liu, Nannan
    Zhang, Yi-Yuan
    JOURNAL OF HEALTHCARE ENGINEERING, 2021, 2021
  • [10] Techniques on Text Mining
    Sukanya, M.
    Biruntha, S.
    2012 IEEE INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION CONTROL AND COMPUTING TECHNOLOGIES (ICACCCT), 2012, : 269 - 271