STRUCTURAL TOPIC MINING IN WEB COLLECTIONS

被引:0
|
作者
Garza, Sara E. [1 ]
Brena, Ramon F. [2 ]
机构
[1] UANL, Sch Mech & Elect Engn, Postgrad Div Computat & Mechatron, San Nicolas De Los Garza, NL, Mexico
[2] Tecnol Monterrey, Sch Engn & Informat Technol, Campus Monterrey, NL, Mexico
关键词
Topic Mining; Graph Clustering; Structure; Wikipedia;
D O I
暂无
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
This paper introduces structural topic mining: an approach for discovering and describing thematically related document groups in large document collections. A collection is viewed as a directed graph where vertices represent documents and arcs represent connections among these. Because a document is likely to have more connections to documents of the same theme, we have assumed that topics have the structure of a graph cluster, i.e. a group of vertices with more arcs to the inside of the group and fewer arcs to the outside. So, topics could be discovered by clustering the document graph; a local approach is used for scalability. We also extract properties (keywords and representative documents) from clusters. This approach was tested over Wikipedia, and the resulting clusters in fact correspond to topics; this shows that topic mining can be treated as a graph clustering problem. Comparative results suggest considerable quality at a low cost.
引用
收藏
页码:271 / 285
页数:15
相关论文
共 50 条
  • [1] Web Service Orchestration Topic Mining
    Chu, Victor W.
    Wong, Raymond K.
    Chi, Chi-Hung
    Hung, Patrick C. K.
    [J]. 2014 IEEE 21ST INTERNATIONAL CONFERENCE ON WEB SERVICES (ICWS 2014), 2014, : 225 - 232
  • [2] Graph Local Clustering for Topic Detection in Web Collections
    Garza, Sara E.
    Brena, Ramon
    [J]. LA-WEB: 2009 LATIN AMERICAN WEB CONGRESS, 2009, : 207 - 213
  • [3] Topic mining on web-shared videos
    Liu, Lu
    Rui, Yong
    Sun, Li-Feng
    Yang, Bo
    Zhang, Jianwei
    Yang, Shi-Qiang
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 2145 - +
  • [4] Web Log Mining based on Website Topic
    Yu, Xiaobing
    Guo, Shunsheng
    Peng, Zhao
    [J]. SEVENTH WUHAN INTERNATIONAL CONFERENCE ON E-BUSINESS, VOLS I-III: UNLOCKING THE FULL POTENTIAL OF GLOBAL TECHNOLOGY, 2008, : 874 - 878
  • [5] Enhanced web information retrieval by topic tag mining
    Yong, Ding
    [J]. Journal of Convergence Information Technology, 2011, 6 (04) : 18 - 24
  • [6] Research on Web Data Mining Based on Topic Crawler
    Guo, Hongjian
    [J]. JOURNAL OF WEB ENGINEERING, 2021, 20 (04): : 1131 - 1143
  • [7] Research and improvement on topic distillation algorithm in web mining
    Wang, Bao-Yi
    Ding, Juan
    [J]. PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 1561 - +
  • [8] Common Topic Group Mining for Web Service Discovery
    Wang, Jian
    Gao, Panpan
    Ma, Yutao
    He, Keqing
    [J]. ADVANCES IN SERVICES COMPUTING, APSCC 2015, 2015, 9464 : 92 - 107
  • [9] An Information-Theoretic Approach for Unsupervised Topic Mining in Large Text Collections
    Ramirez, Eduardo H.
    Brena, Ramon F.
    [J]. 2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 1, 2009, : 331 - 334
  • [10] Analyzing Web archives through Topic and Event Focused Sub-Collections
    Gossen, Gerhard
    Demidova, Elena
    Risse, Thomas
    [J]. PROCEEDINGS OF THE 2016 ACM WEB SCIENCE CONFERENCE (WEBSCI'16), 2016, : 291 - 295