Chinese Hot Topic Extraction Based on Web Log

被引:1
|
作者
Li, Junhua [1 ]
Liu, Zhen [1 ]
Fu, Yan [1 ]
She, Li [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 610054, Peoples R China
关键词
Chinese Hot topic extraction; theme extraction; web log;
D O I
10.1109/WISM.2009.29
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditional topic extraction methods only take text document into account and ignore user's contribution in the process of extraction. But it occurs to us that the browsing status of users in one topic plays a more important role in indicating whether this topic is currently hot than the properties of text document. So in this paper, we bring forward a method of extracting "Chinese hot topic" from a set of text document downloaded from the Internet according to the web log. There are three major steps. Firstly, we get all corrective user information and the textual materials from web according to the web log. Secondly, we extract the hot terms of each web page, computing hotness of theme based on click-through rate and the forgetting factor. Finally, we form hot topics by merging correlative themes on the basis of common hot terms. It can deal with massive textual data with high efficiency and brings a new angle from the users in determining whether a topic is hot or not. We test our method on some data from several portal sites, and find that it detects the topics with highest hotness efficiently.
引用
收藏
页码:103 / 107
页数:5
相关论文
共 50 条
  • [41] Hot Topics Extraction from Chinese Micro-blog Based on Sentence
    Zhou, Chuanfeng
    Zhang, Yuqing
    Li, Beige
    Li, Donghui
    IEEE 12TH INT CONF UBIQUITOUS INTELLIGENCE & COMP/IEEE 12TH INT CONF ADV & TRUSTED COMP/IEEE 15TH INT CONF SCALABLE COMP & COMMUN/IEEE INT CONF CLOUD & BIG DATA COMP/IEEE INT CONF INTERNET PEOPLE AND ASSOCIATED SYMPOSIA/WORKSHOPS, 2015, : 645 - 648
  • [42] Hot Topic Extraction based on Frequency, Position, Scattering and Topical Weight for Time Sliced News Documents
    Jahnavi, Y.
    Radhika, Y.
    2013 15TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING TECHNOLOGIES (ICACT), 2013,
  • [43] Detection and Extraction of Hot Topics on Chinese Microblogs
    Yang, Liang
    Lin, Hongfei
    Lin, Yuan
    Liu, Shengbo
    Cognitive Computation, 2016, 8 (04): : 577 - 586
  • [44] Detection and Extraction of Hot Topics on Chinese Microblogs
    Liang Yang
    Hongfei Lin
    Yuan Lin
    Shengbo Liu
    Cognitive Computation, 2016, 8 : 577 - 586
  • [45] Detection and Extraction of Hot Topics on Chinese Microblogs
    Yang, Liang
    Lin, Hongfei
    Lin, Yuan
    Liu, Shengbo
    COGNITIVE COMPUTATION, 2016, 8 (04) : 577 - 586
  • [46] Hot Topic Detection Based on Complex Networks
    Deng, Jingwei
    Deng, Kaiying
    Li, Yongsheng
    Li, Yingxing
    2013 10TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2013, : 1055 - 1059
  • [47] Log Anomaly Detection Based on Semantic Features and Topic Features
    Wang, Peipeng
    Zhang, Xiuguo
    Cao, Zhiying
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2023, PT V, 2024, 14491 : 407 - 427
  • [48] Hot Topic Clustering Based On Words Distances
    Liu, Hongtao
    Guan, Hongwei
    Jian, Jie
    Liu, Xueyan
    PROCEEDINGS OF THE 2017 5TH INTERNATIONAL CONFERENCE ON FRONTIERS OF MANUFACTURING SCIENCE AND MEASURING TECHNOLOGY (FMSMT 2017), 2017, 130 : 573 - 577
  • [49] Keyword Extraction Based on Multi-feature Fusion for Chinese Web Pages
    He, Qi
    Hao, Hong-Wei
    Yin, Xu-Cheng
    PROCEEDINGS OF THE 2011 2ND INTERNATIONAL CONGRESS ON COMPUTER APPLICATIONS AND COMPUTATIONAL SCIENCE, VOL 1, 2012, 144 : 119 - 124
  • [50] Ontology-Based Information Extraction of Crop Diseases on Chinese Web Pages
    Jiang, Bo
    Zhu, Meng-xia
    Wang, Jia-le
    JOURNAL OF COMPUTERS, 2013, 8 (01) : 85 - 90