Chinese Hot Topic Extraction Based on Web Log

被引:1
|
作者
Li, Junhua [1 ]
Liu, Zhen [1 ]
Fu, Yan [1 ]
She, Li [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 610054, Peoples R China
关键词
Chinese Hot topic extraction; theme extraction; web log;
D O I
10.1109/WISM.2009.29
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditional topic extraction methods only take text document into account and ignore user's contribution in the process of extraction. But it occurs to us that the browsing status of users in one topic plays a more important role in indicating whether this topic is currently hot than the properties of text document. So in this paper, we bring forward a method of extracting "Chinese hot topic" from a set of text document downloaded from the Internet according to the web log. There are three major steps. Firstly, we get all corrective user information and the textual materials from web according to the web log. Secondly, we extract the hot terms of each web page, computing hotness of theme based on click-through rate and the forgetting factor. Finally, we form hot topics by merging correlative themes on the basis of common hot terms. It can deal with massive textual data with high efficiency and brings a new angle from the users in determining whether a topic is hot or not. We test our method on some data from several portal sites, and find that it detects the topics with highest hotness efficiently.
引用
收藏
页码:103 / 107
页数:5
相关论文
共 50 条
  • [21] Topic Detection and Tracking for Chinese News Web Pages
    Jing Qiu
    Liao, LeJian
    Dong, XiuJie
    ALPIT 2008: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED LANGUAGE PROCESSING AND WEB INFORMATION TECHNOLOGY, PROCEEDINGS, 2008, : 114 - 120
  • [22] Error Checking for Chinese Query by Mining Web Log
    Duan, Jianyong
    Mi, Peng
    Liu, Hui
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [23] Exploring various features to optimize hot topic retrieval on WEB
    You, L
    Huang, XJ
    Wu, LD
    Yu, H
    Wang, J
    Nishino, F
    ADVANCES IN NEURAL NETWORKS - ISNN 2004, PT 1, 2004, 3173 : 1025 - 1031
  • [24] Hot Topic Detection of Web Video Based on Cross-Media Semantic Association Enhancement
    Zhang C.
    Liu Y.
    Xiao X.
    Mei K.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (11): : 2624 - 2637
  • [25] A Novel POS-Based Approach to Chinese News Topic Extraction from Internet
    Zhao, Xujian
    Jin, Peiquan
    Yue, Lihua
    2008 SECOND INTERNATIONAL CONFERENCE ON FUTURE GENERATION COMMUNICATION AND NETWORKING SYMPOSIA, VOLS 1-5, PROCEEDINGS, 2008, : 135 - 138
  • [26] An Improved Latent Dirichlet Allocation Model for Hot Topic Extraction
    Liu, Guolong
    Xu, Xiaofei
    Zhu, Ying
    Li, Li
    2014 IEEE FOURTH INTERNATIONAL CONFERENCE ON BIG DATA AND CLOUD COMPUTING (BDCLOUD), 2014, : 470 - 476
  • [27] Deep Learning for Hot Topic Extraction from Social Streams
    Rekik, Amal
    Jamoussi, Salma
    PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS (HIS 2016), 2017, 552 : 186 - 197
  • [28] Dependency Tree based Chinese Relation Extraction over Web Data
    Zheng, Shanshan
    Yang, Jing
    Lin, Xin
    Gu, JunZhong
    2012 SEVENTH INTERNATIONAL CONFERENCE ON KNOWLEDGE, INFORMATION AND CREATIVITY SUPPORT SYSTEMS (KICSS 2012), 2012, : 104 - 110
  • [29] Keyword extraction based on lexical chains for Chinese news web pages
    Hu, Xue-Gang
    Li, Xing-Hua
    Xie, Fei
    Wu, Xin-Dong
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2010, 23 (01): : 45 - 51
  • [30] Content Extraction from Web Pages Based on Chinese Punctuation Number
    Song, Mingqiu
    Wu, Xintao
    2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 5573 - 5575