Co-occurrence word model for news media hotspot mining-text mining method design

被引:0
|
作者
Zhang X. [1 ]
Ding T. [2 ]
机构
[1] School of Arts and Creative Technologies, The University of York, York
[2] Department of Statistical Science, University College London, London
关键词
co-occurrence word model; hot spot discovery; text mining; theme word extraction; tot news topic;
D O I
10.3934/mbe.2024238
中图分类号
学科分类号
摘要
Currently, with the rapid growth of online media, more people are obtaining information from it. However, traditional hotspot mining algorithms cannot achieve precise and fast control of hot topics. Aiming at the problem of poor accuracy and timeliness in current news media hotspot mining methods, this paper proposes a hotspot mining method based on the co-occurrence word model. First, a new co-occurrence word model based on word weight is proposed. Then, for key phrase extraction, a hotspot mining algorithm based on the co-occurrence word model and improved smooth inverse frequency rank (SIFRANK) is designed. Finally, the Spark computing framework is introduced to improve the computing efficiency. The experimental outcomes expresses that the new word discovery algorithm discovered 16871 and 17921 new words in the Weibo Short News and Weibo Short Text datasets respectively. The heat weight values of the keywords obtained by the improved SIFRANK reaches 0.9356, 0.9991, and 0.6117. In the Covid19 Tweets dataset, the accuracy is 0.6223, the recall is 0.7015, and the F1 value is 0.6605. In the President-elects Tweets dataset, the accuracy is 0.6418, the recall is 0.7162, and the F1 value is 0.6767. After applying the Spark computing framework, the running speed has significantly improved. The text mining news media hotspot mining method based on the co-occurrence word model proposed in this study has improved the accuracy and efficiency of mining hot topics, and has great practical significance. ©2024 the Author(s), licensee AIMS Press.
引用
收藏
页码:5411 / 5429
页数:18
相关论文
共 50 条
  • [1] Text Topic Mining Based on LDA and Co-occurrence Theory
    Wu Maowen
    Zhang CaiDong
    Lan Weiyao
    Wu QingQiang
    PROCEEDINGS OF 2012 7TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, VOLS I-VI, 2012, : 525 - 528
  • [2] Word co-occurrence augmented topic model in short text
    Chen, Guan-Bin
    Kao, Hung-Yu
    INTELLIGENT DATA ANALYSIS, 2017, 21 : S55 - S70
  • [3] Word co-occurrence features for text classification
    Figueiredo, Fabio
    Rocha, Leonardo
    Couto, Thierson
    Salles, Thiago
    Goncalves, Marcos Andre
    Meira, Wagner, Jr.
    INFORMATION SYSTEMS, 2011, 36 (05) : 843 - 858
  • [4] Using Patterns Co-occurrence Matrix for Cleaning Closed Sequential Patterns for Text Mining
    Albathan, Mubarak
    Li, Yuefeng
    Algarni, Abdulmohsen
    2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2012), VOL 1, 2012, : 201 - 205
  • [5] Text Similarity Computing Based on LDA Topic Model and Word Co-occurrence
    Shao, Minglai
    Qin, Liangxi
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, KNOWLEDGE ENGINEERING AND INFORMATION ENGINEERING (SEKEIE 2014), 2014, 114 : 199 - 203
  • [6] Mining Regional Co-Occurrence Patterns for Image Classification
    Ji, Zhihang
    Wu, Sining
    Wang, Fan
    Xu, Lijuan
    Yang, Yan
    Hu, Xiaopeng
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2018, 2018
  • [7] DIRECT MINING CO-OCCURRENCE FEATURES FOR VISUAL RECOGNITION: A BRANCH AND BOUND METHOD
    Weng, Chaoqun
    Jiang, Yuning
    Yuan, Junsong
    2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2013), 2013,
  • [8] Mining spatiotemporal co-occurrence patterns in solar datasets
    Aydin, B.
    Kempton, D.
    Akkineni, V.
    Angryk, R.
    Pillai, K. G.
    ASTRONOMY AND COMPUTING, 2015, 13 : 136 - 144
  • [9] Research of Mining Partial Periodic Co-occurrence Patterns
    Wang, Zhanquan
    Kong, Man
    Tang, Minwei
    Shi, Kai
    2014 11TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2014, : 3747 - 3752
  • [10] Mining consistent correspondences using co-occurrence statistics
    Xiao, Guobao
    Wang, Shiping
    Wang, Han
    Ma, Jiayi
    PATTERN RECOGNITION, 2021, 119