Co-occurrence word model for news media hotspot mining-text mining method design

被引:0
|
作者
Zhang X. [1 ]
Ding T. [2 ]
机构
[1] School of Arts and Creative Technologies, The University of York, York
[2] Department of Statistical Science, University College London, London
关键词
co-occurrence word model; hot spot discovery; text mining; theme word extraction; tot news topic;
D O I
10.3934/mbe.2024238
中图分类号
学科分类号
摘要
Currently, with the rapid growth of online media, more people are obtaining information from it. However, traditional hotspot mining algorithms cannot achieve precise and fast control of hot topics. Aiming at the problem of poor accuracy and timeliness in current news media hotspot mining methods, this paper proposes a hotspot mining method based on the co-occurrence word model. First, a new co-occurrence word model based on word weight is proposed. Then, for key phrase extraction, a hotspot mining algorithm based on the co-occurrence word model and improved smooth inverse frequency rank (SIFRANK) is designed. Finally, the Spark computing framework is introduced to improve the computing efficiency. The experimental outcomes expresses that the new word discovery algorithm discovered 16871 and 17921 new words in the Weibo Short News and Weibo Short Text datasets respectively. The heat weight values of the keywords obtained by the improved SIFRANK reaches 0.9356, 0.9991, and 0.6117. In the Covid19 Tweets dataset, the accuracy is 0.6223, the recall is 0.7015, and the F1 value is 0.6605. In the President-elects Tweets dataset, the accuracy is 0.6418, the recall is 0.7162, and the F1 value is 0.6767. After applying the Spark computing framework, the running speed has significantly improved. The text mining news media hotspot mining method based on the co-occurrence word model proposed in this study has improved the accuracy and efficiency of mining hot topics, and has great practical significance. ©2024 the Author(s), licensee AIMS Press.
引用
收藏
页码:5411 / 5429
页数:18
相关论文
共 50 条
  • [21] Partial spatio-temporal co-occurrence pattern mining
    Mete Celik
    Knowledge and Information Systems, 2015, 44 : 27 - 49
  • [22] Mixed-drove spatiotemporal co-occurrence pattern mining
    Celik, Mete
    Shekhar, Shashi
    Rogers, James P.
    Shine, James A.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (10) : 1322 - 1335
  • [23] Text Authorship Identified Using the Dynamics of Word Co-Occurrence Networks
    Akimushkin, Camilo
    Amancio, Diego Raphael
    Oliveira, Osvaldo Novais, Jr.
    PLOS ONE, 2017, 12 (01):
  • [24] Combining word based and word co-occurrence based sequence analysis for text categorization
    Luo, X
    Zincir-Heywood, AN
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 1580 - 1585
  • [25] Text Classification Method Based on Co-occurrence Events
    Huang, Chan
    Luo, Yanmei
    Li, Qingyuan
    2019 15TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS 2019), 2019, : 277 - 281
  • [26] Identification of air pollution patterns using a modified fuzzy co-occurrence pattern mining method
    M. Akbari
    F. Samadzadegan
    International Journal of Environmental Science and Technology, 2015, 12 : 3551 - 3562
  • [27] Identification of air pollution patterns using a modified fuzzy co-occurrence pattern mining method
    Akbari, M.
    Samadzadegan, F.
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL SCIENCE AND TECHNOLOGY, 2015, 12 (11) : 3551 - 3562
  • [28] SeCo-LDA: Mining Service Co-occurrence Topics for Recommendation
    Gao, Zhenfeng
    Fan, Yushun
    Wu, Cheng
    Tan, Wei
    Zhang, Jia
    Ni, Yayu
    Bai, Bing
    Chen, Shuhui
    2016 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES (ICWS), 2016, : 25 - 32
  • [29] Spatiotemporal Indexing Techniques for Efficiently Mining Spatiotemporal Co-occurrence Patterns
    Aydin, Berkay
    Kempton, Dustin
    Akkineni, Vijay
    Gopavaram, Shaktidhar Reddy
    Pillai, Karthik Ganesan
    Angryk, Rafal
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014,
  • [30] Mining spatiotemporal co-occurrence patterns in non-relational databases
    Berkay Aydin
    Vijay Akkineni
    Rafal Angryk
    GeoInformatica, 2016, 20 : 801 - 828