Co-occurrence word model for news media hotspot mining-text mining method design

被引:0
|
作者
Zhang X. [1 ]
Ding T. [2 ]
机构
[1] School of Arts and Creative Technologies, The University of York, York
[2] Department of Statistical Science, University College London, London
关键词
co-occurrence word model; hot spot discovery; text mining; theme word extraction; tot news topic;
D O I
10.3934/mbe.2024238
中图分类号
学科分类号
摘要
Currently, with the rapid growth of online media, more people are obtaining information from it. However, traditional hotspot mining algorithms cannot achieve precise and fast control of hot topics. Aiming at the problem of poor accuracy and timeliness in current news media hotspot mining methods, this paper proposes a hotspot mining method based on the co-occurrence word model. First, a new co-occurrence word model based on word weight is proposed. Then, for key phrase extraction, a hotspot mining algorithm based on the co-occurrence word model and improved smooth inverse frequency rank (SIFRANK) is designed. Finally, the Spark computing framework is introduced to improve the computing efficiency. The experimental outcomes expresses that the new word discovery algorithm discovered 16871 and 17921 new words in the Weibo Short News and Weibo Short Text datasets respectively. The heat weight values of the keywords obtained by the improved SIFRANK reaches 0.9356, 0.9991, and 0.6117. In the Covid19 Tweets dataset, the accuracy is 0.6223, the recall is 0.7015, and the F1 value is 0.6605. In the President-elects Tweets dataset, the accuracy is 0.6418, the recall is 0.7162, and the F1 value is 0.6767. After applying the Spark computing framework, the running speed has significantly improved. The text mining news media hotspot mining method based on the co-occurrence word model proposed in this study has improved the accuracy and efficiency of mining hot topics, and has great practical significance. ©2024 the Author(s), licensee AIMS Press.
引用
收藏
页码:5411 / 5429
页数:18
相关论文
共 50 条
  • [31] Co-occurrence pattern mining based on a biological approximation scoring matrix
    Guo, Dan
    Yuan, Ermao
    Hu, Xuegang
    Wu, Xindong
    PATTERN ANALYSIS AND APPLICATIONS, 2018, 21 (04) : 977 - 996
  • [32] Co-occurrence pattern mining based on a biological approximation scoring matrix
    Dan Guo
    Ermao Yuan
    Xuegang Hu
    Xindong Wu
    Pattern Analysis and Applications, 2018, 21 : 977 - 996
  • [33] Mining top-k co-occurrence items with sequential pattern
    Tung Kieu
    Bay Vo
    Tuong Le
    Deng, Zhi-Hong
    Bac Le
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 85 : 123 - 133
  • [34] A New Social Media Topic Mining Method Based on Co-word Network
    Wang Y.
    Fu X.
    Li M.
    Wuhan Daxue Xuebao (Xinxi Kexue Ban)/Geomatics and Information Science of Wuhan University, 2018, 43 (12): : 2287 - 2294
  • [35] Mining a chemical database for fragment co-occurrence:: Discovery of "chemical cliches"
    Lameijer, EW
    Kok, JN
    Bäck, T
    Ijzerman, AP
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2006, 46 (02) : 553 - 562
  • [36] Implicit Feature Identification via Co-occurrence Association Rule Mining
    Hai, Zhen
    Chang, Kuiyu
    Kim, Jung-jae
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, PT I, 2011, 6608 : 393 - 404
  • [37] Mining spatiotemporal co-occurrence patterns in non-relational databases
    Aydin, Berkay
    Akkineni, Vijay
    Angryk, Rafal
    GEOINFORMATICA, 2016, 20 (04) : 801 - 828
  • [38] Hotspots of News Articles: Joint Mining of News Text & Social Media to Discover Controversial Points in News
    Lourentzou, Ismini
    Dyer, Graham
    Sharma, Abhishek
    Zhai, ChengXiang
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 2948 - 2950
  • [39] Clustering Analysis of Feature Words in News Text Based on Co-occurrence Matrix
    Liu, Shan
    Fan, Xinyi
    Chai, Jianping
    2017 10TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI), 2017,
  • [40] A word co-occurrence matrix based method for relevance feedback
    Chen, Zilong
    Lu, Yang
    Journal of Computational Information Systems, 2011, 7 (01): : 17 - 24