Co-occurrence word model for news media hotspot mining-text mining method design

被引:0
|
作者
Zhang X. [1 ]
Ding T. [2 ]
机构
[1] School of Arts and Creative Technologies, The University of York, York
[2] Department of Statistical Science, University College London, London
关键词
co-occurrence word model; hot spot discovery; text mining; theme word extraction; tot news topic;
D O I
10.3934/mbe.2024238
中图分类号
学科分类号
摘要
Currently, with the rapid growth of online media, more people are obtaining information from it. However, traditional hotspot mining algorithms cannot achieve precise and fast control of hot topics. Aiming at the problem of poor accuracy and timeliness in current news media hotspot mining methods, this paper proposes a hotspot mining method based on the co-occurrence word model. First, a new co-occurrence word model based on word weight is proposed. Then, for key phrase extraction, a hotspot mining algorithm based on the co-occurrence word model and improved smooth inverse frequency rank (SIFRANK) is designed. Finally, the Spark computing framework is introduced to improve the computing efficiency. The experimental outcomes expresses that the new word discovery algorithm discovered 16871 and 17921 new words in the Weibo Short News and Weibo Short Text datasets respectively. The heat weight values of the keywords obtained by the improved SIFRANK reaches 0.9356, 0.9991, and 0.6117. In the Covid19 Tweets dataset, the accuracy is 0.6223, the recall is 0.7015, and the F1 value is 0.6605. In the President-elects Tweets dataset, the accuracy is 0.6418, the recall is 0.7162, and the F1 value is 0.6767. After applying the Spark computing framework, the running speed has significantly improved. The text mining news media hotspot mining method based on the co-occurrence word model proposed in this study has improved the accuracy and efficiency of mining hot topics, and has great practical significance. ©2024 the Author(s), licensee AIMS Press.
引用
收藏
页码:5411 / 5429
页数:18
相关论文
共 50 条
  • [41] Text Clustering Algorithm Based on the Graph Structures of Semantic Word Co-occurrence
    Jin, Chun-Xia
    Bai, Qiu-Chan
    2016 INTERNATIONAL CONFERENCE ON INFORMATION SYSTEM AND ARTIFICIAL INTELLIGENCE (ISAI 2016), 2016, : 497 - 502
  • [42] Research on a novel word co-occurrence model and its application
    Zheng, Dequan
    Zhao, Tiejun
    Li, Sheng
    Yu, Hao
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, 2007, 4798 : 437 - 446
  • [43] TWCM: Twitter Word Co-occurrence Model for Event Detection
    Garg, Muskan
    Kumar, Mukesh
    8TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING & COMMUNICATIONS (ICACC-2018), 2018, 143 : 434 - 441
  • [44] News media sentiment and asset prices in Korea: text-mining approach
    Pyo, Dong-Jin
    Kim, Jungho
    ASIA-PACIFIC JOURNAL OF ACCOUNTING & ECONOMICS, 2021, 28 (02) : 183 - 205
  • [45] MCoR-Miner: Maximal Co-Occurrence Nonoverlapping Sequential Rule Mining
    Li, Yan
    Zhang, Chang
    Li, Jie
    Song, Wei
    Qi, Zhenlian
    Wu, Youxi
    Wu, Xindong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (09) : 9531 - 9546
  • [46] Co-occurrence Interaction Networks of Extremophile Species Living in a Copper Mining Tailing
    Galvez, Gabriel
    Ortega, Jaime
    Fredericksen, Fernanda
    Aliaga-Tobar, Victor
    Parra, Valentina
    Reyes-Jara, Angelica
    Pizarro, Lorena
    Latorre, Mauricio
    FRONTIERS IN MICROBIOLOGY, 2022, 12
  • [47] A Pattern Growth-based Approach for Mining Spatiotemporal Co-occurrence Patterns
    Hamdi, Shah Muhammad
    Aydin, Berkay
    Angryk, Rafal A.
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2016, : 1125 - 1132
  • [48] Research of Mining Algorithms for Uncertain Spatio-temporal Co-occurrence Pattern
    Wang, Zhanquan
    Lu, Bowen
    Ying, Fangli
    Kong, Man
    Tang, Minwei
    2017 9TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST), 2017, : 12 - 17
  • [49] Mining Top-k Co-Occurrence Patterns across Multiple Streams
    Amagata, Daichi
    Hara, Takahiro
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (10) : 2249 - 2262
  • [50] A spatiotemporal co-occurrence pattern mining algorithm based on ship trajectory data
    Feng, Chengxu
    Xu, Jianghu
    Zhang, Jianqiang
    Li, Houpu
    ADVANCES IN MECHANICAL ENGINEERING, 2024, 16 (09)