Research of News Text with Word Frequency Statistics and User Information

被引:0
|
作者
Liu, Shan [1 ]
Huang, Kun [1 ]
Chai, Jianping [1 ]
机构
[1] Commun Univ China, Sch Informat Engn, Beijing, Peoples R China
关键词
word frequency; user information; tag model; TF-IDF algorithm;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper focuses on the news collection of a single topic, analyzes the large number of keywords and phrases in the news, uses the traditional TF-IDF algorithm based on the word frequency, adds the weight to the tags, extracts the keywords which have certain use value in the news, and uses the tag selection formula to tag the news. In addition, collect user reviews on the Internet news portal to create user comments information tags library for the news tag group to make the news tags more fit user needs. In this paper, the value of the tag is measured by the recall rate and the F1 score, and the tag based on the news content is analyzed according to the intermediate variable in the model operation. It also summarizes the general content of the topic by analyzing the tags distribution of the news collection.
引用
收藏
页码:2633 / 2637
页数:5
相关论文
共 50 条
  • [1] Research of News Tagging Based on Word Frequency Statistics and User Information
    Liu, Shan
    Huang, Kun
    Chai, Jianping
    [J]. 2017 10TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI), 2017,
  • [2] Word Frequency Statistics by Tree-Structure Algorithm Research
    Li, Huanqin
    Yan, Shi-Tao
    [J]. INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2012), 2013, 8768
  • [3] Identifying Modes of User Engagement with Online News and Their Relationship to Information Gain in Text
    Grinberg, Nir
    [J]. WEB CONFERENCE 2018: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW2018), 2018, : 1745 - 1754
  • [4] On word frequency information and negative evidence in Naive Bayes text classification
    Schneider, KM
    [J]. ADVANCES IN NATURAL LANGUAGE PROCESSING, 2004, 3230 : 474 - 485
  • [5] Information Technology in eParticipation Research: A Word Frequency Analysis
    Bohman, Samuel
    [J]. ELECTRONIC PARTICIPATION, EPART 2014, 2014, 8654 : 78 - 89
  • [6] Research and Application of News-text Similarity Algorithm based on Chinese word segmentation
    Guan, Wei
    Zhang, Pengzhou
    [J]. 2013 3RD INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, COMMUNICATIONS AND NETWORKS (CECNET), 2013, : 484 - 487
  • [7] The randomness assumption in word frequency statistics
    Baayen, RH
    [J]. RESEARCH IN HUMANITIES COMPUTING 5: SELECTED PAPERS FROM THE ACH/ALLC CONFERENCE, UNIVERSITY OF CALIFORNIA, SANTA BARBARA, AUGUST 1995, 1996, 5 : 17 - 31
  • [8] A New Way of News Extraction by Text Washing and Statistics
    Wang Su
    Du Junping
    Gao Tian
    [J]. PROCEEDINGS OF THE 2011 INTERNATIONAL CONFERENCE ON INFORMATICS, CYBERNETICS, AND COMPUTER ENGINEERING (ICCE2011), VOL 2: INFORMATION SYSTEMS AND COMPUTER ENGINEERING, 2011, 111 : 195 - 203
  • [9] A New Way of News Extraction by Text Washing and Statistics
    Wang Su
    Du Junping
    Gao Tian
    [J]. 2010 INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT (CCCM2010), VOL II, 2010, : 41 - 45
  • [10] WORD PROCESSORS, TEXT PROCESSORS - ASPECTS OF THE USER INTERFACE
    BERRY, RE
    HALL, JA
    [J]. JOURNAL OF MICROCOMPUTER APPLICATIONS, 1985, 8 (02): : 175 - 179