Individualized Automatic Classification of Web Documents

被引:0
|
作者
Tsai, Yihjia [1 ]
Chen, Kaun-Yu [1 ]
机构
[1] Tamkang Univ, Dept Comp Sci & Informat Engn, Taipei, Taiwan
关键词
Naive Bayes; web documents classification;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper applies Naive Bayes classifier in designing customized automatic web document classification to systematically collecting massive news articles from the Internet. The proposed news classification system allows users to establish the necessary information classifications based on their own preferences. When the amount of daily news is increasing, this approach enables users to effectively filter through large amount of articles and more focused on interested articles. Performances of the proposed approach are characterized by the recall rate and precision. This system can achieve over 66% recall rate, and over 89% precision rate for a real-world Chinese test database.
引用
下载
收藏
页码:410 / 412
页数:3
相关论文
共 50 条
  • [1] Ontology-based automatic classification of web documents
    Song, MuHee
    Lim, SooYeon
    Kang, DongJin
    Lee, SangJo
    COMPUTATIONAL INTELLIGENCE, PT 2, PROCEEDINGS, 2006, 4114 : 690 - 700
  • [2] An automatic classification technique and tool for information retrieval of web documents
    Di Martino, B
    Mazzocca, N
    Squeglia, A
    Mazzeo, A
    CONCURRENT ENGINEERING: ENHANCED INTEROPERABLE SYSTEMS, 2003, : 1043 - 1050
  • [3] Ontology-based automatic classification and ranking for web documents
    Fang, Jun
    Guo, Lei
    Wang, XiaoDong
    Yang, Ning
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 3, PROCEEDINGS, 2007, : 627 - 631
  • [4] Multiple sets of features for automatic genre classification of web documents
    Lim, CS
    Lee, KJ
    Kim, GC
    INFORMATION PROCESSING & MANAGEMENT, 2005, 41 (05) : 1263 - 1276
  • [5] Automatic documents classification
    Mohamed, Hoda K.
    2007 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS: ICCES '07, 2007, : 33 - 37
  • [6] Improving concept hierarchy development for web returned documents using automatic classification
    Wu, YFB
    Bot, RS
    Chen, X
    Li, QZ
    ICOMP '05: PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON INTERNET COMPUTING, 2005, : 99 - 105
  • [7] Automatic Genre Classification of Web Documents Using Discriminant Analysis for Feature Selection
    Maeda, Akira
    Hayashi, Yukinori
    2009 SECOND INTERNATIONAL CONFERENCE ON THE APPLICATIONS OF DIGITAL INFORMATION AND WEB TECHNOLOGIES (ICADIWT 2009), 2009, : 405 - +
  • [8] CLASSIFICATION OF SENSITIVE WEB DOCUMENTS
    Gao, Hui
    Fu, Yan
    Li, Jian-Ping
    2008 INTERNATIONAL CONFERENCE ON APPERCEIVING COMPUTING AND INTELLIGENCE ANALYSIS (ICACIA 2008), 2008, : 295 - 298
  • [9] Automatic genre detection of Web documents
    Lim, CS
    Lee, KJ
    Kim, GC
    NATURAL LANGUAGE PROCESSING - IJCNLP 2004, 2005, 3248 : 310 - 319
  • [10] Automatic classification experience of documents about Life Sciences and Biomedicine obtained in the Web of Science
    Bautista, Luis Roberto Polo
    Bautista, Israel Polo
    INVESTIGACION BIBLIOTECOLOGICA, 2022, 36 (93): : 13 - 32