Ontology-based automatic classification and ranking for web documents

被引:9
|
作者
Fang, Jun [1 ]
Guo, Lei [1 ]
Wang, XiaoDong [1 ]
Yang, Ning [1 ]
机构
[1] Northwestern Polytech Univ, Sch Automat, Control & Networks Lab, Xian, Peoples R China
关键词
D O I
10.1109/FSKD.2007.432
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The process of web document classification involves calculating similarities between documents and categories by using the information extracted from them. In recent years, ontology-based web documents classification method is introduced to solve the problem of classifier training and not considering semantic relations between words in traditional Machine Learning algorithms. However, previous works on ontology-based web documents classification miss some important tissues of automatic ontology construction and ranking of classified documents. In order to solve these problems, this paper proposes an ontology-based web documents classification and ranking method Firstly, weighted terms set are extracted from web documents, and ontology is build up by using an effective ontology construction method which clarifies and augments an existent ontology; then similarity score between documents and ontology is computed based on WordNet by using Earth Mover's Distance (EMD) method; finally, web documents are assigned to categories according to the similarity score, and a simple ranking method is used to sort the documents in the same categories. The experiment result shows our classification algorithm achieves better precision and recall compare with adaptive KNN method, and is competitive with SVM method, the ranking method also has good performance.
引用
收藏
页码:627 / 631
页数:5
相关论文
共 50 条
  • [1] Ontology-based automatic classification of web documents
    Song, MuHee
    Lim, SooYeon
    Kang, DongJin
    Lee, SangJo
    [J]. COMPUTATIONAL INTELLIGENCE, PT 2, PROCEEDINGS, 2006, 4114 : 690 - 700
  • [2] Ontology-based automatic classification of web pages
    Song, Mu-Hee
    Lim, Soo-Yeon
    Park, Seong-Bae
    Kang, Dong-Jin
    Lee, Sang-Jo
    [J]. APPLIED SOFT COMPUTING TECHNOLOGIES: THE CHALLENGE OF COMPLEXITY, 2006, 34 : 483 - 493
  • [3] Automatic ontology-based knowledge extraction from web documents
    Alani, H
    Kim, S
    Millard, DE
    Weal, MJ
    Hall, W
    Lewis, PH
    Shadbolt, NR
    [J]. IEEE INTELLIGENT SYSTEMS, 2003, 18 (01) : 14 - 21
  • [4] Ontology-based automatic classification for the web pages: Design, implementation and evaluation
    Prabowo, R
    Jackson, M
    Burden, P
    Knoell, HD
    [J]. WISE 2002: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS ENGINEERING, 2002, : 182 - 191
  • [5] Ontology-based semantic classification of unstructured documents
    Cheng, CK
    Pan, XS
    Kurfess, F
    [J]. ADAPTIVE MULTIMEDIA RETRIEVAL, 2004, 3094 : 120 - 131
  • [6] Ontology-Based Automatic Annotation: An Approach for Efficient Retrieval of Semantic Results of Web Documents
    Tulasi, R. Lakshmi
    Rao, Meda Sreenivasa
    Ankita, K.
    Hgoudar, R.
    [J]. PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INFORMATICS, ICCII 2016, 2017, 507 : 331 - 339
  • [7] An ontology-based mechanism for automatic categorization of web services
    Kehagias, Dionysios D.
    Giannoutakis, Konstantinos M.
    Gravvanis, George A.
    Tzovaras, Dimitrios
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2012, 24 (03): : 214 - 236
  • [8] Ontology-Based Multilabel Text Classification of Construction Regulatory Documents
    Zhou, Peng
    El-Gohary, Nora
    [J]. JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 2016, 30 (04)
  • [9] An ontology-based approach for semantics ranking of the web search engines results
    Bouramoul, Abdelkrim
    Kholladi, Mohamed-Khireddine
    Doan, Bich-Lien
    [J]. 2012 INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS (ICMCS), 2012, : 797 - 802
  • [10] Multiple Ontology-Based Indexing of Multimedia Documents on the World Wide Web
    Maree, Mohammed
    Belkhatir, Mohammed
    Fauzi, Fariza
    Kmail, Aseel B.
    Ewais, Ahmad
    Sabha, Muath
    [J]. INTELLIGENT DECISION TECHNOLOGIES 2016, PT II, 2016, 57 : 51 - 62