Applying semantic links for classifying Web pages

被引:0
|
作者
Choi, B [1 ]
Guo, Q [1 ]
机构
[1] Louisiana Tech Univ, Coll Engn & Sci, Ruston, LA 71272 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic hypertext classification is an essential technique for organizing vast amount of Internet Web pages or HTML documents. One the of problems in classifying Web pages is that Web pages are usually short and contain insufficient text to clearly identify its category. Text classification mechanisms, by analyzing only the contents of the document itself, are relatively ineffective in classifying short Web pages. This paper proposes a new hypertext classification mechanism to address the problem by analyzing not only the Web page itself but also its linked Web pages referred by the URLs contained within the page. The URLs are treated as semantic links. The hypothesis is that the linked Web pages contain related information to help identifying the category of the Web page. Experimental results show that the proposed approach could increase the accuracy by 35% over the approach of analyzing only the Web page itself.
引用
收藏
页码:148 / 153
页数:6
相关论文
共 50 条
  • [21] A Hybrid Method for Semantic Annotation of Chinese Web Pages
    Jing, Tao
    Zuo, Wanli
    He, Fengling
    2008 INTERNATIONAL SEMINAR ON FUTURE INFORMATION TECHNOLOGY AND MANAGEMENT ENGINEERING, PROCEEDINGS, 2008, : 252 - 256
  • [22] Semantic prefetching objects of slower web site pages
    Pons, Alexander P.
    JOURNAL OF SYSTEMS AND SOFTWARE, 2006, 79 (12) : 1715 - 1724
  • [23] A Formal Model for Classifying Trusted Semantic Web Services
    Galizia, Stefania
    Gugliotta, Alessio
    Pedrinaci, Carlos
    SEMANTIC WEB, PROCEEDINGS, 2008, 5367 : 540 - 554
  • [24] Classifying English Web pages with "smart" ant-like agents
    Lai, WK
    Hoe, KM
    Tai, TSY
    Seah, MCY
    MULTIMEDIA, IMAGE PROCESSING AND SOFT COMPUTING: TRENDS, PRINCIPLES AND APPLICATIONS, 2002, 13 : 411 - 416
  • [25] Classifying Web Pages by Genre: An n-gram Based Approach
    Mason, Jane E.
    Shepherd, Michael
    Duffy, Jack
    2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 1, 2009, : 458 - 465
  • [26] Classifying Malicious Web Pages by Using an Adaptive Support Vector Machine
    Hwang, Young Sup
    Kwon, Jin Baek
    Moon, Jae Chan
    Cho, Seong Je
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2013, 9 (03): : 395 - 404
  • [27] Research on Chinese Web Pages Classification Based on Relation links
    Jin, Yining
    Wang, Huabing
    Zhang, Yu
    2012 2ND INTERNATIONAL CONFERENCE ON APPLIED ROBOTICS FOR THE POWER INDUSTRY (CARPI), 2012, : 905 - 908
  • [28] Applying Semantic Web Services to Web-based IS and applications
    Bose, Ranjit
    PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: NEW GENERATIONS, 2008, : 1150 - 1155
  • [29] Discovery of semantic relationships among Web pages based on Web topic structures
    Matsukura, T
    Kondo, H
    Hirata, Y
    Tanaka, K
    SEMANTIC ISSUES IN E-COMMERCE SYSTEMS, 2003, 111 : 171 - 185
  • [30] Applying Semantic Web into Technology Forecasting in Enterprises
    Tang, Fangcheng
    Liu, Yang
    IEEE/SOLI'2008: PROCEEDINGS OF 2008 IEEE INTERNATIONAL CONFERENCE ON SERVICE OPERATIONS AND LOGISTICS, AND INFORMATICS, VOLS 1 AND 2, 2008, : 135 - +