Ontology-based HTML']HTML to XML conversion

被引:0
|
作者
Li, SJ [1 ]
Ou, WJ
Yu, JQ
机构
[1] Wuhan Univ, Sch Comp, Wuhan 430072, Peoples R China
[2] Chinese Acad Sci, Lab Comp Sci, Inst Software, Beijing 100080, Peoples R China
[3] Huazhong Univ Sci & Technol, Coll Comp Sci & Technol, Wuhan 430074, Peoples R China
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current wrapper approaches break down in extracting data from differently structured and frequently changing Web pages. To tackle this challenge, this paper defines domain-specific ontology, captures the semantic hierarchy in Web pages automatically by exploiting both structural information and common formatting information, and recognizes and extracts data by using ontology-based semantic matching without relying on page-specific formatting. It is adaptive to differently structured and frequently changing Web pages for a domain of interest.
引用
收藏
页码:888 / 893
页数:6
相关论文
共 50 条
  • [31] TC-GXML A Transcoder for HTML']HTML to XML Grammar
    Singh, Raghuraj
    Verma, Prabhat
    Singh, Avinash Kumar
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA STORAGE AND DATA ENGINEERING (DSDE 2010), 2010, : 34 - 38
  • [32] The SGML FAQ book: Understanding the foundation of HTML']HTML and XML
    Lunemann, RS
    TECHNICAL COMMUNICATION, 1998, 45 (03) : 408 - 409
  • [33] Automated information mediator for HTML']HTML and XML based web information delivery service
    Park, SS
    Kim, YS
    Park, GC
    Kang, BH
    Compton, P
    AI 2005: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2005, 3809 : 401 - 404
  • [34] Progress in separation of structure and style: HTML']HTML, XHTML']HTML, XML and cascading style sheets
    Fugate, James K.
    Vokurka, Robert J.
    INTERNATIONAL JOURNAL OF INNOVATION AND LEARNING, 2005, 2 (04) : 425 - 433
  • [35] XML与HTML整合
    陈银凤
    现代计算机(专业版), 2011, (14) : 49 - 51
  • [36] Key2html']html: a tool for the quick conversion of dichotomous keys into HTML']HTML code
    Schmidt-Lebuhn, Alexander N.
    Kessler, Michael
    TAXON, 2007, 56 (02) : 505 - 508
  • [37] HTML与XML浅析
    王艳娟
    硅谷, 2012, (06) : 43 - 43
  • [38] Using Semantic-Level Tags in HTML']HTML/XML Documents
    Henschen, Lawrence J.
    Lee, Julia C.
    UNIVERSAL ACCESS IN HUMAN-COMPUTER INTERACTION: APPLICATIONS AND SERVICES, PT III, 2009, 5616 : 683 - 692
  • [39] A resource for transforming HTML']HTML and molfile documents to XML compliant form
    Gkoutos, GV
    Kenway, PR
    Murray-Rust, P
    Rzepa, HS
    Wright, M
    INTERNET JOURNAL OF CHEMISTRY, 2001, 4 (05):
  • [40] Developer-Friendly Annotation-Based HTML']HTML-to-XML Transformation Technology
    Tseng, Lendle Chun-Hsiung
    DOCENG 2011: PROCEEDINGS OF THE 2011 ACM SYMPOSIUM ON DOCUMENT ENGINEERING, 2011, : 73 - 76