Ontology-based HTML']HTML to XML conversion

被引:0
|
作者
Li, SJ [1 ]
Ou, WJ
Yu, JQ
机构
[1] Wuhan Univ, Sch Comp, Wuhan 430072, Peoples R China
[2] Chinese Acad Sci, Lab Comp Sci, Inst Software, Beijing 100080, Peoples R China
[3] Huazhong Univ Sci & Technol, Coll Comp Sci & Technol, Wuhan 430074, Peoples R China
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current wrapper approaches break down in extracting data from differently structured and frequently changing Web pages. To tackle this challenge, this paper defines domain-specific ontology, captures the semantic hierarchy in Web pages automatically by exploiting both structural information and common formatting information, and recognizes and extracts data by using ontology-based semantic matching without relying on page-specific formatting. It is adaptive to differently structured and frequently changing Web pages for a domain of interest.
引用
收藏
页码:888 / 893
页数:6
相关论文
共 50 条
  • [1] Automatic HTML']HTML to XML conversion
    Li, SJ
    Liu, MC
    Ling, TW
    Peng, ZY
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT: PROCEEDINGS, 2004, 3129 : 714 - 719
  • [2] Analysis of the HTML']HTML to XML Conversion Method
    Li Busheng
    Hu Jingfang
    PROCEEDINGS OF THE 2015 INTERNATIONAL SYMPOSIUM ON COMPUTERS & INFORMATICS, 2015, 13 : 64 - 69
  • [3] HTML']HTML to XML conversion for non-programmers
    Mijic, J
    Tadic, M
    Jancec, M
    Jovanov, G
    ITI 2005: Proceedings of the 27th International Conference on Information Technology Interfaces, 2005, : 349 - 354
  • [4] A browser-based tool for conversion between Fortran NAMELIST and XML/HTML']HTML
    Naito, O.
    SOFTWAREX, 2017, 6 : 25 - 29
  • [5] Research on content reuse of HTML']HTML based on XML
    Li, QS
    Chen, P
    COMPUTER SCIENCE AND TECHNOLOGY IN NEW CENTURY, 2001, : 521 - 525
  • [6] XSLT: Working with XML and HTML']HTML
    Owens, D
    TECHNICAL COMMUNICATION, 2002, 49 (04) : 481 - 483
  • [7] A gateway from HTML']HTML to XML
    Fu, T
    Liu, MC
    INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 2004, : 205 - 214
  • [8] Wrapping HTML']HTML tables into XML
    Li, SJ
    Liu, MC
    Peng, ZY
    WEB INFORMATION SYSTEMS - WISE 2004, PROCEEDINGS, 2004, 3306 : 147 - 152
  • [9] Getting to XML from HTML']HTML
    Wood, L
    SGML EUROPE '97 - CONFERENCE PROCEEDINGS, 1997, : 189 - 192
  • [10] Template resolution in XML/HTML']HTML
    Kristensen, A
    COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7): : 239 - 249