Research on Web Information Extraction Based on XML

被引:0
|
作者
Hu, Yan [1 ]
Xuan, Yanyan [1 ]
机构
[1] Wuhan Univ Technol, Dept Comp Sci & Technol, Wuhan 430070, Peoples R China
关键词
D O I
10.1109/WGEC.2008.16
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The standard XML technology is used for Web information extraction in this paper, and a generic XML-based Web information extraction solution is proposed. In the extraction process, two key technologies are proposed and implemented: the XML-based Web data conversion technology and the DOM-based XPath generation technology, to simplify the information extraction work. XSLT is used as the description language of extraction rules, which is conductive to the unity of extraction patterns.
引用
收藏
页码:201 / 204
页数:4
相关论文
共 50 条
  • [1] Research on the Application of Web Information Extraction Based On Semi Structured XML
    Yang, Guo-Jun
    [J]. 2016 INTERNATIONAL CONFERENCE ON SERVICE SCIENCE, TECHNOLOGY AND ENGINEERING (SSTE 2016), 2016, : 317 - 323
  • [2] A Method of Web Information Automatic Extraction Based on XML
    Gu, Junhua
    Song, Jie
    Zhang, Na
    Liu, Yanliu
    [J]. INFORMATION TECHNOLOGY FOR MANUFACTURING SYSTEMS, PTS 1 AND 2, 2010, : 178 - 183
  • [3] Study of Extraction for Web Pages Information Based on XML
    Li, Suming
    [J]. PROCEEDINGS OF THE 2016 2ND WORKSHOP ON ADVANCED RESEARCH AND TECHNOLOGY IN INDUSTRY APPLICATIONS, 2016, 81 : 829 - 832
  • [4] An XML-based wrapper generator for Web information extraction
    Liu, L
    Han, W
    Buttler, D
    Pu, C
    Tang, W
    [J]. SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999: SIGMOD99: PROCEEDINGS OF THE 1999 ACM SIGMOD - INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 1999, : 540 - 543
  • [5] XML-based Web Information Extraction System Design and Implementation
    Jun, Ma
    Li Tihong
    [J]. PROCEEDINGS OF 2010 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (ICCSIT 2010), VOL 8, 2010, : 551 - 554
  • [6] The Research of Web Parallel Information Extraction Based on Hadoop
    Ma, Songyu
    Shi, Quan
    Xu, Lu
    [J]. PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (CSAIT 2013), 2014, 255 : 341 - 348
  • [7] Research on web character information extraction based on semantic similarity
    Wang, Bao-Cheng
    Huang, Wei
    Li, Zhong-Ren
    Xiao, Ke
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMMUNICATION AND ELECTRONIC INFORMATION ENGINEERING (CEIE 2016), 2016, 116 : 663 - 670
  • [8] Research on Information Extraction Based on Web Table Structure and Ontology
    Wang, Xiaofeng
    [J]. MECHATRONICS AND INDUSTRIAL INFORMATICS, PTS 1-4, 2013, 321-324 : 2254 - 2259
  • [9] A Research of the Internet Based on Web Information Extraction and Data Fusion
    Jiang, Yajun
    Wu, Zaoliang
    Zhan, Zengrong
    Xu, Lingyu
    [J]. NEW HORIZONS IN WEB-BASED LEARNING: ICWL 2010 WORKSHOPS, 2011, 6537 : 195 - 206
  • [10] Research of Web information extraction MAS model based on KPS
    Duan Longzhen
    Qian Jun
    Huang Shuiyuan
    Yu Jing
    Zhang Hejiang
    [J]. ADVANCED COMPUTER TECHNOLOGY, NEW EDUCATION, PROCEEDINGS, 2007, : 520 - 524