Protection techniques from information extraction

被引:0
|
作者
Greco, Gianluigi [1 ]
Ianni, Giovambattista [1 ]
Lio, Vincenzino [1 ]
Palopoli, Luigi [1 ]
机构
[1] Univ Calabria, Calabria, Italy
关键词
D O I
10.1109/WI.2006.138
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Information extraction technologies meet the market need for automatic tools for extracting semi-structured information from web pages. However pages may change over time due to different reasons, ranging from restyling pages to on-purpose modifications brought about into pages in order to puzzle Web wrappers. In this paper we deal with this latter scenario, by studying the issue of on-purpose wrapper spoiling and its relationship to wrapping. We present an architecture and a tool implementing a wrapper spoiling system, and discuss some practical spoiling techniques which are also experimentally tested.
引用
收藏
页码:1029 / +
页数:2
相关论文
共 50 条
  • [21] Parallel techniques for information extraction from hyperspectral imagery using heterogeneous networks of workstations
    Plaza, Antonio J.
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2008, 68 (01) : 93 - 111
  • [22] Change information extraction through image processing techniques
    Firouzabadi, PZ
    Ramachandram, S
    REMOTE SENSING FOR ENVIRONMENTAL MONITORING, GIS APPLICATIONS, AND GEOLOGY II, 2003, 4886 : 528 - 533
  • [24] Nonlinguistic Information Extraction by Semi-Supervised Techniques
    Semenkina, Maria
    Akhmedova, Shakhnaz
    Semenkin, Eugene
    ICINCO: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS - VOL 1, 2017, : 312 - 317
  • [25] INFORMATION EXTRACTION FROM SPEECH
    MURPHY, AJ
    RATCLIFFE, NW
    JOHNSON, DAH
    DEWHURST, DJ
    ANNALS OF OTOLOGY RHINOLOGY AND LARYNGOLOGY, 1987, 96 (01): : 69 - 71
  • [26] Information extraction from voicemail
    Huang, J
    Zweig, G
    Padmanabhan, M
    39TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2001, : 290 - 297
  • [27] Information Extraction from Invoices
    Hamdi, Ahmed
    Carel, Elodie
    Joseph, Aurelie
    Coustaty, Mickael
    Doucet, Antoine
    DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II, 2021, 12822 : 699 - 714
  • [28] New techniques and technologies for information retrieval and knowledge extraction from nuclear fusion massive databases
    Murari, A.
    Vega, J.
    Alonso, J. A.
    De La Luna, E.
    Farthing, J.
    Hidalgo, C.
    Ratt, G. A.
    Svensson, J.
    Vagliasindi, G.
    2007 IEEE INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING, CONFERENCE PROCEEDINGS BOOK, 2007, : 943 - +
  • [29] Information extraction for search engines using fast heuristic techniques
    Hong, Jer Lang
    Siew, Eu-Gene
    Egerton, Simon
    DATA & KNOWLEDGE ENGINEERING, 2010, 69 (02) : 169 - 196
  • [30] Knowledge Obtention Combining Information Extraction Techniques with Linked Data
    Luis Garrido, Angel
    Blazquez, Pilar
    Buey, Maria G.
    Ilarri, Sergio
    WWW'15 COMPANION: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2015, : 643 - 648