Protection techniques from information extraction

被引:0
|
作者
Greco, Gianluigi [1 ]
Ianni, Giovambattista [1 ]
Lio, Vincenzino [1 ]
Palopoli, Luigi [1 ]
机构
[1] Univ Calabria, Calabria, Italy
关键词
D O I
10.1109/WI.2006.138
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Information extraction technologies meet the market need for automatic tools for extracting semi-structured information from web pages. However pages may change over time due to different reasons, ranging from restyling pages to on-purpose modifications brought about into pages in order to puzzle Web wrappers. In this paper we deal with this latter scenario, by studying the issue of on-purpose wrapper spoiling and its relationship to wrapping. We present an architecture and a tool implementing a wrapper spoiling system, and discuss some practical spoiling techniques which are also experimentally tested.
引用
收藏
页码:1029 / +
页数:2
相关论文
共 50 条
  • [31] EXTRACTION OF USEFUL INFORMATION OF A PATIENT USING DATA MINING TECHNIQUES
    Revathi, T.
    Rajesh, Sumathi
    2011 3RD INTERNATIONAL CONFERENCE ON COMPUTER TECHNOLOGY AND DEVELOPMENT (ICCTD 2011), VOL 3, 2012, : 285 - 289
  • [32] Application of temporal information extraction techniques to question answering systems
    Teresa Vicente-Diez, Maria
    Martinez, Paloma
    Martinez-Gonzalez, Angel
    Luis Martinez-Fernandez, Jose
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2009, (42): : 25 - 30
  • [33] Adaptive Information Extraction of Disaster Information from Twitter
    Rcgalado, Ralph Vincent J.
    Chua, Jenina L.
    Co, Justin L.
    Cheng, Herman C.
    Magpantay, Angelo Bruce L.
    Kalaw, Kristine Ma. Dominique F.
    2014 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2014, : 286 - 289
  • [34] Semantic information generation from classification and information extraction
    Silva, TDS
    de Freitas, FLG
    Teske, RC
    Bittencourt, G
    WEB ENGINEERING, PROCEEDINGS, 2004, 3140 : 573 - 574
  • [35] Evaluation of Information Extraction Techniques to Label Extracted Data from e-Commerce Web Pages
    Anderson, Neil
    Hong, Jun
    WWW'14 COMPANION: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2014, : 1275 - 1278
  • [36] Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques
    Qiu, Qinjun
    Xie, Zhong
    Wu, Liang
    Tao, Liufeng
    EARTH SCIENCE INFORMATICS, 2020, 13 (04) : 1393 - 1410
  • [37] Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques
    Qinjun Qiu
    Zhong Xie
    Liang Wu
    Liufeng Tao
    Earth Science Informatics, 2020, 13 : 1393 - 1410
  • [38] Speaker Anonymization for Personal Information Protection Using Voice Conversion Techniques
    Yoo, In-Chul
    Lee, Keonnyeong
    Leem, Seonggyun
    Oh, Hyunwoo
    Ko, Bonggu
    Yook, Dongsuk
    IEEE ACCESS, 2020, 8 (08): : 198637 - 198645
  • [39] A Comparative Analysis of Information Hiding Techniques for Copyright Protection of Text Documents
    Ahvanooey, Milad Taleby
    Li, Qianmu
    Shim, Hiuk Jae
    Huang, Yanyan
    SECURITY AND COMMUNICATION NETWORKS, 2018,
  • [40] INFORMATION EXTRACTION FROM CHEMICAL PATENTS
    Bergmann, Sandra
    Romberg, Mathilde
    Klenner, Alexander
    Zimmermann, Marc
    COMPUTER SCIENCE-AGH, 2012, 13 (02): : 21 - 32