Web Information Extraction for content augmentation

被引:0
|
作者
Janevski, A [1 ]
Dimitrova, N [1 ]
机构
[1] Philips Res USA, Briarcliff Manor, NY 10510 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Today users have to cope with an overwhelming amount of TV channels and Web content sources. We introduce automatic content augmentation, as a novel approach to contextual information extraction on behalf of the user where the context is provided by the primary content source (i.e. TV channel) and tailored by user's preferences. A key aspect of this approach is Web Information Extraction (WebIE) which automatically derives structured information from unstructured Web documents. Our system executes WebIE tasks, each an instantiation of WebIE rules - our generic document processors. We present two WebIE approaches: Diffusion WebIE that crawls a wide set of Web pages and extracts information from a subset of the pertinent pages; and Laser WebIE that accesses a select set of Web pages and extracts narrowly defined information. We describe the architecture and the implementation details of the system and provide detailed Laser WebIE examples.
引用
收藏
页码:A389 / A392
页数:4
相关论文
共 50 条
  • [31] Services orchestration for web information extraction
    Quafafou, Mohamed
    Jarir, Zahi
    Erradi, Mohammed
    [J]. NWESP 2007: THIRD INTERNATIONAL CONFERENCE ON NEXT GENERATION WEB SERVICES PRACTICES, PROCEEDINGS, 2007, : 85 - +
  • [32] Building web information extraction tasks
    Habegger, B
    Quafafou, M
    [J]. IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2004), PROCEEDINGS, 2004, : 349 - 355
  • [33] A Classification Method for Web Information Extraction
    LI Xiang-yang 1
    2. Department of Computer Science and Engineering
    [J]. Wuhan University Journal of Natural Sciences, 2004, (05) : 823 - 827
  • [34] WEB INFORMATION EXTRACTION AND ITS APPLICATION
    Peng, Yan
    Zhang, Chenyue
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS, 2011, : 448 - 451
  • [35] A survey of web information extraction systems
    Chang, Chia-Hui
    Kayed, Mohammed
    Girgis, Moheb Ramzy
    Shaalan, Khaled F.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (10) : 1411 - 1428
  • [36] Scalable information extraction for web queries
    Hsu, Meichun
    Xiong, Yuhong
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2010, 5 (3-4) : 176 - 184
  • [37] Multimodal Learning for Web Information Extraction
    Gong, Dihong
    Wang, Daisy Zhe
    Peng, Yang
    [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 288 - 296
  • [38] WetDL: A web information extraction language
    Habegger, B
    Quafafou, M
    [J]. ADVANCES IN INFORMATION SYSTEMS, PROCEEDINGS, 2004, 3261 : 128 - 138
  • [39] Information Extraction from Web pages
    Novotny, Robert
    Vojtas, Peter
    Maruscak, Dusan
    [J]. 2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 3, 2009, : 121 - +
  • [40] Using clustering for web information extraction
    Phong, Le
    Vuong, Bao
    Gao, Xiaoying
    [J]. AI 2007: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2007, 4830 : 415 - +