Scalable information extraction for web queries

被引:0
|
作者
Hsu, Meichun [1 ]
Xiong, Yuhong [2 ]
机构
[1] Hewlett Packard Labs, 1501 Page Mill Rd, Palo Alto, CA 94022 USA
[2] Innovat Works, Beijing 100084, Peoples R China
关键词
web mining; parallel computing; classification; information extraction; focused crawling;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The dominant way to find information on the web nowadays is through search. General search engines are very effective, but search phrases and results are unstructured and that limits a user's ability to further automate the processing of the search results. In recent years, we have seen efforts to build systems that support more precise query on the web for certain content verticals. We describe the general problems for building an extensible web query system and present one of our projects in this area - a vertical search portal for online courses.
引用
收藏
页码:176 / 184
页数:9
相关论文
共 50 条
  • [21] Services orchestration for web information extraction
    Quafafou, Mohamed
    Jarir, Zahi
    Erradi, Mohammed
    [J]. NWESP 2007: THIRD INTERNATIONAL CONFERENCE ON NEXT GENERATION WEB SERVICES PRACTICES, PROCEEDINGS, 2007, : 85 - +
  • [22] Building web information extraction tasks
    Habegger, B
    Quafafou, M
    [J]. IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2004), PROCEEDINGS, 2004, : 349 - 355
  • [23] A Classification Method for Web Information Extraction
    LI Xiang-yang 1
    2. Department of Computer Science and Engineering
    [J]. Wuhan University Journal of Natural Sciences, 2004, (05) : 823 - 827
  • [24] Information Extraction from Web pages
    Novotny, Robert
    Vojtas, Peter
    Maruscak, Dusan
    [J]. 2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 3, 2009, : 121 - +
  • [25] Using clustering for web information extraction
    Phong, Le
    Vuong, Bao
    Gao, Xiaoying
    [J]. AI 2007: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2007, 4830 : 415 - +
  • [26] Multimodal Learning for Web Information Extraction
    Gong, Dihong
    Wang, Daisy Zhe
    Peng, Yang
    [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 288 - 296
  • [27] WetDL: A web information extraction language
    Habegger, B
    Quafafou, M
    [J]. ADVANCES IN INFORMATION SYSTEMS, PROCEEDINGS, 2004, 3261 : 128 - 138
  • [28] A survey of web information extraction systems
    Chang, Chia-Hui
    Kayed, Mohammed
    Girgis, Moheb Ramzy
    Shaalan, Khaled F.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (10) : 1411 - 1428
  • [29] WEB INFORMATION EXTRACTION AND ITS APPLICATION
    Peng, Yan
    Zhang, Chenyue
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS, 2011, : 448 - 451
  • [30] Web Information Extraction and Conversion for Mashup
    Zhang, Rui
    Lan, Xiang
    Liu, Yao
    Liu, Qingyang
    [J]. MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 5471 - 5476