An integrated distributed system for web news retrieval

被引:1
|
作者
Chan, Man-Chung [1 ]
Luo, Wei-Dong [2 ]
Liu, James N. K. [2 ]
机构
[1] Hong Kong Polytech Univ, SPEED, Hong Kong, Hong Kong, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
关键词
D O I
10.1142/9789812819079_0022
中图分类号
F [经济];
学科分类号
02 ;
摘要
This paper highlights the problems of information explosion and the incapability of currently available search engines in finding what we mostly want. In particularly, these search engines cannot offer users the facility of specifying the categories and time frames they receive and cannot provide the online news information with the required frequency. To address these problems, we present the design and implementation of- "Ai-Times", a distributed web news retrieval system which can accurately retrieve and organize the web news information. We describe the optimized crawler algorithm, the news extraction algorithm, and explain how MapReduce is used in "Ai-Times" and can be improved to get better performance.
引用
收藏
页码:147 / +
页数:3
相关论文
共 50 条
  • [1] Design and implement a Web news retrieval system
    Liu, JNK
    Luo, WD
    Chan, EMC
    [J]. KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 3, PROCEEDINGS, 2005, 3683 : 149 - 156
  • [2] Development of an intelligent distributed news retrieval system
    Liu, James N. K.
    Choi, K. C.
    Chai, J. Y.
    [J]. INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2012, 16 (02) : 129 - 140
  • [3] Developing an integrated retrieval system for web databases
    Lee, JO
    Jeon, HS
    Kang, HK
    Kim, J
    [J]. PRACTICAL ASPECTS OF KNOWLEDGE MANAGEMENT, PROCEEDINGS, 2004, 3336 : 186 - 197
  • [4] Retrieval of software components using a distributed web system
    Behle, A
    Kirchhof, M
    Nagl, M
    Welter, R
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2002, 25 (03) : 197 - 222
  • [5] A Survey on Web News Retrieval and Mining
    Hassanian-esfahani, Roya
    Kargar, Mohammad-javad
    [J]. 2016 SECOND INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2016, : 90 - 101
  • [6] Content-Based News Retrieval on the Web
    Capasso, Pasquale
    Cesarano, Carmine
    Picariello, Antonio
    Sansone, Lucio
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2006, 6 (5B): : 88 - 94
  • [7] Dow Jones News/Retrieval migrates to the Web
    不详
    [J]. ONLINE, 1997, 21 (04): : 11 - 11
  • [8] Distributed Web Service Retrieval Method
    Czyszczon, Adam
    Zgrzywa, Aleksander
    [J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS, PT I, 2015, 9011 : 117 - 126
  • [9] iKRISTAL: An integrated information retrieval system using metadata on distributed environments
    Seo, Jeong-Hyeon
    Lee, JongSuk Ruth
    Nam, Young-Kwang
    Lee, Byoung-Dai
    [J]. MATHEMATICAL AND COMPUTER MODELLING, 2013, 58 (5-6) : 1351 - 1361
  • [10] A multi-agent system for distributed information retrieval on the World Wide Web
    Clark, KL
    Lazarou, VS
    [J]. SIXTH IEEE WORKSHOPS ON ENABLING TECHNOLOGIES: INFRASTRUCTURE FOR COLLABORATIVE ENTERPRISES, PROCEEDINGS, 1997, : 87 - 92