Intelligent support for information retrieval of web documents

被引:0
|
作者
Koval, R [1 ]
Návrat, P [1 ]
机构
[1] Slovak Univ Technol Bratislava, Dept Comp Sci & Engn, Bratislava 81219, Slovakia
关键词
intelligent information retrieval; suffix tree clustering algorithm; click-stream analysis; web tool; search agent;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The main goal of this research was to investigate the means of intelligent support for retrieval of web documents. We have proposed the architecture of the web tool system - Trillian, which discovers the interests of users without their interaction and uses them for autonomous searching of related web content. Discovered pages are suggested to the user. The discovery of user interests is based on analysis of documents visited by the users previously. We have created a module for completely transparent tracking of the user's movement on the web, which logs both visited URLs and contents of web pages. The post analysis step is based on a variant of the suffix tree clustering algorithm. We primarily focus on overall Trillian architecture design and the process of discovering topics of interests. We have implemented an experimental prototype of Trillian and evaluated the quality, speed and usefulness of the proposed system. We have shown that clustering is a feasible technique for extraction of interests from web documents. We consider the proposed architecture to be quite promising and suitable for future extensions.
引用
收藏
页码:509 / 528
页数:20
相关论文
共 50 条
  • [41] Typed structured documents for information retrieval
    Dharap, C
    Bowman, CM
    [J]. PRINCIPLES OF DOCUMENT PROCESSING, 1997, 1293 : 135 - 151
  • [42] Information retrieval system for handwritten documents
    Srihari, S
    Ganesh, A
    Tomai, C
    Shin, YC
    Huang, C
    [J]. DOCUMENT ANALYSIS SYSTEMS VI, PROCEEDINGS, 2004, 3163 : 298 - 309
  • [43] Approaches for Information Retrieval in Legal Documents
    Giri, Rachayita
    Porwal, Yosha
    Shukla, Vaibhavi
    Chadha, Palak
    Kaushal, Rishabh
    [J]. 2017 TENTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2017, : 229 - 234
  • [44] System of information retrieval in XML documents
    Smadhi, S
    [J]. ISSUES AND TRENDS OF INFORMATION TECHNOLOGY MANAGEMENT IN CONTEMPORARY ORGANIZATIONS, VOLS 1 AND 2, 2002, : 736 - 739
  • [45] Information Retrieval from Documents: A Survey
    M. Mitra
    B.B. Chaudhuri
    [J]. Information Retrieval, 2000, 2 (2-3): : 141 - 163
  • [46] Intelligent Information Retrieval System
    Cho, Young Im
    [J]. PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL LIFE AND ROBOTICS (AROB 16TH '11), 2011, : 565 - 568
  • [47] Intelligent storage for information retrieval
    Du, DHC
    [J]. INTERNATIONAL CONFERENCE ON NEXT GENERATION WEB SERVICES PRACTICES, 2005, : 214 - 220
  • [48] An intelligent platform for information retrieval
    Li, Fang
    Huang, Xuanjing
    [J]. Cognitive Systems, 2007, 4429 : 45 - 57
  • [49] Segmentation of Web Documents and Retrieval of Useful Passages
    Figuerola, Carlos G.
    Berrocal, Jose L. Alonso
    Rodriguez, Angel F. Zazo
    [J]. ADVANCES IN MULTILINGUAL AND MULTIMODAL INFORMATION RETRIEVAL, 2008, 5152 : 732 - 736
  • [50] A New PSO Methodology for Web Documents Retrieval
    Ramya, C.
    Shreedhara, K. S.
    [J]. 2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER, AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2017, : 852 - 856