Robin: Extracting visual and textual features from web pages

被引:0
|
作者
Oka, M [1 ]
Tsukada, H [1 ]
Kato, K [1 ]
机构
[1] Univ Tsukuba, Tsukuba, Ibaraki 305, Japan
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Web pages contain information in several forms. These include textual information such as words and visual information such as images, use of color, and layout. We propose a method of extracting the characteristic features from both the textual and visual information in Web pages. Our method enables seamless integration of the two types of information and automatic extraction of their characteristic features. Based on this method, we developed a proof-of-concept system called Robin, which is designed to provide users with an intuitive way of browsing search engine results. The results of an experimental evaluation of the system showed that it has the potential to be practical and effective.
引用
收藏
页码:765 / 771
页数:7
相关论文
共 50 条
  • [21] A hybrid approach for extracting informative content from web pages
    Uzun, Erdinc
    Agun, Hayri Volkan
    Yerlikaya, Tarik
    INFORMATION PROCESSING & MANAGEMENT, 2013, 49 (04) : 928 - 944
  • [22] Extracting Topic Maps from Web Pages by Web Link Structure and Content
    Mase, Motohiro
    Yamada, Seiji
    Nitta, Katsumi
    2008 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-8, 2008, : 1232 - +
  • [23] Research of Extracting Data from HTML Web Pages Automatically
    王茹
    宋瀚涛
    陆玉昌
    Journal of Beijing Institute of Technology, 2003, (S1) : 104 - 108
  • [24] The hierarchical classification of web content by the combination of textual and visual features
    Dong, SB
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 1524 - 1529
  • [25] Web Image Annotation by Fusing Visual Features and Textual Information
    Tseng, Vincent S.
    Su, Ja-Hwung
    Wang, Bo-Wen
    Lin, Yu-Ming
    APPLIED COMPUTING 2007, VOL 1 AND 2, 2007, : 1056 - 1060
  • [26] Visual extraction of information from web pages
    Della Penna, Giuseppe
    Magazzeni, Daniele
    Orefice, Sergio
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2010, 21 (01): : 23 - 32
  • [27] A STRUCTURAL APPROACH TO EXTRACTING CHINESE POSITION RELATIONS FROM WEB PAGES
    Jin, Peiquan
    Yang, Jia
    Zhao, Jie
    Liu, Yanhong
    JOURNAL OF WEB ENGINEERING, 2013, 12 (05): : 363 - 382
  • [28] A strategy for extracting information from semi-structured web pages
    Shaker, Mahmoud
    Ibrahim, Hamidah
    Mustapha, Aida
    Abdullah, Lili Nurliyana
    INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2010, 6 (04) : 304 - 318
  • [29] Extracting news text from web pages: an application for the visually impaired
    Lundgren, Erik
    Papapetrou, Panagiotis
    Asker, Lars
    8TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS (PETRA 2015), 2015,
  • [30] Extracting Opinions Relating to Consumer Electronic Goods from Web Pages
    Nakamura, Taichi
    Maruyama, Hiroshi
    KNOWLEDGE-BASED SOFTWARE ENGINEERING, 2006, 140 : 206 - 209