Robin: Extracting visual and textual features from web pages

被引:0
|
作者
Oka, M [1 ]
Tsukada, H [1 ]
Kato, K [1 ]
机构
[1] Univ Tsukuba, Tsukuba, Ibaraki 305, Japan
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Web pages contain information in several forms. These include textual information such as words and visual information such as images, use of color, and layout. We propose a method of extracting the characteristic features from both the textual and visual information in Web pages. Our method enables seamless integration of the two types of information and automatic extraction of their characteristic features. Based on this method, we developed a proof-of-concept system called Robin, which is designed to provide users with an intuitive way of browsing search engine results. The results of an experimental evaluation of the system showed that it has the potential to be practical and effective.
引用
收藏
页码:765 / 771
页数:7
相关论文
共 50 条
  • [1] Using ontologies for extracting product features from Web pages
    Holzinger, Wolfgang
    Kruepl, Bernhard
    Herzog, Marcus
    SEMANTIC WEB - ISEC 2006, PROCEEDINGS, 2006, 4273 : 286 - +
  • [2] CLASSIFYING WEB PAGES WITH VISUAL FEATURES
    de Boer, Viktor
    van Someren, Maarten
    Lupascu, Tiberiu
    WEBIST 2010: PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGY, VOL 1, 2010, : 245 - 252
  • [3] Extracting News Content with Visual Unit of Web Pages
    Zhu, Wenhao
    Dai, Song
    Song, Yang
    Lu, Zhiguo
    2015 16TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2015, : 211 - 215
  • [4] Extracting Templates from Web pages
    Manjula, R.
    Chilambuchelvan, A.
    2013 INTERNATIONAL CONFERENCE ON GREEN COMPUTING, COMMUNICATION AND CONSERVATION OF ENERGY (ICGCE), 2013, : 788 - 791
  • [5] Extracting Data Records from Query Result Pages Based on Visual Features
    Weng, Daiyue
    Hong, Jun
    Bell, David A.
    ADVANCES IN DATABASES, 2011, 7051 : 140 - 153
  • [6] Extracting content structure for web pages based on visual representation
    Cai, D
    Yu, SP
    Wen, JR
    Ma, WY
    WEB TECHNOLOGIES AND APPLICATIONS, 2003, 2642 : 406 - 417
  • [7] Extracting Topic Maps from Web Pages
    Mase, Motohiro
    Yamada, Seiji
    Nitta, Katsumi
    NEW FRONTIERS IN APPLIED DATA MINING, 2009, 5433 : 169 - +
  • [8] Adaptively extracting structured data from Web pages
    Guo, Yingnan
    Zhang, Jiajun
    Chen, Xing
    2019 IEEE INTL CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, BIG DATA & CLOUD COMPUTING, SUSTAINABLE COMPUTING & COMMUNICATIONS, SOCIAL COMPUTING & NETWORKING (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2019), 2019, : 1524 - 1525
  • [9] Finding and Extracting Data Records from Web Pages
    Manuel Álvarez
    Alberto Pan
    Juan Raposo
    Fernando Bellas
    Fidel Cacheda
    Journal of Signal Processing Systems, 2010, 59 : 123 - 137
  • [10] Extracting Academic Information from Conference Web Pages
    Wang, Peng
    You, Yue
    Xu, Baowen
    Zhao, Jianyu
    2011 23RD IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2011), 2011, : 952 - 959