Harvesting image databases from the web

被引:0
|
作者
Schroff, F. [1 ]
Criminisi, A. [2 ]
Zisserman, A. [3 ]
机构
[1] Univ Oxford, San Diego, CA 92093 USA
[2] Microsoft Res, Cambridge, England
[3] Univ Oxford, Dept Engn Sci, Oxford, England
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The objective of this work(1) is to automatically generate a large number of images for a specified object class (for example, penguin). A multi-modal approach employing both text, meta data and visual features is used to gather many, high-quality images from the web. Candidate images are obtained by a text based web search querying on the object identifier (the word penguin). The web pages and the images they contain are down-loaded The task is then to remove irrelevant images and re-rank the remainder First, the images are re-ranked using a Bayes posterior estimator trained on the text surrounding the image and meta data features (such as the image alternative tag, image title tag, and image filename). No visual information is used at this stage. Second, the top-ranked images are used as (noisy) training data and a SVM visual classifier is learnt to improve the ranking further The principal novelty is in combining text/meta-data and visual features in order to achieve a completely automatic ranking of the images. Examples are given for a selection of animals (e.g. camels, sharks, penguins), vehicles (cars, airplanes, bikes) and other classes (guitar wristwatch), totalling 18 classes. The results are assessed by precision/recall curves on ground truth annotated data and by comparison to previous approaches including those of Berg et al. [5] (on an additional six classes) and Fergus et al. [9].
引用
收藏
页码:2120 / +
页数:2
相关论文
共 50 条
  • [41] WebChild: Harvesting and Organizing Commonsense Knowledge from the Web
    Tandon, Niket
    de Melo, Gerard
    Suchanek, Fabian
    Weikum, Gerhard
    [J]. WSDM'14: PROCEEDINGS OF THE 7TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2014, : 523 - 532
  • [42] iCollaborate: Harvesting Value from Enterprise Web Usage
    Kale, Ajinkya
    Burris, Thomas
    Shah, Bhavesh
    Venkatesan, T. L. Prasanna
    Velusamy, Lakshmanan
    Gupta, Manish
    Degerattu, Melania
    [J]. SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 699 - 699
  • [43] World wide web platform-independent access to biomedical text/image databases
    Long, LR
    Goh, GH
    Neve, L
    Thoma, GR
    [J]. MEDICAL IMAGING 1998 - PACS DESIGN AND EVALUATION: ENGINEERING AND CLINICAL ISSUES, 1998, 3339 : 52 - 63
  • [44] Databases on the Web: national web domain survey
    Shestakov, Denis
    [J]. PROCEEDINGS OF THE 15TH INTERNATIONAL DATABASE ENGINEERING & APPLICATIONS SYMPOSIUM (IDEAS '11), 2011, : 179 - 184
  • [45] Software modernization by recovering Web services from legacy databases
    Perez-Castillo, Ricardo
    Garcia-Rodriguez de Guzman, Ignacio
    Caballero, Ismael
    Piattini, Mario
    [J]. JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2013, 25 (05) : 507 - 533
  • [46] The Availability of Faculty Publication Databases from Library Web Pages
    Blummer, Barbara A.
    [J]. JOURNAL OF WEB LIBRARIANSHIP, 2007, 1 (02) : 27 - 55
  • [47] INFORATION RETRIEVAL FROM WEB DATABASES USING SEMANTIC SIMILARITY
    Muthugurunathan, G.
    Sarasu, R.
    [J]. 2013 INTERNATIONAL CONFERENCE ON GREEN COMPUTING, COMMUNICATION AND CONSERVATION OF ENERGY (ICGCE), 2013, : 868 - 870
  • [48] Mobile Visual Search from Dynamic Image Databases
    Chen, Xi
    Koskela, Markus
    [J]. IMAGE ANALYSIS: 17TH SCANDINAVIAN CONFERENCE, SCIA 2011, 2011, 6688 : 196 - 205
  • [49] Query by Shape for Image Retrieval from Multimedia Databases
    Deniziak, Stanislaw
    Michno, T.
    [J]. BEYOND DATABASES, ARCHITECTURES AND STRUCTURES, BDAS 2015, 2015, 521 : 377 - 386
  • [50] Harvesting maps on the web
    Aman Goel
    Matthew Michelson
    Craig A. Knoblock
    [J]. International Journal on Document Analysis and Recognition (IJDAR), 2011, 14 : 349 - 372