Harvesting Image Databases from the Web

被引：98

作者：

Schroff, Florian ^{[1
]}

Criminisi, Antonio ^{[2
]}

Zisserman, Andrew ^{[3
]}

机构：

[1] Univ Calif San Diego, Dept Comp Sci & Engn, San Diego, CA 92093 USA

[2] Microsoft Res Cambridge, Cambridge CB3 0FB, England

[3] Univ Oxford, Dept Engn Sci, Robot Res Grp, Oxford OX1 3PJ, England

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2011年 / 33卷 / 04期

关键词：

Weakly supervised; computer vision; object recognition; image retrieval;

D O I：

10.1109/TPAMI.2010.133

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The objective of this work is to automatically generate a large number of images for a specified object class. A multimodal approach employing both text, metadata, and visual features is used to gather many high-quality images from the Web. Candidate images are obtained by a text-based Web search querying on the object identifier (e.g., the word penguin). The Webpages and the images they contain are downloaded. The task is then to remove irrelevant images and rerank the remainder. First, the images are reranked based on the text surrounding the image and metadata features. A number of methods are compared for this reranking. Second, the top-ranked images are used as (noisy) training data and an SVM visual classifier is learned to improve the ranking further. We investigate the sensitivity of the cross-validation procedure to this noisy training data. The principal novelty of the overall method is in combining text/metadata and visual features in order to achieve a completely automatic ranking of the images. Examples are given for a selection of animals, vehicles, and other classes, totaling 18 classes. The results are assessed by precision/recall curves on ground-truth annotated data and by comparison to previous approaches, including those of Berg and Forsyth [5] and Fergus et al. [12].

引用

页码：754 / 766

页数：13

共 50 条

[1] Harvesting image databases from the web
Schroff, F.
Criminisi, A.
Zisserman, A.
[J]. 2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, : 2120 - +
[2] Harvesting models from web 2.0 databases
Oscar Díaz
Gorka Puente
Javier Luis Cánovas Izquierdo
Jesús García Molina
[J]. Software & Systems Modeling, 2013, 12 : 15 - 34
[3] Harvesting models from web 2.0 databases
Diaz, Oscar
Puente, Gorka
Canovas Izquierdo, Javier Luis
Garcia Molina, Jesus
[J]. SOFTWARE AND SYSTEMS MODELING, 2013, 12 (01): : 15 - 34
[4] Harvesting Large-Scale Weakly-Tagged Image Databases from the Web
Fan, Jianping
Shen, Yi
Zhou, Ning
Gao, Yuli
[J]. 2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 802 - 809
[5] Learning concept templates from web images to query personal image databases
Wu, Yi
Bouguet, Jean-Yves
Nefian, Ara
Kozintsev, Igor V.
[J]. 2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 1986 - 1989
[6] Automatic Categorization of Image Databases using Web Folksonomies
Capasso, Pasquale
Chianese, Angelo
Moscato, Vincenzo
Penta, Antonio
Picariello, Antonio
[J]. ISM: 2008 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, 2008, : 685 - 690
[7] Rank Discovery From Web Databases
Thirumuruganathan, Saravanan
Zhang, Nan
Das, Gautam
[J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (13): : 1582 - 1593
[8] Presenting interactive image databases on the Web using Java']Java
Wertheim, SL
[J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 1998, : 1097 - 1097
[9] An architecture for streamlining the implementation of biomedical text/image databases on the Web
Bopf, M
Coleman, T
Long, LR
Antani, S
Thoma, GR
Jeronimo, J
Schiffman, M
[J]. 17TH IEEE SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, PROCEEDINGS, 2004, : 563 - 568
[10] Learning-based Incremental Creation of Web Image Databases
George, Marian
Ghanem, Nagia
Ismail, M. A.
[J]. 2013 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2013), VOL 1, 2013, : 424 - 429

← 1 2 3 4 5 →