Visual landmark recognition from Internet photo collections: A large-scale evaluation

被引:15
|
作者
Weyand, Tobias [1 ]
Leibe, Bastian [1 ]
机构
[1] Rhein Westfal TH Aachen, Comp Vis Grp, Aachen, Germany
关键词
Landmark recognition; Image clustering; Image retrieval; Semantic annotation; Compact image retrieval indices; MEAN SHIFT; IMAGE;
D O I
10.1016/j.cviu.2015.02.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The task of a visual landmark recognition system is to identify photographed buildings or objects in query photos and to provide the user with relevant information on them. With their increasing coverage of the world's landmark buildings and objects, Internet photo collections are now being used as a source for building such systems in a fully automatic fashion. This process typically consists of three steps: clustering large amounts of images by the objects they depict; determining object names from user-provided tags; and building a robust, compact, and efficient recognition index. To this date, however, there is little empirical information on how well current approaches for those steps perform in a large-scale open-set mining and recognition task. Furthermore, there is little empirical information on how recognition performance varies for different types of landmark objects and where there is still potential for improvement. With this paper, we intend to fill these gaps. Using a dataset of 500 k images from Paris, we analyze each component of the landmark recognition pipeline in order to answer the following questions: How many and what kinds of objects can be discovered automatically? How can we best use the resulting image clusters to recognize the object in a query? How can the object be efficiently represented in memory for recognition? How reliably can semantic information be extracted? And finally: What are the limiting factors in the resulting pipeline from query to semantics? We evaluate how different choices of methods and parameters for the individual pipeline steps affect overall system performance and examine their effects for different query categories such as buildings, paintings or sculptures. (C) 2015 Elsevier Inc. All rights reserved.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 50 条
  • [1] Fast robust large-scale mapping from video and internet photo collections
    Frahm, Jan-Michael
    Pollefeys, Marc
    Lazebnik, Svetlana
    Gallup, David
    Clipp, Brian
    Raguram, Rahul
    Wu, Changchang
    Zach, Christopher
    Johnson, Tim
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2010, 65 (06) : 538 - 549
  • [2] Landmark Classification in Large-scale Image Collections
    Li, Yunpeng
    Crandall, David J.
    Huttenlocher, Daniel P.
    [J]. 2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, : 1957 - 1964
  • [3] Learning human photo shooting patterns from large-scale community photo collections
    Yanpeng Cao
    Kay O’Halloran
    [J]. Multimedia Tools and Applications, 2015, 74 : 11499 - 11516
  • [4] Learning human photo shooting patterns from large-scale community photo collections
    Cao, Yanpeng
    O'Halloran, Kay
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (24) : 11499 - 11516
  • [5] Large-Scale Visual Speech Recognition
    Shillingford, Brendan
    Assael, Yannis
    Hoffman, Matthew W.
    Paine, Thomas
    Hughes, Cian
    Prabhu, Utsav
    Liao, Hank
    Sak, Hasim
    Rao, Kanishka
    Bennett, Lorrayne
    Mulville, Marie
    Denil, Misha
    Coppin, Ben
    Laurie, Ben
    Senior, Andrew
    de Freitas, Nando
    [J]. INTERSPEECH 2019, 2019, : 4135 - 4139
  • [6] Large-Scale Visual Font Recognition
    Chen, Guang
    Yang, Jianchao
    Jin, Hailin
    Brandt, Jonathan
    Shechtman, Eli
    Agarwala, Aseem
    Han, Tony X.
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 3598 - 3605
  • [7] Clusformer: A Transformer based Clustering Approach to Unsupervised Large-scale Face and Visual Landmark Recognition
    Xuan-Bac Nguyen
    Duc Toan Bui
    Chi Nhan Duong
    Bui, Tien D.
    Luu, Khoa
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 10842 - 10851
  • [8] Efficient Nearest Neighbors Search for Large-Scale Landmark Recognition
    Magliani, Federico
    Fontanini, Tomaso
    Prati, Andrea
    [J]. ADVANCES IN VISUAL COMPUTING, ISVC 2018, 2018, 11241 : 541 - 551
  • [9] Sparse Output Coding for Large-Scale Visual Recognition
    Zhao, Bin
    Xing, Eric P.
    [J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 3350 - 3357
  • [10] Embedding Visual Hierarchy With Deep Networks for Large-Scale Visual Recognition
    Zhao, Tianyi
    Zhang, Baopeng
    He, Ming
    Zhang, Wei
    Zhou, Ning
    Yu, Jun
    Fan, Jianping
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (10) : 4740 - 4755