Performance evaluation of large-scale object recognition system using bag-of-visual words model

被引:0
|
作者
Min-Uk Kim
Kyoungro Yoon
机构
[1] Konkuk University,School of Computer Science and Engineering
来源
关键词
Object recognition; Bag-of-visual words; Sift; Vocabulary tree; CDVS; Standard;
D O I
暂无
中图分类号
学科分类号
摘要
Object recognition technology is usually used for recognizing specific objects, such as book covers, landmarks, vehicles, etc. This technology is supported by multi-dimensional local image descriptors in most situations. These descriptors are designed to be robust to the environmental changes, such as illumination change, view angle change, scale change, etc. If there are many target objects in your database, object recognition using large scale local image descriptor database may not be a trivial task, because of the high dimensionality of the local image descriptors. For consistent responses from a large-scale database with a reasonable time delay, we need to have a proper data structure which supports the indexing and querying functionality. A vocabulary tree is a data structure based on local image descriptors, and this data structure is commonly used to cope with massive databases containing local image descriptors. By using a vocabulary tree, a local image descriptor can be mapped to a vocabulary tree’s leaf node ID, constructing a visual word for object recognition. Visual words are then effectively exploited by a traditional text retrieval engine. In this study, we built a large-scale object recognition system using a vocabulary tree that had leaf nodes of 1 million Scale-Invariant Feature Transform (SIFT) descriptors, which is the most promising local image descriptor in terms of precision. We implement proposed system using publicly available software so that further enhancements and/or reproducibility would be easily accomplished. We then compared and evaluated the proposed system’s performance with the current MPEG CDVS (Compact Descriptors for Visual Search) standard using a database containing two dimensional planar object datasets of three categories with one million distracter images. In addition to these datasets, which are equivalent to those of CDVS, we add a new dataset which are made to mimic realistic occlusion and clutter effects. Experimental results show that our proposed system’s performance is comparable to that of the CDVS achieving 90 % precision at 5 s retrieval time. We also find characteristics of vocabulary tree limiting adaptation to a specific application domain.
引用
收藏
页码:2499 / 2517
页数:18
相关论文
共 50 条
  • [1] Performance evaluation of large-scale object recognition system using bag-of-visual words model
    Kim, Min-Uk
    Yoon, Kyoungro
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (07) : 2499 - 2517
  • [2] Object Classification of Aerial Images With Bag-of-Visual Words
    Xu, Sheng
    Fang, Tao
    Li, Deren
    Wang, Shiwei
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2010, 7 (02) : 366 - 370
  • [3] BAG OF WORDS FOR LARGE SCALE OBJECT RECOGNITION Properties and Benchmark
    Aly, Mohamed
    Munich, Mario
    Perona, Pietro
    [J]. VISAPP 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, 2011, : 299 - 306
  • [4] Emotion Recognition from Speech Using the Bag-of-Visual Words on Audio Segment Spectrograms
    Spyrou, Evaggelos
    Nikopoulou, Rozalia
    Vernikos, Ioannis
    Mylonas, Phivos
    [J]. TECHNOLOGIES, 2019, 7 (01)
  • [5] Evaluation of Random Forests on large-scale classification problems using a Bag-of-Visual-Words representation
    Sole, Xavier
    Ramisa, Arnau
    Torras, Carme
    [J]. ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT: RECENT ADVANCES AND APPLICATIONS, 2014, 269 : 273 - 276
  • [6] WEIGHTED BAG OF VISUAL WORDS FOR OBJECT RECOGNITION
    San Biagio, Marco
    Bazzani, Loris
    Cristani, Marco
    Murino, Vittorio
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 2734 - 2738
  • [7] Approximate Image Matching using Strings of Bag-of-Visual Words Representation
    Hong Thinh Nguyen
    Barat, Cecile
    Ducottet, Christophe
    [J]. PROCEEDINGS OF THE 2014 9TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, THEORY AND APPLICATIONS (VISAPP 2014), VOL 2, 2014, : 345 - 353
  • [8] CELLULAR AUTOMATA BAG OF VISUAL WORDS FOR OBJECT RECOGNITION
    Mironical, Ionut
    Ionescu, Bogdan
    Dogaru, Radu
    [J]. UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2015, 77 (04): : 107 - 118
  • [9] Implementation of Large-scale Object Recognition System
    Kim, Min-Uk
    Yoon, Kyoungro
    [J]. 2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND APPLICATIONS (ICISA 2013), 2013,
  • [10] Content-Based Image (Object) Retrieval with Rotational Invariant Bag-of-Visual Words Representation
    Chathurani, N. W. U. D.
    Geva, S.
    Chandran, V
    Cynthujah, V
    [J]. 2015 IEEE 10TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (ICIIS), 2015, : 152 - 157