Compact and Discriminative Approach for Encoding Spatial-Relationship of Visual Words

被引:3
|
作者
Pedrosa, Glauco V. [1 ]
Traina, Agma J. M. [1 ]
机构
[1] Univ Sao Paulo, ICMC, Sao Carlos, SP, Brazil
关键词
image representation; local features; bag-of-features; spatial-relationship; visual words; visual dictionaries;
D O I
10.1145/2695664.2695951
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The Bag-of-Visual-Words approach has been successfully used for video and image analysis by encoding local features as visual words, and the final representation is a histogram of the visual words detected in the image. One limitation of this approach relies on its inability of encoding spatial distribution of the visual words within an image, which is important for similarity measurement between images. In this paper, we present a novel technique to incorporate spatial information, called Global Spatial Arrangement (GSA). The idea is to split the image space into quadrants using each detected point as origin. To ensure rotation invariance, we use the information of the gradient of each detected point to define each quarter of the quadrant. The final representation uses only two extra information into the final feature vector to encode the spatial arrangement of visual words, with the advantage of being invariant to rotation. We performed representative experimental evaluations using several public datasets. Compared to other techniques, such as the Spatial Pyramid (SP), the proposed method needs 90% less information to encode spatial information of visual words. The results in image retrieval and classification demonstrated that our proposed approach improved the retrieval accuracy compared to other traditional techniques, while being the most compact descriptor.
引用
收藏
页码:92 / 95
页数:4
相关论文
共 50 条
  • [1] Encoding Spatial Arrangement of Visual Words
    Penatti, Otavio A. B.
    Valle, Eduardo
    Torres, Ricardo da S.
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, 2011, 7042 : 240 - 247
  • [2] Improving visual vocabularies: A more discriminative, representative and compact bag of visual words
    Chang, Leonardo
    Pérez-Suárez, Airel
    Hernández-Palancar, José
    Arias-Estrada, Miguel
    Sucar, L. Enrique
    Informatica (Slovenia), 2017, 41 (03): : 333 - 347
  • [3] Spatial encoding of visual words for image classification
    Liu, Dong
    Wang, Shengsheng
    Porikli, Fatih
    JOURNAL OF ELECTRONIC IMAGING, 2016, 25 (03)
  • [4] An application based on spatial-relationship to basketball defensive strategies
    Chin, SL
    Huang, CH
    Tang, CT
    Hung, JC
    EMBEDDED AND UBIQUITOUS COMPUTING - EUC 2005 WORKSHOPS, PROCEEDINGS, 2005, 3823 : 180 - 188
  • [5] Heterogeneous Contrastive Learning: Encoding Spatial Information for Compact Visual Representations
    Huo, Xinyue
    Xie, Lingxi
    Wei, Longhui
    Zhang, Xiaopeng
    Chen, Xin
    Li, Hao
    Yang, Zijie
    Zhou, Wengang
    Li, Houqiang
    Tian, Qi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 4224 - 4235
  • [6] Encoding Spatial Arrangements of Visual Words for Rotation-Invariant Image Classification
    Anwar, Hafeez
    Zambanini, Sebastian
    Kampel, Martin
    PATTERN RECOGNITION, GCPR 2014, 2014, 8753 : 443 - 452
  • [7] Creating Compact and Discriminative Visual Vocabularies using Visual Bits
    Kirishanthy, T.
    Ramanan, A.
    2015 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2015, : 258 - 263
  • [8] Constructing a discriminative visual vocabulary with macro and micro sense of visual words
    Chung-Ming Kuo
    Chaur-Heh Hsieh
    Nai-Chung Yang
    Chang-Ming Kuo
    Chi-Kao Chang
    Yu-Ming Chen
    Multimedia Tools and Applications, 2016, 75 : 16983 - 17017
  • [9] Constructing a discriminative visual vocabulary with macro and micro sense of visual words
    Kuo, Chung-Ming
    Hsieh, Chaur-Heh
    Yang, Nai-Chung
    Kuo, Chang-Ming
    Chang, Chi-Kao
    Chen, Yu-Ming
    MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (24) : 16983 - 17017
  • [10] Improving the Discriminative Power of Bag of Visual Words Model
    Ouni, Achref
    Urruty, Thierry
    Visani, Muriel
    MULTIMEDIA MODELING, MMM 2017, PT II, 2017, 10133 : 245 - 256