Building descriptive and discriminative visual codebook for large-scale image applications

被引:0
|
作者
Qi Tian
Shiliang Zhang
Wengang Zhou
Rongrong Ji
Bingbing Ni
Nicu Sebe
机构
[1] University of Texas at San Antonio,Computer Science Department
[2] Institute of Computing Technology,Key Lab of Intelligent Information Processing
[3] Chinese Academy of Sciences,EEIS Department
[4] University of Science and Technology of China,Department of Information Engineering and Computer Science
[5] Harbin Institute of Technology,undefined
[6] National University of Singapore,undefined
[7] University of Trento,undefined
来源
关键词
Visual vocabulary; Large-scale image retrieval; Image search re-ranking; Feature space quantization;
D O I
暂无
中图分类号
学科分类号
摘要
Inspired by the success of textual words in large-scale textual information processing, researchers are trying to extract visual words from images which function similar as textual words. Visual words are commonly generated by clustering a large amount of image local features and the cluster centers are taken as visual words. This approach is simple and scalable, but results in noisy visual words. Lots of works are reported trying to improve the descriptive and discriminative ability of visual words. This paper gives a comprehensive survey on visual vocabulary and details several state-of-the-art algorithms. A comprehensive review and summarization of the related works on visual vocabulary is first presented. Then, we introduce our recent algorithms on descriptive and discriminative visual word generation, i.e., latent visual context analysis for descriptive visual word identification [74], descriptive visual words and visual phrases generation [68], contextual visual vocabulary which combines both semantic contexts and spatial contexts [69], and visual vocabulary hierarchy optimization [18]. Additionally, we introduce two interesting post processing strategies to further improve the performance of visual vocabulary, i.e., spatial coding [73] is proposed to efficiently remove the mismatched visual words between images for more reasonable image similarity computation; user preference based visual word weighting [44] is developed to make the image similarity computed based on visual words more consistent with users’ preferences or habits.
引用
收藏
页码:441 / 477
页数:36
相关论文
共 50 条
  • [31] Building large-scale digital libraries
    Schatz, B
    Chen, HC
    [J]. COMPUTER, 1996, 29 (05) : 22 - 26
  • [32] Efficient Large-Scale Image Data Set Exploration: Visual Concept Network and Image Summarization
    Yang, Chunlei
    Feng, Xiaoyi
    Peng, Jinye
    Fan, Jianping
    [J]. ADVANCES IN MULTIMEDIA MODELING, PT II, 2011, 6524 : 111 - 121
  • [33] Uniting Keypoints: Local Visual Information Fusion for Large-Scale Image Search
    Liu, Zhen
    Li, Houqiang
    Zhou, Wengang
    Hong, Richang
    Tian, Qi
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (04) : 538 - 548
  • [34] Attention graph: Learning effective visual features for large-scale image classification
    Cui, Xuelian
    Zhang, Zhanjie
    Zhang, Tao
    Yang, Zhuoqun
    Yang, Jie
    [J]. JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2022, 16
  • [35] Urban mosaic: Visual exploration of streetscapes using large-scale image data
    Miranda, Fabio
    Hosseini, Maryam
    Lage, Marcos
    Doraiswamy, Harish
    Dove, Graham
    Silva, Cláudio T.
    [J]. arXiv, 2020,
  • [36] Unsupervised Auxiliary Visual Words Discovery for Large-Scale Image Object Retrieval
    Kuo, Yin-Hsi
    Lin, Hsuan-Tien
    Cheng, Wen-Huang
    Yang, Yi-Hsuan
    Hsu, Winston H.
    [J]. 2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011, : 905 - 912
  • [37] Coherent Semantic-Visual Indexing for Large-Scale Image Retrieval in the Cloud
    Hong, Richang
    Li, Lei
    Cai, Junjie
    Tao, Dapeng
    Wang, Meng
    Tian, Qi
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (09) : 4128 - 4138
  • [38] Urban Mosaic: Visual Exploration of Streetscapes Using Large-Scale Image Data
    Miranda, Fabio
    Hosseini, Maryam
    Lage, Marcos
    Doraiswamy, Harish
    Dove, Graham
    Silva, Claudio T.
    [J]. PROCEEDINGS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'20), 2020,
  • [39] On-the-fly learning for visual search of large-scale image and video datasets
    Chatfield, Ken
    Arandjelovic, Relja
    Parkhi, Omkar
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2015, 4 (02) : 75 - 93
  • [40] DendroMap: Visual Exploration of Large-Scale Image Datasets for Machine Learning with Treemaps
    Bertucci D.
    Hamid M.M.
    Anand Y.
    Ruangrotsakun A.
    Tabatabai D.
    Perez M.
    Kahng M.
    [J]. IEEE Transactions on Visualization and Computer Graphics, 2023, 29 (01) : 320 - 330