Building descriptive and discriminative visual codebook for large-scale image applications

被引:13
|
作者
Tian, Qi [1 ]
Zhang, Shiliang [2 ]
Zhou, Wengang [3 ]
Ji, Rongrong [4 ]
Ni, Bingbing [5 ]
Sebe, Nicu [6 ]
机构
[1] Univ Texas San Antonio, Dept Comp Sci, San Antonio, TX 78249 USA
[2] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
[3] Univ Sci & Technol China, EEIS Dept, Hefei 230027, Peoples R China
[4] Harbin Inst Technol, Harbin 150001, Heilongjiang, Peoples R China
[5] Natl Univ Singapore, Singapore 117576, Singapore
[6] Univ Trent, Dept Informat Engn & Comp Sci, I-38100 Trento, Italy
关键词
Visual vocabulary; Large-scale image retrieval; Image search re-ranking; Feature space quantization;
D O I
10.1007/s11042-010-0636-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Inspired by the success of textual words in large-scale textual information processing, researchers are trying to extract visual words from images which function similar as textual words. Visual words are commonly generated by clustering a large amount of image local features and the cluster centers are taken as visual words. This approach is simple and scalable, but results in noisy visual words. Lots of works are reported trying to improve the descriptive and discriminative ability of visual words. This paper gives a comprehensive survey on visual vocabulary and details several state-of-the-art algorithms. A comprehensive review and summarization of the related works on visual vocabulary is first presented. Then, we introduce our recent algorithms on descriptive and discriminative visual word generation, i.e., latent visual context analysis for descriptive visual word identification [74], descriptive visual words and visual phrases generation [68], contextual visual vocabulary which combines both semantic contexts and spatial contexts [69], and visual vocabulary hierarchy optimization [18]. Additionally, we introduce two interesting post processing strategies to further improve the performance of visual vocabulary, i.e., spatial coding [73] is proposed to efficiently remove the mismatched visual words between images for more reasonable image similarity computation; user preference based visual word weighting [44] is developed to make the image similarity computed based on visual words more consistent with users' preferences or habits.
引用
收藏
页码:441 / 477
页数:37
相关论文
共 50 条
  • [1] Building descriptive and discriminative visual codebook for large-scale image applications
    Qi Tian
    Shiliang Zhang
    Wengang Zhou
    Rongrong Ji
    Bingbing Ni
    Nicu Sebe
    [J]. Multimedia Tools and Applications, 2011, 51 : 441 - 477
  • [2] Generating Descriptive Visual Words and Visual Phrases for Large-Scale Image Applications
    Zhang, Shiliang
    Tian, Qi
    Hua, Gang
    Huang, Qingming
    Gao, Wen
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2011, 20 (09) : 2664 - 2677
  • [3] A novel visual codebook model based on fuzzy geometry for large-scale image classification
    Li, Yanshan
    Huang, Qinghua
    Xie, Weixin
    Li, Xuelong
    [J]. PATTERN RECOGNITION, 2015, 48 (10) : 3125 - 3134
  • [4] FAST AND COMPACT VISUAL CODEBOOK FOR LARGE-SCALE OBJECT RETRIEVAL
    Cen, Shusheng
    Dong, Yuan
    Bai, Hongliang
    Huang, Chong
    [J]. 2013 5TH IEEE INTERNATIONAL CONFERENCE ON BROADBAND NETWORK & MULTIMEDIA TECHNOLOGY (IC-BNMT), 2013, : 35 - 38
  • [5] Building Discriminative User Profiles for Large-scale Content Recommendation
    Zhong, Erheng
    Liu, Nathan
    Shi, Yue
    Rajan, Suju
    [J]. KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 2277 - 2286
  • [6] A Feature Encoding based on Fuzzy Codebook for Large-Scale Image Recognition
    Shinomiya, Yuki
    Hoshino, Yukinobu
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, : 2908 - 2913
  • [7] Fast Learning Discriminative Dictionaries for Large-scale Visual Recognition
    Zhao, Tianyi
    Qu, Yanyun
    Fan, Jianping
    [J]. 2015 IEEE 17TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2015,
  • [8] Discriminative Learning of Relaxed Hierarchy for Large-scale Visual Recognition
    Gao, Tianshi
    Koller, Daphne
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2011, : 2072 - 2079
  • [9] Large-Scale Aerial Image Categorization Using a Multitask Topological Codebook
    Zhang, Luming
    Wang, Meng
    Hong, Richang
    Yin, Bao-Cai
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (02) : 535 - 545
  • [10] Large-Scale Image Annotation using Visual Synset
    Tsai, David
    Jing, Yushi
    Liu, Yi
    Rowley, Henry A.
    Ioffe, Sergey
    Rehg, James M.
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2011, : 611 - 618