Representations of Keypoint-Based Semantic Concept Detection: A Comprehensive Study

被引:158
|
作者
Jiang, Yu-Gang [1 ,2 ]
Yang, Jun [3 ]
Ngo, Chong-Wah [1 ]
Hauptmann, Alexander G. [4 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Kowloon, Hong Kong, Peoples R China
[2] Columbia Univ, Dept Elect Engn, New York, NY 10027 USA
[3] Google Inc, Mountain View, CA 94043 USA
[4] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
关键词
Bag-of-visual-words; representation choice; semantic concept detection; KERNELS;
D O I
10.1109/TMM.2009.2036235
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Based on the local keypoints extracted as salient image patches, an image can be described as a "bag-of-visual-words (BoW)" and this representation has appeared promising for object and scene classification. The performance of BoW features in semantic concept detection for large-scale multimedia databases is subject to various representation choices. In this paper, we conduct a comprehensive study on the representation choices of BoW, including vocabulary size, weighting scheme, stop word removal, feature selection, spatial information, and visual bi-gram. We offer practical insights in how to optimize the performance of BoW by choosing appropriate representation choices. For the weighting scheme, we elaborate a soft-weighting method to assess the significance of a visual word to an image. We experimentally show that the soft-weighting outperforms other popular weighting schemes such as TF-IDF with a large margin. Our extensive experiments on TRECVID data sets also indicate that BoW feature alone, with appropriate representation choices, already produces highly competitive concept detection performance. Based on our empirical findings, we further apply our method to detect a large set of 374 semantic concepts. The detectors, as well as the features and detection scores on several recent benchmark data sets, are released to the multimedia community.
引用
收藏
页码:42 / 53
页数:12
相关论文
共 50 条
  • [1] A Comprehensive Study of Feature Representations for Semantic Concept Detection
    Duy-Dinh Le
    Satoh, Shin'ichi
    FIFTH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2011), 2011, : 235 - 238
  • [2] Keypoint-based contextual representations for hand pose estimation
    Li, Weiwei
    Du, Rong
    Chen, Shudong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (10) : 28357 - 28372
  • [3] Keypoint-based contextual representations for hand pose estimation
    Weiwei Li
    Rong Du
    Shudong Chen
    Multimedia Tools and Applications, 2024, 83 : 28357 - 28372
  • [4] Local Keypoint-Based Image Detector with Object Detection
    Grycuk, Rafal
    Scherer, Magdalena
    Voloshynovskiy, Sviatoslav
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2017, PT I, 2017, 10245 : 507 - 517
  • [5] A Keypoint-Based Region Duplication Forgery Detection Algorithm
    Emam, Mahmoud
    Han, Qi
    Yu, Liyang
    Zhang, Hongli
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (09) : 2413 - 2416
  • [6] A Keypoint-based Global Association Network for Lane Detection
    Wang, Jinsheng
    Ma, Yinchao
    Huang, Shaofei
    Hui, Tianrui
    Wang, Fei
    Qian, Chen
    Zhang, Tianzhu
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1382 - 1391
  • [7] A Keypoint-Based Approach Toward Scenery Character Detection
    Uchida, Seiichi
    Shigeyoshi, Yuki
    Kunishige, Yasuhiro
    Yaokai, Feng
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 819 - 823
  • [8] Gradient Corner Pooling for Keypoint-Based Object Detection
    Li, Xuyang
    Xie, Xuemei
    Yu, Mingxuan
    Luo, Jiakai
    Rao, Chengwei
    Shi, Guangming
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 1460 - 1467
  • [9] Keypoint-based passive method for image manipulation detection
    Prakash, Choudhary Shyam
    Om, Hari
    Maheshkar, Sushila
    Maheshkar, Vikas
    COGENT ENGINEERING, 2018, 5 (01): : 1 - 19
  • [10] Keypoint-Based Keyframe Selection
    Guan, Genliang
    Wang, Zhiyong
    Lu, Shiyang
    Da Deng, Jeremiah
    Feng, David Dagan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2013, 23 (04) : 729 - 734