INTERACTIVE VIDEO ANNOTATION BY MULTI-CONCEPT MULTI-MODALITY ACTIVE LEARNING

被引:10
|
作者
Wang, Meng [1 ]
Hua, Xian-Sheng [2 ]
Mei, Tao [2 ]
Tang, Jinhui [3 ]
Qi, Guo-Jun [3 ]
Song, Yan [3 ]
Dai, Li-Rong [3 ]
机构
[1] Univ Sci & Technol China, Dept Elect Engn & Informat Sci, Hefei 230027, Anhui, Peoples R China
[2] Microsoft Res Asia, Beijing, Peoples R China
[3] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
关键词
Video annotation; active learning;
D O I
10.1142/S1793351X0700024X
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Active learning has been demonstrated to be an effective approach to reducing human labeling effort in multimedia annotation tasks. However, most of the existing active learning methods for video annotation are studied in a relatively simple context where concepts are sequentially annotated with fixed effort and only a single modality is applied. However, we usually have to deal with multiple modalities, and sequentially annotating concepts without preference cannot suitably assign annotation effort. To address these two issues, in this paper we propose a multi-concept multi-modality active learning method for video annotation in which multiple concepts and multiple modalities can be simultaneously taken into consideration. In each round of active learning, this method selects the concept that is expected to get the highest performance gain and a batch of suitable samples to be annotated for this concept. Then, a graph-based semi-supervised learning is conducted on each modality for the selected concept. The proposed method is able to sufficiently explore the human effort by considering both the learnabilities of different concepts and the potentials of different modalities. Experimental results on TRECVID 2005 benchmark have demonstrated its effectiveness and efficiency.
引用
收藏
页码:459 / 477
页数:19
相关论文
共 50 条
  • [1] Multi-concept multi-modality active learning for interactive video annotation
    Wang, Meng
    Hua, Xian-Sheng
    Song, Yan
    Tang, Jinhui
    Dai, Li-Rong
    [J]. ICSC 2007: INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, PROCEEDINGS, 2007, : 321 - +
  • [2] Exploring multi-modality structure for cross domain adaptation in video concept annotation
    Xu, Shaoxi
    Tang, Sheng
    Zhang, Yongdong
    Li, Jintao
    Zheng, Yan-Tao
    [J]. NEUROCOMPUTING, 2012, 95 : 11 - 21
  • [3] Learning based Multi-modality Image and Video Compression
    Lu, Guo
    Zhong, Tianxiong
    Geng, Jing
    Hu, Qiang
    Xu, Dong
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 6073 - 6082
  • [4] Concept-Driven Multi-Modality Fusion for Video Search
    Wei, Xiao-Yong
    Jiang, Yu-Gang
    Ngo, Chong-Wah
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2011, 21 (01) : 62 - 73
  • [5] Video Event Detection via Multi-modality Deep Learning
    Jhuo, I-Hong
    Lee, D. T.
    [J]. 2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 666 - 671
  • [6] Learning a Multi-Concept Video Retrieval Model with Multiple Latent Variables
    Mazaheri, Amir
    Gong, Boqing
    Shah, Mubarak
    [J]. PROCEEDINGS OF 2016 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2016, : 615 - 620
  • [7] Learning a Multi-Concept Video Retrieval Model with Multiple Latent Variables
    Mazaheri, Amir
    Gong, Boqing
    Shah, Mubarak
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2018, 14 (02)
  • [8] Unified Multi-Modality Video Object Segmentation Using Reinforcement Learning
    Sun M.
    Xiao J.
    Lim E.G.
    Zhao C.
    Zhao Y.
    [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (08) : 1 - 1
  • [9] Video semantic concept detection using multi-modality subspace correlation propagation
    Liu, Yanan
    Wu, Fei
    [J]. ADVANCES IN MULTIMEDIA MODELING, PT 1, 2007, 4351 : 527 - 534
  • [10] An effective video retrieval approach based on multi-modality concept correlation graph
    Feng, Bailan
    Bao, Lei
    Cao, Juan
    Zhang, Yongdong
    Lin, Shouxun
    [J]. Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2010, 22 (05): : 827 - 832