A comparative study for multiple visual concepts detection in images and videos

被引:0
|
作者
Abdelkader Hamadi
Philippe Mulhem
Georges Quénot
机构
[1] Université de Lorraine,
[2] Univ. Grenoble Alpes,undefined
[3] CNRS,undefined
[4] LIG,undefined
来源
关键词
Semantic indexing; Multimedia; Fusion; Multiple concepts; Multi-concept; Concept pairs; Triplet of concepts; Bi-concept; Tri-concept; Image; Video; Pascal VOC; TRECVid;
D O I
暂无
中图分类号
学科分类号
摘要
Automatic indexing of images and videos is a highly relevant and important research area in multimedia information retrieval. The difficulty of this task is no longer something to prove. Most efforts of the research community have been focusing, in the past, on the detection of single concepts in images/videos, which is already a hard task. With the evolution of information retrieval systems, users’ needs become more abstract, and lead to a larger number of words composing the queries. It is important to think about indexing multimedia documents with more than just individual concepts, to help retrieval systems to answer such complex queries. Few studies addressed specifically the problem of detecting multiple concepts (multi-concept) in images and videos. Most of them concern the detection of concept pairs. These studies showed that such challenge is even greater than the one of single concept detection. In this work, we address the problem of multi-concept detection in images/videos by making a comparative and detailed study. Three types of approaches are considered: 1) building detectors for multi-concept, 2) fusing single concepts detectors and 3) exploiting detectors of a set of single concepts in a stacking scheme. We conducted our evaluations on PASCAL VOC’12 collection regarding the detection of pairs and triplets of concepts. We extended the evaluation process on TRECVid 2013 dataset for infrequent concept pairs’ detection. Our results show that the three types of approaches give globally comparable results for images, but they differ for specific kinds of pairs/triplets. In the case of videos, late fusion of detectors seems to be more effective and efficient when single concept detectors have good performances. Otherwise, directly building bi-concept detectors remains the best alternative, especially if a well-annotated dataset is available. The third approach did not bring additional gain or efficiency.
引用
收藏
页码:8973 / 8997
页数:24
相关论文
共 50 条
  • [1] A comparative study for multiple visual concepts detection in images and videos
    Hamadi, Abdelkader
    Mulhem, Philippe
    Quenot, Georges
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (15) : 8973 - 8997
  • [2] Braving the semantic gap: Mapping visual concepts from images and videos
    Deng, D
    [J]. ADVANCES IN DATA MINING: APPLICATIONS IN IMAGE MINING, MEDICINE AND BIOTECHNOLOGY, MANAGEMENT AND ENVIRONMENTAL CONTROL, AND TELECOMMUNICATIONS, 2004, 3275 : 50 - 59
  • [3] Annotation of still images by multiple visual concepts
    Hamadi, Abdelkader
    Mulhem, Philippe
    Quenot, Georges
    [J]. 2014 12TH INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2014,
  • [4] Harnessing high-level concepts, visual, and auditory features for violence detection in videos
    Peixoto, Bruno M.
    Lavi, Bahram
    Dias, Zanoni
    Rocha, Anderson
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 78
  • [5] Using semantic context for multiple concepts detection in still images
    Hamadi, Abdelkader
    Lattar, Hafsa
    Khoussa, Mohamed El Bachir
    Safadi, Bahjat
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2020, 23 (01) : 27 - 44
  • [6] Using semantic context for multiple concepts detection in still images
    Abdelkader Hamadi
    Hafsa Lattar
    Mohamed El Bachir Khoussa
    Bahjat Safadi
    [J]. Pattern Analysis and Applications, 2020, 23 : 27 - 44
  • [7] Automatic Detection of Steatosis in Ultrasound Images with Comparative Visual Labeling
    Saibro, Guinther
    Diana, Michele
    Sauer, Benoit
    Marescaux, Jacques
    Hostettler, Alexandre
    Collins, Toby
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT III, 2022, 13433 : 408 - 418
  • [8] Detection of deepfake technology in images and videos
    Liu, Yong
    Sun, Tianning
    Wang, Zonghui
    Zhao, Xu
    Cheng, Ruosi
    Shi, Baolan
    [J]. INTERNATIONAL JOURNAL OF AD HOC AND UBIQUITOUS COMPUTING, 2024, 45 (02) : 135 - 148
  • [9] Optimization for Detection and Recognition in Images and Videos
    Kim, Wonjun
    Jung, Chanho
    Bianco, Simone
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2017, 2017
  • [10] An experimental comparative study on slide change detection in lecture videos
    Eruvaram P.
    Ramani K.
    Bindu C.S.
    [J]. International Journal of Information Technology, 2020, 12 (2) : 429 - 436