Visual versus Textual Embedding for Video Retrieval

被引:1
|
作者
Francis, Danny [1 ]
Pidou, Paul [1 ]
Merialdo, Bernard [1 ]
Huet, Benoit [1 ]
机构
[1] EURECOM, Campus SophiaTech,450 Route Chappes, F-06410 Biot, France
关键词
D O I
10.1007/978-3-319-70353-4_33
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper compares several approaches of natural language access to video databases. We present two main strategies. The first one is visual, and consists in comparing keyframes with images retrieved from Google Images. The second one is textual and consists in generating a text-based description of the keyframes, and comparing these descriptions with the query. We study the effect of several parameters and find out that substantial improvement is possible by choosing the right strategy for a given topic. Finally we investigate a method for choosing the right approach for a given topic.
引用
收藏
页码:386 / 395
页数:10
相关论文
共 50 条
  • [41] Image-Text Embedding Learning via Visual and Textual Semantic Reasoning
    Li, Kunpeng
    Zhang, Yulun
    Li, Kai
    Li, Yuanyuan
    Fu, Yun
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 641 - 656
  • [42] Tagged Video Retrieval System using Domain Ontology and Word Embedding
    Hahm, Gyeong-june
    Kwak, Chang-uk
    Kin, Sun-joong
    2017 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC), 2017, : 1100 - 1102
  • [43] Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval
    Wang, Jiamian
    Sun, Guohao
    Wang, Pichao
    Liu, Dongfang
    Dianat, Sohail
    Rabbanil, Majid
    Rao, Raghuveer
    Tao, Zhigang
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 16551 - 16560
  • [44] Hierarchical Navigation and Visual Search for Video Keyframe Retrieval
    Ventura, Carles
    Martos, Manel
    Giro-i-Nieto, Xavier
    Vilaplana, Veronica
    Marques, Ferran
    ADVANCES IN MULTIMEDIA MODELING, 2012, 7131 : 652 - 654
  • [45] Image and video databases: Visual browsing, querying and retrieval
    DelBimbo, A
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 1996, 7 (04): : 353 - 359
  • [46] Hierarchical Visual Interface for Educational Video Retrieval and Summarization
    Weng, Jiahao
    Zhang, Chao
    Yang, Xi
    Xie, Haoran
    INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY (IWAIT) 2022, 2022, 12177
  • [47] ENDOMICROSCOPIC VIDEO RETRIEVAL USING MOSAICING AND VISUAL WORDS
    Andre, B.
    Vercauteren, T.
    Buchner, A. M.
    Wallace, M. B.
    Ayache, N.
    2010 7TH IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING: FROM NANO TO MACRO, 2010, : 1419 - 1422
  • [48] Learning Semantic and Visual Similarity for Endomicroscopy Video Retrieval
    Andre, Barbara
    Vercauteren, Tom
    Buchner, Anna M.
    Wallace, Michael B.
    Ayache, Nicholas
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2012, 31 (06) : 1276 - 1288
  • [49] Visual Consensus Modeling for Video-Text Retrieval
    Cao, Shuqiang
    Wang, Bairui
    Zhang, Wei
    Ma, Lin
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 167 - 175
  • [50] LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts
    Ma, Shuming
    Cui, Lei
    Dai, Damai
    Wei, Furu
    Sun, Xu
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6810 - 6817