Visual versus Textual Embedding for Video Retrieval

被引:1
|
作者
Francis, Danny [1 ]
Pidou, Paul [1 ]
Merialdo, Bernard [1 ]
Huet, Benoit [1 ]
机构
[1] EURECOM, Campus SophiaTech,450 Route Chappes, F-06410 Biot, France
关键词
D O I
10.1007/978-3-319-70353-4_33
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper compares several approaches of natural language access to video databases. We present two main strategies. The first one is visual, and consists in comparing keyframes with images retrieved from Google Images. The second one is textual and consists in generating a text-based description of the keyframes, and comparing these descriptions with the query. We study the effect of several parameters and find out that substantial improvement is possible by choosing the right strategy for a given topic. Finally we investigate a method for choosing the right approach for a given topic.
引用
收藏
页码:386 / 395
页数:10
相关论文
共 50 条
  • [31] VIVA: visual information retrieval in video archives
    Markus Mühling
    Nikolaus Korfhage
    Kader Pustu-Iren
    Joanna Bars
    Mario Knapp
    Hicham Bellafkir
    Markus Vogelbacher
    Daniel Schneider
    Angelika Hörth
    Ralph Ewerth
    Bernd Freisleben
    International Journal on Digital Libraries, 2022, 23 : 319 - 333
  • [32] Adaptively Converting Auxiliary Attributes and Textual Embedding for Video Captioning Based on BiLSTM
    Chen, Shuqin
    Zhong, Xian
    Li, Lin
    Liu, Wenxuan
    Gu, Cheng
    Zhong, Luo
    NEURAL PROCESSING LETTERS, 2020, 52 (03) : 2353 - 2369
  • [33] Adaptively Converting Auxiliary Attributes and Textual Embedding for Video Captioning Based on BiLSTM
    Shuqin Chen
    Xian Zhong
    Lin Li
    Wenxuan Liu
    Cheng Gu
    Luo Zhong
    Neural Processing Letters, 2020, 52 : 2353 - 2369
  • [34] Video Visual Relation Detection With Contextual Knowledge Embedding
    Cao, Qianwen
    Huang, Heyan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (12) : 13083 - 13095
  • [35] Character-Oriented Video Summarization With Visual and Textual Cues
    Zhou, Peilun
    Xu, Tong
    Yin, Zhizhuo
    Liu, Dong
    Chen, Enhong
    Lv, Guangyi
    Li, Changliang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (10) : 2684 - 2697
  • [36] Automatic Generation of Visual-Textual Web Video Thumbnail
    Zhao, Baoquan
    Lin, Shujin
    Qi, Xin
    Zhang, Zhiquan
    Luo, Xiaonan
    Wang, Ruomei
    SIGGRAPH ASIA 2017 POSTERS (SA'17), 2017,
  • [37] Visual and textual fusion for semantically supervised region-based retrieval
    Ji, Rongrong
    Yao, Hongxun
    Xu, Pengfei
    Sun, Xiaoshuai
    MULTIMEDIA SYSTEMS, 2009, 15 (04) : 201 - 219
  • [38] Integrating textual and visual information for cross-language image retrieval
    Lin, WC
    Chang, YC
    Chen, HH
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2005, 3689 : 454 - 466
  • [39] Improving Performance of Medical Images Retrieval by Combining Textual and Visual Information
    Diaz-Galiano, M. C.
    Martin-Valdivia, M. T.
    Montejo-Raez, A.
    Urena-Lopez, L. A.
    MICAI 2007: SIXTH MEXICAN INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, : 185 - 192
  • [40] Visual and textual fusion for semantically supervised region-based retrieval
    Rongrong Ji
    Hongxun Yao
    Pengfei Xu
    Xiaoshuai Sun
    Multimedia Systems, 2009, 15 : 201 - 219