Visual versus Textual Embedding for Video Retrieval

被引：1

作者：

Francis, Danny ^{[1
]}

Pidou, Paul ^{[1
]}

Merialdo, Bernard ^{[1
]}

Huet, Benoit ^{[1
]}

机构：

[1] EURECOM, Campus SophiaTech,450 Route Chappes, F-06410 Biot, France

来源：

ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS (ACIVS 2017) | 2017年 / 10617卷

关键词：

D O I：

10.1007/978-3-319-70353-4_33

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper compares several approaches of natural language access to video databases. We present two main strategies. The first one is visual, and consists in comparing keyframes with images retrieved from Google Images. The second one is textual and consists in generating a text-based description of the keyframes, and comparing these descriptions with the query. We study the effect of several parameters and find out that substantial improvement is possible by choosing the right strategy for a given topic. Finally we investigate a method for choosing the right approach for a given topic.

引用

页码：386 / 395

页数：10

共 50 条

[31] VIVA: visual information retrieval in video archives
Markus Mühling
Nikolaus Korfhage
Kader Pustu-Iren
Joanna Bars
Mario Knapp
Hicham Bellafkir
Markus Vogelbacher
Daniel Schneider
Angelika Hörth
Ralph Ewerth
Bernd Freisleben
International Journal on Digital Libraries, 2022, 23 : 319 - 333
[32] Adaptively Converting Auxiliary Attributes and Textual Embedding for Video Captioning Based on BiLSTM
Chen, Shuqin
Zhong, Xian
Li, Lin
Liu, Wenxuan
Gu, Cheng
Zhong, Luo
NEURAL PROCESSING LETTERS, 2020, 52 (03) : 2353 - 2369
[33] Adaptively Converting Auxiliary Attributes and Textual Embedding for Video Captioning Based on BiLSTM
Shuqin Chen
Xian Zhong
Lin Li
Wenxuan Liu
Cheng Gu
Luo Zhong
Neural Processing Letters, 2020, 52 : 2353 - 2369
[34] Video Visual Relation Detection With Contextual Knowledge Embedding
Cao, Qianwen
Huang, Heyan
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (12) : 13083 - 13095
[35] Character-Oriented Video Summarization With Visual and Textual Cues
Zhou, Peilun
Xu, Tong
Yin, Zhizhuo
Liu, Dong
Chen, Enhong
Lv, Guangyi
Li, Changliang
IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (10) : 2684 - 2697
[36] Automatic Generation of Visual-Textual Web Video Thumbnail
Zhao, Baoquan
Lin, Shujin
Qi, Xin
Zhang, Zhiquan
Luo, Xiaonan
Wang, Ruomei
SIGGRAPH ASIA 2017 POSTERS (SA'17), 2017,
[37] Visual and textual fusion for semantically supervised region-based retrieval
Ji, Rongrong
Yao, Hongxun
Xu, Pengfei
Sun, Xiaoshuai
MULTIMEDIA SYSTEMS, 2009, 15 (04) : 201 - 219
[38] Integrating textual and visual information for cross-language image retrieval
Lin, WC
Chang, YC
Chen, HH
INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2005, 3689 : 454 - 466
[39] Improving Performance of Medical Images Retrieval by Combining Textual and Visual Information
Diaz-Galiano, M. C.
Martin-Valdivia, M. T.
Montejo-Raez, A.
Urena-Lopez, L. A.
MICAI 2007: SIXTH MEXICAN INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, : 185 - 192
[40] Visual and textual fusion for semantically supervised region-based retrieval
Rongrong Ji
Hongxun Yao
Pengfei Xu
Xiaoshuai Sun
Multimedia Systems, 2009, 15 : 201 - 219

← 1 2 3 4 5 →