Visual versus Textual Embedding for Video Retrieval

被引：1

作者：

Francis, Danny ^{[1
]}

Pidou, Paul ^{[1
]}

Merialdo, Bernard ^{[1
]}

Huet, Benoit ^{[1
]}

机构：

[1] EURECOM, Campus SophiaTech,450 Route Chappes, F-06410 Biot, France

来源：

ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS (ACIVS 2017) | 2017年 / 10617卷

关键词：

D O I：

10.1007/978-3-319-70353-4_33

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper compares several approaches of natural language access to video databases. We present two main strategies. The first one is visual, and consists in comparing keyframes with images retrieved from Google Images. The second one is textual and consists in generating a text-based description of the keyframes, and comparing these descriptions with the query. We study the effect of several parameters and find out that substantial improvement is possible by choosing the right strategy for a given topic. Finally we investigate a method for choosing the right approach for a given topic.

引用

页码：386 / 395

页数：10

共 50 条

[41] Image-Text Embedding Learning via Visual and Textual Semantic Reasoning
Li, Kunpeng
Zhang, Yulun
Li, Kai
Li, Yuanyuan
Fu, Yun
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 641 - 656
[42] Tagged Video Retrieval System using Domain Ontology and Word Embedding
Hahm, Gyeong-june
Kwak, Chang-uk
Kin, Sun-joong
2017 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC), 2017, : 1100 - 1102
[43] Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval
Wang, Jiamian
Sun, Guohao
Wang, Pichao
Liu, Dongfang
Dianat, Sohail
Rabbanil, Majid
Rao, Raghuveer
Tao, Zhigang
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 16551 - 16560
[44] Hierarchical Navigation and Visual Search for Video Keyframe Retrieval
Ventura, Carles
Martos, Manel
Giro-i-Nieto, Xavier
Vilaplana, Veronica
Marques, Ferran
ADVANCES IN MULTIMEDIA MODELING, 2012, 7131 : 652 - 654
[45] Image and video databases: Visual browsing, querying and retrieval
DelBimbo, A
JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 1996, 7 (04): : 353 - 359
[46] Hierarchical Visual Interface for Educational Video Retrieval and Summarization
Weng, Jiahao
Zhang, Chao
Yang, Xi
Xie, Haoran
INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY (IWAIT) 2022, 2022, 12177
[47] ENDOMICROSCOPIC VIDEO RETRIEVAL USING MOSAICING AND VISUAL WORDS
Andre, B.
Vercauteren, T.
Buchner, A. M.
Wallace, M. B.
Ayache, N.
2010 7TH IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING: FROM NANO TO MACRO, 2010, : 1419 - 1422
[48] Learning Semantic and Visual Similarity for Endomicroscopy Video Retrieval
Andre, Barbara
Vercauteren, Tom
Buchner, Anna M.
Wallace, Michael B.
Ayache, Nicholas
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2012, 31 (06) : 1276 - 1288
[49] Visual Consensus Modeling for Video-Text Retrieval
Cao, Shuqiang
Wang, Bairui
Zhang, Wei
Ma, Lin
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 167 - 175
[50] LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts
Ma, Shuming
Cui, Lei
Dai, Damai
Wei, Furu
Sun, Xu
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6810 - 6817

← 1 2 3 4 5 →