Exploiting multiple modalities for interactive video retrieval

被引:0
|
作者
Christel, MG [1 ]
Huang, C [1 ]
Moraveji, N [1 ]
Papernick, N [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Aural and visual cues can be automatically extracted from video and used to index its contents. This paper explores the relative merits of the cues extracted from the different modalities for locating relevant shots in video, specifically reporting on the indexing and interface strategies used to retrieve information from the Video TREC 2002 and 2003 data sets, and the evaluation of the interactive search runs. For the documentary and news material in these sets, automated speech recognition produces rich textual descriptions derived from the narrative, with visual descriptions and depictions offering additional browsing functionality. Through speech and visual processing, storyboard interfaces with query-based filtering provide an effective interactive retrieval interface. Examples drawn from the Video TREC 2002 and 2003 search topics and results using these topics illustrate the utility of multiple-document storyboards and other interfaces incorporating the results of multimodal processing.
引用
收藏
页码:1032 / 1035
页数:4
相关论文
共 50 条
  • [1] An interactive video annotation frameowrk with multiple modalities
    Wang, Meng
    Hua, Xian-Sheng
    Song, Yan
    Dai, Li-Rong
    Wang, Ren-Hua
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PTS 1-3, PROCEEDINGS, 2007, : 957 - +
  • [2] Improving interactive video retrieval by exploiting automatically-extracted video structural semantics
    Mezaris, Vasileios
    Sidiropoulos, Panagiotis
    Kompatsiaris, Ioannis
    [J]. FIFTH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2011), 2011, : 224 - 227
  • [3] Exploiting Evidential Theory in the Fusion of Textual, Audio, and Visual Modalities for Affective Music Video Retrieval
    Nemati, Shahla
    Naghsh-Nilchi, Ahmad Reza
    [J]. 2017 3RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION AND IMAGE ANALYSIS (IPRIA), 2017, : 222 - 228
  • [4] A usage study of retrieval modalities for video shot retrieval
    Smeaton, Alan F.
    Browne, Paul
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2006, 42 (05) : 1330 - 1344
  • [5] Interactive Video Retrieval with Dialog
    Maeoki, Sho
    Uehara, Kohei
    Harada, Tatsuya
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 4091 - 4099
  • [6] Examining feedback in interactive video retrieval
    Albertson, Dan
    [J]. JOURNAL OF INFORMATION SCIENCE, 2012, 38 (06) : 501 - 511
  • [7] Online interactive video content retrieval
    Bursuc, Andrei
    Zaharia, Titus
    Preteux, Francoise
    [J]. IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE 2011), 2011, : 215 - +
  • [8] iARM - An interactive video retrieval system
    Muneesawang, P
    Guan, L
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, : 285 - 288
  • [9] A content-based broadcasted sports video retrieval system using multiple modalities: SportBR
    Liu, HY
    Zhang, H
    [J]. FIFTH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY - PROCEEDINGS, 2005, : 652 - 656
  • [10] Masking Modalities for Cross-modal Video Retrieval
    Gabeur, Valentin
    Nagrani, Arsha
    Sun, Chen
    Alahari, Karteek
    Schmid, Cordelia
    [J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2111 - 2120