Multi-modal extraction of highlights from TV formula 1 programs

被引:25
|
作者
Petkovic, M [1 ]
Mihajlovic, V [1 ]
Jonker, W [1 ]
Djordjevic-Kajan, S [1 ]
机构
[1] Univ Twente, Dept Comp Sci, NL-7500 AE Enschede, Netherlands
关键词
D O I
10.1109/ICME.2002.1035907
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As amounts of publicly available video data grow, the need to automatically infer semantics from raw video data becomes significant. In this paper, we focus on the use of Dynamic Bayesian Networks (DBNs) for that purpose, and demonstrate how they can be effectively applied for fusing the evidence obtained from different media information sources. The approach is validated in the particular domain of Formula 1 race videos. For that specific domain we introduce a robust audiovisual feature extraction scheme and a text recognition and detection method. Based on numerous experiments performed with DBNs, we give some recommendations with respect to the modeling of temporal and atemporal dependences within the network. Finally, we present the experimental results for the detection of excited speech and the extraction of highlights, as well as the advantageous query capabilities of our system.
引用
收藏
页码:817 / 820
页数:4
相关论文
共 50 条
  • [1] Learning multi-modal control programs
    Mehta, TR
    Egerstedt, M
    HYBRID SYSTEMS: COMPUTATION AND CONTROL, 2005, 3414 : 466 - 479
  • [2] Identity Extraction from Clusters of Multi-modal Observations
    Hruz, Marek
    Salajka, Petr
    Gruber, Ivan
    Hlavac, Miroslav
    SPEECH AND COMPUTER, SPECOM 2019, 2019, 11658 : 171 - 179
  • [3] Multi-modal Interaction System for Smart TV Environments
    Lee, Injae
    Cha, Jihun
    Kwon, Ohseok
    2014 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2014, : 263 - 266
  • [4] Multi-modal exercise programs for older adults
    Baker, Michael K.
    Atlantis, Evan
    Singh, Maria A. Fiatarone
    AGE AND AGEING, 2007, 36 (04) : 375 - 381
  • [5] TripleMIE: Multi-modal and Multi Architecture Information Extraction
    Xia, Boqian
    Ma, Shihan
    Li, Yadong
    Huang, Wenkang
    Shi, Qiuhui
    Huang, Zuming
    Xie, Lele
    Wang, Hongbin
    HEALTH INFORMATION PROCESSING. EVALUATION TRACK PAPERS, 2023, 1773 : 143 - 153
  • [6] LIMUSE: LIGHTWEIGHT MULTI-MODAL SPEAKER EXTRACTION
    Liu, Qinghua
    Huang, Yating
    Hao, Yunzhe
    Xu, Jiaming
    Xu, Bo
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 488 - 495
  • [7] Metaknowledge Extraction Based on Multi-Modal Documents
    Liu, Shu-Kan
    Xu, Rui-Lin
    Geng, Bo-Ying
    Sun, Qiao
    Duan, Li
    Liu, Yi-Ming
    IEEE ACCESS, 2021, 9 : 50050 - 50060
  • [8] TV commercial classification by using multi-modal textual information
    Zheng, Yantao
    Duan, Lingyu
    Tian, Qi
    Jin, Jesse S.
    2006 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO - ICME 2006, VOLS 1-5, PROCEEDINGS, 2006, : 497 - 500
  • [9] MULTI-MODAL CHARACTERISTICS ANALYSIS AND FUSION FOR TV COMMERCIAL DETECTION
    Liu, Nan
    Zhao, Yao
    Zhu, Zhenfeng
    Lu, Hanqing
    2010 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2010), 2010, : 831 - 836
  • [10] Naming multi-modal clusters to identify persons in TV broadcast
    Johann Poignant
    Guillaume Fortier
    Laurent Besacier
    Georges Quénot
    Multimedia Tools and Applications, 2016, 75 : 8999 - 9023