Detecting semantic concepts from video using temporal gradients and audio classification

被引:0
|
作者
Rautiainen, M
Seppänen, T
Penttilä, J
Peltola, J
机构
[1] Univ Oulu, MediaTeam Oulu, FIN-90014 Oulu, Finland
[2] VTT Tech Res Ctr Finland, FIN-90571 Oulu, Finland
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we describe new methods to detect semantic concepts from digital video based on audible and visual content. Temporal Gradient Correlogram captures temporal correlations of gradient edge directions from sampled shot frames. Power-related physical features are extracted from short audio samples in video shots. Video shots containing people, cityscape, landscape, speech or instrumental sound are detected with trained self-organized maps and kNN classification results of audio samples. Test runs and evaluations in TREC 2002 Video Track show consistent performance for Temporal Gradient Correlogram and state-of-the-art precision in audio-based instrumental sound detection.
引用
收藏
页码:260 / 270
页数:11
相关论文
共 50 条
  • [31] Semantic classification of sports news video using color and motion features
    Jang, SangHyun
    Song, MiYoung
    Cho, HyungJe
    2006 INTERNATIONAL CONFERENCE ON HYBRID INFORMATION TECHNOLOGY, VOL 2, PROCEEDINGS, 2006, : 745 - 750
  • [32] Extracting semantic information from basketball video based on audio-visual features
    Kim, K
    Choi, J
    Kim, N
    Kim, P
    IMAGE AND VIDEO RETRIEVAL, 2002, 2383 : 278 - 288
  • [33] Video classification using spatial-temporal features and PCA
    Xu, LQ
    Li, YM
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 485 - 488
  • [34] CLASSIFICATION OF ANIMATED VIDEO GENRE USING COLOR AND TEMPORAL INFORMATION
    Ionescu, Bogdan
    Lambert, Patrick
    UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2013, 75 (03): : 63 - 74
  • [35] DETECTING LOCAL SEMANTIC CONCEPTS IN ENVIRONMENTAL SOUNDS USING MARKOV MODEL BASED CLUSTERING
    Lee, Keansub
    Ellis, Daniel P. W.
    Loui, Alexander C.
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 2278 - 2281
  • [36] Detecting Replay Attacks Using Single-Channel Audio: The Temporal Autocorrelation of Speech
    Lee, Shih-Kuang
    Tsao, Yu
    Wang, Hsin-Min
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1984 - 1990
  • [37] Detecting intra- and inter-categorical structure in semantic concepts using HICLAS
    Ceulemans, Eva
    Storms, Gert
    ACTA PSYCHOLOGICA, 2010, 133 (03) : 296 - 304
  • [38] Music video emotion classification using slow–fast audio–video network and unsupervised feature representation
    Yagya Raj Pandeya
    Bhuwan Bhattarai
    Joonwhoan Lee
    Scientific Reports, 11
  • [39] Using Deep Belief Network to capture temporal information for audio event classification
    Guo, Feng
    Yang, Deshun
    Chen, Xiaoou
    2015 INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING (IIH-MSP), 2015, : 421 - 424
  • [40] Audio Tagging With Connectionist Temporal Classification Model Using Sequentially Labelled Data
    Hou, Yuanbo
    Kong, Qiuqiang
    Li, Shengchen
    COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, CSPS 2018, VOL II: SIGNAL PROCESSING, 2020, 516 : 955 - 964