Audio-visual sports highlights extraction using Coupled Hidden Markov Models

被引：0

作者：

Ziyou Xiong

机构：

[1] University of Illinois at Urbana-Champaign,Department of Electrical and Computer Engineering

来源：

Pattern Analysis and Applications | 2005年 / 8卷

关键词：

State Transition Matrix; Interpretable Model Structure; Sport Video; Golf Swing; Average Classification Accuracy;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We present our studies on the application of Coupled Hidden Markov Models(CHMMs) to sports highlights extraction from broadcast video using both audio and video information. First, we generate audio labels using audio classification via Gaussian mixture models, and video labels using quantization of the average motion vector magnitudes. Then, we model sports highlights using discrete-observations CHMMs on audio and video labels classified from a large training set of broadcast sports highlights. Our experimental results on unseen golf and soccer content show that CHMMs outperform Hidden Markov Models(HMMs) trained on audio-only or video-only observations. Next, we study how the coupling between the two single-modality HMMs offers improvement on modelling capability by making refinements on the states of the models. We also show that the number of states optimized in this fashion also gives better classification results than other number of states. We conclude that CHMMs provide a promising tool for information fusion techniques in the sports domain for audio-visual event detection and analysis.

引用

页码：62 / 71

页数：9

共 50 条

[41] AUDIO-VISUAL PROGRAMMING FOR THE PIANO CLASS + INCLUDING LESSON PLAN USING AUDIO-VISUAL MEDIA
LANCASTER, EL
[J]. CLAVIER, 1976, 15 (05): : 28 - 33
[42] AUDIO-VISUAL EMOTION RECOGNITION WITH BOOSTED COUPLED HMM
Lu, Kun
Jia, Yunde
[J]. 2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 1148 - 1151
[43] Information optimization in coupled audio-visual cortical maps
Kardar, M
Zee, A
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (25) : 15894 - 15897
[44] Speech extraction based on ica and audio-visual coherence
Sodoyer, D
Girin, L
Jutten, C
Schwartz, JL
[J]. SEVENTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOL 2, PROCEEDINGS, 2003, : 65 - 68
[45] Facial event mining using coupled hidden Markov models
Ma, LM
Zhou, Q
Celenk, M
Chelberg, D
[J]. ICIP: 2004 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1- 5, 2004, : 1405 - 1408
[46] A HYBRID VISUAL FEATURE EXTRACTION METHOD FOR AUDIO-VISUAL SPEECH RECOGNITION
Wu, Guanyong
Zhu, Jie
Xu, Haihua
[J]. 2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 1829 - 1832
[47] Recognition of visual speech elements using Hidden Markov Models
Foo, SW
Dong, L
[J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2002, PROCEEDING, 2002, 2532 : 607 - 614
[48] Visual tracking using interactive factorial hidden Markov models
Paeng, Jin Wook
Kwon, Junseok
[J]. IET SIGNAL PROCESSING, 2021, 15 (06) : 365 - 374
[49] Automatic Visual Feature Extraction for Mandarin Audio-Visual Speech Recognition
Pao, Tsang-Long
Liao, Wen-Yuan
Wu, Tsan-Nung
Lin, Ching-Yi
[J]. 2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 2936 - 2940
[50] Does Audio help in deep Audio-Visual Saliency prediction models?
Agrawal, Ritvik
Jyoti, Shreyank
Girmaji, Rohit
Sivaprasad, Sarath
Gandhi, Vineet
[J]. PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2022, 2022, : 48 - 56

← 1 2 3 4 5 →