Temporal Confusion Network for Speech-based Soccer Event Retrieval

被引：0

作者：

Pham, Nhut M. ^{[1
]}

Vu, Quan H. ^{[1
]}

机构：

[1] VNU HCM, Univ Sci, Artificial Intelligence Lab, Ho Chi Minh City, Vietnam

来源：

2013 INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR COMMUNICATIONS (ATC) | 2013年

关键词：

temporal confusion network; soccer video; event detection; content-based multimedia retrieval; VIDEO; SYSTEM;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper introduces temporal confusion network and its application for speech-based soccer event retrieval, where an event is remarked by the announcer's spoken words. A temporal confusion network is a confusion network in which each cluster is marked with temporal information. Since the purpose of soccer event retrieval is to retrieve only the interesting highlights - not the whole video clip, temporal information is crucial in tracking them. By expanding the indexing model from 1-best transcriptions to temporal confusion networks, recall rates for event retrieval can be improved. Experiments are conducted on the first round of World Cup 2010 and the Vietnamese AFF Suzuki-cup 2008 databases. In the best case, an average improvement of 7.1% recall rate is achieved.

引用

页码：549 / 553

页数：5

共 50 条

[1] Bayesian network based soccer video event detection and retrieval
Sun, XH
Jin, GY
Mei, H
Xu, GY
[J]. THIRD INTERNATIONAL SYMPOSIUM ON MULTISPECTRAL IMAGE PROCESSING AND PATTERN RECOGNITION, PTS 1 AND 2, 2003, 5286 : 71 - 76
[2] Spatial-Temporal Feature Network for Speech-Based Depression Recognition
Han, Zhuojin
Shang, Yuanyuan
Shao, Zhuhong
Liu, Jingyi
Guo, Guodong
Liu, Tie
Ding, Hui
Hu, Qiang
[J]. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2024, 16 (01) : 308 - 318
[3] Speech-Based Annotation and Retrieval of Digital Photographs
Hazen, Timothy J.
Sherry, Brennan
Adler, Mark
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2077 - +
[4] User interfaces for speech-based retrieval of lecture recordings
Hürst, W
[J]. ED-MEDIA 2004: World Conference on Educational Multimedia, Hypermedia & Telecommunications, Vols. 1-7, 2004, : 4470 - 4477
[5] Speech-based services
Furman, DS
Cosky, MJ
Thomson, DL
O'Brien, SA
Sumner, EE
[J]. BELL LABS TECHNICAL JOURNAL, 1999, 4 (02) : 88 - 97
[6] Multimodal video search techniques: Late fusion of speech-based retrieval and visual content-based retrieval
Amir, A
Iyengar, G
Lin, CY
Naphade, M
Natsev, A
Neti, C
Nock, HJ
Smith, JR
Tseng, B
[J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 1048 - 1051
[7] Talk, Don't Write: A Study of Direct Speech-Based Image Retrieval
Sanabria, Ramon
Waters, Austin
Baldridge, Jason
[J]. INTERSPEECH 2021, 2021, : 2976 - 2980
[8] Expediting the identification of impaired channels in cochlear implants via analysis of speech-based confusion matrices
Remus, Jeremiah J.
Throckmorton, Chandra S.
Collins, Leslie M.
[J]. IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2007, 54 (12) : 2193 - 2204
[9] WHITMAN AND SPEECH-BASED PROSODY
JARVIS, DR
[J]. WALT WHITMAN REVIEW, 1981, 27 (02): : 51 - 62
[10] Speech-based Class Attendance
Amri, Umar Faizel
Hashim, Nik Nur Wahidah Nik
Hanif, Noor Hazrin Hany Mohamad
[J]. 6TH INTERNATIONAL CONFERENCE ON MECHATRONICS (ICOM'17), 2017, 260

← 1 2 3 4 5 →