Fast Caption Alignment for Automatic Indexing of Audio

被引：3

作者：

Knight, Allan ^{[1
]}

Almeroth, Kevin ^{[1
]}

机构：

[1] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA

来源：

INTERNATIONAL JOURNAL OF MULTIMEDIA DATA ENGINEERING & MANAGEMENT | 2010年 / 1卷 / 02期

关键词：

Audio Processing; Indexing; Multimedia; Natural Language Processing; Speech Recognition;

D O I：

10.4018/jmdem.2010040101

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

For large archives of audio media, just as with text archives, indexing is important for allowing quick and accurate searches. Similar to text archives, audio archives can use text for indexing. Generating this text requires using transcripts of the spoken portions of the audio. From them, an alignment can be made that allows users to search for specific content and immediately view the content at the position where the search terms were spoken. Although previous research has addressed this issue, the solutions align the transcripts only in real-time or greater. In this paper, the authors propose AutoCAp. It is capable of producing accurate audio indexes in faster than real-time for archived audio and in real-time for live audio. In most cases it takes less than one quarter the original duration for archived audio. This paper discusses the architecture and evaluation of the AutoCAp project as well as two of its applications.

引用

页码：1 / 17

页数：17

共 50 条

[1] A novel framework for automatic caption and audio generation
Kulkarni, Chaitanya
Monika, P.
Preeti, B.
Shruthi, S.
[J]. MATERIALS TODAY-PROCEEDINGS, 2022, 65 : 3248 - 3252
[2] Caption Alignment for Low Resource Audio-Visual Data
Konda, Vighnesh Reddy
Warialani, Mayur
Achari, Rakesh Prasanth
Bhatnagar, Varad
Akula, Jayaprakash
Jyothi, Preethi
Ramakrishnan, Ganesh
Haffari, Gholamreza
Singh, Pankaj
[J]. INTERSPEECH 2020, 2020, : 3525 - 3529
[3] News video retrieval using automatic indexing of Korean closed-caption
Cho, J
Jeong, S
Choi, B
[J]. KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 3, PROCEEDINGS, 2005, 3683 : 694 - 703
[4] Association of audio and video segmentations for automatic person indexing
El Khoury, Elie
Jaffre, Gaeel
Pinquier, Julien
Senac, Christine
[J]. 2007 INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING, PROCEEDINGS, 2007, : 287 - +
[5] Automatic segmentation and clustering for speaker indexing of audio databases
Chen, YX
Gao, J
Wang, Q
[J]. PROCEEDINGS OF THE 11TH JOINT INTERNATIONAL COMPUTER CONFERENCE, 2005, : 399 - 403
[6] Automatic audio indexing and audio playback speed control as tools for language learning
Rossiter, David
Lam, Gibson
Mak, Brian
[J]. ADVANCES IN WEB BASED LEARNING - ICWL 2006, 2006, 4181 : 290 - +
[7] Automatic Indexing Algorithm of Golf Video Using Audio Information
Kim, Hyoung-Gook
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2009, 28 (05): : 441 - 446
[8] An automatic caption alignment mechanism for off-the-shelf speech recognition technologies
Federico, Maria
Furini, Marco
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 72 (01) : 21 - 40
[9] An automatic caption alignment mechanism for off-the-shelf speech recognition technologies
Maria Federico
Marco Furini
[J]. Multimedia Tools and Applications, 2014, 72 : 21 - 40
[10] AUDIO CAPTION: LISTEN AND TELL
Wu, Mengyue
Dinkel, Heinrich
Yu, Kai
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 830 - 834

← 1 2 3 4 5 →