Fast Caption Alignment for Automatic Indexing of Audio

被引:3
|
作者
Knight, Allan [1 ]
Almeroth, Kevin [1 ]
机构
[1] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA
关键词
Audio Processing; Indexing; Multimedia; Natural Language Processing; Speech Recognition;
D O I
10.4018/jmdem.2010040101
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
For large archives of audio media, just as with text archives, indexing is important for allowing quick and accurate searches. Similar to text archives, audio archives can use text for indexing. Generating this text requires using transcripts of the spoken portions of the audio. From them, an alignment can be made that allows users to search for specific content and immediately view the content at the position where the search terms were spoken. Although previous research has addressed this issue, the solutions align the transcripts only in real-time or greater. In this paper, the authors propose AutoCAp. It is capable of producing accurate audio indexes in faster than real-time for archived audio and in real-time for live audio. In most cases it takes less than one quarter the original duration for archived audio. This paper discusses the architecture and evaluation of the AutoCAp project as well as two of its applications.
引用
收藏
页码:1 / 17
页数:17
相关论文
共 50 条
  • [1] A novel framework for automatic caption and audio generation
    Kulkarni, Chaitanya
    Monika, P.
    Preeti, B.
    Shruthi, S.
    [J]. MATERIALS TODAY-PROCEEDINGS, 2022, 65 : 3248 - 3252
  • [2] Caption Alignment for Low Resource Audio-Visual Data
    Konda, Vighnesh Reddy
    Warialani, Mayur
    Achari, Rakesh Prasanth
    Bhatnagar, Varad
    Akula, Jayaprakash
    Jyothi, Preethi
    Ramakrishnan, Ganesh
    Haffari, Gholamreza
    Singh, Pankaj
    [J]. INTERSPEECH 2020, 2020, : 3525 - 3529
  • [3] News video retrieval using automatic indexing of Korean closed-caption
    Cho, J
    Jeong, S
    Choi, B
    [J]. KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 3, PROCEEDINGS, 2005, 3683 : 694 - 703
  • [4] Association of audio and video segmentations for automatic person indexing
    El Khoury, Elie
    Jaffre, Gaeel
    Pinquier, Julien
    Senac, Christine
    [J]. 2007 INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING, PROCEEDINGS, 2007, : 287 - +
  • [5] Automatic segmentation and clustering for speaker indexing of audio databases
    Chen, YX
    Gao, J
    Wang, Q
    [J]. PROCEEDINGS OF THE 11TH JOINT INTERNATIONAL COMPUTER CONFERENCE, 2005, : 399 - 403
  • [6] Automatic audio indexing and audio playback speed control as tools for language learning
    Rossiter, David
    Lam, Gibson
    Mak, Brian
    [J]. ADVANCES IN WEB BASED LEARNING - ICWL 2006, 2006, 4181 : 290 - +
  • [7] Automatic Indexing Algorithm of Golf Video Using Audio Information
    Kim, Hyoung-Gook
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2009, 28 (05): : 441 - 446
  • [8] An automatic caption alignment mechanism for off-the-shelf speech recognition technologies
    Federico, Maria
    Furini, Marco
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 72 (01) : 21 - 40
  • [9] An automatic caption alignment mechanism for off-the-shelf speech recognition technologies
    Maria Federico
    Marco Furini
    [J]. Multimedia Tools and Applications, 2014, 72 : 21 - 40
  • [10] AUDIO CAPTION: LISTEN AND TELL
    Wu, Mengyue
    Dinkel, Heinrich
    Yu, Kai
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 830 - 834