Domain Specific Audio Indexing Using Linguistic Information

被引:0
|
作者
Pandey, L. [1 ]
Nathwani, K. [1 ]
Kaur, S. [1 ]
Husain, I. [1 ]
Pathak, R. [1 ]
Singh, G. [1 ]
Tiwari, S. [1 ]
Hegde, Rajesh M. [1 ]
机构
[1] Indian Inst Technol, Dept Elect Engn, Kanpur 208016, Uttar Pradesh, India
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper a novel methodology for indexing domain specific audio archives using linguistic information present in the speech signal is discussed. The audio indexing system is phone based and can work under limited training data conditions. A training data set that captures the linguistic information within Hindi language at the syllable level is first developed. A reduced phone set is then derived from the super syllabic set of the Hindi language. The system is then bootstrapped at the phone level with domain specific data. The audio indexing itself is then performed using a novel sliding phone protocol technique. The performance of such a audio indexing system is then evaluated for Indian parliament speech and read news. The proposed bootstrapping method with sliding phone search provides reasonable improvements in phone recognition accuracy and in terms of search retrieval efficiency when compared to conventional methods.
引用
收藏
页码:364 / 369
页数:6
相关论文
共 50 条
  • [41] Indexing linguistic atlases by regions
    Chauvreau, JP
    FRANCAIS MODERNE, 1997, 65 (01): : 13 - 14
  • [42] Leveraging Linguistic Structure For Open Domain Information Extraction
    Angeli, Gabor
    Premkumar, Melvin Johnson
    Manning, Christopher D.
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1, 2015, : 344 - 354
  • [43] The indexing of persons in news sequences using audio-visual data
    Albiol, A
    Torres, L
    Delp, EJ
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING SIGNAL, PROCESSING EDUCATION, 2003, : 137 - 140
  • [44] Semantic indexing of multimedia content using visual, audio, and text cues
    Adams, W.H. (whadams@us.ibm.com), 1600, Hindawi Publishing Corporation (2003):
  • [45] Semantic indexing of multimedia content using visual, audio, and text cues
    Adams, WH
    Iyengar, G
    Lin, CY
    Naphade, MR
    Neti, C
    Nock, HJ
    Smith, JR
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2003, 2003 (02) : 170 - 185
  • [46] Speaker indexing in audio archives using Gaussian mixture scoring simulation
    Aronowitz, H
    Burshtein, D
    Amir, A
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2005, 3361 : 243 - 252
  • [47] Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues
    W. H. Adams
    Giridharan Iyengar
    Ching-Yung Lin
    Milind Ramesh Naphade
    Chalapathy Neti
    Harriet J. Nock
    John R. Smith
    EURASIP Journal on Advances in Signal Processing, 2003
  • [48] Continuous Emotion Recognition using Visual-audio-linguistic Information: A Technical Report for ABAW3
    Zhang, Su
    An, Ruyi
    Ding, Yi
    Guan, Cuntai
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2375 - 2380
  • [49] Using semantic components to represent and search domain-specific documents An evaluation of indexing accuracy and consistency
    Lykke, Marianne
    Price, Susan L.
    Delcambre, Lois M. L.
    PARADIGMS AND CONCEPTUAL SYSTEMS IN KNOWLEDGE ORGANIZATION, 2010, 12 : 276 - 282
  • [50] Content-based indexing and retrieval of audio data using wavelets
    Li, GH
    Khokhar, AA
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 885 - 888