Domain Specific Audio Indexing Using Linguistic Information

被引：0

作者：

Pandey, L. ^{[1
]}

Nathwani, K. ^{[1
]}

Kaur, S. ^{[1
]}

Husain, I. ^{[1
]}

Pathak, R. ^{[1
]}

Singh, G. ^{[1
]}

Tiwari, S. ^{[1
]}

Hegde, Rajesh M. ^{[1
]}

机构：

[1] Indian Inst Technol, Dept Elect Engn, Kanpur 208016, Uttar Pradesh, India

来源：

2014 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT) | 2014年

关键词：

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper a novel methodology for indexing domain specific audio archives using linguistic information present in the speech signal is discussed. The audio indexing system is phone based and can work under limited training data conditions. A training data set that captures the linguistic information within Hindi language at the syllable level is first developed. A reduced phone set is then derived from the super syllabic set of the Hindi language. The system is then bootstrapped at the phone level with domain specific data. The audio indexing itself is then performed using a novel sliding phone protocol technique. The performance of such a audio indexing system is then evaluated for Indian parliament speech and read news. The proposed bootstrapping method with sliding phone search provides reasonable improvements in phone recognition accuracy and in terms of search retrieval efficiency when compared to conventional methods.

引用

页码：364 / 369

页数：6

共 50 条

[41] Indexing linguistic atlases by regions
Chauvreau, JP
FRANCAIS MODERNE, 1997, 65 (01): : 13 - 14
[42] Leveraging Linguistic Structure For Open Domain Information Extraction
Angeli, Gabor
Premkumar, Melvin Johnson
Manning, Christopher D.
PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1, 2015, : 344 - 354
[43] The indexing of persons in news sequences using audio-visual data
Albiol, A
Torres, L
Delp, EJ
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING SIGNAL, PROCESSING EDUCATION, 2003, : 137 - 140
[44] Semantic indexing of multimedia content using visual, audio, and text cues
Adams, W.H. (whadams@us.ibm.com), 1600, Hindawi Publishing Corporation (2003):
[45] Semantic indexing of multimedia content using visual, audio, and text cues
Adams, WH
Iyengar, G
Lin, CY
Naphade, MR
Neti, C
Nock, HJ
Smith, JR
EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2003, 2003 (02) : 170 - 185
[46] Speaker indexing in audio archives using Gaussian mixture scoring simulation
Aronowitz, H
Burshtein, D
Amir, A
MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2005, 3361 : 243 - 252
[47] Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues
W. H. Adams
Giridharan Iyengar
Ching-Yung Lin
Milind Ramesh Naphade
Chalapathy Neti
Harriet J. Nock
John R. Smith
EURASIP Journal on Advances in Signal Processing, 2003
[48] Continuous Emotion Recognition using Visual-audio-linguistic Information: A Technical Report for ABAW3
Zhang, Su
An, Ruyi
Ding, Yi
Guan, Cuntai
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2375 - 2380
[49] Using semantic components to represent and search domain-specific documents An evaluation of indexing accuracy and consistency
Lykke, Marianne
Price, Susan L.
Delcambre, Lois M. L.
PARADIGMS AND CONCEPTUAL SYSTEMS IN KNOWLEDGE ORGANIZATION, 2010, 12 : 276 - 282
[50] Content-based indexing and retrieval of audio data using wavelets
Li, GH
Khokhar, AA
2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 885 - 888

← 1 2 3 4 5 →