Spoken document representations for probabilistic retrieval

被引:5
|
作者
Jourlin, P
Johnson, SE
Sparck-Jones, K
Woodland, PC
机构
[1] Univ Cambridge, Comp Lab, Cambridge CB2 3QG, England
[2] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England
基金
英国工程与自然科学研究理事会;
关键词
spoken document retrieval; automatic speech recognition; information retrieval;
D O I
10.1016/S0167-6393(00)00021-2
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents some developments in query expansion and document representation of our spoken document retrieval system and shows how various retrieval techniques affect performance for different sets of transcriptions derived from a common speech source. Modifications of the document representation are used, which combine several techniques for query expansion, knowledge-based on one hand and statistics-based on the other. Taken together, these techniques can improve Average Precision by over 19% relative to a system similar to that which we presented at TREC-7. These new experiments have also confirmed that the degradation of Average Precision due to a word error rate (WER) of 25% is quite small (3.7% relative) and can be reduced to almost zero (0.2% relative). The overall improvement of the retrieval system can also be observed for seven different sets of transcriptions from different recognition engines with a WER ranging from 24.8% to 61.5%. We hope to repeat these experiments when larger document collections become available, in order to evaluate the scalability of these techniques. (C) 2000 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:21 / 36
页数:16
相关论文
共 50 条
  • [41] Combination of similarity measures for effective spoken document retrieval
    Crestani, F
    [J]. JOURNAL OF INFORMATION SCIENCE, 2003, 29 (02) : 87 - 96
  • [42] An analysis of the effects of unknown word in the spoken document retrieval
    Ohira, S
    Shirai, K
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 4017 - 4017
  • [43] Spoken Document Retrieval With Unsupervised Query Modeling Techniques
    Chen, Berlin
    Chen, Kuan-Yu
    Chen, Pei-Ning
    Chen, Yi-Wen
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (09): : 2602 - 2612
  • [44] Statistical Lattice-Based Spoken Document Retrieval
    Chia, Tee Kiah
    Sim, Khe Chai
    Li, Haizhou
    Ng, Hwee Tou
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2010, 28 (01)
  • [45] Automatic transcription of audio archives for spoken document retrieval
    Ircing, Pavel
    Psutka, Josef
    Radova, Vlasta
    [J]. PROCEEDINGS OF THE SECOND IASTED INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, 2006, : 448 - +
  • [46] Leveraging Relevance Cues for Improved Spoken Document Retrieval
    Chen, Pei-Ning
    Chen, Kuan-Yu
    Chen, Berlin
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 936 - +
  • [47] Spoken Document Retrieval System based on Phonemic Transcribing
    Tatarinova, Alexandra
    Prozorov, Dmitriy
    [J]. 2017 IEEE EAST-WEST DESIGN & TEST SYMPOSIUM (EWDTS), 2017,
  • [48] Evaluation of Spoken Document Retrieval for Historic Speech Collections
    Heeren, W.
    de Jong, F.
    van der Werff, L.
    Huijbregts, M.
    Ordelman, R.
    [J]. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 2037 - 2041
  • [49] Spoken Document Retrieval for Oral Presentations Integrating Global Document Similarities into Local Document Similarities
    Nanjo, Hiroaki
    Iyonaga, Yusuke
    Yoshimi, Takehiko
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1285 - 1288
  • [50] DOCUMENT RETRIEVAL USING A PROBABILISTIC KNOWLEDGE MODEL
    Wang, Shuguang
    Visweswaran, Shyam
    Hauskrecht, Milos
    [J]. KDIR 2009: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2009, : 26 - +