An LDA-smoothed Relevance Model for Document Expansion: A Case Study for Spoken Document Retrieval

被引:0
|
作者
Ganguly, Debasis [1 ]
Leveling, Johannes [1 ]
Jones, Gareth J. F. [1 ]
机构
[1] Dublin City Univ, Sch Comp, Ctr Next Generat Localisat, Dublin 9, Ireland
基金
爱尔兰科学基金会;
关键词
Document Expansion; Topic Modelling;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Document expansion (DE) in information retrieval (IR) involves modifying each document in the collection by introducing additional terms into the document. It is particularly useful to improve retrieval of short and noisy documents where the additional terms can improve the description of the document content. Existing approaches to DE assume that documents to be expanded are from a single topic. In the case of multi-topic documents this can lead to a topic bias in terms selected for DE and hence may result in poor retrieval quality due to the lack of coverage of the original document topics in the expanded document. This paper proposes a new DE technique providing a more uniform selection and weighting of DE terms from all constituent topics. We show that our proposed method significantly outperforms the most recently reported relevance model based DE method on a spoken document retrieval task for both manual and automatic speech recognition transcripts.
引用
下载
收藏
页码:1057 / 1060
页数:4
相关论文
共 50 条
  • [41] Cluster-based Language Model for Spoken Document Retrieval Using NMF-Based Document Clustering
    Hu, Xinhui
    Isotani, Ryosuke
    Kawai, Hisashi
    Nakamura, Satoshi
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 705 - 708
  • [42] SpeechFind: Advances in spoken document retrieval for a National Gallery of the Spoken Word
    Hansen, JH
    Huang, RQ
    Zhou, B
    Seadle, M
    Deller, JR
    Gurijala, AR
    Kurimo, M
    Angkititrakul, P
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 712 - 730
  • [43] A Soundex-Based Approach for Spoken Document Retrieval
    Alejandro Reyes-Barragan, M.
    Villasenor-Pineda, Luis
    Montes-y-Gomez, Manuel
    MICAI 2008: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, 5317 : 204 - 211
  • [44] The RWTH speech recognition system and spoken document retrieval
    Ney, H
    Welling, L
    Ortmanns, S
    Beulen, K
    Wessel, E
    IECON '98 - PROCEEDINGS OF THE 24TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, VOLS 1-4, 1998, : 2022 - 2027
  • [45] Combination of similarity measures for effective spoken document retrieval
    Crestani, F
    JOURNAL OF INFORMATION SCIENCE, 2003, 29 (02) : 87 - 96
  • [46] Spoken Document Retrieval Based on Approximated Sequence Alignment
    Comas, Pere R.
    Turmo, Jordi
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2008, 5246 : 285 - 292
  • [47] RWTH speech recognition system and spoken document retrieval
    RWTH Aachen - Univ of Technology, Aachen, Germany
    IECON Proc, 1600, (2022-2027):
  • [48] Subword-based approaches for spoken document retrieval
    Ng, K
    Zue, VW
    SPEECH COMMUNICATION, 2000, 32 (03) : 157 - 186
  • [49] Automatic transcription of audio archives for spoken document retrieval
    Ircing, Pavel
    Psutka, Josef
    Radova, Vlasta
    PROCEEDINGS OF THE SECOND IASTED INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, 2006, : 448 - +
  • [50] An analysis of the effects of unknown word in the spoken document retrieval
    Ohira, S
    Shirai, K
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 4017 - 4017