An LDA-smoothed Relevance Model for Document Expansion: A Case Study for Spoken Document Retrieval

被引:0
|
作者
Ganguly, Debasis [1 ]
Leveling, Johannes [1 ]
Jones, Gareth J. F. [1 ]
机构
[1] Dublin City Univ, Sch Comp, Ctr Next Generat Localisat, Dublin 9, Ireland
基金
爱尔兰科学基金会;
关键词
Document Expansion; Topic Modelling;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Document expansion (DE) in information retrieval (IR) involves modifying each document in the collection by introducing additional terms into the document. It is particularly useful to improve retrieval of short and noisy documents where the additional terms can improve the description of the document content. Existing approaches to DE assume that documents to be expanded are from a single topic. In the case of multi-topic documents this can lead to a topic bias in terms selected for DE and hence may result in poor retrieval quality due to the lack of coverage of the original document topics in the expanded document. This paper proposes a new DE technique providing a more uniform selection and weighting of DE terms from all constituent topics. We show that our proposed method significantly outperforms the most recently reported relevance model based DE method on a spoken document retrieval task for both manual and automatic speech recognition transcripts.
引用
下载
收藏
页码:1057 / 1060
页数:4
相关论文
共 50 条
  • [11] A hybrid model to improve relevance in document retrieval
    Department of Electronics and Communication, University of Allahabad, Allahabad, India
    不详
    J. Digit. Inf. Manage., 2006, 1 (73-81):
  • [12] A NEURAL DOCUMENT LANGUAGE MODELING FRAMEWORK FOR SPOKEN DOCUMENT RETRIEVAL
    Yen, Li-Phen
    Wu, Zhen-Yu
    Chen, Kuan-Yu
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8139 - 8143
  • [13] Spoken document representations for probabilistic retrieval
    Jourlin, P
    Johnson, SE
    Sparck-Jones, K
    Woodland, PC
    SPEECH COMMUNICATION, 2000, 32 (1-2) : 21 - 36
  • [14] The THISL spoken document retrieval project
    Renals, S
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, PROCEEDINGS VOL 2, 1999, : 1049 - 1051
  • [15] Probabilistic aspects in spoken document retrieval
    Macherey, W
    Viechtbauer, HJ
    Ney, H
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2003, 2003 (02) : 115 - 127
  • [16] New Approaches to Spoken Document Retrieval
    Martin Wechsler
    Eugen Munteanu
    Peter Schäuble
    Information Retrieval, 2000, 3 : 173 - 188
  • [17] Probabilistic Aspects in Spoken Document Retrieval
    Wolfgang Macherey
    Hans Jörg Viechtbauer
    Hermann Ney
    EURASIP Journal on Advances in Signal Processing, 2003
  • [18] Phonetic recognition for spoken document retrieval
    Ng, K
    Zue, VW
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 325 - 328
  • [19] Probabilistic aspects in spoken document retrieval
    Macherey, W. (w.macherey@informatik.rwth-aachen.de), 1600, Hindawi Publishing Corporation (2003):
  • [20] Information fusion for spoken document retrieval
    Ng, K
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 2405 - 2408