Using Wikipedia and Wiktionary in Domain-Specific Information Retrieval

被引:0
|
作者
Mueller, Christof [1 ]
Gurevych, Iryna [1 ]
机构
[1] Tech Univ Darmstadt, Dept Comp Sci, Ubiquitous Knowledge Proc Lab, D-64289 Darmstadt, Germany
关键词
Information Retrieval; Semantic Relatedness; Collaborative Knowledge Bases; Cross-Language Information Retrieval;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The main objective of our experiments in the domain-specific track at CLEF 2008 is utilizing semantic knowledge from collaborative knowledge bases such as Wikipedia and Wiktionary to improve the effectiveness of information retrieval. While Wikipedia has already been used in IR, the application of Wiktionary in this task is new. We evaluate two retrieval models, i.e. SR-Text and SR-Word, based on semantic relatedness by comparing their performance to a statistical model as implemented by Lucene. We refer to Wikipedia article titles and Wiktionary word entries as concepts and map query and document terms to concept vectors which are then used to compute the document relevance. lit the bilingual task, we translate the English topics into the document language, i.e. German, by rising machine translation. For SR-Text, we alternatively perform the translation process by using cross-language links in Wikipedia, whereby the terms are directly mapped to concept vectors in the target language. The evaluation shows that the latter approach especially improves the retrieval performance in cases where the machine translation system incorrectly translates query terms.
引用
收藏
页码:219 / 226
页数:8
相关论文
共 50 条
  • [31] A Method for Designing Domain-Specific Document Retrieval Systems using Semantic Indexing
    Huynh, ThanhThuong T.
    TruongAn PhamNguyen
    Do, Nhon V.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (10) : 461 - 481
  • [32] Information retrieval in domain-specific databases: An analysis to improve the user interface of the Alcohol Studies Database
    Jantz, R
    COLLEGE & RESEARCH LIBRARIES, 2003, 64 (03): : 229 - 239
  • [33] A pipeline for the retrieval and extraction of domain-specific information with application to COVID-19 immune signatures
    Adam J. H. Newton
    David Chartash
    Steven H. Kleinstein
    Robert A. McDougal
    BMC Bioinformatics, 24
  • [34] A pipeline for the retrieval and extraction of domain-specific information with application to COVID-19 immune signatures
    Newton, Adam J. H.
    Chartash, David
    Kleinstein, Steven H.
    McDougal, Robert A.
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [35] Prioritization of Domain-Specific Web Information Extraction
    Huang, Jian
    Yu, Cong
    PROCEEDINGS OF THE TWENTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-10), 2010, : 1327 - 1333
  • [36] Domain-Specific Languages in a Customs Information System
    Freudenthal, Margus
    IEEE SOFTWARE, 2010, 27 (02) : 65 - 71
  • [37] Research Area Classification using Wikipedia and Information Retrieval
    Al-Ballaa, Hailah
    Al-Dossari, Hmood
    Mirza, Abdulrahman
    PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON INTERNET OF THINGS AND MACHINE LEARNING (IML'17), 2017,
  • [38] Enhancing document modeling for information retrieval using wikipedia
    Luo, Jing
    Meng, Bo
    Tu, Xinhui
    International Journal of Advancements in Computing Technology, 2012, 4 (23) : 266 - 273
  • [39] Domain specific information retrieval system
    Pohorec, Sandi
    Verlic, Mateja
    Zorman, Milan
    PROCEEDINGS OF THE 13TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTERS, 2009, : 465 - +
  • [40] Personalized information retrieval in specific domain
    Liang, Chunyan
    ICIC Express Letters, Part B: Applications, 2011, 2 (06): : 1327 - 1332