Domain-specific readability measures to improve information retrieval in the Persian language

被引:1
|
作者
Arastoopoor, Sholeh [1 ]
机构
[1] Ferdowsi Univ Mashhad, Dept Informat Sci & Knowledge Studies, Mashhad, Iran
来源
ELECTRONIC LIBRARY | 2018年 / 36卷 / 03期
关键词
Information retrieval; Document cohesion; Document scope; Flesch-Dayani formula; Persian; Re-ranking search results; Readability scores;
D O I
10.1108/EL-01-2017-0007
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Purpose The degree to which a text is considered readable depends on the capability of the reader. This assumption puts different information retrieval systems at the risk of retrieving unreadable or hard-to-be-read yet relevant documents for their users. This paper aims to examine the potential use of concept-based readability measures along with classic measures for re-ranking search results in information retrieval systems, specifically in the Persian language. Design/methodology/approach Flesch-Dayani as a classic readability measure along with document scope (DS) and document cohesion (DC) as domain-specific measures have been applied for scoring the retrieved documents from Google (181 documents) and the RICeST database (215 documents) in the field of computer science and information technology (IT). The re-ranked result has been compared with the ranking of potential users regarding their readability. Findings The results show that there is a difference among subcategories of the computer science and IT field according to their readability and understandability. This study also shows that it is possible to develop a hybrid score based on DS and DC measures and, among all four applied scores in re-ranking the documents, the re-ranked list of documents based on the DSDC score shows correlation with re-ranking of the participants in both groups. Practical implications The findings of this study would foster a new option in re-ranking search results based on their difficulty for experts and non-experts in different fields. Originality/value The findings and the two-mode re-ranking model proposed in this paper along with its primary focus on domain-specific readability in the Persian language would help Web search engines and online databases in further refining the search results in pursuit of retrieving useful texts for users with differing expertise.
引用
收藏
页码:430 / 444
页数:15
相关论文
共 50 条
  • [1] Domain-specific information retrieval based on improved language model
    Kang, Kai
    Lin, Kunhui
    Zhou, Changle
    Guo, Feng
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2007, : 374 - +
  • [2] A Sequential Latent Topic-Based Readability Model for Domain-Specific Information Retrieval
    Zhang, Wenya
    Song, Dawei
    Zhang, Peng
    Zhao, Xiaozhao
    Hou, Yuexian
    INFORMATION RETRIEVAL TECHNOLOGY, AIRS 2015, 2015, 9460 : 241 - 252
  • [3] Conceptual language models for domain-specific retrieval
    Meij, Edgar
    Trieschnigg, Dolf
    de Rijke, Maarten
    Kraaij, Wessel
    INFORMATION PROCESSING & MANAGEMENT, 2010, 46 (04) : 448 - 469
  • [4] Domain-Specific Information Retrieval Using Recommenders
    Li, Wei
    PROCEEDINGS OF THE 34TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR'11), 2011, : 1327 - 1327
  • [5] Learning Domain-Specific, L1-Specific Measures of Word Readability
    Bergsma, Shane
    Yarowsky, David
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2013, 54 (01): : 203 - 226
  • [6] Patent Information Retrieval An Instance of Domain-specific Search
    Lupu, Mihai
    SIGIR 2012: PROCEEDINGS OF THE 35TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2012, : 1189 - 1190
  • [7] Using Wikipedia and Wiktionary in Domain-Specific Information Retrieval
    Mueller, Christof
    Gurevych, Iryna
    EVALUATING SYSTEMS FOR MULTILINGUAL AND MULTIMODAL INFORMATION ACCESS, 2009, 5706 : 219 - 226
  • [8] Medical Information Retrieval An Instance of Domain-Specific Search
    Hanbury, Allan
    SIGIR 2012: PROCEEDINGS OF THE 35TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2012, : 1191 - 1192
  • [9] Information retrieval in domain-specific databases: An analysis to improve the user interface of the Alcohol Studies Database
    Jantz, R
    COLLEGE & RESEARCH LIBRARIES, 2003, 64 (03): : 229 - 239
  • [10] Domain-specific cross-language relevant question retrieval
    Bowen Xu
    Zhenchang Xing
    Xin Xia
    David Lo
    Shanping Li
    Empirical Software Engineering, 2018, 23 : 1084 - 1122