n-Gram-based indexing for Korean text retrieval

被引:8
|
作者
Lee, JH
Cho, HY
Park, HR
机构
[1] Soongsil Univ, Sch Comp, Dongjak Gu, Seoul 156743, South Korea
[2] Korea Adv Inst Sci & Technol, Korea Res & Dev Informat Ctr, Taejon 305600, South Korea
关键词
information retrieval; indexing method; Korean text; n-Gram;
D O I
10.1016/S0306-4573(98)00050-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Two groups of indexing methods and morpheme-based indexing have been investigated in the literature of Korean text retrieval. The word-based indexing eliminates the suffix of a word, and generates its remaining stem as an index term. The index term is often a compound noun, which results in the serious decrease of retrieval effectiveness. The morpheme-based indexing overcomes the problem of compound nouns by decomposing a compound noun into simple nouns. It, however, requires a large dictionary and complex linguistic knowledge. In this paper we propose a new indexing method based on n-gram-based indexing is considerably faster than the morpheme-based indexing, and also provides better retrieval effectiveness. (C) 1999 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:427 / 441
页数:15
相关论文
共 50 条
  • [41] A new indexing method based on word proximity for Chinese text retrieval
    Lin Du
    Yufang Sun
    Journal of Computer Science and Technology, 2000, 15 : 280 - 286
  • [42] A new indexing method based on word proximity for Chinese text retrieval
    Du, L
    Sun, YF
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2000, 15 (03) : 280 - 286
  • [43] A New Indexing Method Based on Word Proximity for Chinese Text Retrieval
    杜林
    孙玉芳
    JournalofComputerScienceandTechnology, 2000, (03) : 280 - 286
  • [44] Word-based compression methods and indexing for text retrieval systems
    Dvorsky, J
    Pokorny, J
    Snásel, V
    ADVANCES IN DATABASES AND INFORMATION SYSTEMS, 1999, 1691 : 75 - 84
  • [45] Metric indexing for the vector model in Text Retrieval
    Skopal, T
    Moravec, P
    Pokorny, J
    Snásel, V
    STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2004, 3246 : 183 - 195
  • [46] Arabic Document Indexing for Improved Text Retrieval
    Al-Lahham, Yaser A. M.
    2019 2ND INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS), 2019, : 226 - 230
  • [47] INDEXING AND RETRIEVAL OF NON-TEXT INFORMATION
    Vermeij, Hermine
    CATALOGING & CLASSIFICATION QUARTERLY, 2013, 51 (08) : 945 - 946
  • [48] Information retrieval and text categorization with semantic indexing
    Rosso, P
    Molina, A
    Pla, F
    Jiménez, D
    Vidal, V
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2004, 2945 : 596 - 600
  • [49] Performance analysis of semantic indexing in text retrieval
    Kang, BY
    Kim, HJ
    Lee, SJ
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2004, 2945 : 433 - 436
  • [50] Efficient indexing for Query By String text retrieval
    Ghosh, Suman K.
    Gomez, Liuis
    Karatzas, Dimosthenis
    Valveny, Ernest
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 1236 - 1240