Information retrieval techniques applied to the development of a thesaurus

被引:2
|
作者
Gil Urdiciain, Blanca [1 ]
Jimenez, Rodrigo Sanchez [1 ]
机构
[1] Univ Complutense Madrid, Fac Ciencias Documentac, Madrid 28010, Spain
来源
TRANSINFORMACAO | 2014年 / 26卷 / 01期
关键词
Thesaurus development; Clustering; Vector space model; Generalized vector space model; Latent semantic indexing model;
D O I
10.1590/S0103-37862014000100003
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
The aim of the article was to propose the application of a set of techniques used in Information Retrieval for the development of a Thesaurus. The proposed ideas have been applied in the selection of the terminology; categorization of terms by creating clusters; and establishment of semantic relationships between terms through semantic similarity, which resulted in a Foreign Trade Thesaurus of 7,790 terms. From these results, we concluded that the techniques used significantly simplified the tasks of obtaining the terminology, and they can improve the quality of the final thesaurus. In addition, the techniques enabled the analysis of the conditions of the collection for which the thesaurus is used and provide extra information that would be hard to obtain manually.
引用
收藏
页码:19 / 26
页数:8
相关论文
共 50 条
  • [21] Semantic thesaurus for automatic expanded query in information retrieval
    Gonzalez, M
    de Lima, VLS
    EIGHTH SYMPOSIUM ON STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2001, : 68 - 75
  • [22] Information Retrieval Techniques for Corpus Filtering Applied to External Plagiarism Detection
    Micol, Daniel
    Ferrandez, Oscar
    Munoz, Rafael
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, 2011, 6716 : 100 - 111
  • [23] Does the Traditional Thesaurus Have a Place in Modern Information Retrieval?
    Hjorland, Birger
    KNOWLEDGE ORGANIZATION, 2016, 43 (03): : 145 - 159
  • [24] FILE STOCKING FOR AN INTEGRATED NETWORK OF INFORMATION RETRIEVAL SYSTEMS AND THESAURUS
    KULIK, AN
    NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 2-INFORMATSIONNYE PROTSESSY I SISTEMY, 1972, (12): : 23 - 29
  • [25] Thesaurus and beyond: An advanced formula for linguistic engineering and information retrieval
    Schmitz-Esser, W
    KNOWLEDGE ORGANIZATION, 1999, 26 (01): : 10 - 22
  • [26] The ISO 25964 Data Model for the Structure of an Information Retrieval Thesaurus
    Will, Leonard
    CATEGORIES, CONTEXTS AND RELATIONS IN KNOWLEDGE ORGANIZATION, 2012, 13 : 284 - 290
  • [27] A thesaurus for improving information retrieval in an integrated legal expert system
    Cammelli, A
    Socci, F
    NINTH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 1998, : 619 - 624
  • [28] HISTORICALLY DOCUMENTED THESAURUS FOR IMPROVED RETROSPECTIVE INFORMATION-RETRIEVAL
    MARTINEZ, SJ
    BAILEY, JA
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1978, 175 (MAR): : 28 - 28
  • [29] A cooccurrence-based thesaurus and two applications to information retrieval
    Schutze, H
    Pedersen, JO
    INFORMATION PROCESSING & MANAGEMENT, 1997, 33 (03) : 307 - 318
  • [30] THESAURUS DEVELOPMENT FOR A DECENTRALIZED INFORMATION NETWORK
    ELLER, JL
    PANEK, RL
    AMERICAN DOCUMENTATION, 1968, 19 (03): : 213 - &