Combining Text Vector Representations for Information Retrieval

被引:0
|
作者
Carrillo, Maya [1 ,2 ]
Eliasmith, Chris [3 ]
Lopez-Lopez, A. [1 ]
机构
[1] INAOE, Coordinac Ciencias Computac, Luis Enrique Erro 1, Puebla 72840, Mexico
[2] BUAP, Fac Ciencias Comput, Puebla 72570, Mexico
[3] Univ Waterloo, Ctr Theoret Neurosci, Dept Syst Design Engn, Dept Phil, Waterloo, ON N2L 3G1, Canada
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper suggests a novel representation for documents that is intended to improve precision. This representation is generated by combining two central techniques: Random Indexing; and Holographic Reduced Representations (HRRs). Random indexing uses co-occurrence information among words to generate semantic context vectors that are the sum of randomly generated term identity vectors. HRRs are used to encode textual structure which can directly capture relations between words (e.g., compound terms, subject-verb, and verb-object). By using the random vectors to capture semantic information, and then employing HRRs to capture structural relations extracted from the text, document vectors are generated by summing all such representations in a document. In this paper, we show that these representations can be successfully used in information retrieval, can effectively incorporate relations, and can reduce the dimensionality of the traditional vector space model (VSM). The results of our experiments show that, when a representation that uses random index vectors is combined with different contexts, such as document occurrence representation (DOR), term co-occurrence representation (TCOR) and HRRs, the VSM representation is outperformed when employed in information retrieval tasks.
引用
收藏
页码:24 / +
页数:2
相关论文
共 50 条
  • [1] COMBINING THE EVIDENCE OF MULTIPLE QUERY REPRESENTATIONS FOR INFORMATION-RETRIEVAL
    BELKIN, NJ
    KANTOR, P
    FOX, EA
    SHAW, JA
    [J]. INFORMATION PROCESSING & MANAGEMENT, 1995, 31 (03) : 431 - 448
  • [2] TEXT PROCESSING IN INFORMATION RETRIEVAL SYSTEM USING VECTOR SPACE MODEL
    Premalatha, R.
    Srinivasan, S.
    [J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES), 2014,
  • [3] Combining text clustering and text retrieval for corpus adaptation
    He F.
    Ding X.
    [J]. Gaojishu Tongxin/Chinese High Technology Letters, 2010, 20 (12): : 1224 - 1228
  • [4] Combining image and structured text retrieval
    Iskandar, D. N. F. Awang
    Pehcevski, Jovan
    Thom, James A.
    Tahaghoghi, S. M. M.
    [J]. ADVANCES IN XML INFORMATION RETRIEVAL AND EVALUATION, 2006, 3977 : 525 - 539
  • [5] Lower dimensional representation of text data in vector space based information retrieval
    Park, H
    Jeon, M
    Rosen, JB
    [J]. COMPUTATIONAL INFORMATION RETRIEVAL, 2001, : 3 - 23
  • [6] A new similarity measure for vector space models in text classification and information retrieval
    Eminagaoglu, Mete
    [J]. JOURNAL OF INFORMATION SCIENCE, 2022, 48 (04) : 463 - 476
  • [7] Text Information Retrieval in Tetun
    de Jesus, Gabriel
    [J]. ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III, 2023, 13982 : 429 - 435
  • [8] Text databases and information retrieval
    [J]. ACM Comput Surv, 1 (133):
  • [9] Text Analysis and Information Retrieval of Text Data
    Gupta, Honey
    Kottwani, Aveena
    Gogia, Soniya
    Chaudhari, Sheetal
    [J]. PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2016, : 788 - 792
  • [10] Text mining and information retrieval
    Forest, Dominic
    Da Sylva, Lyne
    [J]. CANADIAN JOURNAL OF INFORMATION AND LIBRARY SCIENCE-REVUE CANADIENNE DES SCIENCES DE L INFORMATION ET DE BIBLIOTHECONOMIE, 2011, 35 (03): : 217 - 227