Implementing Random Indexing on GPU

被引:0
|
作者
Polok, Lukas [1 ]
Smrz, Pavel [1 ]
机构
[1] Brno Univ Technol, Fac Informat Technol, Bozetechova 2, Brno 61266, Czech Republic
基金
欧盟第七框架计划;
关键词
GPGPU; term co-occurrence; word space models;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Vector space models have received a significant attention in recent years. They have been applied in a wide spectrum of areas including information filtering, information retrieval, document indexing and relevancy ranking. Random indexing is one of the methods employing distributional statistics of term co-occurrences to generate vector space models from a set of documents. If the size of the document collection is large, a significant computational power is required to compute the results. This paper presents an efficient implementation of the random indexing method on GPU which allows efficient training on large datasets. It is only limited by the amount of memory available on the GPU. Various ways to overcome the dependence on the GPU memory are discussed. Speedups in magnitude of tens are achieved for training from random seed vectors, and even much higher figures for retraining. The implementation scales well with both the term vector dimension and the seed length.
引用
收藏
页码:134 / 142
页数:9
相关论文
共 50 条
  • [1] Implementing Digital Downconversion on a GPU
    Kandaurov, N. A.
    Lipatkin, V., I
    Varlamov, V. O.
    [J]. 2021 SYSTEMS OF SIGNAL SYNCHRONIZATION, GENERATING AND PROCESSING IN TELECOMMUNICATIONS (SYNCHROINFO), 2021,
  • [2] Random Indexing Revisited
    QasemiZadeh, Behrang
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2015, 2015, 9103 : 437 - 442
  • [3] Random Indexing and Modified Random Indexing based approach for extractive text summarization
    Chatterjee, Niladri
    Sahoo, Pramod Kumar
    [J]. COMPUTER SPEECH AND LANGUAGE, 2015, 29 (01): : 32 - 44
  • [4] Random tilings with the GPU
    Keating, David
    Sridhar, Ananth
    [J]. JOURNAL OF MATHEMATICAL PHYSICS, 2018, 59 (09)
  • [5] Implementing Decision Trees and Forests on a GPU
    Sharp, Toby
    [J]. COMPUTER VISION - ECCV 2008, PT IV, PROCEEDINGS, 2008, 5305 : 595 - 608
  • [6] Analyzing and Implementing GPU Hash Tables
    Awad, Muhammad A.
    Ashkiani, Saman
    Porumbescu, Serban D.
    Farach-Colton, Martin
    Owens, John D.
    [J]. 2023 SYMPOSIUM ON ALGORITHMIC PRINCIPLES OF COMPUTER SYSTEMS, APOCS, 2023, : 33 - 50
  • [7] Indexing schemes for random points
    Koutsoupias, E
    Taylor, D
    [J]. PROCEEDINGS OF THE TENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 1999, : 596 - 602
  • [8] Random indexing of multidimensional data
    Fredrik Sandin
    Blerim Emruli
    Magnus Sahlgren
    [J]. Knowledge and Information Systems, 2017, 52 : 267 - 290
  • [9] Random indexing of multidimensional data
    Sandin, Fredrik
    Emruli, Blerim
    Sahlgren, Magnus
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 52 (01) : 267 - 290
  • [10] Employing GPU architectures for permutation-based indexing
    Martin Kruliš
    Hasmik Osipyan
    Stéphane Marchand-Maillet
    [J]. Multimedia Tools and Applications, 2017, 76 : 11859 - 11887