Semantic Vector Space Model for Reducing Arabic Text Dimensionality

被引:0
|
作者
Awajun, Arafat [1 ]
机构
[1] Princess Sumava Univ Technol, Dept Comp Sci, Amman, Jordan
关键词
Semantic vector space model; word-context matrix; Arabic language processing; text dimension reduction;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we introduce an efficient method to represent Arabic texts in comparatively smaller sizes without losing significant information. The proposed method uses the linguistic features of the Arabic language, mainly its very productive morphology and its richness in synonyms, to reduce the dimension of the document vector and to improve its vector space model representation. We have incorporated semantic information from word thesauri like WordNet to create clusters of similar words extracted from the same root and regrouped along with their synonyms. Distributional similarity measures are applied on the word-context matrix associated with the document in order to identify similar words based on a text's context. The experimental results have confirmed that the proposed method significantly reduces the size of text representation by about 20% compared with the stem-based vector space model and by about 40% compared with the traditional bag of words model.
引用
收藏
页码:129 / 135
页数:7
相关论文
共 50 条
  • [31] Semantic Representation Extraction from Unstructured Arabic Text
    Zakria, Gehad
    Farouk, Mamdouh
    Fathy, Khaled
    Makar, Malak N.
    PROCEEDINGS OF 2019 8TH INTERNATIONAL CONFERENCE ON SOFTWARE AND INFORMATION ENGINEERING (ICSIE 2019), 2019, : 222 - 226
  • [32] A Semantic Proximity Based System of Arabic Text Indexation
    Zaki, Taher
    Mammass, Driss
    Ennaji, Abdellatif
    IMAGE AND SIGNAL PROCESSING, PROCEEDINGS, 2010, 6134 : 419 - +
  • [33] A focused crawler based on semantic disambiguation vector space model
    Liu, Wenjun
    He, Yu
    Wu, Jing
    Du, Yajun
    Liu, Xing
    Xi, Tiejun
    Gan, Zurui
    Jiang, Pengjun
    Huang, Xiaoping
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (01) : 345 - 366
  • [34] A focused crawler based on semantic disambiguation vector space model
    Wenjun Liu
    Yu He
    Jing Wu
    Yajun Du
    Xing Liu
    Tiejun Xi
    Zurui Gan
    Pengjun Jiang
    Xiaoping Huang
    Complex & Intelligent Systems, 2023, 9 : 345 - 366
  • [35] Semantic Context-dependent Weighting for Vector Space Model
    Nakanishi, Takafumi
    2014 IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2014, : 262 - 266
  • [36] An ontology-improved vector space model for semantic retrieval
    Tang, Mingwei
    Chen, Jiangping
    Chen, Haihua
    Xu, Zhenyuan
    Wang, Yueyao
    Xie, Mengting
    Lin, Jiangwei
    ELECTRONIC LIBRARY, 2020, 38 (5-6): : 919 - 942
  • [37] Knowledge-based vector space model for text clustering
    Jing, Liping
    Ng, Michael K.
    Huang, Joshua Z.
    KNOWLEDGE AND INFORMATION SYSTEMS, 2010, 25 (01) : 35 - 55
  • [38] Knowledge-based vector space model for text clustering
    Liping Jing
    Michael K. Ng
    Joshua Z. Huang
    Knowledge and Information Systems, 2010, 25 : 35 - 55
  • [39] Method of filtering reactionary text based on vector space model
    College of Information Security Engineering, Shanghai Jiaotong University, Shanghai 200030, China
    Jisuanji Gongcheng, 2006, 10 (4-5+8):
  • [40] Beyond TFIDF Weighting for Text Categorization in the Vector Space Model
    Soucy, Pascal
    Mineau, Guy W.
    19TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-05), 2005, : 1130 - 1135