Doc2Vec, SBERT, InferSent, and USE Which embedding technique for noun phrases?

被引:3
|
作者
Ajallouda, Lahbib [1 ]
Najmani, Kawtar [2 ]
Zellou, Ahmed [1 ]
Benlahmar, El Habib [2 ]
机构
[1] Mohammed V Univ, SPM ENSIAS, Rabat, Morocco
[2] Fac Sci Ben Msik, Dept Math & Comp Sci, Casablanca, Morocco
关键词
Phrase embedding techniques; Natural language processing; Noun phrases; Empirical study;
D O I
10.1109/IRASET52964.2022.9738300
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Phrase embedding is a technique of representing phrases in vector space. A very high effort has been made to develop this technique to improve tasking in various natural language processing (NLP) applications. The evaluation of phrase embedding has been presented in many studies, but most of them focused on the intrinsic or extrinsic evaluation process regardless of the type of the phrase (noun phrases, Verb phrases...). In the literature, there is no study evaluating the embedding of noun phrases, knowing that this type is used by many NLP applications, such as automatic key-phrase extraction (AKE), information retrieval, and question answering. In this article, we will present an empirical study to compare the most common phrase embedding techniques, to determine the most suitable for representing noun phrases. Dataset used in the comparison process consists of the noun phrases from the Inspec and SemEval2010 datasets, to which we have added their manually defined synonyms.
引用
收藏
页码:548 / 552
页数:5
相关论文
共 50 条
  • [21] Using Collaborative Filtering Algorithms Combined with Doc2Vec for Movie Recommendation
    Liu, Gaojun
    Wu, Xingyu
    [J]. PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 1461 - 1464
  • [22] Key word extraction for short text via word2vec, doc2vec, and textrank
    Li, Jun
    Huang, Guimin
    Fan, Chunli
    Sun, Zhenglin
    Zhu, Hongtao
    [J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2019, 27 (03) : 1794 - 1805
  • [23] Sentiment analysis via Doc2Vec and Convolutional Neural Network hybrids
    Dhariyal, Bhaskar
    Ravi, Vadlamani
    Ravi, Kumar
    [J]. 2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, : 666 - 671
  • [24] Sentiment Analysis on Twitter data with Semi-Supervised Doc2Vec
    Bilgin, Metin
    Senturk, Izzet Fatih
    [J]. 2017 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2017, : 661 - 666
  • [25] A doc2vec and local outlier factor approach to measuring the novelty of patents
    Jeon, Daeseong
    Ahn, Joon Mo
    Kim, Juram
    Lee, Changyong
    [J]. TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE, 2022, 174
  • [26] Recommendation method for academic journal submission based on doc2vec and XGBoost
    Huang ZhengWei
    Min JinTao
    Yang YanNi
    Huang Jin
    Tian Ye
    [J]. Scientometrics, 2022, 127 : 2381 - 2394
  • [27] Web services classification via combining Doc2Vec and LINE model
    Ye, Hongfan
    Cao, Buqing
    Geng, Jinkun
    Wen, Yiping
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2020, 23 (03) : 250 - 261
  • [28] Filtering Malicious Java']JavaScript Code with Doc2Vec on an Imbalanced Dataset
    Mimura, Mamoru
    Suga, Yuya
    [J]. 2019 14TH ASIA JOINT CONFERENCE ON INFORMATION SECURITY (ASIAJCIS 2019), 2019, : 24 - 31
  • [29] Recommendation method for academic journal submission based on doc2vec and XGBoost
    Huang Zhengwei
    Min Jintao
    Yang Yanni
    Huang Jin
    Tian Ye
    [J]. SCIENTOMETRICS, 2022, 127 (05) : 2381 - 2394
  • [30] Identification of Cybersecurity Specific Content Using the Doc2Vec Language Model
    Mendsaikhan, Otgonpurev
    Hasegawa, Hirokazu
    Yamaguchi, Yukiko
    Shimada, Hajime
    [J]. 2019 IEEE 43RD ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2019, : 396 - 401