Comparing Word Embeddings through Visualisation

被引:2
|
作者
Santos, Pedro [1 ]
Datia, Nuno [1 ,2 ]
Pato, Matilde [1 ,3 ]
Sobral, Jose [1 ]
机构
[1] ISEL Lisbon Sch Engn, Politecn Lisboa, Lisbon, Portugal
[2] NOVA Sch Sci & Technol, NOVA LINCS, Monte De Caparica, Portugal
[3] Univ Lisbon, FCUL, LASIGE, Lisbon, Portugal
关键词
NLP; Word Embeddings; Visualisation; Asset Management; NATURAL-LANGUAGE;
D O I
10.1109/IV56949.2022.00024
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Asset management is a branch of facilities management that is responsible for the operation and maintenance of assets. The most common means of managing assets and their life-cycle is through requests and work orders. A request is used to report an occurrence that is detected either by a sensory device, a technician, or non-technical personnel; they are used to pointing out that something is wrong in a given asset, and needs appropriate attention. Depending on the problem, a request can give rise to a work order if the solution is not trivial. Work orders consist in technical reports that specify the asset that needs intervention and has the details about the work to be done or, in the case that the work is unknown from the start, the characteristics of the malfunctioning. Work orders contain a set of words, free text, that are not restricted from a fixed set of vocabulary, making it difficult to automatically analyse them. In this paper, we discuss the application of modern Natural Language Processing techniques to process the work order's description, while presenting a comparison between two Word Embedding models - Word2Vec and Fasttext- through semantic similarity tests between the encoded words, and a visualisation of the vector space through dimensionality reduction of the encoded vectors. The results show a better performance of the Fasttext approach, considering the semantics of the results.
引用
收藏
页码:91 / 97
页数:7
相关论文
共 50 条
  • [1] Comparing Different Word Embeddings for Multiword Expression Identification
    Ashok, Aishwarya
    Elmasri, Ramez
    Natarajan, Ganapathy
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2019), 2019, 11608 : 295 - 302
  • [2] Comparing Pretrained Multilingual Word Embeddings on an Ontology Alignment Task
    Gromann, Dagmar
    Declerck, Thierry
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 230 - 236
  • [3] Measuring associational thinking through word embeddings
    Carlos Periñán-Pascual
    [J]. Artificial Intelligence Review, 2022, 55 : 2065 - 2102
  • [4] Measuring associational thinking through word embeddings
    Perinan-Pascual, Carlos
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (03) : 2065 - 2102
  • [5] Comparing Dependency-based Compositional Models with Contextualized Word Embeddings
    Gamallo, Pablo
    de Prada Corral, Manuel
    Garcia, Marcos
    [J]. ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2, 2021, : 1258 - 1265
  • [6] Comparing general and specialized word embeddings for biomedical named entity recognition
    Ramos-Vargas, Rigo E.
    Roman-Godinez, Israel
    Torres-Ramos, Sulema
    [J]. PEERJ COMPUTER SCIENCE, 2021, 7 : 1 - 22
  • [7] Prepositional Polysemy through the lens of contextualized word embeddings
    Fonteyn, Lauren
    [J]. COGNITEXTES, 2021, 21
  • [8] Explaining Financial Uncertainty through Specialized Word Embeddings
    Theil, Christoph Kilian
    Štajner, Sanja
    Stuckenschmidt, Heiner
    [J]. ACM/IMS Transactions on Data Science, 2020, 1 (01):
  • [9] Improved analysis of deep bioacoustic embeddings through dimensionality reduction and interactive visualisation
    Sanchez, Francisco J. Bravo
    English, Nathan B.
    Hossain, Md Rahat
    Moore, Steven T.
    [J]. ECOLOGICAL INFORMATICS, 2024, 81
  • [10] Comparing General and Locally-Learned Word Embeddings for Clinical Text Mining
    Thadajarassiri, Jidapa
    Sen, Cansu
    Hartvigsen, Thomas
    Kong, Xiangnan
    Rundensteiner, Elke
    [J]. 2019 IEEE EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL & HEALTH INFORMATICS (BHI), 2019,