Dissecting word embeddings and language models in natural language processing

被引：8

作者：

Verma, Vivek Kumar ^{[1
]}

Pandey, Mrigank ^{[1
]}

Jain, Tarun ^{[2
]}

Tiwari, Pradeep Kumar ^{[3
]}

机构：

[1] Manipal Univ Jaipur, Dept Informat Technol, Jaipur 303007, Rajasthan, India

[2] Manipal Univ Jaipur, Dept Comp Sci & Engn, Jaipur 303007, Rajasthan, India

[3] Manipal Univ Jaipur, Dept Comp Applicat, Jaipur 303007, Rajasthan, India

来源：

JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY | 2021年 / 24卷 / 05期

关键词：

Natural language processing; Language models; Word embedding;

D O I：

10.1080/09720529.2021.1968108

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

Natural language processing (NLP) is an area in artificial intelligence that deals with understanding, interpretation and development of human language for computers to carry out tasks such as sentiment analysis, summarization of text in a document, developing conversational agents, machine translation and speech recognition. From conversational agents called catboats deployed on various websites that interact with consumers digitally to understand the needs of the consumers to reading summarized content delivered through apps in smartphones, NLP has had some major achievements in transforming the digital world that is increasingly gearing towards artificial intelligence. One area that has seen remarkable growth in recent times is language modelling, a statistical technique to compute the probability of tokens or words in a given sentence. In this paper, we attempt to present an overview of various representations with respect to language modelling, from neural word embeddings such as Word2Vec and GloVe to deep contextualized pre-trained embedding such as ULMFit, ELMo, OpenAI GPT and BERT.

引用

页码：1509 / 1515

页数：7

共 50 条

[21] A Word Sense Disambiguation Method Applied to Natural Language Processing for the Portuguese Language
do Nascimento, Clovis Holanda
Garcia, Vinicius Cardoso
Araujo, Ricardo de Andrade
[J]. IEEE OPEN JOURNAL OF THE COMPUTER SOCIETY, 2024, 5 : 268 - 277
[22] Natural language processing: Word recognition without segmentation
Saeed, K
Dardzinska, A
[J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2001, 52 (14): : 1275 - 1279
[23] Processing natural language without natural language processing
Brill, E
[J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, PROCEEDINGS, 2003, 2588 : 360 - 369
[24] Word and language processing
Cartier, Emmanuel
Isaac, Fabrice
[J]. FRANCAIS MODERNE, 2009, 77 (01): : 145 - 160
[25] Implementation of language models within an infrastructure designed for Natural Language Processing
Walkowiak, Bartosz
Walkowiak, Tomasz
[J]. INTERNATIONAL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 2024, 70 (01) : 153 - 159
[26] Robustness of GPT Large Language Models on Natural Language Processing Tasks
Xuanting, Chen
Junjie, Ye
Can, Zu
Nuo, Xu
Tao, Gui
Qi, Zhang
[J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2024, 61 (05): : 1128 - 1142
[27] A Study of Pre-trained Language Models in Natural Language Processing
Duan, Jiajia
Zhao, Hui
Zhou, Qian
Qiu, Meikang
Liu, Meiqin
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2020), 2020, : 116 - 121
[28] What Does This Word Mean? Explaining Contextualized Embeddings with Natural Language Definition
Chang, Ting-Yun
Chen, Yun-Nung
[J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 6064 - 6070
[29] Embeddings in Natural Language Processing: Theory and Advances in Vector Representations of Meaning
Corro, Caio Filippo
[J]. TRAITEMENT AUTOMATIQUE DES LANGUES, 2021, 62 (02): : 41 - 44
[30] On the Explainability of Natural Language Processing Deep Models
El Zini, Julia
Awad, Mariette
[J]. ACM COMPUTING SURVEYS, 2023, 55 (05)

← 1 2 3 4 5 →