Dissecting word embeddings and language models in natural language processing

被引:8
|
作者
Verma, Vivek Kumar [1 ]
Pandey, Mrigank [1 ]
Jain, Tarun [2 ]
Tiwari, Pradeep Kumar [3 ]
机构
[1] Manipal Univ Jaipur, Dept Informat Technol, Jaipur 303007, Rajasthan, India
[2] Manipal Univ Jaipur, Dept Comp Sci & Engn, Jaipur 303007, Rajasthan, India
[3] Manipal Univ Jaipur, Dept Comp Applicat, Jaipur 303007, Rajasthan, India
关键词
Natural language processing; Language models; Word embedding;
D O I
10.1080/09720529.2021.1968108
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Natural language processing (NLP) is an area in artificial intelligence that deals with understanding, interpretation and development of human language for computers to carry out tasks such as sentiment analysis, summarization of text in a document, developing conversational agents, machine translation and speech recognition. From conversational agents called catboats deployed on various websites that interact with consumers digitally to understand the needs of the consumers to reading summarized content delivered through apps in smartphones, NLP has had some major achievements in transforming the digital world that is increasingly gearing towards artificial intelligence. One area that has seen remarkable growth in recent times is language modelling, a statistical technique to compute the probability of tokens or words in a given sentence. In this paper, we attempt to present an overview of various representations with respect to language modelling, from neural word embeddings such as Word2Vec and GloVe to deep contextualized pre-trained embedding such as ULMFit, ELMo, OpenAI GPT and BERT.
引用
收藏
页码:1509 / 1515
页数:7
相关论文
共 50 条
  • [21] A Word Sense Disambiguation Method Applied to Natural Language Processing for the Portuguese Language
    do Nascimento, Clovis Holanda
    Garcia, Vinicius Cardoso
    Araujo, Ricardo de Andrade
    [J]. IEEE OPEN JOURNAL OF THE COMPUTER SOCIETY, 2024, 5 : 268 - 277
  • [22] Natural language processing: Word recognition without segmentation
    Saeed, K
    Dardzinska, A
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2001, 52 (14): : 1275 - 1279
  • [23] Processing natural language without natural language processing
    Brill, E
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, PROCEEDINGS, 2003, 2588 : 360 - 369
  • [24] Word and language processing
    Cartier, Emmanuel
    Isaac, Fabrice
    [J]. FRANCAIS MODERNE, 2009, 77 (01): : 145 - 160
  • [25] Implementation of language models within an infrastructure designed for Natural Language Processing
    Walkowiak, Bartosz
    Walkowiak, Tomasz
    [J]. INTERNATIONAL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 2024, 70 (01) : 153 - 159
  • [26] Robustness of GPT Large Language Models on Natural Language Processing Tasks
    Xuanting, Chen
    Junjie, Ye
    Can, Zu
    Nuo, Xu
    Tao, Gui
    Qi, Zhang
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2024, 61 (05): : 1128 - 1142
  • [27] A Study of Pre-trained Language Models in Natural Language Processing
    Duan, Jiajia
    Zhao, Hui
    Zhou, Qian
    Qiu, Meikang
    Liu, Meiqin
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2020), 2020, : 116 - 121
  • [28] What Does This Word Mean? Explaining Contextualized Embeddings with Natural Language Definition
    Chang, Ting-Yun
    Chen, Yun-Nung
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 6064 - 6070
  • [29] Embeddings in Natural Language Processing: Theory and Advances in Vector Representations of Meaning
    Corro, Caio Filippo
    [J]. TRAITEMENT AUTOMATIQUE DES LANGUES, 2021, 62 (02): : 41 - 44
  • [30] On the Explainability of Natural Language Processing Deep Models
    El Zini, Julia
    Awad, Mariette
    [J]. ACM COMPUTING SURVEYS, 2023, 55 (05)