Data science in light of natural language processing: An overview

被引:13
|
作者
Zeroual, Imad [1 ]
Lakhouaja, Abdelhak [1 ]
机构
[1] Mohamed First Univ, Fac Sci, Av Med 6 BP 717, Oujda 60000, Morocco
关键词
Data science; Natural language processing; Data driven approches; Corpora; Machine learning;
D O I
10.1016/j.procs.2018.01.101
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The focus of data scientists is essentially divided into three areas: collecting data, analyzing data, and inferring information from data. Each one of these tasks requires special personnel, takes time, and costs money. Yet, the next and the fastidious step is how to turn data into products. Therefore, this field grabs the attention of many research groups in academia as well as industry. In the last decades, data-driven approaches came into existence and gained more popularity because they require much less human effort. Natural Language Processing (NLP) is strongly among the fields influenced by data. The growth of data is behind the performance improvement of most NLP applications such as machine translation and automatic speech recognition. Consequently, many NLP applications are frequently moving from rule-based systems and knowledge-based methods to data driven approaches. However, collected data that are based on undefined design criteria or on technically unsuitable forms will be useless. Also, they will be neglected if the size is not enough to perform the required analysis and to infer the accurate information. The chief purpose of this overview is to shed some lights on the vital role of data in various fields and give a better understanding of data in light of NLP. Expressly, it describes what happen to data during its life-cycle: building, processing, analyzing, and exploring phases. (C) 2018 The Authors. Published by Elsevier B.V.
引用
收藏
页码:82 / 91
页数:10
相关论文
共 50 条
  • [41] Multispectral thermal imager science, data product and ground data processing - Overview
    Szymanski, JJ
    Atkins, W
    Balick, L
    Borel, CC
    Clodius, WB
    Christensen, W
    Davis, AB
    Echohawk, JC
    Galbraith, A
    Hirsch, K
    Krone, JB
    Little, C
    McLachlan, P
    Morrison, A
    Pollock, K
    Pope, P
    Novak, C
    Ramsey, K
    Riddle, E
    Rohde, C
    Roussel-Dupre, D
    Smith, BW
    Smith, K
    Starkovich, K
    Theiler, J
    Weber, P
    IGARSS 2001: SCANNING THE PRESENT AND RESOLVING THE FUTURE, VOLS 1-7, PROCEEDINGS, 2001, : 1458 - 1461
  • [42] Natural Language Query Processing for Life Science Knowledge Position Paper
    Kim, Jin-Dong
    Yamamoto, Yasunori
    Yamaguchi, Atsuko
    Nakao, Mitsuteru
    Oouchida, Kenta
    Chun, Hong-Woo
    Takagi, Toshihisa
    ACTIVE MEDIA TECHNOLOGY, 2010, 6335 : 158 - +
  • [43] Social Science for Natural Language Processing: A Hostile Narrative Analysis Prototype
    Anning, Stephen
    Konstantinidis, George
    Webber, Craig
    PROCEEDINGS OF THE 13TH ACM WEB SCIENCE CONFERENCE, WEBSCI 2021, 2020, : 102 - 111
  • [44] Designing a Natural Language Processing System to Support Social Science Research
    Gone, Keshava Pallavi
    Smit, Michael
    PROCEEDINGS OF THE 2023 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2023, 2023, : 345 - 347
  • [45] Lessons Learned from a Citizen Science Project for Natural Language Processing
    Klie, Jan-Christoph
    Lee, Ji-Ung
    Stowe, Kevin
    Sahin, Gozde Gul
    Moosavi, Nafise Sadat
    Bates, Luke
    Petrak, Dominic
    de Castilho, Richard Eckart
    Gurevych, Iryna
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 3594 - 3608
  • [46] Natural Language Processing and Cognitive Science - Proceedings of the 5th International Workshop on Natural Language Processing and Cognitive Science, NLPCS 2008; In Conjunction with ICEIS 2008: Foreword
    Sharp, Bernadette
    Zock, Michael
    2008, Inst. for Syst. and Technol. of Inf. Control and Commun., Av. D. Manuel I, 27 r/c esq, Setubal, 2910-595, Portugal
  • [47] Natural Language to Code Generation in Interactive Data Science Notebooks
    Yin, Pengcheng
    Li, Wen-Ding
    Xiao, Kefan
    Rao, Abhishek
    Wen, Yeming
    Shi, Kensen
    Howland, Joshua
    Bailey, Paige
    Catasta, Michele
    Michalewski, Henryk
    Polozov, Alex
    Sutton, Charles
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 126 - 173
  • [48] A Natural Language Interface for Dissemination of Reproducible Biomedical Data Science
    John, Rogers Jeffrey Leo
    Patel, Jignesh M.
    Alexander, Andrew L.
    Singh, Vikas
    Adluru, Nagesh
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2018, PT IV, 2018, 11073 : 197 - 205
  • [49] AN OVERVIEW OF COMPOSITE PROCESSING SCIENCE
    HALPIN, JC
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1984, 187 (APR): : 13 - INDE
  • [50] A multi-dimensional data organization for natural language processing
    Cheng, Kam-Hoi
    Faris, Waleed
    Journal of Computational Methods in Sciences and Engineering, 2009, 9 (SUPPL.1)