A Natural Language Normalization Approach to Enhance Social Media Text Reasoning

被引：0

作者：

Long Hoang Nguyen ^{[1
]}

Salopek, Andrew ^{[1
]}

Zhao, Liang ^{[2
]}

Jin, Fang ^{[1
]}

机构：

[1] Texas Tech Univ, Dept Comp Sci, Lubbock, TX 79409 USA

[2] George Mason Univ, Informat Sci & Technol, Fairfax, VA 22030 USA

来源：

2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2017年

关键词：

Language Preprocessing; Information Retrieval; Sentiment Analysis; Social Media Reasoning;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Social media has become a popular data source to track and analyze societal events. Targeted domains such as election, civil unrest, and spreading disease all require a natural language normalization tool capable of extracting information pertinent to these domains accurately. Due to the unstructured language, short-length messages, casual posting styles, and homonyms, it is technically difficult and labor-intensive to remove barriers that may lead to inaccurate analysis. Because the fact that typos or other symbolic representations of sentiment may lead to lower frequency of term appearance, language preprocessing becomes critical and necessary to improve social media text reasoning. We propose a novel unsupervised preprocessing approach to enhance text understanding quality and illustrate this approach using one specific domain, flu shot reasoning. The proposed approach relies on a database of synonyms and opposite words and an algorithm to transform negative sentences into its affirmative form. In this form, the features and opinions are reflected accurately via transforming parts of speech. For instance, features are presented as nouns and opinions are presented as verbs or adjectives. The algorithm also corrects words if they are not correctly written and normalizes them to increase its frequency of appearance. The effectiveness of our algorithm is evaluated on the tweets dataset to answer why people are reluctant to take flu shots.

引用

页码：2019 / 2026

页数：8

共 50 条

[41] Special Issue on Natural Language Processing for Social Media Analysis
Mporas, Iosif
Simaki, Vasiliki
Paradis, Carita
Kerren, Andreas
Paraskevas, Michael
[J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2020, 29 (02)
[42] Natural Language Processing of Social Media as Screening for Suicide Risk
Coppersmith, Glen
Leary, Ryan
Crutchley, Patrick
Fine, Alex
[J]. BIOMEDICAL INFORMATICS INSIGHTS, 2018, 10
[43] Turkish Normalization Lexicon for Social Media
Demir, Seniz
Tan, Murat
Topcu, Berkay
[J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, (CICLING 2016), PT II, 2018, 9624 : 418 - 429
[44] An Approach for Learning and Construction of Expressive Ontology from Text in Natural Language
de Azevedo, Ryan Ribeiro
Freitas, Fred
Rocha, Rodrigo G. C.
Alves de Menezes, Jose Antonio
de Oliveira Rodrigues, Cleyton Mario
Silva, Gabriel de F. P. e
[J]. 2014 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 1, 2014, : 149 - 156
[45] A Hybrid Approach for Spatial Information Extraction from Natural Language Text
Hassini, Nesrine
Mahmoudi, Khaoula
Faiz, Sami
[J]. 2023 20TH ACS/IEEE INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, AICCSA, 2023,
[46] Neurolinguistic approach to natural language processing with applications to medical text analysis
Duch, Wlodzisfaw
Matykiewicz, Pawel
Pestian, John
[J]. NEURAL NETWORKS, 2008, 21 (10) : 1500 - 1510
[47] A Natural Language Processing Approach to Social License Management
Boutilier, Robert G.
Bahr, Kyle
[J]. SUSTAINABILITY, 2020, 12 (20) : 1 - 12
[48] Multimedia reasoning with natural language support
Dasiopoulou, Stamatia
Heinecke, Johannes
Saathoff, Carsten
Strintzis, Michael G.
[J]. ICSC 2007: INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, PROCEEDINGS, 2007, : 413 - +
[49] Temporal Reasoning in Natural Language Inference
Vashishtha, Siddharth
Poliak, Adam
Lal, Yash Kumar
Van Durme, Benjamin
White, Aaron Steven
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4070 - 4078
[50] A Corpus of Natural Language for Visual Reasoning
Suhr, Alane
Lewis, Mike
Yeh, James
Artzi, Yoav
[J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 2, 2017, : 217 - 223

← 1 2 3 4 5 →