Effective Preprocessing and Normalization Techniques for COVID-19 Twitter Streams with POS Tagging via Lightweight Hidden Markov Model

被引:3
|
作者
Narayanasamy, Senthil Kumar [1 ]
Hu, Yuh-Chung [2 ]
Qaisar, Saeed Mian [3 ]
Srinivasan, Kathiravan [4 ]
机构
[1] Vellore Inst Technol, Sch Informat Technol & Engn, Vellore 632014, India
[2] Natl Ilan Univ, Dept Mech & Electromech Engn, Yilan 26047, Taiwan
[3] Effat Univ, Elect & Comp Engn Dept, Jeddah, Saudi Arabia
[4] Vellore Inst Technol, Sch Comp Sci & Engn, Vellore 632014, India
关键词
SOCIAL MEDIA; RECOGNITION;
D O I
10.1155/2022/1222692
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The major focus of this research work is to refine the basic preprocessing steps for the unstructured text content and retrieve the potential conceptual features for further enhancement processes such as semantic enrichment and named entity recognition. Although some of the preprocessing techniques such as text tokenization, normalization, and Part-of-Speech (POS) tagging work exceedingly well on formal text, it has not performed well when it is applied into informal text such as tweets and short messages. Hence, we have given the enhanced text normalization techniques to reduce the complexity persist over the twitter streams and eliminate the overfitting issues such as text anomalies and irregular boundaries while fixing the grammar of the text. The hidden Markov model (HMM) has been pervasively used to extract the core lexical features from the Twitter dataset and suitably adapt the external documents to supplement the extraction techniques to complement the tweet context. Using this Markov process, the POS tags are identified as states of the Markov process, and words are the desired results of the model. As this process is very crucial for the next stage of entity extraction and classification, the effective handling of informal text is considered to be important and therefore proposed the most effective hybrid approach to deal with the issues appropriately.
引用
收藏
页数:14
相关论文
共 12 条
  • [1] Twitter Storytelling Generator Using Latent Dirichlet Allocation and Hidden Markov Model POS-TAG (Part-of-Speech Tagging)
    Rohman, Yasir Abdur
    Kusumaningrum, Retno
    2019 3RD INTERNATIONAL CONFERENCE ON INFORMATICS AND COMPUTATIONAL SCIENCES (ICICOS 2019), 2019,
  • [2] Inferring school district learning modalities during the COVID-19 pandemic with a hidden Markov model
    Panaggio, Mark J.
    Fang, Mike
    Bang, Hyunseung
    Armstrong, Paige A.
    Binder, Alison M.
    Grass, Julian E.
    Magid, Jake
    Papazian, Marc
    Shapiro-Mendoza, Carrie K.
    Parks, Sharyn E.
    PLOS ONE, 2023, 18 (10):
  • [3] DYNAMIC STRATIFICATION OF DISEASE SEVERITY AND PROGNOSIS OF HOSPITALIZED COVID-19 PATIENTS USING HIDDEN MARKOV MODEL
    Soper, Braden
    Cadena, Jose
    Nguyen, Sam
    Chan, Kwan Ho Ryan
    Kiszka, Paul
    Womack, Lucas
    Work, Mark
    Duggan, Joan
    Haller, Steven T.
    Hanrahan, Jennifer
    Kennedy, David J.
    Mukundan, Deepa
    Ray, Priyadip
    JOURNAL OF INVESTIGATIVE MEDICINE, 2021, 69 (05) : 1111 - 1112
  • [4] Lightweight Cost Effective Deep Learning Model for COVID-19 Detection using CXR Images
    Farooq, Umar
    Amtullah, Afeefa
    Rehman, Abdur
    Sarfraz, Mohammad
    2022 5TH INTERNATIONAL CONFERENCE ON MULTIMEDIA, SIGNAL PROCESSING AND COMMUNICATION TECHNOLOGIES (IMPACT), 2022,
  • [5] Numerical solution of COVID-19 pandemic model via finite difference and meshless techniques
    Zarin, Rahat
    Siraj-ul-Islam
    Haider, Nadeem
    Naeem-ul-Islam
    ENGINEERING ANALYSIS WITH BOUNDARY ELEMENTS, 2023, 147 : 76 - 89
  • [6] City-scale model for COVID-19 epidemiology with mobility and social activities represented by a set of hidden Markov models
    Pais, Carlos M.
    Godano, Matias I.
    Juarez, Emanuel
    del Prado, Abelardo
    Manresa, Jose Biurrun
    Rufiner, H. Leonardo
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 160
  • [7] A deep learning approach using effective preprocessing techniques to detect COVID-19 from chest CT-scan and X-ray images
    Ahamed, Khabir Uddin
    Islam, Manowarul
    Uddin, Ashraf
    Akhter, Arnisha
    Paul, Bikash Kumar
    Abu Yousuf, Mohammad
    Uddin, Shahadat
    Quinn, Julian M. W.
    Moni, Mohammad Ali
    COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 139
  • [8] Inferring school district learning modalities during the COVID-19 pandemic with a hidden Markov model (vol 10, e0292354, 2023 )
    Panaggio, Mark J.
    Fang, Mike
    Bang, Hyunseung
    Armstrong, Paige A.
    Binder, Alison M.
    Grass, Julian E.
    Magid, Jake
    Papazian, Marc
    Shapiro-Mendoza, Carrie K.
    Parks, Sharyn E.
    PLOS ONE, 2024, 19 (03):
  • [9] The effect of COVID-19 pandemic on uncertain supply chain model with risk and visibility via expected value and chance constraint techniques
    Sahoo, Palash
    Jana, Dipak Kumar
    Pramanik, Sutapa
    Panigrahi, Goutam
    SOFT COMPUTING, 2023, 27 (24) : 18739 - 18764
  • [10] The effect of COVID-19 pandemic on uncertain supply chain model with risk and visibility via expected value and chance constraint techniques
    Palash Sahoo
    Dipak Kumar Jana
    Sutapa Pramanik
    Goutam Panigrahi
    Soft Computing, 2023, 27 : 18739 - 18764