Personality Classification from Online Text using Machine Learning Approach

被引:0
|
作者
Khan, Alam Sher [1 ]
Ahmad, Hussain [1 ]
Asghar, Muhammad Zubair [1 ]
Saddozai, Furcian Khan [1 ]
Arir, Areeba [1 ]
Khalid, Hassan Ali [1 ]
机构
[1] Gomal Univ, Inst Comp & Informat Technol, Dera Ismail Khan, Pakistan
关键词
Personality recognition; re-sampling; machine learning; XGBoost; class imbalanced; MBTI; social networks; SOCIAL MEDIA;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Personality refer to the distinctive set of characteristics of a person that effect their habits, behaviour's, attitude and pattern of thoughts. Text available on Social Networking sites provide an opportunity to recognize individual's personality traits automatically. In this proposed work, Machine Learning Technique, XGBoost classifier is used to predict four personality traits based on Myers- Briggs Type Indicator (MBTI) model, namely Introversion-Extroversion(I-E), iNtuition-Sensing(N-S), Feeling-Thinking(F-T) and Judging-Perceiving(J-P) from input text. Publically available benchmark dataset from Kaggle is used in experiments. The skewness of the dataset is the main issue associated with the prior work, which is minimized by applying Re-sampling technique namely random over-sampling, resulting in better performance. For more exploration of the personality from text, pre-processing techniques including tokenization, word stemming, stop words elimination and feature selection using TF IDF are also exploited. This work provides the basis for developing a personality identification system which could assist organization for recruiting and selecting appropriate personnel and to improve their business by knowing the personality and preferences of their customers. The results obtained by all classifiers across all personality traits is good enough, however, the performance of XGBoost classifier is outstanding by achieving more than 99% precision and accuracy for different traits.
引用
收藏
页码:460 / 476
页数:17
相关论文
共 50 条
  • [1] Machine Learning Approach for Text Classification in Cybercrime
    Kumari, Swati
    Saquib, Zia
    Pawar, Sanjay
    [J]. 2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
  • [2] StoryQ-an Online Environment for Machine Learning of Text Classification
    Finzer, William
    Chao, Jie
    Rose, Carolyn
    Jiang, Shiyan
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 12860 - 12860
  • [3] Text Classification Using Lifelong Machine Learning
    Arif, Muhammad Hassan
    Jin, Xin
    Li, Jianxin
    Iqbal, Muhammad
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2017, PT I, 2017, 10634 : 394 - 404
  • [4] Automatic Classification for Cognitive Engagement in Online Discussion Forums: Text Mining and Machine Learning Approach
    Hayati, Hind
    Idrissi, Mohammed Khalidi
    Bennani, Samir
    [J]. ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2020), PT II, 2020, 12164 : 114 - 118
  • [5] Multi-class Text Classification Using Machine Learning Models for Online Drug Reviews
    Joshi, Shreehar
    Abdelfattah, Eman
    [J]. 2021 IEEE WORLD AI IOT CONGRESS (AIIOT), 2021, : 262 - 267
  • [6] An exploration on text classification using machine learning techniques
    Athanasios, Tzimourtas
    Spyros, Bakalakos
    Panagiota, Tselenti
    Athanasios, Voulodimos
    [J]. 25TH PAN-HELLENIC CONFERENCE ON INFORMATICS WITH INTERNATIONAL PARTICIPATION (PCI2021), 2021, : 247 - 249
  • [7] Academic Registration Text Classification Using Machine Learning
    Alhawas, Mohammed S.
    Almurayziq, Tariq S.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (01): : 93 - 96
  • [8] Text Classification for Azerbaijani Language Using Machine Learning
    Suleymanov, Umid
    Kalejahi, Behnam Kiani
    Amrahov, Elkhan
    Badirkhanli, Rashid
    [J]. COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2020, 35 (06): : 467 - 475
  • [9] Domain Text Classification Using Machine Learning Models
    Rao, Akula V. S. Siva Rama
    Bhavani, D. Ganga
    Krishna, J. Gopi
    Swapna, B.
    Varma, K. Rama Sai
    [J]. PROCEEDINGS OF SECOND INTERNATIONAL CONFERENCE ON SUSTAINABLE EXPERT SYSTEMS (ICSES 2021), 2022, 351 : 573 - 582
  • [10] Multi-Label Emotion Classification of Online Learners' Reviews Using Machine Learning Text-Based Multi-Label Classification Approach
    Makhoukhi, Hajar
    Roubi, Sarra
    [J]. 2024 5TH INTERNATIONAL CONFERENCE ON EDUCATION DEVELOPMENT AND STUDIES, ICEDS 2024, 2024, : 59 - 64