Machine Learning Approach for Answer Detection in Discussion Forums: An Application of Big Data Analytics

被引:12
|
作者
Khan, Atif [1 ]
Ibrahim, Ibrahim [1 ]
Uddin, M. Irfan [2 ]
Zubair, Muhammad [1 ]
Ahmad, Shafiq [3 ]
Al Firdausi, Muhammad Dzulqarnain [3 ]
Zaindin, Mazen [4 ]
机构
[1] Islamia Coll Peshawar, Dept Comp Sci, Peshawar, Pakistan
[2] Kohat Univ Sci & Technol, Inst Comp, Kohat, Pakistan
[3] King Saud Univ, Coll Engn, Dept Ind Engn, Riyadh, Saudi Arabia
[4] King Saud Univ, Coll Sci, Dept Stat & Operat Res, Riyadh, Saudi Arabia
关键词
Data Analytics;
D O I
10.1155/2020/4621196
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Nowadays, data are flooding into online web forums, and it is highly desirable to turn gigantic amount of data into actionable knowledge. Online web forums have become an integral part of the web and are main sources of knowledge. People use this platform to post their questions and get answers from other forum members. Usually, an initial post (question) gets more than one reply posts (answers) that make it difficult for a user to scan all of them for most relevant and quality answer. Thus, how to automatically extract the most relevant answer for a question within a thread is an important issue. In this research, we treat the task of answer extraction as classification problem. A reply post can be classified as relevant, partially relevant, or irrelevant to the initial post. To find the relevancy/similarity of a reply to the question, both lexical and nonlexical features are used. We proposed to use LinearSVC, a variant of support vector machine (SVM), for answer classification. Two selection techniques such as chi-square and univariate are employed to reduce the feature space size. The experimental results showed that LinearSVC classifier outperformed the other state-of-the-art classifiers in the context of classification accuracy for both Ubuntu and TripAdvisor (NYC) discussion forum datasets.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] A review of machine learning for big data analytics: bibliometric approach
    El-Alfy, El-Sayed M.
    Mohammed, Salahadin A.
    [J]. TECHNOLOGY ANALYSIS & STRATEGIC MANAGEMENT, 2020, 32 (08) : 984 - 1005
  • [2] Machine learning for big data analytics
    [J]. Oja, E. (erkki.oja@aalto.fi), 1600, Springer Verlag (384):
  • [3] Use of Machine Learning in Big Data Analytics for Insider Threat Detection
    Mayhew, Michael
    Atighetchi, Michael
    Adler, Aaron
    Greenstadt, Rachel
    [J]. 2015 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM 2015), 2015, : 915 - 922
  • [4] Machine learning for Big Data analytics in plants
    Ma, Chuang
    Zhang, Hao Helen
    Wang, Xiangfeng
    [J]. TRENDS IN PLANT SCIENCE, 2014, 19 (12) : 798 - 808
  • [5] Big Data, Predictive Analytics and Machine Learning
    Ongsulee, Pariwat
    Chotchaung, Veena
    Bamrungsi, Eak
    Rodcheewit, Thanaporn
    [J]. 2018 16TH INTERNATIONAL CONFERENCE ON ICT AND KNOWLEDGE ENGINEERING (ICT&KE), 2018, : 37 - 42
  • [6] Machine Learning Technologies for Big Data Analytics
    Gandomi, Amir H.
    Chen, Fang
    Abualigah, Laith
    [J]. ELECTRONICS, 2022, 11 (03)
  • [7] Summarizing Online Movie Reviews: A Machine Learning Approach to Big Data Analytics
    Khan, Atif
    Gul, Muhammad Adnan
    Uddin, M. Irfan
    Shah, Syed Atif Ali
    Ahmad, Shafiq
    Al Firdausi, Muhammad Dzulqarnain
    Zaindin, Mazen
    [J]. SCIENTIFIC PROGRAMMING, 2020, 2020
  • [8] Big data analytics and machine learning: 2015 and beyond
    Passos, Ives Cavalcante
    Mwangi, Benson
    Kapczinski, Flavio
    [J]. LANCET PSYCHIATRY, 2016, 3 (01): : 13 - 15
  • [9] Machine learning with big data analytics for cloud security
    Mohammad, Abdul Salam
    Pradhan, Manas Ranjan
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2021, 96
  • [10] Machine learning and big data analytics in mood disorders
    Yang, Lu
    Chen, Jun
    [J]. FRONTIERS IN PSYCHIATRY, 2024, 15