Performance evaluation of machine learning models on large dataset of android applications reviews

被引:5
|
作者
Qureshi, Ali Adil [1 ]
Ahmad, Maqsood [2 ]
Ullah, Saleem [1 ]
Yasir, Muhammad Naveed [3 ]
Rustam, Furqan [4 ]
Ashraf, Imran [5 ]
机构
[1] Khwaja Fareed Univ Engn & Informat Technol, Dept Comp Sci, Rahim Yar Khan 64200, Pakistan
[2] Islamia Univ Bahawalpur, Dept Informat Secur, Bahawalpur 63100, Punjab, Pakistan
[3] Univ Narowal, Dept Comp Sci, Narowal 51600, Pakistan
[4] Univ Coll Dublin, Sch Comp Sci, Dublin, Ireland
[5] Yeungnam Univ, Informat & Commun Engn, Gyongsan 38541, South Korea
关键词
Opinion mining; Sentiment analysis; Mobile apps reviews; Google Play Store; CLASSIFICATION;
D O I
10.1007/s11042-023-14713-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With an ever-increasing number of mobile users, the development of mobile applications (apps) has become a potential market during the past decade. Billions of users download mobile apps for divergent use from Google Play Store, fulfill tasks and leave comments about their experience. Such reviews are replete with a variety of feedback that serves as a guide for the improvement of existing apps and intuition for novel mobile apps. However, application reviews are challenging and very broad to approach. Such reviews, when segregated into different classes guide the user in the selection of suitable apps. This study proposes a framework for analyzing the sentiment of reviews for apps of eight different categories like shopping, sports, casual, etc. A large dataset is scrapped comprising 251661 user reviews with the help of 'Regular Expression' and 'Beautiful Soup'. The framework follows the use of different machine learning models along with the term frequency-inverse document frequency (TF-IDF) for feature extraction. Extensive experiments are performed using preprocessing steps, as well as, the stats feature of app reviews to evaluate the performance of the models. Results indicate that combining the stats feature with TF-IDF shows better performance and the support vector machine obtains the highest accuracy. Experimental results can potentially be used by other researchers to select appropriate models for the analysis of app reviews. In addition, the provided dataset is large, diverse, and balanced with eight categories and 59 app reviews and provides the opportunity to analyze reviews using state-of-the-art approaches.
引用
收藏
页码:37197 / 37219
页数:23
相关论文
共 50 条
  • [41] Reintroducing KAPD as a Dataset for Machine Learning and Data Mining Applications
    Seddiq, Yasser
    Meftah, Ali
    Alghamdi, Mansour
    Alotaibi, Yousef
    UKSIM-AMSS 10TH EUROPEAN MODELLING SYMPOSIUM ON COMPUTER MODELLING AND SIMULATION (EMS), 2016, : 70 - 74
  • [42] Increasing the performance of intrusion detection models developed using machine learning method with preprocessing applied to the dataset
    Ilgun, Esen Gul
    Samet, Refik
    JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY, 2024, 39 (02): : 679 - 692
  • [43] Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects
    Mengmeng Liu
    Gopal Srivastava
    J. Ramanujam
    Michal Brylinski
    Scientific Reports, 14
  • [44] Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects
    Liu, Mengmeng
    Srivastava, Gopal
    Ramanujam, J.
    Brylinski, Michal
    SCIENTIFIC REPORTS, 2024, 14 (01)
  • [45] LARGE SCALE EMPIRICAL EVALUATION OF MACHINE LEARNING FOR SEMI-AUTOMATING CITATION SCREENING IN SYSTEMATIC REVIEWS
    Trikalinos, Thomas
    Wallace, Byron
    Jap, Jens
    Senturk, Birol
    Adam, Gaelen
    Smith, Bryant
    Schmid, Christopher
    Balk, Ethan
    Forbes, Shaun P.
    MEDICAL DECISION MAKING, 2020, 40 (01) : E293 - E294
  • [46] Machine learning based hybrid behavior models for Android malware analysis
    Chuang, Hsin-Yu
    Wang, Sheng-De
    2015 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE SECURITY AND RELIABILITY (QRS 2015), 2015, : 201 - 206
  • [47] Malware Detection in Android Systems with Traditional Machine Learning Models: A Survey
    Bayazit, Esra Calik
    Sahingoz, Ozgur Koray
    Dogan, Buket
    2ND INTERNATIONAL CONGRESS ON HUMAN-COMPUTER INTERACTION, OPTIMIZATION AND ROBOTIC APPLICATIONS (HORA 2020), 2020, : 374 - 381
  • [48] Machine learning models and dimensionality reduction for improving the Android malware detection
    Moran, Pablo
    Robles-Gomez, Antonio
    Duque, Andres
    Tobarra, Llanos
    Pastor-Vargas, Rafael
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [49] Evaluating Machine Learning Models for Android Malware Detection - A Comparison Study
    Rana, Md. Shohel
    Gudla, Charan
    Sung, Andrew H.
    PROCEEDINGS OF 2018 VII INTERNATIONAL CONFERENCE ON NETWORK, COMMUNICATION AND COMPUTING (ICNCC 2018), 2018, : 17 - 21
  • [50] Analysis and Evaluation of Machine Learning Classifiers for IoT Attack Dataset
    Jagruthi, H.
    Kavitha, C.
    MACHINE LEARNING AND AUTONOMOUS SYSTEMS, 2022, 269 : 471 - 482