Novel approach for quantitative and qualitative authors research profiling using feature fusion and tree-based learning approach

被引:0
|
作者
Umer M. [1 ]
Aljrees T. [2 ]
Ullah S. [1 ]
Bashir A.K. [3 ]
机构
[1] Department of Computer Science, Khwaja Fareed University of Engineering & IT, Punjab, Rahim Yar Khan
[2] Department of Computer Science and Engineering, University of Hafr Al-Batin, Hafar Al-Batin
[3] Department of Computing and Mathematics, The Manchester Metropolitan University, Manchester
关键词
Authors research profiling; Citation sentiment analysis; Ensemble learning; Feature engineering; Feature fusion; Intelligent recommendation and text analysis; Self citation analysis;
D O I
10.7717/PEERJ-CS.1752
中图分类号
学科分类号
摘要
Article citation creates a link between the cited and citing articles and is used as a basis for several parameters like author and journal impact factor, H-index, i10 index, etc., for scientific achievements. Citations also include self-citation which refers to article citation by the author himself. Self-citation is important to evaluate an author’s research profile and has gained popularity recently. Although different criteria are found in the literature regarding appropriate self-citation, self-citation does have a huge impact on a researcher’s scientific profile. This study carries out two cases in this regard. In case 1, the qualitative aspect of the author’s profile is analyzed using hand-crafted feature engineering techniques. The sentiments conveyed through citations are integral in assessing research quality, as they can signify appreciation, critique, or serve as a foundation for further research. Analyzing sentiments within in-text citations remains a formidable challenge, even with the utilization of automated sentiment annotations. For this purpose, this study employs machine learning models using term frequency (TF) and term frequency-inverse document frequency (TF-IDF). Random forest using TF with Synthetic Minority Oversampling Technique (SMOTE) achieved a 0.9727 score of accuracy. Case 2 deals with quantitative analysis and investigates direct and indirect self-citation. In this study, the top 2% of researchers in 2020 is considered as a baseline. For this purpose, the data of the top 25 Pakistani researchers are manually retrieved from this dataset, in addition to the citation information from the Web of Science (WoS). The selfcitation is estimated using the proposed model and results are compared with those obtained from WoS. Experimental results show a substantial difference between the two, as the ratio of self-citation from the proposed approach is higher than WoS. It is observed that the citations from the WoS for authors are overstated. For a comprehensive evaluation of the researcher's profile, both direct and indirect selfcitation must be included. © 2023 Umer et al.
引用
收藏
相关论文
共 50 条
  • [21] Local spectral envelope: An approach using dyadic tree-based adaptive segmentation
    Stoffer, DS
    Ombao, HC
    Tyler, DE
    ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2002, 54 (01) : 201 - 223
  • [22] Automated vertebrae extraction using watershed segmentation and tree-based modelling approach
    Ikhsan, Ili Ayuni Mohd
    Hussain, Aini
    Zulkifley, Mohd Asyraf
    Mustapha, Aouache
    Journal of Fiber Bioengineering and Informatics, 2015, 8 (03): : 547 - 555
  • [23] Local Spectral Envelope: An Approach Using Dyadic Tree-Based Adaptive Segmentation
    David S. Stoffer
    Hernando C. Ombao
    David E. Tyler
    Annals of the Institute of Statistical Mathematics, 2002, 54 : 201 - 223
  • [24] A tree-based learning approach for document structure analysis and its application to web search
    Pembe, F. Canan
    Gungor, Tunga
    NATURAL LANGUAGE ENGINEERING, 2015, 21 (04) : 569 - 605
  • [25] Finding Gene Regulatory Networks in Psoriasis: Application of a Tree-Based Machine Learning Approach
    Deng, Jingwen
    Schieler, Carlotta
    Borghans, Jose A. M.
    Lu, Chuanjian
    Pandit, Aridaman
    FRONTIERS IN IMMUNOLOGY, 2022, 13
  • [26] Decision tree-based learning and laboratory data mining: an efficient approach to amebiasis testing
    Al-khlifeh, Enas
    Tarawneh, Ahmad S.
    Almohammadi, Khalid
    Alrashidi, Malek
    Hassanat, Ramadan
    Hassanat, Ahmad B.
    PARASITES & VECTORS, 2025, 18 (01):
  • [27] Deep Learning Based Weighted Feature Fusion Approach for Sentiment Analysis
    Usama, Mohd
    Xiao, Wenjing
    Ahmad, Belal
    Wan, Jiafu
    Hassan, Mohammad Mehedi
    Alelaiwi, Abdulhameed
    IEEE ACCESS, 2019, 7 : 140252 - 140260
  • [28] A Principled Approach to Using Machine Learning in Qualitative Education Research
    Magana, Alejandra J.
    Boutin, Mireille
    2018 IEEE FRONTIERS IN EDUCATION CONFERENCE (FIE), 2018,
  • [29] A Novel Feature Fusion Approach for Classification of Motor Imagery EEG Based on Hierarchical Extreme Learning Machine
    Duan, Lijuan
    Lian, Zhaoyang
    Qiao, Yuanhua
    Chen, Juncheng
    Miao, Jun
    Li, Mingai
    COGNITIVE COMPUTATION, 2024, 16 (02) : 566 - 580
  • [30] A Novel Feature Fusion Approach for Classification of Motor Imagery EEG Based on Hierarchical Extreme Learning Machine
    Lijuan Duan
    Zhaoyang Lian
    Yuanhua Qiao
    Juncheng Chen
    Jun Miao
    Mingai Li
    Cognitive Computation, 2024, 16 : 566 - 580