Novel approach for quantitative and qualitative authors research profiling using feature fusion and tree-based learning approach

被引:0
|
作者
Umer, Muhammad [1 ]
Aljrees, Turki [2 ]
Ullah, Saleem [1 ]
Bashir, Ali Kashif [3 ]
机构
[1] Khwaja Fareed Univ Engn & IT, Dept Comp Sci, Rahim Yar Khan, Punjab, Pakistan
[2] Univ Hafr Al Batin, Dept Comp Sci & Engn, Hafar Al Batin, Saudi Arabia
[3] Manchester Metropolitan Univ, Dept Comp & Math, Manchester, England
关键词
Citation sentiment analysis; Ensemble learning; Feature engineering; Feature fusion; Intelligent recommendation and text analysis; Authors research profiling; Self citation analysis; SELF-CITATION RATES; H-INDEX; IMPACT; CLASSIFICATION; PATTERNS; MACRO; SMOTE;
D O I
10.7717/peerj-cs.1752
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Article citation creates a link between the cited and citing articles and is used as a basis for several parameters like author and journal impact factor, H-index, i10 index, etc., for scientific achievements. Citations also include self-citation which refers to article citation by the author himself. Self-citation is important to evaluate an author's research profile and has gained popularity recently. Although different criteria are found in the literature regarding appropriate self-citation, self-citation does have a huge impact on a researcher's scientific profile. This study carries out two cases in this regard. In case 1, the qualitative aspect of the author's profile is analyzed using hand-crafted feature engineering techniques. The sentiments conveyed through citations are integral in assessing research quality, as they can signify appreciation, critique, or serve as a foundation for further research. Analyzing sentiments within in-text citations remains a formidable challenge, even with the utilization of automated sentiment annotations. For this purpose, this study employs machine learning models using term frequency (TF) and term frequency-inverse document frequency (TF-IDF). Random forest using TF with Synthetic Minority Oversampling Technique (SMOTE) achieved a 0.9727 score of accuracy. Case 2 deals with quantitative analysis and investigates direct and indirect self-citation. In this study, the top 2% of researchers in 2020 is considered as a baseline. For this purpose, the data of the top 25 Pakistani researchers are manually retrieved from this dataset, in addition to the citation information from the Web of Science (WoS). The self citation is estimated using the proposed model and results are compared with those obtained from WoS. Experimental results show a substantial difference between the two, as the ratio of self-citation from the proposed approach is higher than WoS. It is observed that the citations from the WoS for authors are overstated. For a comprehensive evaluation of the researcher's profile, both direct and indirect self citation must be included.
引用
收藏
页数:26
相关论文
共 50 条
  • [41] Survival analysis with time-varying regression effects using a tree-based approach
    Xu, RH
    Adak, S
    BIOMETRICS, 2002, 58 (02) : 305 - 315
  • [42] Towards tree-based systems disturbance monitoring of tropical mosaic landscape using a time series ensemble learning approach
    Abera, Temesgen
    Pellikka, Petri
    Johansson, Tino
    Mwamodenyi, James
    Heiskanen, Janne
    REMOTE SENSING OF ENVIRONMENT, 2023, 299
  • [43] Classification of multiple and single power quality disturbances using a decision tree-based approach
    Barbosa B.H.G.
    Ferreira D.D.
    Journal of Control, Automation and Electrical Systems, 2013, 24 (05) : 638 - 648
  • [44] Tree-based data aggregation approach in wireless sensor network using fitting functions
    Atoui, Ibrahim
    Ahmad, Ali
    Medlej, Maguy
    Makhoul, Abdallah
    Tawbe, Samar
    Hijazi, Abbas
    2016 SIXTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION PROCESSING AND COMMUNICATIONS (ICDIPC), 2016, : 146 - 150
  • [45] Snow avalanche susceptibility mapping using novel tree-based machine learning algorithms (XGBoost, NGBoost, and LightGBM) with eXplainable Artificial Intelligence (XAI) approach
    Iban, Muzaffer Can
    Bilgilioglu, Suleyman Sefa
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2023, 37 (06) : 2243 - 2270
  • [46] Snow avalanche susceptibility mapping using novel tree-based machine learning algorithms (XGBoost, NGBoost, and LightGBM) with eXplainable Artificial Intelligence (XAI) approach
    Muzaffer Can IBAN
    Suleyman Sefa BILGILIOGLU
    Stochastic Environmental Research and Risk Assessment, 2023, 37 : 2243 - 2270
  • [47] Speech emotion recognition using multimodal feature fusion with machine learning approach
    Sandeep Kumar Panda
    Ajay Kumar Jena
    Mohit Ranjan Panda
    Susmita Panda
    Multimedia Tools and Applications, 2023, 82 : 42763 - 42781
  • [48] Speech emotion recognition using feature fusion: a hybrid approach to deep learning
    Khan, Waleed Akram
    ul Qudous, Hamad
    Farhan, Asma Ahmad
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (31) : 75557 - 75584
  • [49] Speech emotion recognition using multimodal feature fusion with machine learning approach
    Panda, Sandeep Kumar
    Jena, Ajay Kumar
    Panda, Mohit Ranjan
    Panda, Susmita
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (27) : 42763 - 42781
  • [50] Machine learning approach of speech emotions recognition using feature fusion technique
    Paul, Bachchu
    Bera, Somnath
    Dey, Tanushree
    Phadikar, Santanu
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (03) : 8663 - 8688