Novel approach for quantitative and qualitative authors research profiling using feature fusion and tree-based learning approach

被引:0
|
作者
Umer, Muhammad [1 ]
Aljrees, Turki [2 ]
Ullah, Saleem [1 ]
Bashir, Ali Kashif [3 ]
机构
[1] Khwaja Fareed Univ Engn & IT, Dept Comp Sci, Rahim Yar Khan, Punjab, Pakistan
[2] Univ Hafr Al Batin, Dept Comp Sci & Engn, Hafar Al Batin, Saudi Arabia
[3] Manchester Metropolitan Univ, Dept Comp & Math, Manchester, England
关键词
Citation sentiment analysis; Ensemble learning; Feature engineering; Feature fusion; Intelligent recommendation and text analysis; Authors research profiling; Self citation analysis; SELF-CITATION RATES; H-INDEX; IMPACT; CLASSIFICATION; PATTERNS; MACRO; SMOTE;
D O I
10.7717/peerj-cs.1752
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Article citation creates a link between the cited and citing articles and is used as a basis for several parameters like author and journal impact factor, H-index, i10 index, etc., for scientific achievements. Citations also include self-citation which refers to article citation by the author himself. Self-citation is important to evaluate an author's research profile and has gained popularity recently. Although different criteria are found in the literature regarding appropriate self-citation, self-citation does have a huge impact on a researcher's scientific profile. This study carries out two cases in this regard. In case 1, the qualitative aspect of the author's profile is analyzed using hand-crafted feature engineering techniques. The sentiments conveyed through citations are integral in assessing research quality, as they can signify appreciation, critique, or serve as a foundation for further research. Analyzing sentiments within in-text citations remains a formidable challenge, even with the utilization of automated sentiment annotations. For this purpose, this study employs machine learning models using term frequency (TF) and term frequency-inverse document frequency (TF-IDF). Random forest using TF with Synthetic Minority Oversampling Technique (SMOTE) achieved a 0.9727 score of accuracy. Case 2 deals with quantitative analysis and investigates direct and indirect self-citation. In this study, the top 2% of researchers in 2020 is considered as a baseline. For this purpose, the data of the top 25 Pakistani researchers are manually retrieved from this dataset, in addition to the citation information from the Web of Science (WoS). The self citation is estimated using the proposed model and results are compared with those obtained from WoS. Experimental results show a substantial difference between the two, as the ratio of self-citation from the proposed approach is higher than WoS. It is observed that the citations from the WoS for authors are overstated. For a comprehensive evaluation of the researcher's profile, both direct and indirect self citation must be included.
引用
收藏
页数:26
相关论文
共 50 条
  • [31] Machine Learning Based Approach for Future Prediction of Authors in Research Academics
    Bhattacharya S.
    Banerjee A.
    Goswami A.
    Nandi S.
    Pradhan D.K.
    SN Computer Science, 4 (3)
  • [32] Enhancing interpretability of tree-based models for downstream salinity prediction: Decomposing feature importance using the Shapley additive explanation approach
    Zhao, Guang-yao
    Ohsu, Kenji
    Saputra, Henry Kasmanhadi
    Okada, Teruhisa
    Suzuki, Jumpei
    Kuwahara, Yuji
    Fujita, Masafumi
    RESULTS IN ENGINEERING, 2024, 23
  • [33] A novel self-learning feature selection approach based on feature attributions
    Chen, Jianting
    Yuan, Shuhan
    Lv, Dongdong
    Xiang, Yang
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 183
  • [34] A NOVEL APPROACH TO MIXING QUALITATIVE AND QUANTITATIVE METHODS IN HIV AND STI PREVENTION RESEARCH
    Penman-Aguilar, Ana
    Macaluso, Maurizio
    Peacock, Nadine
    Snead, M. Christine
    Posner, Samuel F.
    AIDS EDUCATION AND PREVENTION, 2014, 26 (02) : 95 - 108
  • [35] High Quality Inverse Halftoning Using Variance Gain-, Texture- and Decision Tree-Based Learning Approach
    Chung, Kuo-Liang
    Huang, Yong-Huai
    Wu, Kang-Chieh
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2010, 26 (06) : 2213 - 2227
  • [36] A tree-based machine learning model to approach morphologic assessment of malignant salivary gland tumors
    Lopez-Janeiro, Alvaro
    Cabanuz, Clara
    Blasco-Santana, Luis
    Ruiz-Bravo, Elena
    ANNALS OF DIAGNOSTIC PATHOLOGY, 2022, 56
  • [37] Position Sensing using an Asymmetric Carbon Nanotube Dimer and a Tree-Based Classification Approach
    Dey, Sumitra
    Hassan, Ahmed M.
    2020 IEEE INTERNATIONAL SYMPOSIUM ON ANTENNAS AND PROPAGATION AND NORTH AMERICAN RADIO SCIENCE MEETING, 2020, : 829 - 830
  • [38] Market segmentation of clearance sales outshoppers using cluster and classification tree-based approach
    Hemalatha, M.
    INTERNATIONAL JOURNAL OF INDIAN CULTURE AND BUSINESS MANAGEMENT, 2012, 5 (06) : 627 - 643
  • [39] A tree-based approach for event prediction using episode rules over event streams
    Cho, Chung-Wen
    Zheng, Ying
    Wu, Yi-Hung
    Chen, Arbee L. P.
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2008, 5181 : 225 - +
  • [40] Deciphering the adsorption mechanisms between microplastics and antibiotics: A tree-based stacking machine learning approach
    Gao, Zhiyuan
    Kong, Lingwei
    Han, Donglin
    Kuang, Meijuan
    Li, Linhua
    Song, Xiaomao
    Li, Nannan
    Shi, Qingcheng
    Qin, Xuande
    Wu, Yikang
    Wu, Dinkun
    Xu, Zhihua
    JOURNAL OF CLEANER PRODUCTION, 2025, 486