Machine Learning Data Imputation and Prediction of Foraging Group Size in a Kleptoparasitic Spider

被引:5
|
作者
Su, Yong-Chao [1 ]
Wu, Cheng-Yu [1 ]
Yang, Cheng-Hong [2 ,3 ,4 ]
Li, Bo-Sheng [2 ]
Moi, Sin-Hua [5 ]
Lin, Yu-Da [6 ]
机构
[1] Kaohsiung Med Univ, Dept Biomed Sci & Environm Biol, Kaohsiung 80708, Taiwan
[2] Natl Kaohsiung Univ Sci & Technol, Dept Elect Engn, Kaohsiung 80778, Taiwan
[3] Kaohsiung Med Univ, PhD Program Biomed Engn, Kaohsiung 80708, Taiwan
[4] Kaohsiung Med Univ, Drug Dev & Value Creat Res Ctr, Kaohsiung 80708, Taiwan
[5] I Shou Univ, E Da Canc Hosp, Ctr Canc Program Dev, Kaohsiung 82445, Taiwan
[6] Natl Penghu Univ Sci & Technol, Dept Comp Sci & Informat Engn, Magong 880011, Penghu, Taiwan
关键词
machine learning; data imputation; group foraging; PLS-PM; ideal free distribution; kleptoparasitism; resource allocation; CLASSIFICATION;
D O I
10.3390/math9040415
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Cost-benefit analysis is widely used to elucidate the association between foraging group size and resource size. Despite advances in the development of theoretical frameworks, however, the empirical systems used for testing are hindered by the vagaries of field surveys and incomplete data. This study developed the three approaches to data imputation based on machine learning (ML) algorithms with the aim of rescuing valuable field data. Using 163 host spider webs (132 complete data and 31 incomplete data), our results indicated that the data imputation based on random forest algorithm outperformed classification and regression trees, the k-nearest neighbor, and other conventional approaches (Wilcoxon signed-rank test and correlation difference have p-value from < 0.001-0.030). We then used rescued data based on a natural system involving kleptoparasitic spiders from Taiwan and Vietnam (Argyrodes miniaceus, Theridiidae) to test the occurrence and group size of kleptoparasites in natural populations. Our partial least-squares path modelling (PLS-PM) results demonstrated that the size of the host web (T = 6.890, p = 0.000) is a significant feature affecting group size. The resource size (T = 2.590, p = 0.010) and the microclimate (T = 3.230, p = 0.001) are significant features affecting the presence of kleptoparasites. The test of conformation of group size distribution to the ideal free distribution (IFD) model revealed that predictions pertaining to per-capita resource size were underestimated (bootstrap resampling mean slopes <IFD predicted slopes, p < 0.001). These findings highlight the importance of applying appropriate ML methods to the handling of missing field data.
引用
收藏
页码:1 / 16
页数:16
相关论文
共 50 条
  • [1] Foraging Payoffs Change With Group Size in Kin and Non-kin Groups of an Argyrodinae Kleptoparasitic Spider, Argyrodes miniaceus
    Yu, Chia-Ning
    Kuo, Chi-Yun
    Lin, Hsing-Chieh
    Su, Yong-Chao
    [J]. FRONTIERS IN ECOLOGY AND EVOLUTION, 2022, 10
  • [2] Sex-linked differences in learning to improve foraging techniques in the group-living kleptoparasitic spider Argyrodes antipodianus (Theridiidae)
    Whitehouse, Mary E. A.
    [J]. NEW ZEALAND JOURNAL OF ZOOLOGY, 2016, 43 (01) : 96 - 111
  • [3] PREY SIZE, PREY PERISHABILITY AND GROUP FORAGING IN A SOCIAL SPIDER
    RYPSTRA, AL
    TIREY, RS
    [J]. OECOLOGIA, 1991, 86 (01) : 25 - 30
  • [4] Prediction of concrete strengths enabled by missing data imputation and interpretable machine learning
    Lyngdoh, Gideon A.
    Zaki, Mohd
    Krishnan, N. M. Anoop
    Das, Sumanta
    [J]. CEMENT & CONCRETE COMPOSITES, 2022, 128
  • [5] A Deep Learning Approach to Imputation of Dynamic Pupil Size Data and Prediction of ADHD
    Choi, Seongyune
    Jang, Yeonju
    Kim, Hyeoncheol
    [J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2023, 32 (06)
  • [6] Genotype Imputation Quality Prediction using Machine Learning
    Kunji, Khalid
    Saad, Mohamad
    [J]. GENETIC EPIDEMIOLOGY, 2022, 46 (07) : 528 - 528
  • [7] A Machine Learning-Based Missing Data Imputation with FHIR Interoperability Approach in Sepsis Prediction
    Toro Beltran, Cristian Fernando
    Villarreal Ibanez, Erick Daniel
    Milen Orejuela, Vivian
    Garcia Henao, John Anderson
    [J]. HIGH PERFORMANCE COMPUTING, CARLA 2022, 2022, 1660 : 116 - 130
  • [8] Missing data imputation using machine learning based methods to improve HCC survival prediction
    Yumus, Mehmethan
    Apaydin, Merve
    Degirmenci, Ali
    Karal, Omer
    [J]. 2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [9] Analysis of Machine Learning Based Imputation of Missing Data
    Rizvi, Syed Tahir Hussain
    Latif, Muhammad Yasir
    Amin, Muhammad Saad
    Telmoudi, Achraf Jabeur
    Shah, Nasir Ali
    [J]. CYBERNETICS AND SYSTEMS, 2023,
  • [10] Data imputation on IoT gateways using machine learning
    Franca, Cinthya M.
    Couto, Rodrigo S.
    Velloso, Pedro B.
    [J]. 2021 19TH MEDITERRANEAN COMMUNICATION AND COMPUTER NETWORKING CONFERENCE (MEDCOMNET), 2021,