Machine Learning Data Imputation and Prediction of Foraging Group Size in a Kleptoparasitic Spider

被引:5
|
作者
Su, Yong-Chao [1 ]
Wu, Cheng-Yu [1 ]
Yang, Cheng-Hong [2 ,3 ,4 ]
Li, Bo-Sheng [2 ]
Moi, Sin-Hua [5 ]
Lin, Yu-Da [6 ]
机构
[1] Kaohsiung Med Univ, Dept Biomed Sci & Environm Biol, Kaohsiung 80708, Taiwan
[2] Natl Kaohsiung Univ Sci & Technol, Dept Elect Engn, Kaohsiung 80778, Taiwan
[3] Kaohsiung Med Univ, PhD Program Biomed Engn, Kaohsiung 80708, Taiwan
[4] Kaohsiung Med Univ, Drug Dev & Value Creat Res Ctr, Kaohsiung 80708, Taiwan
[5] I Shou Univ, E Da Canc Hosp, Ctr Canc Program Dev, Kaohsiung 82445, Taiwan
[6] Natl Penghu Univ Sci & Technol, Dept Comp Sci & Informat Engn, Magong 880011, Penghu, Taiwan
关键词
machine learning; data imputation; group foraging; PLS-PM; ideal free distribution; kleptoparasitism; resource allocation; CLASSIFICATION;
D O I
10.3390/math9040415
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Cost-benefit analysis is widely used to elucidate the association between foraging group size and resource size. Despite advances in the development of theoretical frameworks, however, the empirical systems used for testing are hindered by the vagaries of field surveys and incomplete data. This study developed the three approaches to data imputation based on machine learning (ML) algorithms with the aim of rescuing valuable field data. Using 163 host spider webs (132 complete data and 31 incomplete data), our results indicated that the data imputation based on random forest algorithm outperformed classification and regression trees, the k-nearest neighbor, and other conventional approaches (Wilcoxon signed-rank test and correlation difference have p-value from < 0.001-0.030). We then used rescued data based on a natural system involving kleptoparasitic spiders from Taiwan and Vietnam (Argyrodes miniaceus, Theridiidae) to test the occurrence and group size of kleptoparasites in natural populations. Our partial least-squares path modelling (PLS-PM) results demonstrated that the size of the host web (T = 6.890, p = 0.000) is a significant feature affecting group size. The resource size (T = 2.590, p = 0.010) and the microclimate (T = 3.230, p = 0.001) are significant features affecting the presence of kleptoparasites. The test of conformation of group size distribution to the ideal free distribution (IFD) model revealed that predictions pertaining to per-capita resource size were underestimated (bootstrap resampling mean slopes <IFD predicted slopes, p < 0.001). These findings highlight the importance of applying appropriate ML methods to the handling of missing field data.
引用
收藏
页码:1 / 16
页数:16
相关论文
共 50 条
  • [21] Group foraging in the colonial spider Parawixia bistriata (Araneidae):: effect of resource levels and prey size
    Campon, Florencia Fernandez
    [J]. ANIMAL BEHAVIOUR, 2007, 74 : 1551 - 1562
  • [22] Sharpening the BLADE: Missing Data Imputation Using Supervised Machine Learning
    Suresh, Marcus
    Taib, Ronnie
    Zhao, Yanchang
    Jin, Warren
    [J]. AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 215 - 227
  • [23] Analysis of Suitable Machine Learning Imputation Techniques for Arthritis Profile Data
    Ramasamy, Uma
    Santhoshkumar, Sundar
    [J]. IETE JOURNAL OF RESEARCH, 2024, 70 (01) : 334 - 355
  • [24] Evaluation of Machine Learning Classification Algorithms & Missing Data Imputation Techniques
    Nwulu, Nnamdi I.
    [J]. 2017 INTERNATIONAL ARTIFICIAL INTELLIGENCE AND DATA PROCESSING SYMPOSIUM (IDAP), 2017,
  • [25] Graph Machine Learning for Improved Imputation of Missing Tropospheric Ozone Data
    Betancourt, Clara
    Li, Cathy W. Y.
    Kleinert, Felix
    Schultz, Martin G.
    [J]. ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2023, 57 (46) : 18246 - 18258
  • [26] Prediction of Breast Cancer Survival by Machine Learning Methods: An Application of Multiple Imputation
    Lotfnezhad Afshar, Hadi
    Jabbari, Nasrollah
    Khalkhali, Hamid Reza
    Esnaashari, Omid
    [J]. IRANIAN JOURNAL OF PUBLIC HEALTH, 2021, 50 (03) : 598 - 605
  • [27] From Imputation to Prediction: A Comprehensive Machine Learning Pipeline for Stroke Risk Analysis
    Padmakala, S.
    Chandrasekar, A.
    [J]. 2024 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND APPLIED INFORMATICS, ACCAI 2024, 2024,
  • [28] Sedimentary environment prediction of grain-size data based on machine learning approach
    Su, Qiao
    Zhu, Yanhui
    Hu, Fang
    Xu, Xingyong
    [J]. INTERPRETATION-A JOURNAL OF SUBSURFACE CHARACTERIZATION, 2020, 8 (03): : SL71 - SL78
  • [29] IoT-based group size prediction and recommendation system using machine learning and deep learning techniques
    Deepti Chopra
    Arvinder Kaur
    [J]. SN Applied Sciences, 2021, 3
  • [30] IoT-based group size prediction and recommendation system using machine learning and deep learning techniques
    Chopra, Deepti
    Kaur, Arvinder
    [J]. SN APPLIED SCIENCES, 2021, 3 (02)