Revealing the sources of arsenic in private well water using Random Forest Classification and Regression

被引:0
|
作者
Giri, Subhasis [1 ]
Kang, Yang [2 ]
MacDonald, Kristi [3 ]
Tippett, Mara [3 ]
Qiu, Zeyuan [4 ]
Lathrop, Richard G. [1 ]
Obropta, Christopher C. [5 ]
机构
[1] Rutgers State Univ, Dept Ecol Evolut & Nat Resources, New Brunswick, NJ 08901 USA
[2] Columbia Univ, Dept Stat, New York, NY 10027 USA
[3] Raritan Headwaters, Gladstone, NJ 07931 USA
[4] Univ Hts, New Jersey Inst Technol, Dept Chem & Environm Sci, Newark, NJ 07102 USA
[5] Rutgers State Univ, Dept Environm Sci, New Brunswick, NJ 08901 USA
关键词
Arsenic; Random Forest Classification; Random Forest Regression; Private well water; Bed rock; Human health;
D O I
暂无
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Exposure to arsenic through private drinking water wells causes serious human health risks throughout the globe. Water testing data indicates there is arsenic contamination in private drinkingwater wells acrossNewJersey. To reduce the adverse health risk due to exposure to arsenic in drinking water, it is necessary to identify arsenic sources affecting private wells. Private wells are not regulated by any federal or state agencies through the Safe DrinkingWater Act and therefore information is often lacking. To this end, we have developed machine learning algorithms including Random Forest Classification and Regression to decipher the factors contributing to higher arsenic concentration in private drinking water wells in west-central New Jersey. Arsenic concentration in private drinking water wells served as a response variable while explanatory variableswere geological bedrock type, soil type, drainage class, land use/cover, and presence of orchards, contaminated sites, and abandoned mines within the 152.4-meter (500 ft) radius of each well. Random Forest Classification and Regression achieved 66 % and 55 % prediction accuracies for arsenic concentration in private drinking water wells, respectively. Overall, both models identify that bedrock, soil, land use/cover, and drainage type (in descending order) are the most important variables contributing to higher arsenic concentration in well water. These models further identify bedrock subgroups at a finer scale including Passaic Formation, Lockatong Formation, Stockton Formation contributing significantly to arsenic concentration in well water. Identification of sources of arsenic contamination in private drinking water wells at such a fine scale facilitates development of more targeted outreach as well as mitigation strategies to improve water quality and safeguard human health.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Comparison of Heart Disease Classification with Logistic Regression Algorithm and Random Forest Algorithm
    Latifah, Firda Anindita
    Slamet, Isnandar
    Sugiyanto
    [J]. INTERNATIONAL CONFERENCE ON SCIENCE AND APPLIED SCIENCE (ICSAS2020), 2020, 2296
  • [32] A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification
    Kanish Shah
    Henil Patel
    Devanshi Sanghvi
    Manan Shah
    [J]. Augmented Human Research, 2020, 5 (1)
  • [33] Feature selection and classification of leukocytes using random forest
    Saraswat, Mukesh
    Arya, K. V.
    [J]. MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2014, 52 (12) : 1041 - 1052
  • [34] Face Classification Using Gabor Wavelets and Random Forest
    Ghosal, Vidyut
    Tikmani, Paras
    Gupta, Phalguni
    [J]. 2009 CANADIAN CONFERENCE ON COMPUTER AND ROBOT VISION, 2009, : 68 - 73
  • [35] Pathological Lung Classification Using Random Forest Classifier
    Vijayakumari, B.
    Manikumaran, M.
    [J]. PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL (I2C2), 2017,
  • [36] Automatic fruit classification using random forest algorithm
    Zawbaa, Hossam M.
    Hazman, Maryam
    Abbass, Mona
    Hassanien, Aboul Ella
    [J]. 2014 14TH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS (HIS), 2014, : 164 - 168
  • [37] Feature selection and classification of leukocytes using random forest
    Mukesh Saraswat
    K. V. Arya
    [J]. Medical & Biological Engineering & Computing, 2014, 52 : 1041 - 1052
  • [38] Methodology for Malware Classification using a Random Forest Classifier
    Domenick Morales-Molina, Carlos
    Santamaria-Guerrero, Diego
    Sanchez-Perez, Gabriel
    Toscano-Medina, Karina
    Perez-Meana, Hector
    Hernandez-Suarez, Aldo
    [J]. 2018 IEEE INTERNATIONAL AUTUMN MEETING ON POWER, ELECTRONICS AND COMPUTING (ROPEC), 2018,
  • [39] Statistical classification of mammograms using random forest classifier
    Vibha, L.
    Harshavardhan, G. M.
    Pranaw, K.
    Shenoy, P. Deepa
    Venugopal, K. R.
    Patnaik, L. M.
    [J]. FOURTH INTERNATIONAL CONFERENCE ON INTELLIGENT SENSING AND INFORMATION PROCESSSING, PROCEEDINGS, 2006, : 178 - +
  • [40] Classification of Seizure Types Using Random Forest Classifier
    Basri, Ashjan
    Arif, Muhammad
    [J]. ADVANCES IN SCIENCE AND TECHNOLOGY-RESEARCH JOURNAL, 2021, 15 (03) : 167 - 178