Revealing the sources of arsenic in private well water using Random Forest Classification and Regression

被引:0
|
作者
Giri, Subhasis [1 ]
Kang, Yang [2 ]
MacDonald, Kristi [3 ]
Tippett, Mara [3 ]
Qiu, Zeyuan [4 ]
Lathrop, Richard G. [1 ]
Obropta, Christopher C. [5 ]
机构
[1] Rutgers State Univ, Dept Ecol Evolut & Nat Resources, New Brunswick, NJ 08901 USA
[2] Columbia Univ, Dept Stat, New York, NY 10027 USA
[3] Raritan Headwaters, Gladstone, NJ 07931 USA
[4] Univ Hts, New Jersey Inst Technol, Dept Chem & Environm Sci, Newark, NJ 07102 USA
[5] Rutgers State Univ, Dept Environm Sci, New Brunswick, NJ 08901 USA
关键词
Arsenic; Random Forest Classification; Random Forest Regression; Private well water; Bed rock; Human health;
D O I
暂无
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Exposure to arsenic through private drinking water wells causes serious human health risks throughout the globe. Water testing data indicates there is arsenic contamination in private drinkingwater wells acrossNewJersey. To reduce the adverse health risk due to exposure to arsenic in drinking water, it is necessary to identify arsenic sources affecting private wells. Private wells are not regulated by any federal or state agencies through the Safe DrinkingWater Act and therefore information is often lacking. To this end, we have developed machine learning algorithms including Random Forest Classification and Regression to decipher the factors contributing to higher arsenic concentration in private drinking water wells in west-central New Jersey. Arsenic concentration in private drinking water wells served as a response variable while explanatory variableswere geological bedrock type, soil type, drainage class, land use/cover, and presence of orchards, contaminated sites, and abandoned mines within the 152.4-meter (500 ft) radius of each well. Random Forest Classification and Regression achieved 66 % and 55 % prediction accuracies for arsenic concentration in private drinking water wells, respectively. Overall, both models identify that bedrock, soil, land use/cover, and drainage type (in descending order) are the most important variables contributing to higher arsenic concentration in well water. These models further identify bedrock subgroups at a finer scale including Passaic Formation, Lockatong Formation, Stockton Formation contributing significantly to arsenic concentration in well water. Identification of sources of arsenic contamination in private drinking water wells at such a fine scale facilitates development of more targeted outreach as well as mitigation strategies to improve water quality and safeguard human health.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Revealing the sources of arsenic in private well water using Random Forest Classification and Regression
    Giri, Subhasis
    Kang, Yang
    MacDonald, Kristi
    Tippett, Mara
    Qiu, Zeyuan
    Lathrop, Richard G.
    Obropta, Christopher C.
    [J]. SCIENCE OF THE TOTAL ENVIRONMENT, 2023, 857
  • [2] Logistic Regression and Random Forest for Effective Imbalanced Classification
    Luo, Hanwu
    Pan, Xiubao
    Wang, Qingshun
    Ye, Shasha
    Qian, Ying
    [J]. 2019 IEEE 43RD ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2019, : 916 - 917
  • [3] Random forest: A classification and regression tool for compound classification and QSAR modeling
    Svetnik, V
    Liaw, A
    Tong, C
    Culberson, JC
    Sheridan, RP
    Feuston, BP
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2003, 43 (06): : 1947 - 1958
  • [4] A classification based on random forest for partial discharge sources
    Pu, Senlin
    Zhang, Huajun
    Mao, Cuimin
    Yang, Guang
    [J]. PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 2307 - 2311
  • [5] Simultaneous regression and classification for drug sensitivity prediction using an advanced random forest method
    Kerstin Lenhof
    Lea Eckhart
    Nico Gerstner
    Tim Kehl
    Hans-Peter Lenhof
    [J]. Scientific Reports, 12
  • [6] Simultaneous regression and classification for drug sensitivity prediction using an advanced random forest method
    Lenhof, Kerstin
    Eckhart, Lea
    Gerstner, Nico
    Kehl, Tim
    Lenhof, Hans-Peter
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)
  • [7] Classification and Prediction of Breast Cancer using Linear Regression, Decision Tree and Random Forest
    Murugan, S.
    Kumar, B. Muthu
    Amudha, S.
    [J]. 2017 INTERNATIONAL CONFERENCE ON CURRENT TRENDS IN COMPUTER, ELECTRICAL, ELECTRONICS AND COMMUNICATION (CTCEEC), 2017, : 763 - 766
  • [8] Arsenic in private well water and birth outcomes in the United States
    Bulka, Catherine M.
    Bryan, Molly Scannell
    Lombard, Melissa A.
    Bartell, Scott M.
    Jones, Daniel K.
    Bradley, Paul M.
    Vieira, Veronica M.
    Silverman, Debra T.
    Focazio, Michael
    Toccalino, Patricia L.
    Daniel, Johnni
    Backer, Lorraine C.
    Ayotte, Joseph D.
    Gribble, Matthew O.
    Argos, Maria
    [J]. ENVIRONMENT INTERNATIONAL, 2022, 163
  • [9] Validity of spatial models of arsenic concentrations in private well water
    Meliker, Jaymie R.
    AvRuskin, Gillian A.
    Slotnick, Melissa J.
    Goovaerts, Pierre
    Schottenfeld, David
    Jacquez, Geoffrey M.
    Nriagu, Jerome O.
    [J]. ENVIRONMENTAL RESEARCH, 2008, 106 (01) : 42 - 50
  • [10] Cost effective arsenic reductions in private well water in Maine
    Sargent-Michaud, Jessica
    Boyle, Kevin J.
    Smith, Andrew E.
    [J]. JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION, 2006, 42 (05): : 1237 - 1245