Eight years of sub-micrometre organic aerosol composition data from the boreal forest characterized using a machine-learning approach

被引:18
|
作者
Heikkinen, Liine [1 ,4 ,5 ]
Aijala, Mikko [1 ]
Daellenbach, Kaspar R. [1 ,2 ]
Chen, Gang [2 ]
Garmash, Olga [1 ]
Aliaga, Diego [1 ]
Graeffe, Frans [1 ]
Raty, Meri [1 ]
Luoma, Krista [1 ]
Aalto, Pasi [1 ]
Kulmala, Markku [1 ]
Petaja, Tuukka [1 ]
Worsnop, Douglas [1 ,3 ]
Ehn, Mikael [1 ]
机构
[1] Univ Helsinki, Fac Sci, Inst Atmospher & Earth Syst Res Phys, Helsinki 00014, Finland
[2] Paul Scherrer Inst, Lab Atmospher Chem, Villigen, Switzerland
[3] Aerodyne Res Inc, Billerica, MA 01821 USA
[4] Stockholm Univ, Dept Environm Sci, Stockholm, Sweden
[5] Stockholm Univ, Bolin Ctr Climate Res, Stockholm, Sweden
基金
芬兰科学院; 欧洲研究理事会; 瑞士国家科学基金会;
关键词
POSITIVE MATRIX FACTORIZATION; SOURCE APPORTIONMENT; DATA SETS; SECONDARY; MODEL; POLLUTION; UNCERTAINTY; COMPONENTS; EMISSIONS; CHEMISTRY;
D O I
10.5194/acp-21-10081-2021
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The Station for Measuring Ecosystem-Atmosphere Relations (SMEAR) II, located within the boreal forest of Finland, is a unique station in the world due to the wide range of long-term measurements tracking the Earth-atmosphere interface. In this study, we characterize the composition of organic aerosol (OA) at SMEAR II by quantifying its driving constituents. We utilize a multi-year data set of OA mass spectra measured in situ with an Aerosol Chemical Speciation Monitor (ACSM) at the station. To our knowledge, this mass spectral time series is the longest of its kind published to date. Similarly to other previously reported efforts in OA source apportionment from multi-seasonal or multi-annual data sets, we approached the OA characterization challenge through positive matrix factorization (PMF) using a rolling window approach. However, the existing methods for extracting minor OA components were found to be insufficient for our rather remote site. To overcome this issue, we tested a new statistical analysis framework. This included unsupervised feature extraction and classification stages to explore a large number of unconstrained PMF runs conducted on the measured OA mass spectra. Anchored by these results, we finally constructed a relaxed chemical mass balance (CMB) run that resolved different OA components from our observations. The presented combination of statistical tools provided a data-driven analysis methodology, which in our case achieved robust solutions with minimal subjectivity. Following the extensive statistical analyses, we were able to divide the 2012-2019 SMEAR II OA data (mass concentration interquartile range (IQR): 0.7, 1.3, and 2.6 mu gm(-3)) into three sub-categories - low-volatility oxygenated OA (LV-OOA), semi-volatile oxygenated OA (SV-OOA), and primary OA (POA) - proving that the tested methodology was able to provide results consistent with literature. LV-OOA was the most dominant OA type (organic mass fraction IQR: 49 %, 62 %, and 73 %). The seasonal cycle of LV-OOA was bimodal, with peaks both in summer and in February. We associated the wintertime LV-OOA with anthropogenic sources and assumed biogenic influence in LV-OOA formation in summer. Through a brief trajectory analysis, we estimated summertime natural LV-OOA formation of tens of ngm 3 h 1 over the boreal forest. SV-OOA was the second highest contributor to OA mass (organic mass fraction IQR: 19 %, 31 %, and 43 %). Due to SV-OOA's clear peak in summer, we estimate biogenic processes as the main drivers in its formation. Unlike for LV-OOA, the highest SV-OOA concentrations were detected in stable summertime nocturnal surface layers. Two nearby sawmills also played a significant role in SV-OOA production as also exemplified by previous studies at SMEAR II. POA, taken as a mix of two different OA types reported previously, hydrocarbon-like OA (HOA) and biomass burning OA (BBOA), made up a minimal OA mass fraction (IQR: 2 %, 6 %, and 13 %). Notably, the quantification of POA at SMEAR II using ACSM data was not possible following existing rolling PMF methodologies. Both POA organic mass fraction and mass concentration peaked in winter. Its appearance at SMEAR II was linked to strong southerly winds. Similar wind direction and speed dependence was not observed among other OA types. The high wind speeds probably enabled the POA transport to SMEAR II from faraway sources in a relatively fresh state. In the event of slower wind speeds, POA likely evaporated and/or aged into oxidized organic aerosol before detection. The POA organic mass fraction was significantly lower than reported by aerosol mass spectrometer (AMS) measurements 2 to 4 years prior to the ACSM measurements. While the co-located long-term measurements of black carbon supported the hypothesis of higher POA loadings prior to year 2012, it is also possible that short-term (POA) pollution plumes were averaged out due to the slow time resolution of the ACSM combined with the further 3 h data averaging needed to ensure good signal-to-noise ratios (SNRs). Despite the length of the ACSM data set, we did not focus on quantifying long-term trends of POA (nor other components) due to the high sensitivity of OA composition to meteorological anomalies, the occurrence of which is likely not normally distributed over the 8-year measurement period. Due to the unique and realistic seasonal cycles and meteorology dependences of the independent OA subtypes complemented by the reasonably low degree of unexplained OA variability, we believe that the presented data analysis approach performs well. Therefore, we hope that these results encourage also other researchers possessing several-yearlong time series of similar data to tackle the data analysis via similar semi- or unsupervised machine-learning approaches. This way the presented method could be further optimized and its usability explored and evaluated also in other environments.
引用
收藏
页码:10081 / 10109
页数:29
相关论文
共 7 条
  • [1] Constructing a data-driven receptor model for organic and inorganic aerosol - a synthesis analysis of eight mass spectrometric data sets from a boreal forest site
    Aijala, Mikko
    Daellenbach, Kaspar R.
    Canonaco, Francesco
    Heikkinen, Liine
    Junninen, Heikki
    Petaja, Tuukka
    Kulmala, Markku
    Prevot, Andre S. H.
    Ehn, Mikael
    ATMOSPHERIC CHEMISTRY AND PHYSICS, 2019, 19 (06) : 3645 - 3672
  • [2] A Machine-Learning Approach to PolInSAR and LiDAR Data Fusion for Improved Tropical Forest Canopy Height Estimation Using NASA AfriSAR Campaign Data
    Pourshamsi, Maryam
    Garcia, Mariano
    Lavalle, Marco
    Balzter, Heiko
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2018, 11 (10) : 3453 - 3463
  • [4] A Machine-Learning Approach to PolInSAR and LiDAR Data Fusion for Improved Tropical Forest Canopy Height Estimation Using NASA AfriSAR Campaign Data (vol 11, pg 3453, 2019)
    Pourshamsi, Maryam
    Garcia, Mariano
    Lavalle, Marco
    Balzter, Heiko
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2020, 13 : 566 - 566
  • [5] Estimating total organic carbon (TOC) of shale rocks from their mineral composition using stacking generalization approach of machine learning
    Asante-Okyere, Solomon
    Marfo, Solomon Adjei
    Ziggah, Yao Yevenyo
    UPSTREAM OIL AND GAS TECHNOLOGY, 2023, 11
  • [6] LARGE-SCALE FOREST HEIGHT MAPPING FROM TANDEM-X, ICESAT-2 AND LANDSAT 8 DATA USING A MACHINE-LEARNING METHOD
    Hu, Huacan
    Fu, HaiQiang
    Zhu, JianJun
    Lopez-Sanchez, Juan M.
    Gomez, Cristina
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 1764 - 1767
  • [7] Prediction and validation of protein-protein interactors from genome-wide DNA-binding data using a knowledge-based machine-learning approach
    Waardenberg, Ashley J.
    Homan, Bernou
    Mohamed, Stephanie
    Harvey, Richard P.
    Bouveret, Romaric
    OPEN BIOLOGY, 2016, 6 (09)