Constructing a data-driven receptor model for organic and inorganic aerosol - a synthesis analysis of eight mass spectrometric data sets from a boreal forest site


The interactions between organic and inorganic aerosol chemical components are integral to understanding and modelling climate and health-relevant aerosol physicochemical properties, such as volatility, hygroscopicity, light scattering and toxicity. This study presents a synthesis analysis for eight data sets, of non-refractory aerosol composition, measured at a boreal forest site. The measurements, performed with an aerosol mass spectrometer, cover in total around 9 months over the course of 3 years. In our statistical analysis, we use the complete organic and inorganic unit-resolution mass spectra, as opposed to the more common approach of only including the organic fraction. The analysis is based on iterative, combined use of (1) data reduction, (2) classification and (3) scaling tools, producing a data-driven chemical mass balance type of model capable of describing site-specific aerosol composition. The receptor model we constructed was able to explain 83 +/- 8% of variation in data, which increased to 96 +/- 3% when signals from low signal-to-noise variables were not considered. The resulting interpretation of an extensive set of aerosol mass spectrometric data infers seven distinct aerosol chemical components for a rural boreal forest site: ammonium sulfate (35 +/- 7% of mass), low and semi-volatile oxidised organic aerosols (27 +/- 8% and 12 +/- 7 %), biomass burning organic aerosol (11 +/- 7 %), a nitrate-containing organic aerosol type (7 +/- 2 %), ammonium nitrate (5 +/- 2 %), and hydrocarbon-like organic aerosol (3 +/- 1 %). Some of the additionally observed, rare outlier aerosol types likely emerge due to surface ionisation effects and likely represent amine compounds from an unknown source and alkaline metals from emissions of a nearby district heating plant. Compared to traditional, ionbalance-based inorganics apportionment schemes for aerosol mass spectrometer data, our statistics-based method provides an improved, more robust approach, yielding readily useful information for the modelling of submicron atmospheric aerosols physical and chemical properties. The results also shed light on the division between organic and inorganic aerosol types and dynamics of salt formation in aerosol. Equally importantly, the combined methodology exemplifies an iterative analysis, using consequent analysis steps by a combination of statistical methods. Such an approach offers new ways to home in on physicochemically sensible solutions with minimal need for a priori information or analyst interference. We therefore suggest that similar statisticsbased approaches offer significant potential for un- or semi-supervised machine-learning applications in future analyses of aerosol mass spectrometric data.