Final Report: Improving Particulate Matter Source Apportionment for Health Studies: A Trained Receptor Modeling Approach with Sensitivity, Uncertainty and Spatial Analyses

EPA Grant Number: R833866
Title: Improving Particulate Matter Source Apportionment for Health Studies: A Trained Receptor Modeling Approach with Sensitivity, Uncertainty and Spatial Analyses
Investigators: Russell, Armistead G. , Klein, Mitchel , Marmur, Amit , Mulholland, James , Sarnat, Stefanie Ebelt , Sarnat, Jeremy , Tolbert, Paige
Institution: Georgia Institute of Technology , Emory University
EPA Project Officer: Ilacqua, Vito
Project Period: December 1, 2008 through November 30, 2012 (Extended to November 30, 2013)
Project Amount: $899,956
RFA: Innovative Approaches to Particulate Matter Health, Composition, and Source Questions (2007) RFA Text |  Recipients Lists
Research Category: Health Effects , Particulate Matter , Air

Objective:

As laid out in the proposal, the main objectives of this research are to test four hypotheses derived from ongoing source apportionment (SA)-based epidemiologic and air quality modeling studies:

1) A receptor-based approach, trained using an ensemble of model results (including receptor and emissions-based models), can be developed that neither introduces excessive nor inhibits an appropriate level of day-to-day variability.

2) The method can be applied to long-term data sets for use in acute health effect studies.

3) The method can be used to temporally interpolate between observations (e.g., for data available every third day) and spatially interpolate between urban and rural monitors, and

4) Uncertainties can be propagated from SA model inputs to health analysis outputs, with outputs most sensitive to source profile inputs.

To test the hypotheses, a three-step chemical mass balance (CMB) approach has been developed for particulate matter (PM) source apportionment (SA) that utilizes an ensemble of both source and receptor-based approaches to train a CMB method for use in longer term applications. These three steps include:

1) Averaging SA results, using weights based on method uncertainty, from four receptor models and one chemical transport model, the community multiscale air quality (CMAQ) model, to develop ensemble-based source impacts.

2) Using the weighted source impacts (from Step 1) in an application of CMB with the Lipschitz Global Optimizer (CMB-LGO) to calculate nine ensemble-based source profiles (EBSPs): The source profiles developed include gasoline vehicles (GV), diesel vehicles (DV), dust (DUST), biomass burning (BURN), coal combustion (COAL), secondary organic carbon (SOC), SULFATE, NITRATE, and AMMONIUM.

3) Using the EBSPs on a longer term data set of observations to develop improved source impacts.

Summary/Accomplishments (Outputs/Outcomes):

As detailed in the 2010, 2011 and 2012 progress reports, we have focused on using the ensemble method’s Step 1 to gain new insights into uncertainties of ensemble results as well as source apportionment methods. One of the least understood aspects of source apportionment is that uncertainties in daily source impacts and overall method uncertainties have not been well characterized. Furthermore, they often use different methods, intrinsic to each SA method, which makes inter-comparison of SA method uncertainties difficult. We published the work of ensemble averaging 4 four methods (CMB-LGO, PMF, CMB-MM and CMAQ) using the methods refined in 2011. Since most CMB analyses do not use CMB-LGO, we performed a sensitivity analysis using CMB-RG in lieu of CMB-LGO. We also validated the ensemble results by comparing with SOC estimates from another independent method and determined that mixed weighting is the most appropriate way to conduct the ensemble. In 2012, we developed a Bayesian method of ensemble averaging. Our results show that Bayesian-based ensemble averaging results in a higher correlation with levoglucosan, a tracer of biomass burning. This work was presented at the 2012 American Association of Aerosol Research. We are currently in the process of submitting this work to Environmental Science and Technology. It should also be noted that instead of using the Excel-based CMB-LGO, we developed a Matlab-based program that incorporates gas-based constraints; we refer to this method as CMB-GC (gas constraints). The input into the initial ensemble, however, still uses CMB-LGO results.

We have performed the ensemble method for July 2001, to represent summer, and January 2002, to represent winter, in a manner similar to Lee et al. (2009). Three features of this work distinguish it from that of Lee et al. (2009). First, we performed source apportionment using CMB-RG, CMB-LGO, and PMF using a data set for the Jefferson St. (JST) SEARCH site in Atlanta, GA from January 1, 1999 through December 31, 2004. Missing data were treated in the same manner as Marmur et al. (2005). We did not include several fitting species because on the vast majority of days, they were below the detection limit. These species include: Al, As, Ba, Sb, Sn, and Ti. In addition, we focused on Ensemble averaging using no weights (N=0) and weights of uncertainty squared (N=2), 1/σN , where σ is the daily source impact uncertainty (Lee et al. 2009 focused on weights using 1/σ). Finally, ensemble average uncertainties take into account the covariance of source impacts from the five SA methods.

In the Bayesian-based ensemble averaging method, a posterior distribution of uncertainties is determined using subjective prior information with root mean square error (RMSE) between each method and the ensemble average as updated information. In the previous approach, the uncertainty of each method, the RMSE, was constant. In the Bayesian approach, we treat each method’s uncertainty as itself having uncertainty. That is, the RMSE represents the average uncertainty, which can take on values from a distribution formulated in a Bayesian context. Using this approach, method uncertainties can be sampled L times from the formulated distribution. This give’s L sets of uncertainties that can be used for N realizations of ensemble results for any given day. Thus, if there are K days of methods results, then there can be K*L ensemble results. There are two major consequences of this. First, ensemble results are more
variable than ensemble averaging using the RMSEs, because the Bayesian formulation results in different weights for each ensemble average. Second, having K*L ensemble averages results in a distribution of K*L source profiles. For each day in the long term time series, M source profiles can be chosen from the distribution of K*L source profiles, resulting in a distribution of M source impacts for each day. The Bayesian formulation of SA method uncertainties, with subsequent random sampling from distributions, obviates the need for propagation of errors in estimating uncertainties. Further, random sampling and multiple realization of ensemble
averages, source profiles and final source impacts, results in distributions that automatically propagate uncertainties.

Both non-informative and informative priors were tested. For each day of the short term application of the four SA methods, source impact uncertainties are sampled from the Bayesian-based  posterior distribution. These uncertainties are used as weights to determine an ensemble average. A Monte Carlo technique is used to estimate a distribution of Bayesian ensemble–based source impacts for each day in the ensemble. These distributions of source impacts are then used to determine distributions of two seasonally-based source profiles. For each day in a long term PM2.5 data set, 10 source profiles are sampled from these distributions and used in a CMB application resulting in 10 SA results for each day. This formulation results in a distribution of daily source impacts rather than a single value with an estimated uncertainty. The average and standard deviation of the distribution are used as the final estimate of source impact and uncertainty, respectively.

Conclusions:

Ensemble averaging results in reduction of zero impact days and provides results for every day of the data set and has reduced variability by averaging out excessively high and low source impact days. The ensemble averages and their overall uncertainties are consistent with ensemble averages found in Balachandran et al. [2012]. This is expected since the mean of the posterior distribution is approximately equal to the RMSE.

We used the ensemble results to determine new source profiles, one that was representative of summer and one that was representative of winter. To determine the source profiles, we ran CMB-LGO in “reverse”, where the source impacts were treated as the known quantity (i.e. the ensemble averages) and the source profiles were treated as the unknown. We then ran CMBLGO for a data set at JST from 8/31/98 – 12/31/07 using these new Bayesian-based source profiles (BBSPs) and with measurement based source profiles (MBSPs) (cite Marmur et al 2005). The long-term application source impacts using these two sets of source profiles are highly correlated for all sources except biomass burning, coal combustions and to a lesser extent SOC. Using BBSPs resulted in similar values of the chi-squared statistic. Zero impact days for diesel vehicles were also reduced using BBSPs. However, zero impact days for SOC increased using EBSPs, although the majority of these zero impact days were in winter (October – March). The Bayesian-based biomass burning source impacts using profiles derived form noninformative priors correlated better with observed levoglucosan (R2=0.66) and water soluble potassium (R2=0.63) than source impacts estimated using measurement-based source profiles (R2=0.21 and 0.5, respectively) and positive matrix factorization (R2=0.016 and 0.26, respectively). The Bayesian approach led to closer agreement with total mass (predicted to observed PM2.5 ratio of 0.93) than other methods. The Bayesian approach also corrects for expected seasonal variation of biomass burning and secondary impacts.

We have also applied the ensemble method to data sets from monitoring stations located in Yorkville, GA and South Dekalb (Atlanta, GA) to assess regional differences and demonstrate applicability to other locations. We have compared SA results between the three sites to explore variability using time-series analysis. We have implemented a method developed by Lomb and Scargle (cite) that can be applied to data sets with missing data. This has obviated the need for using interpolated data since traditional Fourier methods require a continuous data set.

The ensemble method has been applied to PM2.5 data sets in St. Louis, MO, Dallas, TX and Birmingham, AL. Our health research partners at Emory University have started using these source apportionment results in their health models. 


Journal Articles on this Report : 18 Displayed | Download in RIS Format

Other project views: All 30 publications 20 publications in selected types All 18 journal articles
Type Citation Project Document Sources
Journal Article Balachandran S, Pachon JE, Hu Y, Lee D, Mulholland JA, Russell AG. Ensemble-trained source apportionment of fine particulate matter and method uncertainty analysis. Atmospheric Environment 2012;61:387-394. R833866 (2012)
R833866 (Final)
R834799 (2012)
R834799 (2013)
R834799 (2014)
R834799 (2015)
R834799 (2016)
R834799 (Final)
R834799C003 (2013)
R834799C003 (2014)
R834799C003 (2015)
R834799C003 (Final)
R834799C004 (2013)
R834799C004 (2014)
R834799C004 (2015)
R834799C004 (Final)
  • Full-text: ScienceDirect-Full Text HTML
    Exit
  • Abstract: ScienceDirect-Abstract
    Exit
  • Other: ScienceDirect-Full Text PDF
    Exit
  • Journal Article Balachandran S, Chang HH, Pachon JE, Holmes HA, Mulholland JA, Russell AG. Bayesian-based ensemble source apportionment of PM2.5. Environmental Science & Technology 2013;47(23):13511-13518. R833866 (Final)
    R834799 (2014)
    R834799 (2015)
    R834799 (2016)
    R834799 (Final)
    R834799C003 (2013)
    R834799C003 (2014)
  • Abstract from PubMed
  • Full-text: ES&T-Full Text HTML
    Exit
  • Abstract: ES&T-Abstract
    Exit
  • Other: ES&T-Full Text PDF
    Exit
  • Journal Article Balachandran S, Pachon JE, Lee S, Oakes MM, Rastogi N, Shi W, Tagaris E, Yan B, Davis A, Zhang X, Weber RJ, Mulholland JA, Bergin MH, Zheng M, Russell AG. Particulate and gas sampling of prescribed fires in South Georgia, USA. Atmospheric Environment 2013;81:125-135. R833866 (Final)
    R834799 (2015)
    R834799 (2016)
    R834799 (Final)
  • Full-text: ScienceDirect-Full Text HTML
    Exit
  • Abstract: ScienceDirect-Abstract
    Exit
  • Other: ScienceDirect-Full Text PDF
    Exit
  • Journal Article Capps SL, Henze DK, Hakami A, Russell AG, Nenes A. ANISORROPIA: the adjoint of the aerosol thermodynamic model ISORROPIA. Atmospheric Chemistry and Physics 2012;12(1):527-543. R833866 (Final)
  • Full-text: ACP-Full Text PDF
    Exit
  • Abstract: ACP-Abstract
    Exit
  • Journal Article Gass K, Balachandran S, Chang HH, Russell AG, Strickland MJ. Ensemble-based source apportionment of fine particulate matter and emergency department visits for pediatric asthma. American Journal of Epidemiology 2015;181(7):504-512. R833866 (Final)
    R834799 (2015)
    R834799 (2016)
    R834799 (Final)
    R834799C003 (2015)
    R834799C003 (Final)
  • Full-text from PubMed
  • Abstract from PubMed
  • Associated PubMed link
  • Full-text: Oxford Academic-Full Text PDF
    Exit
  • Abstract: Oxford Academic-Abstract & Full Text HTML
    Exit
  • Journal Article Goldman GT, Mulholland JA, Russell AG, Srivastava A, Strickland MJ, Klein M, Waller LA, Tolbert PE, Edgerton ES. Ambient air pollutant measurement error: characterization and impacts in a time-series epidemiologic study in Atlanta. Environmental Science & Technology 2010;44(19):7692-7698. R833866 (Final)
  • Full-text from PubMed
  • Abstract from PubMed
  • Associated PubMed link
  • Full-text: ES&T-Full Text PDF
    Exit
  • Abstract: ES&T-Abstract
    Exit
  • Other: ES&T-Full Text PDF
    Exit
  • Journal Article Goldman GT, Mulholland JA, Russell AG, Strickland MJ, Klein M, Waller LA, Tolbert PE. Impact of exposure measurement error in air pollution epidemiology: effect of error type in time-series studies. Environmental Health 2011;10:61 (11 pp.). R833866 (Final)
    R829213 (Final)
    R834799 (2011)
    R834799 (2013)
    R834799 (2014)
    R834799 (2015)
    R834799 (2016)
    R834799 (Final)
    R834799C003 (2011)
    R834799C003 (2013)
    R834799C003 (2014)
    R834799C003 (2015)
    R834799C003 (Final)
    R834799C004 (2013)
    R834799C004 (2014)
    R834799C004 (2015)
    R834799C004 (Final)
  • Full-text from PubMed
  • Abstract from PubMed
  • Associated PubMed link
  • Full-text: Springer-Full Text HTML
    Exit
  • Other: Springer-Full Text PDF
    Exit
  • Journal Article Goldman GT, Mulholland JA, Russell AG, Gass K, Strickland MJ, Tolbert PE. Characterization of ambient air pollution measurement error in a time-series health study using a geostatistical simulation approach. Atmospheric Environment 2012;57:101-108. R833866 (Final)
    R829213 (Final)
    R833626 (Final)
    R834799 (2012)
    R834799 (2013)
    R834799 (2014)
    R834799 (2015)
    R834799 (2016)
    R834799 (Final)
    R834799C003 (2012)
    R834799C003 (2013)
    R834799C003 (2014)
    R834799C003 (2015)
    R834799C003 (Final)
    R834799C004 (2013)
    R834799C004 (2014)
    R834799C004 (2015)
    R834799C004 (Final)
  • Full-text from PubMed
  • Abstract from PubMed
  • Associated PubMed link
  • Full-text: ScienceDirect-Full Text HTML
    Exit
  • Abstract: ScienceDirect-Abstract
    Exit
  • Other: ScienceDirect-Full Text PDF
    Exit
  • Journal Article Hu Y, Odman MT, Chang ME, Russell AG. Operational forecasting of source impacts for dynamic air quality management. Atmospheric Environment 2015;116:320-322. R833866 (Final)
    R834799 (2015)
    R834799 (2016)
    R834799 (Final)
    R835217 (2014)
    R835217 (Final)
  • Full-text: ScienceDirect-Full Text HTML
    Exit
  • Abstract: ScienceDirect-Abstract
    Exit
  • Other: ScienceDirect-Full Text PDF
    Exit
  • Journal Article Ivey C, Holmes H, Shi G, Balachandran S, Hu Y, Russell AG. Development of PM2.5 source profiles using a hybrid chemical transport-receptor modeling approach. Environmental Science & Technology 2017;51(23):13788-13796. R833866 (Final)
    R833626 (Final)
    R834799 (Final)
  • Abstract from PubMed
  • Full-text: ACS-Full Text HTML
    Exit
  • Abstract: ACS-Abstract
    Exit
  • Other: ACS-Full Text PDF
    Exit
  • Journal Article Krall JR, Mulholland JA, Russell AG, Balachandran S, Winquist A, Tolbert PE, Waller LA, Sarnat SE. Associations between source-specific fine particulate matter and emergency department visits for respiratory disease in four U.S. cities. Environmental Health Perspectives 2017;125(1):97-103. R833866 (Final)
    R829213 (Final)
    R834799 (2016)
    R834799 (Final)
    R834799C004 (2015)
    R834799C004 (Final)
  • Full-text from PubMed
  • Abstract from PubMed
  • Associated PubMed link
  • Full-text: EHP-Full Text HTML
  • Other: EHP-Full Text PDF
  • Journal Article Lee D, Balachandran S, Pachon J, Shankaran R, Lee S, Mulholland JA, Russell AG. Ensemble-trained PM2.5 source apportionment approach for health studies. Environmental Science & Technology 2009;43(18):7023-7031. R833866 (2009)
    R833866 (Final)
    R831076 (Final)
    R832159 (Final)
    R833626 (2009)
  • Abstract from PubMed
  • Full-text: ES&T-Full Text HTML
    Exit
  • Abstract: ES&T-Abstract
    Exit
  • Other: ES&T-Full Text PDF
    Exit
  • Journal Article Lv B, Hu Y, Chang HH, Russell AG, Bai Y. Improving the accuracy of daily PM2.5 distributions derived from the fusion of ground-level measurements with aerosol optical depth observations, a case study in North China. Environmental Science & Technology 2016;50(9):4752-4759. R833866 (Final)
    R834799 (Final)
    R835217 (Final)
  • Abstract from PubMed
  • Full-text: ACS-Full Text HTML
    Exit
  • Abstract: ACS-Abstract
    Exit
  • Other: ACS-Full Text PDF
    Exit
  • Journal Article Maier ML, Balachandran S, Sarnat SE, Turner JR, Mulholland JA, Russell AG. Application of an ensemble-trained source apportionment approach at a site impacted by multiple point sources. Environmental Science & Technology 2013;47(8):3743-3751. R833866 (Final)
    R833626 (Final)
    R834799 (2013)
    R834799 (2014)
    R834799 (2015)
    R834799 (2016)
    R834799 (Final)
    R834799C004 (2013)
    R834799C004 (2014)
    R834799C004 (2015)
    R834799C004 (Final)
  • Abstract from PubMed
  • Full-text: ACS-Full Text HTML
    Exit
  • Abstract: ACS-Abstract
    Exit
  • Other: ACS-Full Text PDF
    Exit
  • Journal Article Pachon JE, Balachandran S, Hu Y, Weber RJ, Mulholland JA, Russell AG. Comparison of SOC estimates and uncertainties from aerosol chemical composition and gas phase data in Atlanta. Atmospheric Environment 2010;44 (32):3907-3914. R833866 (Final)
    R833626 (2010)
    R833626 (Final)
  • Full-text: Science Direct-Full Text HTML
    Exit
  • Abstract: Science Direct-Abstract
    Exit
  • Other: Science Direct-Full Text PDF
    Exit
  • Journal Article Pachon JE, Balachandran S, Hu Y, Mulholland JA, Darrow LA, Sarnat JA, Tolbert PE, Russell AG. Development of outcome-based, multipollutant mobile source indicators. Journal of the Air & Waste Management Association 2012;62(4):431-442. R833866 (Final)
    R833626 (Final)
    R834799 (2012)
    R834799 (2013)
    R834799 (2014)
    R834799 (2015)
    R834799 (2016)
    R834799 (Final)
    R834799C003 (2013)
    R834799C003 (2014)
    R834799C003 (2015)
    R834799C003 (Final)
    R834799C004 (2013)
    R834799C004 (2014)
    R834799C004 (2015)
    R834799C004 (Final)
  • Full-text from PubMed
  • Abstract from PubMed
  • Associated PubMed link
  • Full-text: Taylor&Francis-Full Text HTML
    Exit
  • Abstract: Taylor&Francis-Abstract
    Exit
  • Other: Taylor&Francis-Full Text PDF
    Exit
  • Journal Article Pachon JE, Weber RJ, Zhang X, Mulholland JA, Russell AG. Revising the use of potassium (K) in the source apportionment of PM2.5. Atmospheric Pollution Research 2013;4(1):14-21. R833866 (Final)
    R833626 (Final)
    R834799 (2014)
    R834799 (2015)
    R834799 (2016)
    R834799 (Final)
    R834799C004 (2013)
    R834799C004 (2014)
    R834799C004 (2015)
  • Full-text: ScienceDirect-Full Text HTML
    Exit
  • Abstract: ScienceDirect-Abstract
    Exit
  • Other: ScienceDirect-Full Text PDF
    Exit
  • Journal Article Redman JD, Holmes HA, Balachandran S, Maier ML, Zhai X, Ivey C, Digby K, Mulholland JA, Russell AG. Development and evaluation of a daily temporal interpolation model for fine particulate matter species concentrations and source apportionment. Atmospheric Environment 2016;140:529-538. R833866 (Final)
    R833626 (Final)
    R834799 (2016)
    R834799 (Final)
  • Full-text: ScienceDirect-Full Text HTML
    Exit
  • Abstract: ScienceDirect-Abstract
    Exit
  • Other: ScienceDirect-Full Text PDF
    Exit
  • Supplemental Keywords:

    ensemble, ensemble-trained CMB, source apportionment, health study

    Progress and Final Reports:

    Original Abstract
  • 2009 Progress Report
  • 2010 Progress Report
  • 2011 Progress Report
  • 2012 Progress Report