Grantee Research Project Results
2006 Progress Report: Addressing Temporal Correlation, Incomplete Source Profile Information, and Varying Source Profiles in the Source Apportionment of Particulate Matter
EPA Grant Number: R832160Title: Addressing Temporal Correlation, Incomplete Source Profile Information, and Varying Source Profiles in the Source Apportionment of Particulate Matter
Investigators: Christensen, William F. , Reese, C. Shane
Institution: Brigham Young University
EPA Project Officer: Chung, Serena
Project Period: December 1, 2004 through November 30, 2007
Project Period Covered by this Report: December 1, 2005 through November 30, 2006
Project Amount: $238,721
RFA: Source Apportionment of Particulate Matter (2004) RFA Text | Recipients Lists
Research Category: Air Quality and Air Toxics , Particulate Matter , Air
Objective:
Most pollution source apportionment studies utilize ambient measurements that are gathered consecutively. Notwithstanding, most source apportionment (SA) approaches neither account for the impact of this correlation on statistical estimation and inference nor exploit the additional information available in correlated data. Additional complications in SA studies occur when only partial source profile information is available, and when the source profiles evolve or vary over the measurement period. The proposed research has three objectives in addressing these issues.
- Address both the challenges and advantages presented by temporally correlated ambient data, and address the opportunity for improved source contribution estimates when the temporal resolution of ambient measures is improved.
- Develop the iterated confirmatory factor analysis (ICFA) approach, which can utilize partial source profile information and take on aspects of CMB analysis, confirmatory factor analysis (CFA), and exploratory factor analysis (EFA) by assigning varying degrees of constraint to each element of the estimated source profile matrix during the estimation process.
- Develop a Bayesian hierarchical model for source apportionment, and present an approach for evaluating not only the change in source contributions over time, but also the change in source profiles.
Progress Summary:
Progress this year has occurred in three main areas of development: Bayesian multivariate receptor modeling, pollution source identification tools, and optimal utilization of source information.
Bayesian Multivariate Receptor Modeling
In the first area of development, we have continued developing a Bayesian approach for pollution source apportionment. In Lingwall, Christensen, and Reese (2007), we propose a simple, fully Bayesian approach for multivariate receptor modeling that allows for flexible and consistent incorporation of a priori information. The model uses a generalization of the Dirichlet distribution as the prior distribution on source profiles that allows great flexibility in the specification of prior information. Heavy-tailed lognormal distributions are used as priors on source contributions to match the nature of particulate concentrations. A simulation study based on the Washington, DC airshed shows that the model compares favorably to Positive Matrix Factorization, a standard analysis approach used for pollution source apportionment. A significant advantage of the proposed approach compared to most popularly-used methods is that the Bayesian framework yields complete distributional results for each parameter of interest (including distributions for each element of the source profile and source contribution matrices). These distributions offer a great deal of power and versatility when addressing complex questions of interest to the researcher.
The proposed Bayesian approach provides a useful alternative to other methods used in multivariate receptor modeling. The fully Bayesian approach is attractive because it easily incorporates a wide range of a priori information into analysis and gives full distributional results rather than just point estimates for source profiles and contributions. The novel use of heavy-tailed lognormal distributions for the source contributions and for the distribution of the particulate measurements is scientifically satisfying. The use of a Generalized Dirichlet distribution for source profiles allows for great flexibility in multivariate specification of prior information about emission sources while constraining the solution to be physically meaningful.
The Bayesian approach allows us to consistently incorporate the a priori information into an analysis rather than adjusting results after a model has been fit or introducing target transformations a posteriori. In simulation, the approach has been found to compete favorably with PMF. The full distributional results obtained from the Bayesian approach gives the researcher a great deal of flexibility in addressing questions associated with potentially complex functions of estimated parameters. For example, Figure 1 shows the complete distribution for each day’s estimate of the secondary sulfate source. But we can also answer complex questions of interest that are not easily addressed in a traditional PSA framework. For example, one might be interested in the number of exceedance days for a specific source. Reducing the exceedance days for auto/diesel emissions may be a sub-goal related to the larger aim of reducing the number of PM2.5 threshold exceedance days. Let κ the number of days (out of the total of 100) in which the auto/diesel source exceeds 10 μg/m3. Figure 2 gives the probability distribution for κ given the data. If we consider the posterior median as a point estimate for the auto/diesel source contribution, only three of the 100 study days have point estimates in excess of 10 μg/m3. But Figure 2 gives a more complete understanding of this variable. For example, the expected number of auto/diesel exceedance days is roughly 3.3 and the probability that the number of auto/diesel exceedance days surpasses 4 days is roughly 16%.
Figure 1. Posterior Distributions for Daily Secondary Sulfate Formation. Posterior medians are shown in black.
Figure 2. Probability Distribution for the Number of Days (Out of the Total of 100) in Which the Auto/Diesel Source Exceeds 10 μg/m3.
Source Identification Tools
An important precursor to conducting a pollution source apportionment analysis is the identification of potential sources. Traditional approaches for synthesizing meteorological data with ambient pollutant measurements are being reconsidered and expanded. These include: the “weighted rose,” diagnostic techniques based on dispersion modeling software, and estimation of point source directions via Bayesian regression. For the latter approach, we are exploring the use of Reversible Jump Markov Chain Monte Carlo (RJMCMC) methods to identify the number of important point sources effecting levels of any pollutant. These locations can then be used to form prior distributions for point sources as described in the section on Bayesian PSA. Figure 3 illustrates a 95% credible interval for the RJMCMC estimate of the predominant source direction associated with Zinc at the St. Louis Supersite. In the plot, Zinc concentration is plotted against wind direction. Vertical green lines indicate the 95% credible interval for the MCMC estimate of the predominant source direction. The vertical blue line represents the direction of the largest zinc source according to the Toxic Release Inventory (a local zinc smelter).
Figure 3. Zinc Concentration Plotted Against Wind Direction. Vertical green lines indicate the 95% credible interval for the MCMC estimate of the predominant source direction. Vertical blue line represents the direction of the largest zinc source according to the Toxic Release Inventory (a local zinc smelter).
Another source of information about pollution sources can be found in particle size distributions. Dillner, Schauer, Christensen, and Cass (2005) use cluster analysis to group particle size distribution vectors and then use these clusters to identify pollution sources. In work begun during 2005, we extend the work of Dillner, et al. (2005) to incorporate profile uncertainty vectors in a cluster analysis. This approach (Christensen, Dillner, Schauer, and Reese, 2007) was refined in 2006, submitted, and recently accepted for publication.
Optimal Utilization of Source Information
Finally, we have continued to investigate the optimal utilization of a priori information in PMF. A manuscript currently under review evaluates approaches for using a priori information about pollution sources and illustrates the use of source profile targeting in an exploratory analysis of the St. Louis Supersite data.
Future Activities:
During the final year of the project, we will continue to focus on the Bayesian hierarchical model as our ultimate goal. Our primary goal is to begin to incorporate flexibility in the nature of the a priori pollution source profile information, including partially or completely unspecified profiles. Additional issues we plan to address in the next year include the following.
- Temporally varying profiles. This is currently under development. In the coming months, we will apply these approaches to the St. Louis Supersite data.
- Informative prior distributions on source contribution processes. Thus far, we have assumed little about the nature of the source contributions. However, for many pollution sources, knowledge about the general shape of the contribution process is known. For example, secondary sulfate processes peak in the summer time, and diesel contributions drop more precipitously on weekends than does gasoline vehicle contributions. Further, using AERMOD and Bayesian regression, we have been able to formulate the beginnings of system for prior distribution based on meteorological data. We intend to use this information to improve the Bayesian estimation of contributions and profiles. To date, researchers have predominantly used such knowledge to identify sources in only a post hoc fashion, but such information can be directly incorporated into the PSA.
Journal Articles on this Report : 1 Displayed | Download in RIS Format
Other project views: | All 36 publications | 10 publications in selected types | All 8 journal articles |
---|
Type | Citation | ||
---|---|---|---|
|
Christensen WF, Schauer JJ, Lingwall JW. Iterated confirmatory factor analysis for pollution source apportionment. Environmetrics 2006;17(6):663-681. |
R832160 (2005) R832160 (2006) R832160 (Final) |
Exit |
Supplemental Keywords:
receptor model, chemical mass balance model, Bayesian analysis, statistics, modeling, decision making, air quality models , Air, Ecosystem Protection/Environmental Exposure & Risk, RFA, Scientific Discipline, Air Quality, Atmospheric Sciences, Environmental Chemistry, Environmental Engineering, Environmental Monitoring, Monitoring/Modeling, particulate matter, Bayesian hierarchical model, aerosol analyzers, air quality model, air quality models, air sampling, airborne particulate matter, analytical chemistry, area of influence analysis, atmospheric chemistry, atmospheric dispersion models, atmospheric measurements, chemical characteristics, chemical speciation sampling, emissions monitoring, environmental measurement, iterated confirmatory factor analysis, model-based analysis, modeling studies, particle size measurement, particulate matter mass, particulate organic carbon, real-time monitoring, source apportionment, source receptor based methods,, RFA, Scientific Discipline, Air, Ecosystem Protection/Environmental Exposure & Risk, particulate matter, Air Quality, Environmental Chemistry, Monitoring/Modeling, Environmental Monitoring, Atmospheric Sciences, Environmental Engineering, particulate organic carbon, atmospheric dispersion models, atmospheric measurements, model-based analysis, area of influence analysis, source receptor based methods, source apportionment, chemical characteristics, emissions monitoring, environmental measurement, airborne particulate matter, air quality models, air quality model, air sampling, speciation, particulate matter mass, Bayesian hierarchical model, analytical chemistry, iterated confirmatory factor analysis, modeling studies, real-time monitoring, aerosol analyzers, chemical speciation sampling, particle size measurementProgress and Final Reports:
Original AbstractThe perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Conclusions drawn by the principal investigators have not been reviewed by the Agency.