2004 Progress Report: Statistical Approaches to Detection and Downscaling of Climate Variability and Change

EPA Grant Number: R829402C006
Subproject: this is subproject number 006 , established and managed by the Center Director under grant R829402
(EPA does not fund or establish subprojects; EPA awards and manages the overall grant for this center).

Center: Center for Integrating Statistical and Environmental Science
Center Director: Stein, Michael
Title: Statistical Approaches to Detection and Downscaling of Climate Variability and Change
Investigators: Wuebbles, Donald J. , Hayhoe, Katharine , Stein, Michael , Tiao, George
Current Investigators: Wuebbles, Donald J. , Cai, Airong , Hayhoe, Katharine , Hertel, Anne , Stein, Michael , Tiao, George , Vrac, Mathieu
Institution: University of Illinois at Urbana-Champaign , University of Chicago
Current Institution: University of Illinois at Urbana-Champaign , Texas Tech University , University of Chicago
EPA Project Officer: Packard, Benjamin H
Project Period: March 12, 2002 through March 11, 2007
Project Period Covered by this Report: March 12, 2003 through March 11, 2004
RFA: Environmental Statistics Center (2001) RFA Text |  Recipients Lists
Research Category: Ecological Indicators/Assessment/Restoration , Environmental Statistics , Health , Ecosystems


Climate change is a global problem whose impacts will be seen and felt most strongly at the regional and local level. To understand the regional impacts of climate change, we must develop an understanding of how global climate change has already and is likely to continue to interact with the current environment and local conditions—hence the dual focus of this project on assessing the degree to which climate change has already affected the Midwest as well as exploring novel and innovative statistical applications to downscaling future global-scale projections of climate change to the regional scale.

Progress Summary:

Statistical Approaches to Data Assimilation and Downscaling

At the global scale, the Intergovernmental Panel on Climate Change has concluded that the world is warming, and that “the balance of evidence suggests a discernable human influence on global climate.” However, it is not clear whether the same statement can be also be made for climate at the regional level and in the Midwest in particular. Comparisons among historical observed climate records; an ensemble of historical simulated model runs, including natural variability and greenhouse forcing; and future projections of change corresponding to a range of plausible socio-economic scenarios are being used to identify correlations between historical and model data. This allows for an assessment of the degree to which the climate signal is currently detectable in the Midwest region and evaluate the potential for large-scale climate simulations to improve on short-term climate forecasts on the scale of years to decades. Multiple time series modeling, trend analysis, and prediction are exceedingly complex as the number of variables increases, requiring interplay between scientific considerations of the problem in terms of the choice of variables and dynamic relations of included variables and statistical analyses using existing and newly developed methods.

Clustering Methods for Downscaling Climate Change

The second part of this project has been revised to focus specifically on clustering methods for downscaling global-scale climate projections. Temperature and large-scale precipitation over the United States are generally driven by atmospheric circulation patterns. Consistency between observed and model-simulated large-scale circulation suggests that a statistical weather typing approach to classify projected changes in these patterns may provide useful information regarding projected changes in surface temperature and rainfall patterns. A statistical mixture model is being applied to clustering vertical atmospheric profiles into air mass types using both cumulative distribution functions as well as numerical data input and employing copula functions to model joint distributions and dependencies between variables. The goal of this project is to produce fine-scale projections of future climate simulations for the U.S. Midwest region to assess the degree to which local and regional-scale climate, circulation, and topography affect climate impacts throughout the region.

Results to Date

Statistical Approaches to Data Assimilation and Downscaling. Statistical relationships between monthly temperature records over the last 100 years for a station in Illinois (Aurora) and model simulations were evaluated based on an ensemble of historical simulations from the Parallel Climate Model (PCM). The traditional approach of using Empirical Orthogonal Functions (EOF) to determine linear combinations of gridded values that explain most of the variability in the PCM yields explanatory variables that are linked to atmospheric patterns. Coefficients were estimated over the period 1900–1990, and predictions based on these coefficients were then compared with observations over the period 1991–2000.

Initial multiple linear regressions of de-seasonalized monthly temperature anomalies reveal little correlation between model projections and observed temperatures due to a low signal to noise ratio in terms of climate warming versus natural variability (Figure 1a). However, as the source of much of the noise in the climate record is not random but rather due to a combination of explicit physical processes, we can use our understanding of the climate system to remove key components of the noise. The first step is to include multiple model simulations, which has the effect of reducing the mean squared error (MSE) and F value by a factor of 4 (Figure 1b; note the change in value on the x-axis). Averaging over 1 year reduces root mean square error (RMSE) by a factor of 3.5, as a trend towards increasing temperatures begins to become evident (Figure 1c).

Figure 1. Correlation of Model and Observed Temperatures

Figure 1. Correlation of Model and Observed Temperatures for the Period 1900–2000 for (a) Monthly Values for One Model Simulation, (b) Four-Member Ensemble Average Monthly Values, and (c) Four-Member Ensemble Averaged Over 1 Year.

Physical understanding of the teleconnections that are known to non-linearly affect climate over the Midwest and that are currently mismatched between historical observed and simulated data can be further used to reduce the noise and improve inter-simulation as well as model-observational correlations. Using the El Niño-Southern Oscillation (ENSO) 1.2 and 3.4 indices, we identified El Niño, Neutral, and La Niña years in the historical record and in four ensemble simulations. Taking into account these teleconnections reduced RMSE, improving the predictive power of the technique. Three predictor variables have been chosen for the regressions—surface temperature, 850 mb temperature, and 850 mb geopotential height—and efforts are currently underway to identify the best combination of principle components and direct regression on temperature.

Clustering Methods for Downscaling Climate Change. One of the new postdoctoral researchers, Mathieu Vrac, has begun discussions with Michael Stein on how clustering methods could be used in statistical downscaling of climate change. We intend to incorporate this work with the previously proposed work on the use of regional simulations in downscaling, but we will also consider approaches that do not require regional simulations. This work has just begun, so there are no significant results to date. However, we have begun working out the details of methods for downscaling based on clustering of weather regimes, and we are currently testing parts of the procedure.


Cook BI, Mann ME, D'OdoricoP, Smith TM. Statistical simulation of the influence of the NAO on European winter surface temperatures: applications to phenological modeling. Journal of Geophysical Research 2004;109(D16):D16106, doi:10.1029/2003JD004305.

Covey Curt, AchutaRao KM, Cubasch U, Jones P, Lambert ST, Mann ME, Phillips TJ, Taylor KE. An overview of results from the Coupled Model Intercomparison Project (CMIP). Global and Planetary Change 2003;37:103-133.

Dai A, Meehl GA, Washington WM, Wigley TML, Arblaster JM. Ensemble simulation of Twenty-First century climate changes: Business-as-usual versus CO2 stabilization. Bulletin of the American Meteorological Society 2001;82:2377-2388.

Delworth TL, Dixon KW. Implications of the recent trend in the Arctic/North Atlantic Oscillation for the North Atlantic Thermohaline Circulation. Journal of Climate 2000;13(21):3721-3727.

Meehl GA, Tebaldi C. More intense, more frequent, and longer lasting heat waves in the 21st Century. Science 2004;305:994-997.

Meehl GA, Arblaster JM, Strand WG. Sea-ice effects on climate model sensitivity and low frequency variability. Climate Dynamics 2000;16:257-271.

Meehl GA, Gent PR, Arblaster JM, Otto-Bliesner BL, Brady EC. Craig A. Factors that affect the amplitude of El Niño in global coupled models. Climate Dynamics 2001;17:515-526.

Meehl GA, Washington WM, Wigley TML, Arblaster JM Dai A. Solar and greenhouse gas forcing and climate response in the twentieth century. Journal of Climate 2003;16:426-444.

Trenberth KE, Hoar TJ. El Nino and climate change. Geophysical Research Letters 1997;24(23):3057-3060.

Washington WM, Weatherly JW, Meehl GA, Semtner AJ, Bettge TW, Craig AP, Strand WG, Arblaster J, Wayland VB, James R, Zhang Y. Parallel climate model (PCM) control and transient simulations. Climate Dynamics 2000;16:755-774.

Future Activities:

Statistical Approaches to Data Assimilation and Downscaling

During the next year, we plan to incorporate additional observations of soil moisture, humidity and cloudiness as well as atmospheric circulation indices corresponding to the ENSO, Arctic Oscillation (AO), and its close counterpart the North Atlantic Oscillation (NAO) to assess the degree to which large-scale climate patterns and localized climate feedbacks may reveal and/or mask the climate change signal in the Midwest.

This is of interest since an upward trend has been detected in the AO (Delworth and Dixon, 2000) and a significant shift has been detected in the mean ENSO indices (Trenberth and Hoar, 1997). Cook, et al. (2004) recently assessed the statistical link between NAO indices and temperature through a classification technique very similar to our approach for ENSO. However, this simple simulation does not take into account the complex interactions captured by global model-derived indices and the interactions between multiple oscillations. For this reason, we plan to improve on this method as well as apply it to ENSO patterns in order to assess the impact of large-scale atmospheric features on surface conditions as have already been demonstrated to exist for heat waves (Meehl and Tebaldi, 2004).

We also plan to expand the spatial area covered by the analysis. A sophisticated gridding program that averages randomly-spaced geographic locations into a uniform grid, similar to model output, has been used to produce gridded historical observed temperature and precipitation fields on an identical scale to global model output. Moving outwards from the Aurora station, we will evaluate the correlation between synoptic-scale observed temperature and precipitation fields and model simulations. We will also focus on neighboring weather stations surrounding Aurora to assess the degree to which point-source observed/simulated climate correlations hold over a broader area.

Finally, we propose to build on our current work in order to quantify the contribution of multiple members of modeled ensembles to improving the correlation between modeled and observed climate statistics over spatial and temporal scales. The impact of human-induced change on climate over the next few decades will then be assessed relative to the changes implied from natural variability alone. Statistical methods will be refined and a paper presenting these results prepared for submission to a peer-reviewed journal.

Statistical Applications for Regional Simulations of Climate Change

We plan to develop a statistical weather typing approach to classify projected changes in large-scale circulation features and assess the impact on surface temperature and precipitation. Although statistical mixture models have frequently been applied to climate studies, mixtures of distributions have rarely been applied to clustering in climate studies. The proposed approach can be used either with or without regional climate simulations, but, for simplicity, we describe a version that only requires global-scale PCM model output.

For the region of interest, the first step is to define clusters based on model output and assign actual weather patterns to these clusters. The clusters must be defined in terms of quantities available from model output, but the goal of the clusters is to define groups of days for which the difference between the average observed weather and the average model output has large across-cluster variation. Defining clusters based on one set of variables, such that they produce meaningful clustering for another set of variables, will require close interaction between the statisticians and atmospheric scientists on this project in terms of variable selection, choice of metric, and evaluation strategies. In addition to applying principal components to geopotential heights, we plan to explore clusters based on vertical atmospheric profiles using copula functions to model joint dependencies between profiles.

The second critical step is selection of the set of variables to be used in the clustering algorithm. One possible approach is to try the first few principal components of geopotential heights at some pressure level (e.g., 500 or 700 mb) in the region of interest. The clustering would be carried out on the daily loadings for these principal components using some appropriately chosen metric and using a historical model simulation or ensemble. Representative days from a future model simulation would then be categorized into the same set of clusters, and the changes in frequencies of these clusters between the present and future scenarios would be evaluated. Observed weather patterns would be assigned to the clusters, and their effect on surface conditions would be evaluated using historical surface observations and National Centers for Environmental Prediction (NCEP) reanalysis to obtain geopotential heights and assign each day to a cluster based on the loadings of the PCM-based principal components.

Downscaled estimates of surface conditions, such as average daily maximum temperature at a specific site in the future scenario, would then be obtained by computing the observed average maximum daily temperature for days assigned to each cluster by NCEP minus the average maximum daily temperature for the PCM output under present conditions for all days assigned to the cluster, to produce a downscaling adjustment for each cluster. To get an overall adjustment, a weighted average of these adjustments would be used, based on the cluster frequencies obtained from the future model simulation(s). This procedure will thus give greater weight to adjustments for clusters that are more frequent in the future simulation(s).

Regional simulations could be brought to bear on this problem in a number of ways. First, even a short run of a high-resolution regional simulation in the future would provide for a more direct assessment of how well the clustering method is working. Longer regional simulations, driven by both present and future condition PCM runs, would make it possible to use the clustering method to estimate differences between the observed weather and regional climate simulations, which should be considerably smaller than the differences between observed conditions and PCM output. The issue of judiciously selecting the time periods on which the regional climate simulations should be run, which was a main proposed focus of this subproject, will be critical here as well.

Once the clustering approach to downscaling global-scale climate projections has been developed and evaluated against historical data, it will be applied to future projections corresponding to the Special Report on Emissions Scenarios (SRES) A1fi (high) and B1 (low) emission scenarios, as projected by the PCM model, to evaluate the spatial diversity in climate change impacts on the U.S. Midwest. A paper describing the method and its application will be prepared for submission to a peer-reviewed journal.

Journal Articles:

No journal articles submitted with this report: View all 12 publications for this subproject

Supplemental Keywords:

RFA, Scientific Discipline, Economic, Social, & Behavioral Science Research Program, Air, Health Risk Assessment, climate change, Air Pollution Effects, Environmental Statistics, Ecological Risk Assessment, biostatistics, environmental monitoring, particulate matter, risk assessment, health risk analysis, environmental risks, air pollution, climate models, data analysis, environmental indicators, infant mortality, climate variability, statistical methods

Relevant Websites:

http://www.stat.uchicago.edu/~cises/ Exit

Progress and Final Reports:

Original Abstract
  • 2002
  • 2003
  • 2005
  • Final Report

  • Main Center Abstract and Reports:

    R829402    Center for Integrating Statistical and Environmental Science

    Subprojects under this Center: (EPA does not fund or establish subprojects; EPA awards and manages the overall grant for this center).
    R829402C001 Detection of a Recovery in Stratospheric and Total Ozone
    R829402C002 Integrating Numerical Models and Monitoring Data
    R829402C003 Air Quality and Reported Asthma Incidence in Illinois
    R829402C004 Quasi-Experimental Evidence on How Airborne Particulates Affect Human Health
    R829402C005 Model Choice Stochasticity, and Ecological Complexity
    R829402C006 Statistical Approaches to Detection and Downscaling of Climate Variability and Change