2004 Progress Report: Computational Requirements of Statistical Learning within a Decision-Making Framework for Sustainable Technology.EPA Grant Number: R828207
Title: Computational Requirements of Statistical Learning within a Decision-Making Framework for Sustainable Technology.
Investigators: Chen, Victoria C.P.
Institution: The University of Texas at Arlington
Current Institution: Georgia Institute of Technology
EPA Project Officer: Hahn, Intaek
Project Period: July 1, 2000 through June 30, 2003 (Extended to June 30, 2005)
Project Period Covered by this Report: July 1, 2003 through June 30, 2004
Project Amount: $335,000
RFA: Technology for a Sustainable Environment (1999) RFA Text | Recipients Lists
Research Category: Sustainability , Pollution Prevention/Sustainable Development
This research project addresses the development of a decision-making framework (DMF) for creating more sustainable urban environments. The development of this DMF requires a novel collaboration of current research in sustainability, optimization, and statistics. The objective of the DMF is to explore hypothetical paradigms on various scales, subject to technical and societal constraints, and measure their effect on other internal and external systems. The possibility of DMF prototypes in two important arenas will be investigated: (1) Water Quality - comparison of current and emerging technologies in a wastewater treatment system; and (2) Air Quality - evaluation of spatial and temporal actions for reducing ground-level ozone pollution.
The DMF will be based on a stochastic dynamic programming (SDP) approach that permits optimization of a system changing over time. Although dynamic programming has proven optimal and has been successful in many applications, it is computationally intensive. A critical objective of the proposed research project is to develop computationally practical, high-dimensional, continuous-state SDP solution methods for use within a DMF for sustainable technology. Our methodology will follow along the same lines as the orthogonal arrays (OA)/ multivariate adaptive regression splines (MARS) method. The generalized solution method uses statistical experimental designs through which statistical learning within the SDP is achieved. Prior applications of OA/MARS have demonstrated that memory requirements are well within the capacity of modern technology. The computational effort, however, required by the learning process (i.e., MARS), although polynomial in growth, may not be practical for very large problems. The development of statistical learning for this research is separated into four categories: (1) restructuring MARS to reduce computation with minimal loss in learning; (2) investigating other flexible methods of statistical learning, such as artificial neural networks (ANNs); (3) employing smaller statistical experimental designs, such as Latin hypercube designs; and (4) parallelizing the statistical learning process. A successful DMF for air quality will push the boundaries of SDP research.
For the statistical learning process of SDP, a literature review of statistical data mining methods for computer experiments has been completed (Tsui, et al., 2005) and comparisons of experimental designs have been conducted (Wen, et al., 2005a and 2005b). Supervised learning data mining methods are applicable potentially for use within the Atmospheric Chemistry Module of the Ozone Pollution DMF. Although both regression and MARS were applied for this purpose, ANNs are a good competitor to MARS (Cervellera, et al., 2004).
A DMF prototype for ozone pollution has been completed (Yang et al., 2004). We studied an episode in urban Atlanta on July 31–August 1, 1987, which remains one of the worst episodes on record. July 31 is the key day to control this episode, and our ozone pollution DMF prototype focuses on this day. Our work on the modules, particularly the Atmospheric Chemistry Module, is described in Yang, et al (2004). The results from the prototype are described in Yang, et al. (2005a and 2005b).
As demonstrated in our previous report, the DMF prototype successfully maintains ozone levels below the U.S. Environmental Protection Agency (EPA) standard. Our additional work conducts sensitivity testing by varying the initial conditions to see how the control strategies change. For 50 hypothetical scenarios (for the initial conditions), 38 emission variables were controlled in the DMF, 13 had no change, 10 had some variation, and 15 had large variation. The 13 emission variables with no change were those that were either reduced 100 percent or 0 percent. The fact that 15 emission variables had large variation from scenario to scenario demonstrates the dynamic nature of the problem and the SDP solution. Therefore, if the initial conditions of the day are different, then the optimal control strategies will be different.
We also compared the optimal emission reductions to the typical “across-the-board” control strategy that dictates the same percent reduction in emissions everywhere and all day. The across-the-board strategy required 60 percent reduction to achieve the EPA 1-hour standard of 125 ppb. Our DMF required a maximum of 55 percent reduction in total emissions during 7:00 AM -10:00 AM, with lower percent reductions required during the later time periods, and no reductions taken during any other times (i.e., before 7:00 AM and after 7:00 PM). This demonstrates the potential cost effectiveness of targeted control strategies, as well as the potential to solve real world problems using our DMF.
The investigator did not report any future activities.