Abstract |
In a testbed model based on EPA's Regional Oxidant Model, the Quasi-steady State Approximation (QSSA) gas-phase chemistry solver dominates the computation, which is typical of Eulerian grid cell air quality models. We report results from optimizing the testbed solver on a Cray T3D parallel system. We use a simple processor mapping to assign a block to grid cells to each T3D processing element (PE). To minimize execution time, programming optimization techniques, such as cache collision avoidance and loop unrolling, have been applied. Proper application of such techniques has a significant effect on performance, leading to improve speedup as compared with simply relying ng on optimizing compilers. PEs become idle while remaining Pes complete their tasks within a simulation time step. Based on experience with optimizing the QSSA on Cray vector supercomputers, a dynamic task allocation approach to deal with local imbalance is described and performance results are given. |