2000 Progress Report: Development of Object-Based Simulation Tools for Distributed Modular Ecological ModelingEPA Grant Number: R827960
Title: Development of Object-Based Simulation Tools for Distributed Modular Ecological Modeling
Investigators: Bolte, John P. , Budd, Timothy
Institution: Oregon State University
EPA Project Officer: Carleton, James N
Project Period: October 1, 1999 through September 30, 2002 (Extended to March 31, 2004)
Project Period Covered by this Report: October 1, 1999 through September 30, 2000
Project Amount: $247,394
RFA: Computing Technology for Ecosystem Modeling (1999) RFA Text | Recipients Lists
Research Category: Environmental Statistics
Objective:The goal of this project is to develop a modular simulation system for coupling existing simulations and development of new simulation modules. The primary objective of the study is to provide a language-neutral, object-oriented simulation framework for ecological modeling in four focused areas: (1) underlying object-based technology for implementing and coordinating collections of simulation objects (modules) in an integrated simulation environment; (2) inter-object communication technology supporting both single-machine and network-based communication between components of a modular simulation; (3) spatial and nonspatial data input, collection, analysis and visualization; and (4) development of visual programming tools for rapid definition and assembly of model modules into complete, fully functional ecological models and visual interpretation of model results.
Progress Summary:Progress during the first year of the project has been made on four fronts: development of model concepts and design requirements, prototyping, simulation framework implementation, and design specification for a visual modeling tool. Year one efforts have focused initially on developing a requirements specification for modular model component development. This was accomplished internally through collaboration between computer science and engineering researchers funded through the project, and externally through collaboration with an international group of ecological and agroecological modelers, through workshop participation, direct contact and an electronic list-server established by the project for promoting communication about requirements and design specifications for modular model development. The resulting requirements specification led to a design specification that is guiding the implementation of a language-neutral software system addressing the requirements. The design specification covers a range of areas ranging from core simulation services, data management, intermodule communication, security, and model construction services.
Development of Model Concepts and Design Requirements. The ModCom project began in the spring of 2000. At that time, we initiated regular meetings to discuss model concepts and plan the development of ModCom. One of the first concepts developed was a general view of how its users will view ModCom. This view is shown in Figure 1. ModCom is composed (conceptually) of three levels of components: foundation level, tool level, and model level.
The foundation level contains simulation components that are common to all simulations. This includes elements like simulation clocks, facilities for solving differential equations, inter-object communication, and synchronizing execution of various model components. The foundation level also is where users find the general definitions of how models interact. The tool level contains simulation components that are used by most simulations. This includes elements such as statistical tools, graphing, database management. Components in the tool level represent "features" that exist in addition to the basic components of the foundation level. The model level represents domain specific components. This includes components that contain implementation-specific calculations. It is assumed that when existing simulation models are integrated into ModCom, they will reside at the model level.
The basis for the design specification of ModCom was an existing object-based simulation toolkit. We undertook a refactoring process to assess the strengths and weaknesses of the original design, resulting in a significantly reengineered architecture. The purpose of the refactoring was to familiarize all participants with the existing framework design and to recognize which features of the existing framework should be part of the core ModCom implementation and which should be moved to supporting components. We also were able to recognize what features need to be added or enhanced to support ModCom. The result of these discussions was the development of the requirements specifications described below.
Prototyping. During the spring and summer of 2000, several prototypes
were constructed in collaboration with an international group of ecological
modelers. The purpose of these prototypes was to test the efficacy of our initial
design strategies. We found out early in the process that one of our design
methodologies would not work. Originally, we had intended to construct "wrapper"
classes (in COM, Microsoft's language-neutral object protocol) that would contain
components from the existing framework. This strategy would save time and still
permit reimplementation of native COM/C++ if performance limitations required
it. However, COM places certain limitations on how information can be exchanged
between components. These limits, combined with the way the existing BAG components
exchanged information, meant that "wrappers" could not be used.
The prototypes that followed examined different styles of defining how the components exchanged information. From these we concluded that the best approach would be to reimplement in COM the core components of the existing BAG framework. Doing this allows us to move directly to a fast native COM implementation. The prototypes also showed that native COM implementation do not produce significant performance reduction.
Framework Implementation. During the late fall and winter, we implemented (in C++ using COM interfaces) the core components of the framework. These classes and their relationships are shown in Figure 2. The framework design is based largely on the existing BAG framework and enhancements suggested by Dr. Budd. Although the limitations imposed by COM required changes in how the components exchange information, the limitations did not require any significant changes to the overall Object Oriented design of the framework. Several new features were added to the framework. They include more advanced event processing, listener interfaces for synchronous inter-object communication, and the Integrator interfaces.
The following general requirements were determined.
A. General Requirements
- Language-neutral interfaces for all services utilizing standard protocols
( COM ). Our goal is that simulation services should be available to all
developers regardless of the language or development environment that they
- Keep it as simple as necessary, but no more so. The central elements/interfaces
for simulation should only contain what is common to all simulations and is
necessary for performing simulations. Added features, extensions, or advanced
simulation tools should not be part of the core.
- Models should be understandable and self evidently correct. The
expressions/equations for defining a simulation should, when expressed in
the simulation system, be uncluttered by technical details and easily understood.
The manner of expressing a model should be similar to how a model is expressed
in a conceptual (paper and pencil or analytical) context.
- Models should be easy to modify. When you want to add a new feature, it should be obvious how to do it. The simulation framework (and the simulation system itself) should be designed such that the modules that developers create will be easy to modify later. Our goal here is to minimize and manage software accretion.
B. Core Simulation Services
- Registration and coordination of simulation objects. The simulation
environment should be able to accept new simulation objects at runtime and
manage their (simultaneous) execution.
- Clock services. For synchronizing time flow interaction between simulation
objects, supporting both continuous, discrete-event, and mixed simulations.
The same mechanisms present in the existing BAG simulation libraries should
- Data storage for multi-run (e.g., Monte Carlo) model analysis.
- Network-distributed simulation object coordination. Simulation users should be able to use simulation components that reside on other computers without needing to copy/download the files that implement the components. This feature should be as transparent as possible to simulation users and designers.
C. Core Simulation Object/Module Services
- 2-D time series data collection/input/output. Simulations should
be able to store and retrieve time series data at runtime. This requirement
applies to any 2-D data set, regardless of how it is stored (i.e., any mechanism
that can retrieve data should be able to do so at runtime).
- 3-D time/x/y data collection input/output. For example, to support
spatially distributed models, particularly with links to ArcInfo/OpenGIS datasets.
We should be able to support spatially distributed models.
- Capabilities Exposure. The ability to expose data/functionality in
a standardized manner, to allow more intelligent multi-module simulation assembly
and efficient object-to-object communication. This requirement also facilitates
the use of a visual model building tool in that users can see at runtime what
capabilities a module has without having to delve into the module code.
- Intellectual property control. The ability to control the use of
specific simulation objects. This will allow some form of licensing for modules
containing a commercially developed code.
- Simulation object self-documentation facilities. Each module should
contain some information about what it does, what data it needs, and what
information it provides. The goal is that information (in a human readable
format) should always be available with a module. This feature also facilitates
use of a visual tool. Also simplifies management of modules because there
are fewer files to manage.
- Provide standard, pre-built simulation objects for common tasks/data structures (e.g., lists, queues, timers, etc.).
D. Data Services
- Standardized data organization. A database schema for storing/retrieving
data used or produced by a simulation. This should include multi-run simulations.
Also, the schema should include a representation of simulation objects to
facilitate object persistence.
- Transparent access to data. A uniform (single API) method should
be available for storing/retrieving data from the simulation database.
- Support for 2-D and 3-D datasets. The database schema/database implementation must be able to store time series and spatial data in addition to simulation objects.
E. Numerical Services
- Integration methods for ordinary differential equations. It is assumed
that many models will be expressed in terms of differential equations. ModCom
will provide methods (e.g., Euler, Runge-Kutta) that allow any simulation
object to solve differential equations. These methods are essentially the
same integration services that are present in the existing BAG framework.
- Integration methods for partial differential equations. Same as above
except that the existing BAG framework does not allow for PDE (e.g., difference
grids), so a new implementation will be necessary.
- Parameter estimation methods. Large models may have hundreds of parameters. Automated numerical estimation of parameter values is an essential feature for calibration. It is assumed that either a Marquardt or genetic algorithm method will be implemented.
F. Inter-Object Communication Services
- Asynchronous message posting (Blackboard). A mechanism should be
available that allows simulation objects to exchange information. Furthermore,
the method of exchange should not limit or impose any structure on how the
information is formatted. To this end, two styles of inter-object communication
will be implemented: asynchronous (blackboard) and synchronous (listeners).
Both methods will allow simulation objects to communicate directly (via their
- Synchronous message broadcasting (Listeners). See above.
G. Statistical Services
- Random number generators for common distributions. These should include
all the distributions normally associated with stochastic simulations. A uniform
means of using the distributions also should be provided.
- Analysis of Monte Carlo simulations. The module should be able to
compute statistics associated with Monte Carlo simulations.
- Sensitivity Analysis. The module should be able to compute basic statistics associated with sensitivity analysis. Automated methods also should be present that allow simulation designers to run sensitivity analyses without significant modification to existing simulations.
H. Analysis and Visualization Services
- Visualization of 2-D datasets. A module should be available for visual
(on screen) display of 2-D graphs (e.g., variables x time). The model should
be able to graph input data and simulation results.
- Visualization of 3-D datasets. A module should be available that
can display (on screen) spatially distributed data (GIS data sets), either
input or generated.
- Statistical analysis of simulation results. Facilities should exist to compute common statistics for analysis and hypothesis testing.
I. Model Construction Services
- Model construction environment. The purpose is to allow assembly/connection
of existing compiled simulation objects/modules. One of the primary goals
of this project is to provide tools that will allow users to form new simulations
by "connecting" existing modules without writing any code. The model
construction environment will provide a visual (GUI driven) environment for
establishing connections between modules, running a simulation with the connected
modules, and using visualization modules to view simulation results.
- Object-oriented simulation language. The purpose is to create compiled/interpreted simulation objects. The goal is to provide a machine independent specification of connected modules (i.e., ones specified with the model construction environment). In addition, the language should be able to specify new simulation objects that can be used in the model construction environment.
Visual Modeling Tool
The ultimate goal of the Visual Modeling Tool (VMT in this report) is to enable modelers to concentrate on the science involved in their simulations and to free them from the details of manually programming their simulations. Very few of the users comprising the target community for the ModCom framework are C++ programmers. In fact, they are much more likely to be Visual Basic or Delphi programmers. It also is hoped that eventually the user base of the ModCom framework will grow to include nonprogrammers. It always would be possible for those modelers who wish to write their own simulations to extend the ModCom framework. It is a goal of the developers of the ModCom framework to enable the creators of new models to work in the programming language of their choice.
It is worth noting that the ModCom framework itself has been designed and implemented using Microsoft's Component Object Model (COM). This implementation choice dictates that the VMT is COM-aware. There are several possible ways in which the VMT could satisfy this constraint. One of the more intriguing possibilities-and one that we have been actively investigating-would be to use Microsoft's forthcoming .NET framework. One of our current prototype efforts makes use of the available public beta of .NET.
In the past year, we have made progress on the, as yet unnamed, VMT in the following areas: defining the requirements, analyzing specification and implementation issues, and building a prototype.
Requirements. The three primary requirements that the first iteration of the VMT must satisfy include the following:
- Assemble collections of predefined components
- Define simulation parameters
- Execute a simulation.
We also have identified a few areas that may become requirements for VMT in the future:
- Create new simulation components
- A runtime environment to support these components
- A facility for running simulations in a distributed/networked fashion.
Specification and Implementation Issues. The ModCom framework is designed to allow its users to employ a component-based development strategy. Although the notion of a component defies exact characterization, the ModCom uses as an operational definition objects that are instantiated from the classes SimEnv and SimObj. Given this definition, for the VMT to be able to satisfy its first requirement (assemble collections of predefined components), the VMT must be able to create instances of and be able to manipulate SimEnv and SimObj objects and objects created from classes that inherit from SimEnv and SimObj. The VMT makes use of the fact that interfaces to the SimEnv and SimObj classes are exposed via COM interfaces.
The VMT must be able to satisfy two different usage scenarios. In the first case, the VMT will only be required to work with those objects that are created from classes that are delivered as a part of the ModCom framework. In the second case, the VMT must also be able to work with classes that inherit from SimEnv and SimObj and that may not have been available at compile time. This enables users to define their own custom, inheritance-based SimEnv and SimObj classes. The users can then expect to be able to integrate their newly created classes with the predefined ModCom classes via the VMT.
Preliminary investigations have shown that Microsoft's C# programming language combined with the .Net framework is capable of satisfying both of these usage scenarios. The first scenario can be satisfied via a mechanism known as "early binding." In this case, the predefined ModCom classes, in the form of COM classes, are available at compile time. The second scenario makes use of a technique known as "late binding." The use of late binding allows objects and their public methods to be discovered at run-time. This methodology is obviously not as efficient as early binding. The amount of flexibility that late binding allows for extending the ModCom framework is believed to outweigh any performance risk. Nonetheless, any use of late binding will need to be carefully thought out. Both of these methods are documented in a wide variety of currently available literature.
As mentioned previously, it is one of the aims of this project to enable model developers to create new models in their preferred programming language. Our current work leads us to believe that the .Net framework will be able to fulfill this goal. In addition to support for Microsoft's own stable of programming language environments-Visual Basic, C#, and C++-various commercial entities have announced that they will be providing support for languages such as Cobol, Delphi, Haskell, Mercury, Oberon, Scheme, and Smalltalk. This is not, by any means, an exhaustive list.
Prototypes. In support of the current year's funding, work on two separate prototypes of the VMT has begun. The first prototype was built by Dr. Timothy Budd. This prototype captured the essential aspects of the problem domain via an all Java solution. The need to be able to the integrate the VMT with the rest of the ModCom framework, which, as was described earlier, is written using COM, has sidetracked the current investigation of Java-based solutions. At this time, a suitably current Java-COM binding is not known to exist. The most recent version of Microsoft's J++ conforms to version 1.1.6 of the JDK. This version lacks key user interface tools such as the SWING api.
A second prototype is actually three prototypes that need to be integrated into a single solution. These prototypes are more closely tied to the existing ModCom framework than Dr. Budd's. The first prototype is a GUI-based tool that demonstrates the use of Microsoft's new GUI api, WinForms. The second and third prototypes demonstrate the integration of programs written for the .Net framework with classes written to the COM interface specification. In fact, these prototypes make use of the ModCom classes. The prototypes are both console apps that demonstrate the concepts of early and late binding.
During the first year, the project dealt primarily with requirements and design specification issues, which resulted in the development of early prototype softwares. We also have established communications with several other groups involved in model development, and are pursuing interactions with these groups. A large issue has been the need for a language-neutral approach that allows object-oriented constructs to be used in model development. We are pursuing this by taking advantage of Microsoft's COM architecture and related datatypes. By careful use of COM-like interfaces definitions, without reliance on COM's internals, we have successfully managed a language-neutral approach that is not strictly tied to COM; for example, the basic libraries have been successfully compiled and executed on a non-COM platform (Linux).