Grantee Research Project Results
Final Report: Partitioning Algorithms and Their Applications to Massively Parallel Computations of Multiphase Fluid Flows in Porous Media
EPA Grant Number: R825207Title: Partitioning Algorithms and Their Applications to Massively Parallel Computations of Multiphase Fluid Flows in Porous Media
Investigators: Ewing, Richard E. , Djidjev, Hristo , Lazarov, Raytcho D. , Vardi, Moshe
Institution: Texas A & M University , Rice University
EPA Project Officer: Aja, Hayley
Project Period: November 1, 1996 through October 31, 1999
Project Amount: $290,760
RFA: High Performance Computing (1996) RFA Text | Recipients Lists
Research Category: Human Health , Aquatic Ecosystems , Environmental Statistics
Objective:
This project focused on the development of methods for simulation of multi-phase and multi-component fluid flows in porous media that combine recent advances in numerical methods with techniques for partitioning and load balancing based on graph theory and graph algorithms. According to the research plan, we have concentrated our efforts and achieved our goals in the following interrelated groups of problems, which combine theoretical investigations with computational experimental work:
- Construction, analysis, and implementations of finite volume element methods for reactive ows in porous media,
- Local error estimators and refinement strategies for finite volume methods,
- Algorithms for partitioning and load balancing,
- Domain decomposition for parallel computations in uid ow modeling, and
- Testing the developed methods on uid ows in porous media.
The first four points are connected in the following computational strategy. We use a mesh generator (triangle for 2-D meshes and NETGEN for 3-D meshes) to generate a good coarse mesh. Then the considered problem is solved sequentially on the coarse mesh (by every processor). The so- lution is used to compute a posteriori error estimates, which are used as weights in an element-based splitting of the coarse mesh into sub-domains. Such splitting ensures that the local refinements that follow will produce a computational mesh with a number of triangles/tetrahedrons balanced over the sub-domains. Every sub-domain is \mapped" to a processor. Then, based on a posteriori error analysis, each processor refines consecutively its region independently. After every step of independent refinement, there is communication between the processors to make the mesh on that level globally conforming. The last point concentrates on computer simulation and testing the developed techniques on uid ows in porous media. In Section 3 we give the results from applying the developed computational methodology over the so-called Bioscreen Problem.
Summary/Accomplishments (Outputs/Outcomes):
Mathematical model
The mathematical model of steady state ground-water flows and transport in porous media yields two basic equations. These are the Darcy equation for the pressure, discussed in Subsection 6.1.1, and the advection-dispersion (transport) equation, discussed in Subsection 6.1.2. The transport equation describes the steady-state distribution of a passive substance dissolved in the water, transported by the flow, and absorbed by the soil. Further, this method can be extended to the case of transport of multiple chemicals that react. Since many clean-up, remediation, and exploration strategies in aquifers and petroleum reservoirs are based on treatment/injection/production through wells we also briefly discuss various well models.
Diffusion (pressure) equation
The fluid flow is due to the velocity defined by the Darcy's law:υ = -D∇ρ, where p is the pressure, D is the permeability of the porous media. The pressure p satisfies the following equation subject to various boundary conditions:
Here Ω is a bounded polyhedral domain in R3 with boundary ∂Ω ≡ Γ = ΓD UΓNUΓw , D is symmetric, bounded and uniformly positive definite matrix in Ω , n is the outer unit vector normal to the boundary of Ω , Pd , Pn , and γ≥Ο are given functions, Pw is a given constant (called well-bore pressure), and f is the given source term. The last three equations prescribe Dirichlet, Neumann, and well boundary conditions, correspondingly. The last one models injection/extraction of fluid through a well, which is assumed to be a cylinder with radius Γw . Since the well radius Γw is very small compared to the reservoir size the wells can be classified as small features of the media and well boundary conditions will lead to solutions with almost singular behavior. For discussion of well boundary conditions, including nonlinear ones, we refer to [20].
Another boundary condition that models injection/extraction of fluid from the reservoir is well condition with a prescribed production rate Q, but with unknown pressure on the well surface:
ρ = ρw, οη Γw, ρw unknown constant and ∫Γw D∇p • nds = Q.
And finally, on Γw we can prescribe the same type of boundary condition as on ΓN . Namely, we have the boundary condition
-D∇p • n = γp + Ǭ on Γw
with Ǭ a given constant. Note that here Ǭ is the pointwise flux, while Q in (2) is the total debit of the well. For γ = O they are related by Q = ǬS , where Sw is the area of the lateral surface of the well. Our computations involving well model were done for condition (3) with γ = U .
Convection-diffusion-reaction (transport) equation
The second basic equation gives the concentration of a passive chemical dissolved and distributed in the water due to the processes of advection, diffusion, and absorption. The equation describes the conservation of mass of the chemical. The steady-state distribution of the concentration c is described by the following general boundary value problem for convection-diffusion-reaction equation:
Again, is a bounded polyhedral domain in
with boundary
, that is split into Dirichlet, Neumann, and well parts, namely
. Further, the Neumann boundary is divided into two parts:
, where
and
. We assume that the diffusion-dispersion tensor K is symmetric, bounded and uniformly positive definite matrix in
,
is the given convection vector field,
as before is the outer unit vector normal to
,
, f,
,
and
are given functions. The boundary condition on
models the case of a given concentration on the well surface, which corresponds to the case of injection well.
In our computations we take the advection vector-field , where the Darcy velocity
is obtained after solving the problem (6.1.1). Then the diffusion-dispersion tensor is given by
, where
,
, and
are constants characterizing correspondingly the diffusion, transverse dispersion, and longitudal dispersion.
In the case of production well we have to impose boundary conditions that model appropriately the extraction of the dissolved substance by the well activity. In the case when the flow is determined by the solution of problem (6.1.1) with well boundary condition (3) we get the following boundary condition for the concentration:
Finite volume element methods for reactive flows in porous media
Many clean-up, remediation, and exploration strategies in aquifers and petroleum reservoirs are based on treatments/actions through wells. Therefore, accurate numerical simulation of the well is an important problem. We have restricted our research on the following two main directions:
- accurate numerical well models;
- adaptive local grid refinement.
Firstly, we have theoretically studied the well models for Darcy`s and Forchheimer's flows treated numerically by finite differences, finite elements, and mixed finite elements. This has resulted in a paper by Ewing, R. Lazarov, S. Lyons, D. Papavassiliou, and J. Pasciak, which has been submitted for publishing. This study is a basis for our further research in 3-D problems, as well as more complex flows. The results in this direction are summarized in the Technical report by R.Ewing, R.Lazarov, S.Lyons, D.Papavassiliou, J.Pasciak ...
Mathematical formulations in term of parabolic problems with integral terms in time have been used in various engineering models, such as nonlocal reactive transport in underground water flows in porous media done by Dagan [[]] and Cushman and collaborates [[], []], and radioactive nuclear decay in fluid flows investigated by Shelukhin [[]]. These equations model quite adequately the process of multi-component flows in porous media and lead to better understanding of the physical processes and their properties. A very important characteristic of all these models is that they all express a conservation of a certain quantity (mass, heat, etc.) in any moment for any sub-domain. In many applications this is a desirable feature of the approximation method when it comes to numerical solution and application of the corresponding models to real life problems.
This year we have studied finite volume element methods for one-dimensional and two-dimensional problems of this type. Namely, we have developed a general framework for obtaining finite volume element approximations and studying their error analysis. In 1-D we consider the lowest-order (linear and L-splines) elements. These schemes are locally conservative and have optimal approximation properties. We have shown that the finite volume element approximations are convergent with optimal order in norm, suboptimal in the
norm and super-convergent in a discrete
norm. In 2-D we have also studied the convergence in
-norm and shown optimal error estimates. This work has resulted in two papers by Ewing, Lazarov, and Lin.
Local error estimators and refinement strategies
The solutions of flow and transport problems in porous media exhibit local behavior. Therefore it is essential that local grid refinement, based on a posteriori error analysis, is applied. Our goal was to implement and experiment with various local error estimators and indicators, to develop adaptive finite element and finite volume codes for both 2-D and 3-D problems. The methods were based on the previous research by Babuska and Rheinboldt [3, 4], Zienkiewicz and Zhu [23], Becker and Rannacher [9], Bank and Weiser [7], Chen, Ewing and Lazarov [].
We started the problem investigation by developing, implementing and testing a 2-D grid refinement strategy based on the known error estimator and indicators. In the context of the finite element method our work was mostly concentrated on the "h-version" adaptive refinement, where the error is reduced by adaptively refining the grid and leaving the degree of the approximating polynomials the same. The other technique , "p-version", increases the order of the polynomials used in the approximation. The results in this direction are summarized in the Technical report : R. Lazarov, J. Pasciak, S. Tomov , Multilevel grid refinement for elliptic problems based on a posteriori error analysis. Here we show 2-D local refinement examples of a homogeneous reservoir and a nonhomogeneous reservoir with two wells. The pictures show the pressure distributions and the grids on which they were computed.
We extended the developed 2-D adaptive code to 3-D. We worked on, tested and implemented three error indicators, namely residual based refinement (first introduced by Babuska and Rheinboldt [3]), Zienkiewicz-Zhu technique (see [23]), and hierarchical refinement (see [7]). We adapted the first two techniques for the finite volume element method in R. Ewing, R. Lazarov, Y. Lin, S. Tomov , A-posteriori error estimates for finite volume element approximations. In R. Lazarov, S. Tomov: Adaptive finite volume element method for convection-diffusion-reaction problems in 3-D we present an adaptive numerical technique for solving steady-state diffusion and convection-diffusion-reaction equations in 3-D (including the case of dominant convection) using finite volume approximations. Computational results of various model simulations of fluid flow and transport of passive chemicals in non-homogeneous aquifers are presented and discussed.
We have used triangular and tetrahedral meshes for the 2-D and 3-D case correspondingly. The actual mesh refinement is important since the consecutive refinements have to be done in a way such that the element shapes don't degenerate. Our adaptive 3-D mesh generation is based on the bisection algorithm (see, e.g. Arnold et all.). In the 2-D case the marked for refinement triangles are split uniformly into 4 and the mesh is refined to conformity by bisection through the longest element edge.
Finally, we implemented the refinement in parallel using the Message Passing Interface Library (MPI). Every processor refines its part of the domain independently. After conformity is reached locally there is communication between the processors to make the mesh on that level globally conforming.
As an illustration of the local refinement we present here the meshes for two 2-D and two 3-D problems. Realistic 2-D and 3-D examples of flows in porous media are given in the Bioscreen Problem.
Algorithms for Partitioning and Load Balancing
We have concentrated our efforts on implementation issues, working on two versions of our parti- tioning algorithms. The first implementation uses a simpler strategy for searching the graph based on the levels of a breadth-first search combined with recursive bisection. In addition, separators that give a better ratio of component to boundary sizes were were given higher priority than those that give a better balance. In addition, issues related to the optimal combination of smaller com- ponents into components of the final partition were addressed. The advantages of this approach are that the resulting algorithms are very fast and produce good results for triangulations with good quality characteristics and that the algorithms will work with almost no modification for 3-D meshes. The second implementation is of an algorithm that exploits the planar nature of the mesh and divides the graph directly into p parts, instead of recursively subdividing. This algorithm works better for triangulations of high aspect ratios (see paper 3 of the List of Publications Resulting from This Grant and [13, 16, 17]).
Constructing good separators under complex criteria
We studied graph separation as a basic tool for achieving even distribution of data among processors and to reduce the cost of communication between processors in a parallel computer. In graph separation algorithms, one has to construct a small set of vertices or edges called a separator whose deletion divides the input graph into small and roughly equal parts. If the input graph describes a Finite element mesh, then the data associated with the elements corresponding to each component of the partition will be assigned to the same processor. The separator theorems give a guarantee that the data partition is optimal according to certain criteria (see paper 7 of the List of Publications Resulting from This Grant and [14, 16, 17]).
According to the research plan, we studied separation in graphs where vertices or edges will have one or more types of weights and costs associated with them. We constructed an algorithm that finds in O(n) time a separator of an arbitrary planar graph of n vertices. In case of vertex costs in the original graph the total sum of vertex costs of the separator is at most and in case of edge costs the total sum of edge costs is at most
, where d is the maximum vertex degree. This result is tight within a constant factor.
Furthermore, we studied applications of our result for solving problems in diverse areas. In graph embedding, we showed that any n-vertex planar graph of maximum degree d and non- negative edge costs whose squares sum up to n can be embedded into a binary tree, a hypercube, a butter y graph, a shue-exchange graph, a cube-connected-cycles graph, or a de Brujn graph with O(log d) average weighted dilation. For the graph pebbling problem, which has applications in register allocation and memory management, we showed that any n-vertex planar acyclic directed graph G with non-negative vertex costs can be pebbled so that, the sum of the costs of all vertices with pebbles be (Here I denotes the maximum sum of costs on the predecessors of any vertex of G.) We also found e?cient solution of the tree congestion problem using our separation algorithms. No previous e?cient algorithms were known for those problems. These results have been presented in paper 11 of the List of Publications Resulting from This Grant and in [2, 15, 16, 17]).
Maintaining partitions in a dynamic environment
Using graph partitioning algorithms, one can distribute the data among the processors so that the computational load is balanced and the communications are minimized. This initial partitioning, however, might later need to be updated after such operations on the mesh as mesh refinement (adding new points or edges) and mesh coarsening (removing existing points and edges). The currently existing graph partitioning algorithms construct the new partitioning from scratch, which is time consuming, and they also may result in a new data reallocation that di ers much from the original one and thus might require many data transfers. We develop a new algorithm for maintaining a balanced partitioning for dynamic planar graphs, i.e., such that can be changed by local vertex or edge addition or deletion operations. Our algorithm uses a tree-like data structure that takes only O(n) space, where n is the size of the graph, and can be updated in O(log n) time after any local change of the graph of the above type. Furthermore, the updated partition is close to the original one, which means that only a few (O(log n) in the worst case) blocks of data have tobemoved between processors in order to maintain the required properties of the partition (see paper 1 of the List of Publications Resulting from This Grant).
Using separators for adaptive mesh partitioning
The generation of a mesh with appropriate characteristics is a very important phase of the simu- lation, since the shapes of the finite elements have in uence on the accuracy of the approximation and the numerical stability of the method. We investigated issues of mesh smoothing based on the so call spring embedding method, which has recently been studied in graph drawing applications. The idea of this method is to find a placement of the vertices of the mesh in such positions that minimize the total sum of squares of the edge lengths. This simulates a mechanical system, where edges are replaced by strings which attract the corresponding endpoint of of the edge with force proportional to the length of the edge. There are also variants where repulsive forces are assigned to some of the edges. Our study showed that in many cases this method produces bad solutions, but there are cases where the resulting mesh be very bad or even the method can produce an invalid mesh. We investigated an improved version of the method, where real valued weights are assigned on the vertices and edges of the mesh. We showed that appropriate combination of weights can produce an optimal placement for any mesh.
Ready-made software
We have also used METIS in our codes. This is a software package for partitioning large irregular graphs/meshes and computing fill, thus reducing the sequencing of sparse matrices. This software has been developed by George Karypis and Vipin Kumar from the University of Minnesota. The algorithms in METIS are based on multilevel graph partitioning, where the graph is made pro- gressively coarse, the coarse graph is partitioned, and then the computed partitions are projected into the fine graph. Below we give several 2-D and 3-D examples of sub-domain partitioning using METIS.
Implementation of the Domain Decomposition Structure on Parallel Computers
We have completed a domain decomposition implementation featuring the following:
- Sub-domain partitioning providing good load balance and minimization of the number of shared nodes between the sub-domains nodes (called interface nodes);
- A posteriori error estimators and indicators leading the refinement process and providing input for weighted sub-domain partitioning;
- Parallel multilevel local refinement preserving the mesh quality and providing (combined with the above two features) load balance and minimization of the interface node number over the different refinement levels;
- Relatively independent sub-domains \mapped" to di erent processors with data structures providing e?cient access to the interface nodes; and
- Eficient solvers, visualization etc.
The implementation, using C++, is in an object-oriented style. The parallel computations are done using the Message Passing Interface Library (MPI). The main classes, their hierarchy, data members, and methods are given in Figure 1.
The PCG method that solves a symmetric and positive definite system of linear equations and the GMRES method that solves non-singular non-symmetric systems of linear equations are implemented as C++ templates. To perform the algorithm, the templates rely on four routines (to solve Ax = b):
- Given x compute Ax;
- Given x and y, compute the inner product (x, y);
- Given scalars a and b, and vectors x and y, compute ax + by; and
- Given x, compute Bx, where B is a preconditioner to A.
As a result of this template mechanism, we have highly reusable pieces of software.
Figure 1: Code structure. Main classes along with their data members and methods.
We have also investigated a new and very promising technique for domain decomposition com- putations using non-matching grids. Namely, a multi-grid technique for uniformly preconditioning linear systems arising from a mortar finite element discretization of second-order elliptic bound- ary value problems has been introduced and analyzed by Gopalakrishnan and Pasciak (paper 16 of the List of Publications Resulting from This Grant). These problems are posed on domains partitioned into sub-domains, each of which is independently triangulated in a multilevel adaptive fashion. The multilevel mortar finite element spaces based on such triangulations (which need not align across sub-domain interfaces) are in general not nested. Suitable grid transfer operators and smoothers have been developed that lead to a variable V-cycle preconditioner resulting in a uniformly preconditioned algebraic systems.
We have also implemented and tested initialization procedures connecting our software to pack- ages such as HYPRE and PETSc. HYPRE is being developed at the Center for Applied Scientific Computing (CASC) at Lawrence Livermore National Laboratory. It is developed to fill a need for highly parallel preconditioners with scalable performance. PETSc is a toolkit providing variety of scalable (parallel) solvers. This direction of our work is in close cooperation with the LLNL team developing HYPRE and is now under intensive development.
Conclusions:
Here we present a particular problem of ow in porous media. It is based on data supplied to us by the Center for Biofilm Engineering at Montana State University. The setting is described below.
A steady state flow, with Darcy velocity (measured in ft/yr), has been established in a parallelepiped shaped reservoir of size
. The problem setting (see below) gives us symmetry with respect to the plane
, so the equations are solved only in half of the domain, the parallelepiped
. The transport of a contaminant, in our case benzene, dissolved in the water is described by the convection-diffusion-reaction equation (4), where c is the benzene concentration,
is the Darcy velocity
, K is the dispersion-diffusion tensor, and a is the biodegradation rate. We assume that the Darcy velocity
is obtained by solving the pressure equation (1) for fully saturated porous media under appropriate boundary conditions.
Simulation results
Here we present simulation results for the case of homogeneous reservoir, non-homogeneous reservoir and non-homogeneous reservoir with well (the 3-D case). Similar computations, for the 2-D case, are also included.
The pressures at the the faces and
are constants, correspondingly 1000 and 0 and the permeability tensor is D = 32I, I is the identity matrix. On the rest of the boundary a homogeneous Neumann condition is specified. This setting creates a uniform Darcy velocity
ft/yr. A steady state leakage on boundary strip
of 30 mg/l has been established. The dispersion/convection process causes the dissolved benzene to disperse in the reservoir. The dispersion tensor has the form
, where
,
and
. The biodegradation is transforming the pollutant into a solid substance which is absorbed by the soil. This leads to a decrease in the benzene. Its concentration level curves are shown on Figure
for the case of low biodegradation rate a=0.1 mg/yr and on Figure
for medium biodegradation rate a = 0.2 mg/yr. We have started with an initial coarse mesh with 52 nodes.
Figure 6.3: Homogeneous reservoir; the 3-D mesh on refinement level 6 (left) with 54,129 nodes; the concentration level curves (right) in the plane for the case of low biodegradation rate
Figure 6.4: Homogeneous reservoir; the mesh in plane on refinement level 5 (globally with 36,427 nodes); the concentration level curves in plane
for the case of medium biodegradation rate
Here the problem setting is as above but a layer is added (see Figures 2, 3, and 4). In the layer strip we take the permeability to be twice smaller than in the rest of the domain, i.e.
. Now the Darcy velocity is not constant and the error estimators force the grid to be refined around the layer. The obtained grid is shown on Figure 2.
Figure 6.5: Pressure computations for a non-homogeneous reservoir; (left) the locally refined 3-D Mesh on level 2 with 4,053 nodes; (right) Contour curves of the pressure for level 2
Figure 6.6: Concentration computations for a non-homogeneous reservoir; (left) the 3-D mesh on refinement level 4 with 39,445 nodes; (right) contour curves of the concentration for the cross-section ; the permeability in the layer is two time smaller than the rest of the domain
Figure 6.7: Concentration computations for a non-homogeneous reservoir; contour curves of the concentration for permeabilities in the layer 5 times (left) and 10 times (right) smaller than the permeability in the rest of the domain
After the pressure is found with prescribed accuracy we solve the corresponding problem for the concentration. Figure 3 shows the obtained mesh and the isolines for the concentration in the reservoir cross-section on grid refinement level 4. Two more experiments varying the permeability tensor are shown on Figure 4. The first one shows the concentration isoline in the reservoir cross-section
when the permeability in the layer is 5-times and 10-times smaller than the permeability in the rest of the reservoir. The initial coarse mesh in both cases has 235 nodes.
Non-homogeneous reservoir with a well
Finally, we consider a problem with one well using the well model described in Section and the approximation given in Section
. The well has an axis along the segment
,
,
and its production rate is
. On Figure 5 we show the adapted mesh and the level curves for the pressure in the reservoir cross-section
. On Figure 6 we show the obtained computational mesh and the level curves for the concentration in the reservoir cross-section
on grid refinement level 5.
Figure 6.8: Pressure computations for a non-homogeneous reservoir with a well; (left) the 3-D Mesh on level 3 with 34,236 nodes; (right) contour curves of the pressure on level 3
Figure 6.9: Concentration computations for a non-homogeneous reservoir with a well; (left) the 3-D mesh with 67,509 nodes in half of the domain obtained after 5 levels of refinement; (right) contour curves of the concentration in the plane
Journal Articles on this Report : 4 Displayed | Download in RIS Format
Other project views: | All 11 publications | 8 publications in selected types | All 4 journal articles |
---|
Type | Citation | ||
---|---|---|---|
|
Ewing RE, Lazarov RD. Finite volume element approximations of nonlocal in time one-dimensional plows in porous media. Computing 2000;64(2):157-182. |
R825207 (1999) R825207 (Final) |
not available |
|
Ewing R, Lazarov R, Lin YP. Finite volume element approximations of nonlocal reactive flows in porous media. Numerical Methods for Partial Differential Equations 2000;16(3):285-311. |
R825207 (1999) R825207 (Final) |
not available |
|
Gopalakrishnan J, Pasciak JE. Multigrid for the mortar finite element method. SIAM Journal on Numerical Analysis 2000, Volume: 37, Number: 3 (MAR 9), Page: 1029-1052. |
R825207 (1999) R825207 (Final) |
not available |
|
Lazarov RD, Pasciak JE, Vassilevski PS. Iterative solution of a coupled mixed and standard Galerkin discretization method for elliptic problems. Numerical Linear Algebra with Applications 2001;8(1):13-31. |
R825207 (1999) R825207 (Final) |
not available |
Supplemental Keywords:
supercomuting, modeling, parallel algorithms, porous media, fluid flow., RFA, Scientific Discipline, Ecosystem Protection/Environmental Exposure & Risk, Mathematics, computing technology, Environmental Monitoring, Environmental Engineering, adaptive grid model, ecosystem modeling, fate and transport, HPCC, computer science, multiphase fluid flows, data analysis, information technology, parallel computing, partitioning algorithmsProgress and Final Reports:
Original AbstractThe perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Conclusions drawn by the principal investigators have not been reviewed by the Agency.