Efficient parallelization of geostatistical inversion using the quasi-linear approach

doi:10.1016/j.cageo.2012.03.014

Computers & Geosciences

Volume 44, July 2012, Pages 78-85

https://doi.org/10.1016/j.cageo.2012.03.014 Get rights and content

Abstract

Hydraulic conductivity is a key parameter for the simulation of groundwater flow and transport. Typically, it is highly variable in space and difficult to determine by direct methods. The most common approach is to infer hydraulic-conductivity values from measurements of dependent quantities, such as hydraulic head and concentration. In geostatistical inversion, the parameters are estimated as continuous, spatially auto-correlated fields, the most likely values of which are obtained by conditioning on the indirect data. In order to identify small-scaled features, a fine three-dimensional discretization of the domain is needed. This leads to high computational demands in the solution of the forward problem and the calculation of sensitivities. In realistic three-dimensional settings with many measurements parallel computing becomes mandatory.

In the present study, we investigate how parallelization of the quasi-linear geostatistical approach of inversion can be made most efficient. We suggest a two-level approach of parallelization, in which the computational domain is subdivided and the evaluation of sensitivities is also parallelized. We analyze how these two levels of parallelization should be balanced to optimally exploit a given number of computing nodes.

Highlights

► Parallel geostatistical inversion. ► Two level parallelization. ► Efficient C++ programming.

Introduction

Hydraulic conductivity is a key parameter in subsurface hydrology. In inverse modeling its spatial distribution is inferred from indirect measurements, such as hydraulic heads or concentrations. In principle, hydraulic conductivity can vary on all spatial scales, implying a high resolution of the estimated conductivity field. In three-dimensional applications, this may result in millions of hydraulic-conductivity values to be estimated and correspondingly large groundwater flow problems to be solved. The associated high demands with respect to computational power and memory make the application of high-performance computing techniques necessary. To be efficient, a parallelization strategy has to be adapted for each computational problem individually. In this paper, we explore the implementation of efficient parallel methods for geostatistical inversion.

All quantities that depend directly or indirectly on hydraulic conductivity may be used in the inversion procedure. These include, among others, direct estimates of hydraulic conductivity (e.g., from grain-size analysis, flowmeter tests, or small-scale hydraulic tests), hydraulic heads, and solute concentrations from tracer tests. For the sake of readability we limit ourselves to the inversion of hydraulic heads in this paper, but the same general inversion strategy can be applied to other types of data as well (Cirpka and Kitanidis, 2000, Nowak and Cirpka, 2006, Li et al., 2005, Li et al., 2007, Li et al., 2008, Schwede and Cirpka, 2009, Pollock and Cirpka, 2010).

The general objective of our approach is to obtain the hydraulic-conductivity field as a continuous spatial field. Upon discretization, this leads to as many discrete parameter values as elements of the grid, which is typically a number much higher than the number of the measurements available. The inversion problem is therefore mathematically ill posed (Hadamard, 1902). For regularization, one may try to define a small set of zones with constant conductivity values (Carrera and Neuman, 1986), or introduce smoothness constraints (Tikhonov, 1963). While classical Tikhonov regularization is achieved by adding a norm of the spatial parameter gradients to the objective function, geostatistical regularization is obtained by assuming that the parameter field is a random, auto-correlated space variable with certain spatial correlation structure (Matheron, 1971). In this context, inversion becomes (geo)statistical conditioning.

Various versions of geostatistical inversion exist, differing, among others, in the exact representation of the generated fields, the approach of minimizing the objective function (most commonly a variant of the Gauss–Newton or conjugate-gradient methods), and the question whether only a smooth best estimate is sought for or multiple conditional realizations (Sahuquillo et al., 1992, Gómez-Hernández et al., 1997, RamaRao et al., 1995, Benett, 1992, Valstar et al., 2004, Kitanidis, 1995, McLaughlin and Townley, 1996, Yeh et al., 1996, Hernandez et al., 2006).

Only few studies exist using methods of parallel computation in the context of geostatistical inversion. Berg and Illman (2011) implemented the sequential successive linear estimator (SSLE) (e.g., Yeh et al., 1996, Yeh and Liu, 2000) on parallel computers. Doherty, 2010a, Doherty, 2010b introduced a parallel version of the general parameter estimation package PEST, allowing parallel model runs. Specifically using parallel PEST, or the more recent BeoPEST (Schreuder, 2009), with the option of the pilot-point method (e.g. RamaRao et al., 1995), it is possible to estimate hydraulic-conductivity fields as continuous spatial distributions based on geostatistical rules exploiting parallel computer architecture (Hunt et al., 2010, Fienen et al., 2011). However, there are no solid rules of choosing the number and locations of pilot points, leading to non-unique user-dependent results. Cardiff and Barrash (2011) used the quasi-linear geostatistical approach (QLGA) of inversion (Kitanidis, 1995) on a parallel computer implementing simple parallelization of the sensitivity calculations. To the best of our knowledge, the QLGA has not yet been implemented using domain decomposition techniques on parallel computers.

In this paper, we choose the quasi-linear geostatistical approach (QLGA) (Kitanidis, 1995) as inverse kernel, using extensions and modifications to make the scheme more stable and efficient (e.g., Nowak and Cirpka, 2004, Sun and Yeh, 1990). In Section 2.3, we will review the existing approach of quasi-linear geostatistical inversion used as our basis.

In recent years, parallel computing has become increasingly important in the numerical solution of partial differential equations arising in many fields of computational physics, including flow and transport in porous media. In the present study, we will analyze the possibilities for parallelization of geostatistical inversion on mid-range and large computer clusters. The software framework “Distributed Unified Numerics Environment”—DUNE (Bastian et al., 2008a, Bastian et al., 2008b), providing a programming library with tools to solve partial differential equations in parallel, is used to implement a parallelized geostatistical inversion environment in C++.

Section snippets

Governing equations

For the sake of simplification, we restrict our analysis to steady-state groundwater flow without internal sources or sinks $- \nabla \cdot (K \nabla h) = 0 on Ω,$ in which K is the hydraulic conductivity, h is the hydraulic head, and $Ω$ denotes the domain. We set Dirichlet boundary conditions at the in- and outflow of the domain, $Γ_{in}$ and $Γ_{out}$ , and no-flow Neumann boundary conditions along all other boundaries of the domain, $Γ_{no}$ $h = h_{in} on Γ_{in},$ $h = h_{out} on Γ_{out},$ $n \cdot (K \nabla h) = 0 on Γ_{no},$ in which $h_{in}$ and $h_{out}$ are given functions along the

Parallelizing the quasi-linear geostatistical inverse approach

Given a sequential implementation of the quasi-linear geostatistical inverse approach, the question arises which parts can be parallelized. The computation of sensitivities for different measurements may easily be parallelized, because the individual sensitivities are independent from each other. Each of these computations requires the solution of an adjoint problem, Eq. (16). On a second level, the solution of both the adjoint and forward problem, Eq. (1), can be parallelized using domain

Implementation

The software framework DUNE (Bastian et al., 2008a, Bastian et al., 2008b) provides an easy access to different grids (regular and non-regular, conforming and non-conforming) and facilitates the implementation of various finite element schemes with different shape functions. Here, we choose the continuous Galerkin method with bilinear finite elements on a regular cuboidal grid.

For the implementation of efficient parallel programs, three principal programming models exist: distributed memory

Conclusion and outlook

In this study, we have realized a parallelized implementation of the quasi-linear geostatistical approach (QLGA). We have demonstrated that three-dimensional groundwater flow in heterogeneous structures can be simulated with adequate spatial resolutions of the model domain on a mid-range cluster. With the two-level parallelization approach, the code can be adjusted to the number of available computer nodes in clusters of different sizes.

We have shown for our test examples that,

(a)
a two-level

Acknowledgments

This study has been funded by the Baden-Württemberg Stiftung in its high-performance computing program, contract HPC-8, and by the German Federal Ministry for Education and Research (BMBF) within the geotechnologies program under the Grant 03G0742A. We thank Randall J. Hunt and an anonymous reviewer for helpful comments on the manuscript.

References (45)

P.K. Kitanidis
On the geostatistical approach to the inverse problem
Advances Water Resources
(1996)
W. Nowak et al.
A modified Levenberg–Marquardt algorithm for quasi-linear geostatistical inversing
Advances in Water Resources
(2004)
R.L. Schwede et al.
Use of steady-state concentration measurements in geostatistical inversion
Advances Water Resources
(2009)
P. Bastian et al.
A generic grid interface for parallel and adaptive scientific computing. Part II: implementation and tests in DUNE
Computing
(2008)
P. Bastian et al.
A generic grid interface for parallel and adaptive scientific computing. Part I: abstract framework
Computing
(2008)
A.F. Benett
Inverse Methods in Physical Oceanography
(1992)
S.J. Berg et al.
Three-dimensional transient hydraulic tomography in a highly heterogeneous glaciofluvial aquifer-aquitard system
Water Resources Research
(2011)
Blatt, M., 2010. A parallel algebraic multigrid method for elliptic problems with highly discontinuous coefficients....
D. Braess
Towards algebraic multigrid for elliptic problems of second order
Computing
(1995)
M. Cardiff et al.
3-D transient hydraulic tomography in unconfined aquifers with fast drainage response
Water Resources Research
(2011)

J. Carrera et al.

Estimation of aquifer parameters under transient and steady-state conditions. 3. Application to synthetic and field data

Water Resources Research

(1986)

O.A. Cirpka et al.

Characterization of mixing and dilution in heterogeneous aquifers by means of local temporal moments

Water Resources Research

(2000)

P. Davis

Circulant Matrices, Pure and Applied Mathematics

(1979)

J. Doherty

Addendum to the PEST Manual

(2010)

J. Doherty

PEST, Model-Independent Parameter Estimation-User Manual

(2010)

Fienen, M.N., Kunicki, T.C., Kester, D.E., 2011. cloudPEST—A Python Module for Cloud-Computing Deployment of PEST, a...

M. Frigo et al.

The design and implementation of fftw3

Proceedings of the IEEE

(2005)

J.J. Gómez-Hernández et al.

Stochastic simulation of transmissivity fields conditional to both transmissivity and piezometric data, 1. Theory

Journal of Hydrology

(1997)

I.J. Good

On the inversion of circulant matrices

Biometrika

(1950)

J. Hadamard

Sur les problèmes aux dérivées partielles et leur signification physique

(1902)

A.F. Hernandez et al.

Inverse stochastic moment analysis of steady state flow in randomly heterogeneous media

Water Resources Research

(2006)

R.J. Hunt et al.

Using a cloud to replenish parched groundwater modeling efforts

Ground Water

(2010)

Cited by (7)

Efficient geostatistical inversion of transient groundwater flow using preconditioned nonlinear conjugate gradients
2017, Advances in Water Resources
Citation Excerpt :
While the new approach turned out to be more expensive for steady-state flow problems with isolated head observations, it leads to a significant reduction of computational effort in the case of transient data. While this work presents the algorithm in its basic form, we omitted aspects that are necessary for real-world application, namely the handling of three-dimensional observations and the inclusion of several types of observations (hydraulic heads, tracer data, data obtained by geophysical monitoring, among others Schwede et al., 2012). Preliminary results in this regard have been promising (Klein, 2016), and we will elaborate on these points in future work.
In the geostatistical inverse problem of subsurface hydrology, continuous hydraulic parameter fields, in most cases hydraulic conductivity, are estimated from measurements of dependent variables, such as hydraulic heads, under the assumption that the parameter fields are autocorrelated random space functions. Upon discretization, the continuous fields become large parameter vectors with $O (10^{4} - 10^{7})$ elements. While cokriging-like inversion methods have been shown to be efficient for highly resolved parameter fields when the number of measurements is small, they require the calculation of the sensitivity of each measurement with respect to all parameters, which may become prohibitive with large sets of measured data such as those arising from transient groundwater flow.
We present a Preconditioned Conjugate Gradient method for the geostatistical inverse problem, in which a single adjoint equation needs to be solved to obtain the gradient of the objective function. Using the autocovariance matrix of the parameters as preconditioning matrix, expensive multiplications with its inverse can be avoided, and the number of iterations is significantly reduced. We use a randomized spectral decomposition of the posterior covariance matrix of the parameters to perform a linearized uncertainty quantification of the parameter estimate.
The feasibility of the method is tested by virtual examples of head observations in steady-state and transient groundwater flow. These synthetic tests demonstrate that transient data can reduce both parameter uncertainty and time spent conducting experiments, while the presented methods are able to handle the resulting large number of measurements.
HT2DINV: A 2D forward and inverse code for steady-state and transient hydraulic tomography problems
2015, Computers and Geosciences
Citation Excerpt :
In this approach, the measurements are introduced sequentially and the weights coefficients are derived by finding a Best Linear Unbiased Estimation (BLUE methods). The SSLE has been applied in several studies (e.g., Berg and Illman, 2011b; Huang et al., 2011; Illman et al., 2009; Mao et al. 2013; Straface et al., 2007; Yeh and Liu, 2000; Yeh and Zhu, 2007; Zhu and Yeh, 2005). Another method, the pilot point method, has been applied as a parameterization approach for reducing the number of unknown parameters to estimate by evaluating a few key-locations (called pilot points) in the aquifer where the values of the transmissivity can be assessed to fit the hydraulic head measurements via a deterministic or stochastic optimization process (e.g., Doherty, 2003).
Hydraulic tomography is a technique used to characterize the spatial heterogeneities of storativity and transmissivity fields. The responses of an aquifer to a source of hydraulic stimulations are used to recover the features of the estimated fields using inverse techniques. We developed a 2D free source Matlab package for performing hydraulic tomography analysis in steady state and transient regimes. The package uses the finite elements method to solve the ground water flow equation for simple or complex geometries accounting for the anisotropy of the material properties. The inverse problem is based on implementing the geostatistical quasi-linear approach of Kitanidis combined with the adjoint-state method to compute the required sensitivity matrices. For undetermined inverse problems, the adjoint-state method provides a faster and more accurate approach for the evaluation of sensitivity matrices compared with the finite differences method. Our methodology is organized in a way that permits the end-user to activate parallel computing in order to reduce the computational burden. Three case studies are investigated demonstrating the robustness and efficiency of our approach for inverting hydraulic parameters.
Numerical solution of steady-state groundwater flow and solute transport problems: Discontinuous Galerkin based methods compared to the Streamline Diffusion approach
2015, Computer Methods in Applied Mechanics and Engineering
Citation Excerpt :
For the solution of the steady-state convection-dominant transport equation, we have compared a higher order symmetric interior penalty discontinuous Galerkin (DG) method to the streamline diffusion (SDFEM) method. Putting special emphasis on their applicability to geostatistical inversion problems in 3-D, aiming at simulations with high mesh resolutions [57,58], we have analyzed efficiency and accuracy. Two main issues occurring in the solution of the convection-dominated transport equation were tackled:
In this study, we consider the simulation of subsurface flow and solute transport processes in the stationary limit. In the convection-dominant case, the numerical solution of the transport problem may exhibit non-physical diffusion and under- and overshoots. For a higher order symmetric interior penalty discontinuous Galerkin (DG) discretization, we present a novel approach for reducing numerical under- and overshoots near sharp fronts, that are not resolved by the mesh, using a diffusive $L^{2}$ -projection. In the context of geostatistical inversion, where a small amount of oscillations is tolerated by a proper treatment of measurement errors, this may serve as an efficient alternative to adaptive mesh refinement. Furthermore, we realize a fast solver for the arising linear system by reordering the degrees of freedom in flow direction and exploiting the upwind character of the DG scheme. In 2-D and 3-D examples, we compare the DG-based method to the streamline diffusion approach with respect to computing time and their ability to resolve steep fronts.
Three-dimensional geostatistical inversion of synthetic tomographic pumping and heat-tracer tests in a nested-cell setup
2014, Advances in Water Resources
Citation Excerpt :
This leads to nested parallelization, including domain decomposition methods for each PDE to be solved and parallel evaluation of sensitivities. Details on the parallelization strategy have been published elsewhere [59]. The artificial test case analyzed in this study resembles the Lauswiesen field site near Tübingen in Germany [55,32], where field experiments on heat-tracer tomography are currently performed, even though not yet with the full setup described in the following.
A main purpose of groundwater inverse modeling lies in estimating the hydraulic conductivity field of an aquifer. Traditionally, hydraulic head measurements, possibly obtained in tomographic setups, are used as data. Because the groundwater flow equation is diffusive, many pumping and observation wells would be necessary to obtain a high resolution of hydraulic conductivity, which is typically not possible. We suggest performing heat tracer tests using the same already installed pumping wells and thermometers in observation planes to amend the hydraulic head data set by the arrival times of the heat signals. For each tomographic combinations of wells, we recommend installing an outer pair of pumping wells, generating artificial ambient flow, and an inner well pair in which the tests are performed. We jointly invert heads and thermal arrival times in 3-D by the quasi-linear geostatistical approach using an efficiently parallelized code running on a mid-range cluster. In the present study, we evaluate the value of heat tracer versus head data in a synthetic test case, where the estimated fields can be compared to the synthetic truth. Because the sensitivity patterns of the thermal arrival times differ from those of head measurements, the resolved variance in the estimated field is 6 to 10 times higher in the joint inversion in comparison to inverting head data only. Also, in contrast to head measurements, reversing the flow field and repeating the heat-tracer test improves the estimate in terms of reducing the estimation variance of the estimate. Based on the synthetic test case, we recommend performing the tests in four principal directions, requiring in total eight pumping wells and four intersecting observation planes for heads and temperature in each direction.
A real-time interactive simulation framework for watershed decision making using numerical models and virtual environment
2013, Journal of Hydrology
Citation Excerpt :
Not only is it difficult to substantially improve the calculation speed of CPU frequency and thereby take full advantage of multi-core computers, neither can we realize computer cluster computing using network technology. In watershed numerical simulation, many parallel computation methods to reduce computing time have been studied (Iannone et al., 2007; Lecca et al., 2011; Singh et al., 2011; Vivoni et al., 2011; Schwede et al., 2012). However, most of these methods require large investments in equipment purchase and installation, as well as designated space for computer installation and related cooling system capacity.
Decision support systems based on a virtual environment (VE) are becoming a popular platform in watershed simulation and management. Simulation speed and data visualization is of great significance to decision making, especially in urgent events. Real-time interaction during the simulation process is also very important for dealing with different conditions and for making timely decisions. In this study, a VE-based real-time interactive simulation framework (VERTISF) is developed and applied to simulation and management of the Dujiangyan Project in China. In VERTISF development, a virtual reality platform and numerical models were hosted on different computers and connected by a network to improve simulation speed. Different types of numerical models were generalized in a unified architecture based on time step, and interactive control was realized by modifying model boundary conditions at each time step. The “instruction-response” method and data interpolation were used to synchronize virtual environment visualization and numerical model calculation. Implementation of the framework was based on modular software design; various computer languages can be used to develop the appropriate module. Since only slight modification was needed for current numerical model integration in the framework, VERTISF was easy to extend. Results showed that VERTISF could take full advantage of hardware development, and it was a simple and effective solution for complex watershed simulation.
Combining 3D Hydraulic Tomography with Tracer Tests for Improved Transport Characterization
2016, Groundwater

View all citing articles on Scopus

View full text

Efficient parallelization of geostatistical inversion using the quasi-linear approach

Abstract

Highlights

Introduction

Section snippets

Governing equations

Parallelizing the quasi-linear geostatistical inverse approach

Implementation

Conclusion and outlook

Acknowledgments

Advances Water Resources

Advances in Water Resources

Advances Water Resources

A generic grid interface for parallel and adaptive scientific computing. Part II: implementation and tests in DUNE

Computing

A generic grid interface for parallel and adaptive scientific computing. Part I: abstract framework

Computing

Inverse Methods in Physical Oceanography

Three-dimensional transient hydraulic tomography in a highly heterogeneous glaciofluvial aquifer-aquitard system

Water Resources Research

Towards algebraic multigrid for elliptic problems of second order

Computing

3-D transient hydraulic tomography in unconfined aquifers with fast drainage response

Water Resources Research

Estimation of aquifer parameters under transient and steady-state conditions. 3. Application to synthetic and field data

Water Resources Research

Characterization of mixing and dilution in heterogeneous aquifers by means of local temporal moments

Water Resources Research

Circulant Matrices, Pure and Applied Mathematics

Addendum to the PEST Manual

PEST, Model-Independent Parameter Estimation-User Manual

The design and implementation of fftw3

Proceedings of the IEEE

Stochastic simulation of transmissivity fields conditional to both transmissivity and piezometric data, 1. Theory

Journal of Hydrology

On the inversion of circulant matrices

Biometrika

Sur les problèmes aux dérivées partielles et leur signification physique

Inverse stochastic moment analysis of steady state flow in randomly heterogeneous media

Water Resources Research

Using a cloud to replenish parched groundwater modeling efforts

Ground Water