Elsevier

Computers & Geosciences

Volume 44, July 2012, Pages 78-85
Computers & Geosciences

Efficient parallelization of geostatistical inversion using the quasi-linear approach

https://doi.org/10.1016/j.cageo.2012.03.014Get rights and content

Abstract

Hydraulic conductivity is a key parameter for the simulation of groundwater flow and transport. Typically, it is highly variable in space and difficult to determine by direct methods. The most common approach is to infer hydraulic-conductivity values from measurements of dependent quantities, such as hydraulic head and concentration. In geostatistical inversion, the parameters are estimated as continuous, spatially auto-correlated fields, the most likely values of which are obtained by conditioning on the indirect data. In order to identify small-scaled features, a fine three-dimensional discretization of the domain is needed. This leads to high computational demands in the solution of the forward problem and the calculation of sensitivities. In realistic three-dimensional settings with many measurements parallel computing becomes mandatory.

In the present study, we investigate how parallelization of the quasi-linear geostatistical approach of inversion can be made most efficient. We suggest a two-level approach of parallelization, in which the computational domain is subdivided and the evaluation of sensitivities is also parallelized. We analyze how these two levels of parallelization should be balanced to optimally exploit a given number of computing nodes.

Highlights

► Parallel geostatistical inversion. ► Two level parallelization. ► Efficient C++ programming.

Introduction

Hydraulic conductivity is a key parameter in subsurface hydrology. In inverse modeling its spatial distribution is inferred from indirect measurements, such as hydraulic heads or concentrations. In principle, hydraulic conductivity can vary on all spatial scales, implying a high resolution of the estimated conductivity field. In three-dimensional applications, this may result in millions of hydraulic-conductivity values to be estimated and correspondingly large groundwater flow problems to be solved. The associated high demands with respect to computational power and memory make the application of high-performance computing techniques necessary. To be efficient, a parallelization strategy has to be adapted for each computational problem individually. In this paper, we explore the implementation of efficient parallel methods for geostatistical inversion.

All quantities that depend directly or indirectly on hydraulic conductivity may be used in the inversion procedure. These include, among others, direct estimates of hydraulic conductivity (e.g., from grain-size analysis, flowmeter tests, or small-scale hydraulic tests), hydraulic heads, and solute concentrations from tracer tests. For the sake of readability we limit ourselves to the inversion of hydraulic heads in this paper, but the same general inversion strategy can be applied to other types of data as well (Cirpka and Kitanidis, 2000, Nowak and Cirpka, 2006, Li et al., 2005, Li et al., 2007, Li et al., 2008, Schwede and Cirpka, 2009, Pollock and Cirpka, 2010).

The general objective of our approach is to obtain the hydraulic-conductivity field as a continuous spatial field. Upon discretization, this leads to as many discrete parameter values as elements of the grid, which is typically a number much higher than the number of the measurements available. The inversion problem is therefore mathematically ill posed (Hadamard, 1902). For regularization, one may try to define a small set of zones with constant conductivity values (Carrera and Neuman, 1986), or introduce smoothness constraints (Tikhonov, 1963). While classical Tikhonov regularization is achieved by adding a norm of the spatial parameter gradients to the objective function, geostatistical regularization is obtained by assuming that the parameter field is a random, auto-correlated space variable with certain spatial correlation structure (Matheron, 1971). In this context, inversion becomes (geo)statistical conditioning.

Various versions of geostatistical inversion exist, differing, among others, in the exact representation of the generated fields, the approach of minimizing the objective function (most commonly a variant of the Gauss–Newton or conjugate-gradient methods), and the question whether only a smooth best estimate is sought for or multiple conditional realizations (Sahuquillo et al., 1992, Gómez-Hernández et al., 1997, RamaRao et al., 1995, Benett, 1992, Valstar et al., 2004, Kitanidis, 1995, McLaughlin and Townley, 1996, Yeh et al., 1996, Hernandez et al., 2006).

Only few studies exist using methods of parallel computation in the context of geostatistical inversion. Berg and Illman (2011) implemented the sequential successive linear estimator (SSLE) (e.g., Yeh et al., 1996, Yeh and Liu, 2000) on parallel computers. Doherty, 2010a, Doherty, 2010b introduced a parallel version of the general parameter estimation package PEST, allowing parallel model runs. Specifically using parallel PEST, or the more recent BeoPEST (Schreuder, 2009), with the option of the pilot-point method (e.g. RamaRao et al., 1995), it is possible to estimate hydraulic-conductivity fields as continuous spatial distributions based on geostatistical rules exploiting parallel computer architecture (Hunt et al., 2010, Fienen et al., 2011). However, there are no solid rules of choosing the number and locations of pilot points, leading to non-unique user-dependent results. Cardiff and Barrash (2011) used the quasi-linear geostatistical approach (QLGA) of inversion (Kitanidis, 1995) on a parallel computer implementing simple parallelization of the sensitivity calculations. To the best of our knowledge, the QLGA has not yet been implemented using domain decomposition techniques on parallel computers.

In this paper, we choose the quasi-linear geostatistical approach (QLGA) (Kitanidis, 1995) as inverse kernel, using extensions and modifications to make the scheme more stable and efficient (e.g., Nowak and Cirpka, 2004, Sun and Yeh, 1990). In Section 2.3, we will review the existing approach of quasi-linear geostatistical inversion used as our basis.

In recent years, parallel computing has become increasingly important in the numerical solution of partial differential equations arising in many fields of computational physics, including flow and transport in porous media. In the present study, we will analyze the possibilities for parallelization of geostatistical inversion on mid-range and large computer clusters. The software framework “Distributed Unified Numerics Environment”—DUNE (Bastian et al., 2008a, Bastian et al., 2008b), providing a programming library with tools to solve partial differential equations in parallel, is used to implement a parallelized geostatistical inversion environment in C++.

Section snippets

Governing equations

For the sake of simplification, we restrict our analysis to steady-state groundwater flow without internal sources or sinks·(Kh)=0onΩ,in which K is the hydraulic conductivity, h is the hydraulic head, and Ω denotes the domain. We set Dirichlet boundary conditions at the in- and outflow of the domain, Γin and Γout, and no-flow Neumann boundary conditions along all other boundaries of the domain, Γno h=hinonΓin,h=houtonΓout,n·(Kh)=0onΓno,in which hin and hout are given functions along the

Parallelizing the quasi-linear geostatistical inverse approach

Given a sequential implementation of the quasi-linear geostatistical inverse approach, the question arises which parts can be parallelized. The computation of sensitivities for different measurements may easily be parallelized, because the individual sensitivities are independent from each other. Each of these computations requires the solution of an adjoint problem, Eq. (16). On a second level, the solution of both the adjoint and forward problem, Eq. (1), can be parallelized using domain

Implementation

The software framework DUNE (Bastian et al., 2008a, Bastian et al., 2008b) provides an easy access to different grids (regular and non-regular, conforming and non-conforming) and facilitates the implementation of various finite element schemes with different shape functions. Here, we choose the continuous Galerkin method with bilinear finite elements on a regular cuboidal grid.

For the implementation of efficient parallel programs, three principal programming models exist: distributed memory

Conclusion and outlook

In this study, we have realized a parallelized implementation of the quasi-linear geostatistical approach (QLGA). We have demonstrated that three-dimensional groundwater flow in heterogeneous structures can be simulated with adequate spatial resolutions of the model domain on a mid-range cluster. With the two-level parallelization approach, the code can be adjusted to the number of available computer nodes in clusters of different sizes.

We have shown for our test examples that,

  • (a)

    a two-level

Acknowledgments

This study has been funded by the Baden-Württemberg Stiftung in its high-performance computing program, contract HPC-8, and by the German Federal Ministry for Education and Research (BMBF) within the geotechnologies program under the Grant 03G0742A. We thank Randall J. Hunt and an anonymous reviewer for helpful comments on the manuscript.

References (45)

  • P.K. Kitanidis

    On the geostatistical approach to the inverse problem

    Advances Water Resources

    (1996)
  • W. Nowak et al.

    A modified Levenberg–Marquardt algorithm for quasi-linear geostatistical inversing

    Advances in Water Resources

    (2004)
  • R.L. Schwede et al.

    Use of steady-state concentration measurements in geostatistical inversion

    Advances Water Resources

    (2009)
  • P. Bastian et al.

    A generic grid interface for parallel and adaptive scientific computing. Part II: implementation and tests in DUNE

    Computing

    (2008)
  • P. Bastian et al.

    A generic grid interface for parallel and adaptive scientific computing. Part I: abstract framework

    Computing

    (2008)
  • A.F. Benett

    Inverse Methods in Physical Oceanography

    (1992)
  • S.J. Berg et al.

    Three-dimensional transient hydraulic tomography in a highly heterogeneous glaciofluvial aquifer-aquitard system

    Water Resources Research

    (2011)
  • Blatt, M., 2010. A parallel algebraic multigrid method for elliptic problems with highly discontinuous coefficients....
  • D. Braess

    Towards algebraic multigrid for elliptic problems of second order

    Computing

    (1995)
  • M. Cardiff et al.

    3-D transient hydraulic tomography in unconfined aquifers with fast drainage response

    Water Resources Research

    (2011)
  • J. Carrera et al.

    Estimation of aquifer parameters under transient and steady-state conditions. 3. Application to synthetic and field data

    Water Resources Research

    (1986)
  • O.A. Cirpka et al.

    Characterization of mixing and dilution in heterogeneous aquifers by means of local temporal moments

    Water Resources Research

    (2000)
  • P. Davis

    Circulant Matrices, Pure and Applied Mathematics

    (1979)
  • J. Doherty

    Addendum to the PEST Manual

    (2010)
  • J. Doherty

    PEST, Model-Independent Parameter Estimation-User Manual

    (2010)
  • Fienen, M.N., Kunicki, T.C., Kester, D.E., 2011. cloudPEST—A Python Module for Cloud-Computing Deployment of PEST, a...
  • M. Frigo et al.

    The design and implementation of fftw3

    Proceedings of the IEEE

    (2005)
  • J.J. Gómez-Hernández et al.

    Stochastic simulation of transmissivity fields conditional to both transmissivity and piezometric data, 1. Theory

    Journal of Hydrology

    (1997)
  • I.J. Good

    On the inversion of circulant matrices

    Biometrika

    (1950)
  • J. Hadamard

    Sur les problèmes aux dérivées partielles et leur signification physique

    (1902)
  • A.F. Hernandez et al.

    Inverse stochastic moment analysis of steady state flow in randomly heterogeneous media

    Water Resources Research

    (2006)
  • R.J. Hunt et al.

    Using a cloud to replenish parched groundwater modeling efforts

    Ground Water

    (2010)
  • Cited by (7)

    • Efficient geostatistical inversion of transient groundwater flow using preconditioned nonlinear conjugate gradients

      2017, Advances in Water Resources
      Citation Excerpt :

      While the new approach turned out to be more expensive for steady-state flow problems with isolated head observations, it leads to a significant reduction of computational effort in the case of transient data. While this work presents the algorithm in its basic form, we omitted aspects that are necessary for real-world application, namely the handling of three-dimensional observations and the inclusion of several types of observations (hydraulic heads, tracer data, data obtained by geophysical monitoring, among others Schwede et al., 2012). Preliminary results in this regard have been promising (Klein, 2016), and we will elaborate on these points in future work.

    • HT2DINV: A 2D forward and inverse code for steady-state and transient hydraulic tomography problems

      2015, Computers and Geosciences
      Citation Excerpt :

      In this approach, the measurements are introduced sequentially and the weights coefficients are derived by finding a Best Linear Unbiased Estimation (BLUE methods). The SSLE has been applied in several studies (e.g., Berg and Illman, 2011b; Huang et al., 2011; Illman et al., 2009; Mao et al. 2013; Straface et al., 2007; Yeh and Liu, 2000; Yeh and Zhu, 2007; Zhu and Yeh, 2005). Another method, the pilot point method, has been applied as a parameterization approach for reducing the number of unknown parameters to estimate by evaluating a few key-locations (called pilot points) in the aquifer where the values of the transmissivity can be assessed to fit the hydraulic head measurements via a deterministic or stochastic optimization process (e.g., Doherty, 2003).

    • Numerical solution of steady-state groundwater flow and solute transport problems: Discontinuous Galerkin based methods compared to the Streamline Diffusion approach

      2015, Computer Methods in Applied Mechanics and Engineering
      Citation Excerpt :

      For the solution of the steady-state convection-dominant transport equation, we have compared a higher order symmetric interior penalty discontinuous Galerkin (DG) method to the streamline diffusion (SDFEM) method. Putting special emphasis on their applicability to geostatistical inversion problems in 3-D, aiming at simulations with high mesh resolutions [57,58], we have analyzed efficiency and accuracy. Two main issues occurring in the solution of the convection-dominated transport equation were tackled:

    • Three-dimensional geostatistical inversion of synthetic tomographic pumping and heat-tracer tests in a nested-cell setup

      2014, Advances in Water Resources
      Citation Excerpt :

      This leads to nested parallelization, including domain decomposition methods for each PDE to be solved and parallel evaluation of sensitivities. Details on the parallelization strategy have been published elsewhere [59]. The artificial test case analyzed in this study resembles the Lauswiesen field site near Tübingen in Germany [55,32], where field experiments on heat-tracer tomography are currently performed, even though not yet with the full setup described in the following.

    • A real-time interactive simulation framework for watershed decision making using numerical models and virtual environment

      2013, Journal of Hydrology
      Citation Excerpt :

      Not only is it difficult to substantially improve the calculation speed of CPU frequency and thereby take full advantage of multi-core computers, neither can we realize computer cluster computing using network technology. In watershed numerical simulation, many parallel computation methods to reduce computing time have been studied (Iannone et al., 2007; Lecca et al., 2011; Singh et al., 2011; Vivoni et al., 2011; Schwede et al., 2012). However, most of these methods require large investments in equipment purchase and installation, as well as designated space for computer installation and related cooling system capacity.

    View all citing articles on Scopus
    View full text