Workshops on Computational and Mathematical Challenges in Material Science and Engineering: Data Assimilation in the Geosciences

An Overview of the State-of-the-Art in Data Assimilation

Istvan Szunyogh (TAMU)

Abstract

Data assimilation is the process of obtaining an estimate of the state of a complex physical system, such as the terrestrial or a planetary atmosphere or the oceans, based on observations and a dynamical model of the system. The dynamical model represents a collection of our knowledge of the physical, chemical, and biological processes that determine the evolution of the system. The use of the model in the data assimilation process helps extract information from the observations in an intelligent way through a formal statistical-mathematical optimization process.

In this presentation, we will provide a brief survey of the state-of-the-art data assimilation algorithms, including variational and Kalman-filter based sequential data assimilation schemes. Special attention will be paid to the algorithmic solutions that have been developed in the last few years to deal with the high-dimensionality of the resulting computational algorithms, which is the main challenge in most practical applications. The techniques and the most important open scientific and practical challenges will be illustrated on simple dynamical models (often called toy models), a state-of-the-art global atmospheric circulation model, a coastal ocean model, and a model of the Martian atmosphere.

Use of Statistical Estimation Methods for Background Error Covariance in Ensemble Kalman Filter

Mikyoung Jun (TAMU)

Abstract

This study examines the use of both parametric and nonparametric statistical methods to estimate the spatial covariance structure. The parametric method is to use a parametric class of covariance function with an explicit expression and certain parameters as its arguments while the nonparametric method performs kernel smoothing in space at a pre-specified bandwidth. Both methods are applied in the Lorenz 1996 model as a surrogate of complex atmospheric systems. It is demonstrated that the estimated covariances through these statistical methods, particularly nonparametric method, are quantitatively better than sample covariances with and without covariance localization directly estimated from limited number of ensemble members. The statistical methods are able to preserve the true correlation length scale in situations where the true correlation length scale is a slowly-varying or non-varying function to space, and thus overcome the problem of non-zero spurious covariance values that are commonly seen in the sample covariance matrices for distances relatively far apart.

An Adaptive Approach to Mitigate Background Covariance Limitations in the Ensemble Kalman Filter

Ibrahim Hoteit (KAUST)

Abstract

This contribution presents a new approach to address the background covariance limitations arising from under-sampled ensembles and unaccounted model errors in the ensemble Kalman filter (EnKF). The method enhances the representativeness of the EnKF ensemble by augmenting it with new ensemble members chosen adaptively to add missing information that prevents the EnKF from fully fitting the data to the ensemble. The vectors to be added are obtained by back-projecting the residuals of the observation misfits from the EnKF analysis step onto the state space. The back-projection is done using an interpolation (OI) scheme based on an estimate of the covariance of the subspace missing from the ensemble. The OI uses a pre-selected stationary background covariance matrix, as in the hybrid EnKF/3DVAR approach, but the resulting correction is included as a new ensemble member instead of being added to all existing ensemble members. Results from assimilation experiments with the Lorenz-96 model will be presented and discussed.

This is joint work with Hajoon Song, Bruce Cornuelle, and Aneesh Subramanian.

Bayesian Estimation of the Drag Coefficient from the Upper Ocean Response to a Hurricane: A Feasibility Study

Sarah Zedler (TAMU)

Abstract

Estimates of the drag coefficient at high wind speeds are largely calculated from atmospheric measurements, and vary over a factor of at least 1.5. Adopting a Bayesian approach using forward models of the ocean’s response to a hurricane, we seek to determine if a small number of measurements of upper ocean temperature and currents can be used to make estimates of the drag coefficient that have a smaller range of uncertainty than previously found.

The probability distribution function of the range of probable estimates of the drag coefficient parameterization is calculated, that results from adding realistic levels of noise to the ocean response. Allowing the drag coefficient two parameters of freedom, namely the values at 35 m/s and at 45 m/s, the uncertainty in the optimal value is about 25% for levels of instrument noise up to 1K, for a misfit function based on temperature, or 1.0 m/s for a misfit function based on 15-m currents. The results are robust for several different instrument arrays; the noise levels do not decrease by much for arrays with more than 40 sensors, when the sensor positions are random. Having a small number of sensors in a data assimilation problem would likely provide sufficient accuracy in the estimated drag coefficient.

Estimating the State of Large Spatiotemporally Chaoticystems: Ensemble Weather Forecasting, Etc.

Edward Ott (University of Maryland)

Abstract

State estimation is a general requirement for model-based prediction of a system’s future evolution. As such, the state estimation problem has received intensive study, and a very nice rigorous solution, applicable in many engineering contexts, has been presented long ago by Rudolph Kalman.

However, very large spatio-temporally chaotic systems, as occur in geophysical situations, present an extreme challenge for state estimation, because straightforward application of well-understood conventional techniques, like the Kalman filter, are typically not feasible due to computational limitations. This talk will present background material, a proposed method for adapting the Kalman filter to large systems, and illustrative results from application of the technique to weather forecasting and to a laboratory experiment.

Combined State and Parameter Estimation in the Ensemble Kalman Filter Framework

Jonathan Stroud (George Washington University)

Abstract

Kalman filter methods for real-time assimilation of observations and dynamical systems typically assume knowledge of the system parameters. However, relatively little work has been done on extending state estimation procedures to include parameter estimation. Here, in the context of the ensemble Kalman filter, we propose new Monte Carlo algorithms for combined estimation of states and parameters. Our proposed assimilation algorithms can be implemented in a likelihood or Bayesian framework, and extend standard ensemble methods, including serial and square-root assimilation schemes. The methods are illustrated on the Lorenz 40-variable system and a real data example of cloud motion.

Ensemble Data Assimilation for Ocean Biogeochemical Models

Michael Dowd (Dalhousie University)

Abstract

Statistical methodologies that estimate a system state and parameters for time dependent stochastic dynamic systems are now well established. A diverse array of biological oceanographic observations are now available from water sampling, satellites and robotic vehicles. Process based biogeochemical models describing the time evolution of these spatial fields are available as complex computer codes implementing PDE based fluid dynamical models, and coupled to biological processes. There is increasing interest in developing data assimilation approaches that rely on Monte Carlo, or ensemble, based solutions. In this talk, I will overview some new developments using state space models: including ensemble Kalman filters, resampling approaches, and MCMC. Ongoing work using simple ODE based models of ocean biogeochemistry, as well as more complex and realistic PDE based models, will be used to illustrate these ideas. Challenges for adaptation of ensemble data assimilation to large dimension PDE based systems are discussed.

The Use of Streamlines for Flow-Relevant Covariance Localization for the EnKF

Deepak Devegowda (University Oklahoma)

Abstract

Recent advances in the development and implementation of robust algorithms for data assimilation have considerably reduced the time and effort associated with reservoir characterization and eliminated the subjectivity associated with manual model calibration. The ensemble Kalman filter (EnKF) is one such promising technique for data assimilation and provides a relatively straightforward approach to incorporate diverse data types including production and/or time-lapse seismic data. Unlike traditional sensitivity-based history matching methods, the EnKF relies on a cross-covariance matrix computed from an ensemble of reservoir models to relate reservoir properties to production data. However, reliable and accurate subsurface characterization continues to be challenging by the need for a large number of model replicates to estimate sample-based statistical measures, specifically the covariances and cross-covariances that directly impact the spread of information from the measurement locations to the model parameters. Statistical noise resulting from modest ensemble sizes often leads to poor approximations of the cross-covariance matrix and significantly degrades the model updates leading to geologically inconsistent subsurface models and a loss of geologic realism.

In this talk, I present the theory and application of a flow-relevant covariance localization scheme that utilizes streamline-derived information to identify regions within the reservoir that will have a maximum impact on the dynamic response in order to address the difficulties in the implementation of the ensemble Kalman filter (EnKF) for operational data integration problems. Key to the success of streamline-based covariance-localization is its close link to the underlying physics of flow compared to a simple distance-dependent covariance function as used in the past. We illustrate the approach with a synthetic example and a large field-study that demonstrate the difficulties with the traditional EnKF implementation. In both the numerical experiments, it is shown that these challenges are addressed using flow relevant conditioning of the cross-covariance matrix. By mitigating sampling error in the cross-covariance estimates, the proposed approach provides significant computational savings by enabling the use of modest ensemble sizes, and consequently offers the opportunity for use with large field-scale groundwater and reservoir characterization studies.

Merging Data Inversion with Data Assimilation

Charles Jackson (UT Austin)

Abstract

Apart from interest in using data assimilation to estimate state variables, there is also interest to make use of data to improve the representation of system physics. The presentation will review efforts that have attempted to bridge these interests and the scientific, mathematical, and computational challenges that are involved.

A Simple Method for Assimilating Surface Currents into a Texas- Louisiana Shelf Circulation Model

Robert Hetland (TAMU)

Abstract

The Texas General Land Office funds two projects to estimate and predict surface currents over the Texas-Louisiana continental shelf for oil spill trajectory prediction. The Texas Automated Buoy system publishes real-time measurements of surface currents over the Texas- Louisiana shelf (http://tabs.gerg.tamu.edu/). An associated modeling effort uses predictions of winds from the National Center for Atmospheric Prediction to predict wind-driven surface currents in the Gulf of Mexico (http://seawater.tamu.edu/tglo/rxindex.html).

Presently, the numerical predictions are run operationally as forward model, with no data assimilation. Simple estimates of the background error covariance are investigated for use in the operational model. The hypothesis is that the structure of the background error covariance does not change in time, and is proportional to the covariance of currents predicted within the model. Using this simple estimate for the background error covariance would allow the model to assimilate data without resorting to an inverse model, but would be superior to traditional optimal interpolation since the covariance in the currents is strongly anisotropic and typically follows the coastline.