# Low-memory filtering for large-scale data assimilation

##### Bibov, Alexander (2017-05-22)

Väitöskirja

Bibov, Alexander

22.05.2017

Lappeenranta University of Technology

Acta Universitatis Lappeenrantaensis

**Julkaisun pysyvä osoite on**

http://urn.fi/URN:ISBN:978-952-335-077-9

#### Tiivistelmä

Data assimilation is process of combining information acquired from mathematical model with observed data in attempt to increase the accuracy of both. The real world phenomena are hard to model in exact way. In addition, even when certain processes allow very accurate mathematical description the model often cannot be represented in a closed form.

These facts lead to necessity to deal with modelling errors when it comes to numerical

simulations of real-life phenomena. One of the usual ways to improve the quality of sim-

ulated data is to use information from (possibly indirect) observations. The observations

in turn are prone to the measurement errors that should also be taken into consideration.

Therefore, the commonly assumed task, which is the subject for data assimilation could

be roughly formulated as follows: given prediction computed by certain numerical simu-

lation and the corresponding (possibly indirect) observation provide an optimal estimate

for the state of the system in question. Here the optimality can have different meanings,

but the commonly assumed case is that the estimate must be unbiased to gain correct

grasp of reality and that it should have the minimal variance, which corresponds to noise

reduction. The algorithms that solve the data assimilation task are called data assimilation

methods. From the aforementioned descriptions it is visible that data assimilation is in the

essence similar to fitting model to the data. The special term of “data assimilation” was

borrowed from meteorological community and is mostly used when the system leveraged

to compute predictions is having high dimension and is chaotic, i.e. sensitive to the initial

state. This dissertation mainly focuses on such models, which is the reason to use this

special term here.

In this dissertation we consider data assimilation methods that deal with the case where

dimension of the state space of the system being simulated is too “large-scale” for the

classical algorithms to be practicable. By “large-scale” here we presume that if n is

dimension of the state space, then n is considered “large” if an n-by-n matrix could not

fit into available computer memory using desired storage format (e.g. single- or double-

precision). Common example of a large-scale model is an operational weather prediction

simulation system. By the moment of writing this text for such systems n 109, which

means that a 109-by-109 matrix stored in double-precision format (this is often required

in scientific simulations) would occupy approximately 7275958TB of memory, which is

far beyond the capabilities of all modern supercomputers (when this text was written the

fastest supercomputer was Sunway TaihuLight located in national supercomputer center

in Wuxi, China and it was having only 1310TB or RAM).

The motivation for having special emphasis for the large-scale models is that the classical

optimal data assimilation approaches employ covariance matrices of the state vectors (the

vectors containing all parameters that fully represent a state of the model at given time

instance). This implies necessity to store n-by-n matrices, which makes implementation

of all such methods inefficient. In this dissertation we attempt to address this problem

by considering low-memory approximations of the classical approaches. The approxima-

tions are by nature sub-optimal, but the way they are formulated allows to alleviate the

memory issues that arise in the classical algorithms.

In the present work we concentrate on low-memory approximations of the Extended

Kalman filter based on limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS)

unconstrained optimization scheme and present family of stabilizing corrections that allow to circumvent certain stability issues that are present in some previously known approaches based on this scheme. We also demonstrate that our stabilizing corrections imply

better convergence properties and re-use this fact to formulate and solve the parallel filtering task, which is essentially a low-memory approximation of the fixed-lag Kalmansmoother.

We study performance of our methods using a synthetic model, the two-layer Quasi-Geostrophic model, which describes conservative wind motion over cylindrical surface vertically divided into two layers. The model is a well-known test case and has been

extensively used in ongoing research conducted in e.g., European Centre For Medium-

Ranged Weather Forecasts, Reading, UK. We analyse performance of our methods by

comparing them against a set of competing low-memory data assimilation techniques

such as Variational Kalman Filter, BFGS Low-Memory Kalman Filter, Weak-Constraint

4D-VAR, and a selection of ensemble-based algorithms.

Finally, we analyse applicability of our approaches by considering the problem of esti-

mating intensity of blooming in the coastal regions of the Finnish gulf during the Spring-

Summer months. For this case we use high-resolution satellite images that provide con-

centrations of the chlorophyll in the gulf. However, the data taken at certain time instances

is not complete due to cloudiness and therefore, the task of estimating the concentrations

of chlorophyll turns out to be a perfect candidate for data assimilation. In addition, the

problem is naturally large-scale due to the resolution of the original data.

These facts lead to necessity to deal with modelling errors when it comes to numerical

simulations of real-life phenomena. One of the usual ways to improve the quality of sim-

ulated data is to use information from (possibly indirect) observations. The observations

in turn are prone to the measurement errors that should also be taken into consideration.

Therefore, the commonly assumed task, which is the subject for data assimilation could

be roughly formulated as follows: given prediction computed by certain numerical simu-

lation and the corresponding (possibly indirect) observation provide an optimal estimate

for the state of the system in question. Here the optimality can have different meanings,

but the commonly assumed case is that the estimate must be unbiased to gain correct

grasp of reality and that it should have the minimal variance, which corresponds to noise

reduction. The algorithms that solve the data assimilation task are called data assimilation

methods. From the aforementioned descriptions it is visible that data assimilation is in the

essence similar to fitting model to the data. The special term of “data assimilation” was

borrowed from meteorological community and is mostly used when the system leveraged

to compute predictions is having high dimension and is chaotic, i.e. sensitive to the initial

state. This dissertation mainly focuses on such models, which is the reason to use this

special term here.

In this dissertation we consider data assimilation methods that deal with the case where

dimension of the state space of the system being simulated is too “large-scale” for the

classical algorithms to be practicable. By “large-scale” here we presume that if n is

dimension of the state space, then n is considered “large” if an n-by-n matrix could not

fit into available computer memory using desired storage format (e.g. single- or double-

precision). Common example of a large-scale model is an operational weather prediction

simulation system. By the moment of writing this text for such systems n 109, which

means that a 109-by-109 matrix stored in double-precision format (this is often required

in scientific simulations) would occupy approximately 7275958TB of memory, which is

far beyond the capabilities of all modern supercomputers (when this text was written the

fastest supercomputer was Sunway TaihuLight located in national supercomputer center

in Wuxi, China and it was having only 1310TB or RAM).

The motivation for having special emphasis for the large-scale models is that the classical

optimal data assimilation approaches employ covariance matrices of the state vectors (the

vectors containing all parameters that fully represent a state of the model at given time

instance). This implies necessity to store n-by-n matrices, which makes implementation

of all such methods inefficient. In this dissertation we attempt to address this problem

by considering low-memory approximations of the classical approaches. The approxima-

tions are by nature sub-optimal, but the way they are formulated allows to alleviate the

memory issues that arise in the classical algorithms.

In the present work we concentrate on low-memory approximations of the Extended

Kalman filter based on limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS)

unconstrained optimization scheme and present family of stabilizing corrections that allow to circumvent certain stability issues that are present in some previously known approaches based on this scheme. We also demonstrate that our stabilizing corrections imply

better convergence properties and re-use this fact to formulate and solve the parallel filtering task, which is essentially a low-memory approximation of the fixed-lag Kalmansmoother.

We study performance of our methods using a synthetic model, the two-layer Quasi-Geostrophic model, which describes conservative wind motion over cylindrical surface vertically divided into two layers. The model is a well-known test case and has been

extensively used in ongoing research conducted in e.g., European Centre For Medium-

Ranged Weather Forecasts, Reading, UK. We analyse performance of our methods by

comparing them against a set of competing low-memory data assimilation techniques

such as Variational Kalman Filter, BFGS Low-Memory Kalman Filter, Weak-Constraint

4D-VAR, and a selection of ensemble-based algorithms.

Finally, we analyse applicability of our approaches by considering the problem of esti-

mating intensity of blooming in the coastal regions of the Finnish gulf during the Spring-

Summer months. For this case we use high-resolution satellite images that provide con-

centrations of the chlorophyll in the gulf. However, the data taken at certain time instances

is not complete due to cloudiness and therefore, the task of estimating the concentrations

of chlorophyll turns out to be a perfect candidate for data assimilation. In addition, the

problem is naturally large-scale due to the resolution of the original data.

##### Kokoelmat

- Väitöskirjat [859]