Komplexe Systemtheorie trifft auf Phytoplankton “Big Data”.
Kalibrierung stochastischer Regen-Abflussmodelle mit Hilfe von Skalengesetzen für verbesserte Prognosen von Extremereignissen.
[[ element.title ]]
[[ element.title ]]
Second-order phase transition in phytoplankton trait dynamics
Key traits of unicellular species, such as cell size, often follow scale-free or self-similar distributions, hinting at the possibility of an underlying critical process. However, linking such empirical scaling laws to the critical regime of realistic individual-based model classes is difficult. Here, we reveal new empirical scaling evidence associated with a transition in the population and the chlorophyll dynamics of phytoplankton. We offer a possible explanation for these observations by deriving scaling laws in the vicinity of the critical point of a new universality class of non-local cell growth and division models. This "criticality hypothesis" can be tested through new scaling predictions derived for our model class, for the response of chlorophyll distributions to perturbations. The derived scaling laws may also be generalized to other cellular traits and environmental drivers relevant to phytoplankton ecology.
Held, J.; Lorimer, T.; Pomati, F.; Stoop, R.; Albert, C. (2020) Second-order phase transition in phytoplankton trait dynamics, Chaos: An Interdisciplinary Journal of Nonlinear Science, 30(5), 053109 (9 pp.), doi:10.1063/1.5141755, Institutional Repository
Signature-domain calibration of hydrological models using Approximate Bayesian Computation: theory and comparison to existing applications
This study considers Bayesian calibration of hydrological models using streamflow signatures
and its implementation using Approximate Bayesian Computation (ABC). If the modeling objective is to predict
streamflow time series and associated uncertainty, a probabilistic model of streamflow must be specified
but the inference equations must be developed in the signature domain. However, even starting from
simple probabilistic models of streamflow time series, working in the signature domain makes the likelihood
function difficult or impractical to evaluate (in particular, as it is unavailable in closed form). This challenge
can be tackled using ABC, a general class of numerical algorithms for sampling from conditional distributions,
such as (but not limited to) Bayesian posteriors given any calibration data. Using ABC does not avoid
the requirement of Bayesian inference to specify a probability model of the data, but rather exchanges the
requirement to evaluate the pdf of this model (needed to evaluate the likelihood function) by the requirement
to sample model output realizations. For this reason ABC is attractive for inference in the signature
domain. We clarify poorly understood aspects of ABC in the hydrological literature, including similarities
and differences between ABC and GLUE, and comment on previous applications of ABC in hydrology. An
error analysis of ABC approximation errors and their dependence on the tolerance is presented. An empirical
case study is used to illustrate the impact of omitting the specification of a probabilistic model (and
instead using a deterministic model within the ABC algorithm), and the impact of a coarse ABC tolerance.
Kavetski, D.; Fenicia, F.; Reichert, P.; Albert, C. (2018) Signature-domain calibration of hydrological models using Approximate Bayesian Computation: theory and comparison to existing applications, Water Resources Research, 54(6), 4059-4083, doi:10.1002/2017WR020528, Institutional Repository
Accelerating Bayesian inference in hydrological modeling with a mechanistic emulator
As in many fields of dynamic modeling, the long runtime of hydrological models hinders Bayesian inference of model parameters from data. By replacing a model with an approximation of its output as a function of input and/or parameters, emulation allows us to complete this task by trading-off accuracy for speed. We combine (i) the use of a mechanistic emulator, (ii) low-discrepancy sampling of the parameter space, and (iii) iterative refinement of the design data set, to perform Bayesian inference with a very small design data set constructed with 128 model runs in a parameter space of up to eight dimensions. In our didactic example we use a model implemented with the hydrological simulator SWMM that allows us to compare our inference results against those derived with the full model. This comparison demonstrates that iterative improvements lead to reasonable results with a very small design data set.
Boosting Bayesian parameter inference of nonlinear stochastic differential equation models by Hamiltonian scale separation
Parameter inference is a fundamental problem in data-driven modeling. Given observed data that is believed to be a realization of some parameterized model, the aim is to find parameter values that are able to explain the observed data. In many situations, the dominant sources of uncertainty must be included into the model for making reliable predictions. This naturally leads to stochastic models. Stochastic models render parameter inference much harder, as the aim then is to find a distribution of likely parameter values. In Bayesian statistics, which is a consistent framework for data-driven learning, this so-called posterior distribution can be used to make probabilistic predictions. We propose a novel, exact, and very efficient approach for generating posterior parameter distributions for stochastic differential equation models calibrated to measured time series. The algorithm is inspired by reinterpreting the posterior distribution as a statistical mechanics partition function of an object akin to a polymer, where the measurements are mapped on heavier beads compared to those of the simulated data. To arrive at distribution samples, we employ a Hamiltonian Monte Carlo approach combined with a multiple time-scale integration. A separation of time scales naturally arises if either the number of measurement points or the number of simulation points becomes large. Furthermore, at least for one-dimensional problems, we can decouple the harmonic modes between measurement points and solve the fastest part of their dynamics analytically. Our approach is applicable to a wide range of inference problems and is highly parallelizable.
Describing the catchment-averaged precipitation as a stochastic process improves parameter and input estimation
Rainfall input uncertainty is one of the major concerns in hydrological modeling. Unfortunately, during inference, input errors are usually neglected, which can lead to biased parameters and implausible predictions. Rainfall multipliers can reduce this problem but still fail when the observed input (precipitation) has a different temporal pattern from the true one or if the true nonzero input is not detected. In this study, we propose an improved input error model which is able to overcome these challenges and to assess and reduce input uncertainty. We formulate the average precipitation over the watershed as a stochastic input process (SIP) and, together with a model of the hydrosystem, include it in the likelihood function. During statistical inference, we use “noisy” input (rainfall) and output (runoff) data to learn about the “true” rainfall, model parameters, and runoff. We test the methodology with the rainfall-discharge dynamics of a small urban catchment. To assess its advantages, we compare SIP with simpler methods of describing uncertainty within statistical inference: (i) standard least squares (LS), (ii) bias description (BD), and (iii) rainfall multipliers (RM). We also compare two scenarios: accurate versus inaccurate forcing data. Results show that when inferring the input with SIP and using inaccurate forcing data, the whole-catchment precipitation can still be realistically estimated and thus physical parameters can be “protected” from the corrupting impact of input errors. While correcting the output rather than the input, BD inferred similarly unbiased parameters. This is not the case with LS and RM. During validation, SIP also delivers realistic uncertainty intervals for both rainfall and runoff. Thus, the technique presented is a significant step toward better quantifying input uncertainty in hydrological inference. As a next step, SIP will have to be combined with a technique addressing model structure uncertainty.
Del Giudice, D.; Albert, C.; Rieckermann, J.; Reichert, P. (2016) Describing the catchment-averaged precipitation as a stochastic process improves parameter and input estimation, Water Resources Research, 52(4), 3162-3186, doi:10.1002/2015WR017871, Institutional Repository
Computationally efficient implementation of a novel algorithm for the General Unified Threshold model of Survival (GUTS)
The General Unified Threshold model of Survival (GUTS) provides a consistent mathematical framework for survival analysis. However, the calibration of GUTS models is computationally challenging. We present a novel algorithm and its fast implementation in our R package, GUTS, that help to overcome these challenges. We show a step-by-step application example consisting of model calibration and uncertainty estimation as well as making probabilistic predictions and validating the model with new data. Using self-defined wrapper functions, we show how to produce informative text printouts and plots without effort, for the inexperienced as well as the advanced user. The complete ready-to-run script is available as supplemental material. We expect that our software facilitates novel re-analysis of existing survival data as well as asking new research questions in a wide range of sciences. In particular the ability to quickly quantify stressor thresholds in conjunction with dynamic compensating processes, and their uncertainty, is an improvement that complements current survival analysis methods.
Model bias and complexity - understanding the effects of structural deficits and input errors on runoff predictions
Oversimplified models and erroneous inputs play a significant role in impairing environmental predictions. To assess the contribution of these errors to model uncertainties is still challenging. Our objective is to understand the effect of model complexity on systematic modeling errors. Our method consists of formulating alternative models with increasing detail and flexibility and describing their systematic deviations by an autoregressive bias process. We test the approach in an urban catchment with five drainage models. Our results show that a single bias description produces reliable predictions for all models. The bias decreases with increasing model complexity and then stabilizes. The bias decline can be associated with reduced structural deficits, while the remaining bias is probably dominated by input errors. Combining a bias description with a multimodel comparison is an effective way to assess the influence of structural and rainfall errors on flow forecasts.
Del Giudice, D.; Reichert, P.; Bareš, V.; Albert, C.; Rieckermann, J. (2015) Model bias and complexity - understanding the effects of structural deficits and input errors on runoff predictions, Environmental Modelling and Software, 64, 205-214, doi:10.1016/j.envsoft.2014.11.006, Institutional Repository
A simulated annealing approach to approximate Bayes computations
Approximate Bayes computations (ABC) are used for parameter inference when the likelihood function of the model is expensive to evaluate but relatively cheap to sample from. In particle ABC, an ensemble of particles in the product space of model outputs and parameters is propagated in such a way that its output marginal approaches a delta function at the data and its parameter marginal approaches the posterior distribution. Inspired by Simulated Annealing, we present a new class of particle algorithms for ABC, based on a sequence of Metropolis kernels, associated with a decreasing sequence of tolerances w.r.t. the data. Unlike other algorithms, our class of algorithms is not based on importance sampling. Hence, it does not suffer from a loss of effective sample size due to re-sampling. We prove convergence under a condition on the speed at which the tolerance is decreased. Furthermore, we present a scheme that adapts the tolerance and the jump distribution in parameter space according to some mean-fields of the ensemble, which preserves the statistical independence of the particles, in the limit of infinite sample size. This adaptive scheme aims at converging as close as possible to the correct result with as few system updates as possible via minimizing the entropy production of the process. The performance of this new class of algorithms is compared against two other recent algorithms on two toy examples as well as on a real-world example from genetics.
In applied sciences, we often deal with deterministic simulation models that are too slow for simulation-intensive tasks such as calibration or real-time control. In this paper, an emulator for a generic dynamic model, given by a system of ordinary nonlinear differential equations, is developed. The nonlinear differential equations are linearized and Gaussian white noise is added to account for the nonlinearities. The resulting linear stochastic system is conditioned on a set of solutions of the nonlinear equations that have been calculated prior to the emulation. A path-integral approach is used to derive the Gaussian distribution of the emulated solution. The solution reveals that most of the computational burden can be shifted to the conditioning phase of the emulator and the complexity of the actual emulation step only scales like O(Nn) in multiplications of matrices of the dimension of the state space. Here, N is the number of time-points at which the solution is to be emulated and n is the number of solutions the emulator is conditioned on.
The applicability of the algorithm is demonstrated with the hydrological model logSPM.