--- title: "Bayesian DEB Modelling of Eisenia fetida Growth and DEBtox Analysis" author: "Branimir K. Hackenberger, Tamara Djerdj, Domagoj K. Hackenberger" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: true toc_depth: 3 number_sections: true vignette: > %\VignetteIndexEntry{Bayesian DEB Modelling of Eisenia fetida Growth and DEBtox Analysis} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5, fig.align = "center", eval = FALSE ) ``` # Introduction Dynamic Energy Budget (DEB) theory provides a mechanistic, thermodynamically consistent framework for describing how organisms acquire and utilise energy for maintenance, growth, development, and reproduction (Kooijman 2010). Classical DEB calibration relies on deterministic optimisation, yielding a single best-fit parameter vector without formal uncertainty quantification. **BayesianDEB** embeds the DEB ordinary differential equation (ODE) system within a Bayesian state-space model, using Stan's Hamiltonian Monte Carlo (HMC) sampler (Carpenter et al. 2017) to explore the full joint posterior distribution of all parameters. This approach naturally provides: - **Uncertainty propagation** to derived quantities (ultimate length, EC~50~, NEC). - **Prior incorporation** of biological knowledge from the AmP collection (Marques et al. 2018) or previous studies. - **Hierarchical modelling** with partial pooling across individuals (Gelman and Hill 2006). - **Model checking** through posterior predictive checks (Gelman et al. 2013, Ch. 6). This vignette demonstrates the complete workflow on two standard ecotoxicological test organisms: 1. ***Eisenia fetida*** (earthworm) — individual growth model, then a hierarchical multi-individual analysis. 2. ***DEBtox analysis*** — toxicokinetic-toxicodynamic (TKTD) model on growth data under toxicant exposure, with EC~50~/NEC estimation. # Prerequisites ```{r prerequisites} library(BayesianDEB) library(ggplot2) library(posterior) # for summarise_draws() ``` BayesianDEB requires **cmdstanr** and a working CmdStan installation for model fitting. Data preparation, prior specification, and utility functions work without Stan. ```{r check-stan} check_cmdstanr() # informative error if missing ``` # Part 1: *Eisenia fetida* Growth {#eisenia} ## Data description The `eisenia_growth` dataset contains simulated weekly length measurements for 21 *Eisenia fetida* individuals over 84 days (12 weeks). The simulation used the standard 2-state DEB model (reserve $E$, structure $V$) with parameters representative of *E. fetida* from the AmP collection: $\{p_{Am}\} = 5.0$ J d$^{-1}$ cm$^{-2}$, $[p_M] = 0.5$ J d$^{-1}$ cm$^{-3}$, $\kappa = 0.75$, $v = 0.2$ cm d$^{-1}$, $[E_G] = 400$ J cm$^{-3}$. Individual variation in $\{p_{Am}\}$ (CV $\approx$ 10%) and Gaussian observation error ($\sigma_L = 0.015$ cm) were added. ```{r eisenia-explore} data(eisenia_growth) # Structure: 273 obs, 3 variables (id, time, length) str(eisenia_growth) length(unique(eisenia_growth$id)) # 21 individuals length(unique(eisenia_growth$time)) # 13 time points (days 0–84) ``` ```{r eisenia-plot, fig.cap="Growth trajectories of 21 *E. fetida* individuals. Structural length $L = V^{1/3}$ measured weekly over 12 weeks."} ggplot(eisenia_growth, aes(time, length, group = id)) + geom_line(alpha = 0.3, colour = "steelblue") + geom_point(size = 0.8, alpha = 0.4) + theme_bw(base_size = 12) + labs(x = "Time (days)", y = expression(paste("Structural length ", L, " (cm)")), title = "Eisenia fetida: 21 individuals, 12 weeks") ``` Key features visible in the data: - **Sigmoidal growth** consistent with the DEB prediction $L(t) \to L_\infty$ as $t \to \infty$. - **Individual variation** in growth rate (spread of trajectories). - **Measurement noise** visible as scatter around the smooth trend. ## Individual model: single organism {#individual} We start with a single individual to validate the approach. ### Step 1: Prepare data ```{r ind-data} df1 <- eisenia_growth[eisenia_growth$id == 5, ] dat1 <- bdeb_data(growth = df1, f_food = 1.0) dat1 ``` The `f_food = 1.0` argument specifies ad libitum feeding ($f = 1$, the ratio of actual to maximum ingestion rate; Kooijman 2010, Eq. 2.3). ### Step 2: Specify model and priors The individual model tracks two state variables: reserve energy $E$ (J) and structural volume $V$ (cm$^3$), governed by: $$ \frac{dE}{dt} = f\{p_{Am}\}L^2 - \frac{EvL}{E + [E_G]V}, \qquad \frac{dV}{dt} = \frac{\kappa \dot{p}_C - [p_M]V}{[E_G]} $$ where $L = V^{1/3}$ is structural length. Observed lengths are assumed to follow $L_\text{obs} \sim \mathcal{N}(\hat{L}, \sigma_L)$. We set biologically informed priors based on published AmP values for earthworms: | Parameter | Prior | Rationale | |-----------|-------|-----------| | $\{p_{Am}\}$ | LogNormal(1.5, 0.5) | Median $e^{1.5} \approx 4.5$; AmP range 3–8 | | $[p_M]$ | LogNormal(−1.0, 0.5) | Median $e^{-1} \approx 0.37$; typical 0.15–0.8 | | $\kappa$ | Beta(3, 2) | Mode 0.67; earthworms allocate ~60–80% to soma | | $v$ | LogNormal(−1.5, 0.5) | Median $e^{-1.5} \approx 0.22$ cm/d | | $[E_G]$ | LogNormal(6.0, 0.5) | Median $e^6 \approx 403$; typical 200–800 | | $\sigma_L$ | HalfNormal(0.05) | Measurement precision ~0.01–0.03 cm | ```{r ind-model} mod1 <- bdeb_model(dat1, type = "individual", priors = list( p_Am = prior_lognormal(mu = 1.5, sigma = 0.5), p_M = prior_lognormal(mu = -1.0, sigma = 0.5), kappa = prior_beta(a = 3, b = 2), v = prior_lognormal(mu = -1.5, sigma = 0.5), E_G = prior_lognormal(mu = 6.0, sigma = 0.5), sigma_L = prior_halfnormal(sigma = 0.05) )) mod1 ``` Unspecified priors (`E0`, `L0`) are filled automatically from `prior_default("individual")`. ### Step 3: Fit via MCMC The model is compiled to C++ and sampled using the No-U-Turn Sampler (NUTS; Hoffman and Gelman 2014) with the stiff BDF ODE solver. ```{r ind-fit} fit1 <- bdeb_fit(mod1, chains = 4, iter_warmup = 1000, iter_sampling = 2000, adapt_delta = 0.9, seed = 42 ) fit1 ``` ### Step 4: Convergence diagnostics Diagnostics follow the recommendations of Vehtari et al. (2021): ```{r ind-diag} diag1 <- bdeb_diagnose(fit1) ``` **What to check:** - **Divergences = 0** — if present, increase `adapt_delta` (e.g., 0.95). - **$\hat{R} < 1.01$** for all parameters — chains have mixed. - **Bulk ESS > 400** — sufficient effective samples for posterior means. - **Tail ESS > 400** — sufficient for credible interval endpoints. - **E-BFMI > 0.3** — momentum resampling is efficient. ```{r ind-trace, fig.cap="MCMC trace plots for core DEB parameters. Well-mixed chains should appear as overlapping 'hairy caterpillars'."} plot(fit1, type = "trace", pars = c("p_Am", "p_M", "kappa", "sigma_L")) ``` ```{r ind-pairs, fig.cap="Bivariate posterior scatter. Strong correlation between $\\{p_{Am}\\}$ and $[p_M]$ is expected: both control ultimate size $L_\\infty = \\kappa \\{p_{Am}\\} / [p_M]$."} plot(fit1, type = "pairs", pars = c("p_Am", "p_M", "kappa", "E_G")) ``` ### Step 5: Posterior summary ```{r ind-summary} bdeb_summary(fit1, pars = c("p_Am", "p_M", "kappa", "v", "E_G", "sigma_L"), prob = 0.95) ``` ### Step 6: Posterior predictive check Posterior predictive checks (PPCs) compare data replicated from the fitted model ($L^\text{rep}$) with the observed data ($L^\text{obs}$). If the model fits well, the observed points should fall within the envelope of replicated trajectories (Gelman et al. 2013, Ch. 6). ```{r ind-ppc, fig.cap="Posterior predictive check: grey lines are replicated growth trajectories, red points are observed data."} ppc1 <- bdeb_ppc(fit1, type = "growth") plot(ppc1, n_draws = 200) ``` ```{r ind-traj, fig.cap="Posterior predicted trajectories (blue) with observed data (black points). The spread reflects parameter uncertainty."} plot(fit1, type = "trajectory", n_draws = 200) ``` ### Step 7: Derived quantities BayesianDEB computes biologically meaningful quantities directly from the posterior, automatically propagating uncertainty: | Quantity | Formula | Interpretation | |----------|---------|---------------| | $L_m$ | $\kappa \{p_{Am}\} / [p_M]$ | Maximum structural length ($f = 1$) | | $L_\infty$ | $f \cdot L_m$ | Ultimate structural length at food $f$ | | $k_M$ | $[p_M] / [E_G]$ | Somatic maintenance rate constant | | $g$ | $[E_G] \, v / (\kappa \{p_{Am}\})$ | Energy investment ratio | | $\dot{r}_B$ | $k_M \, g \,/\, 3(f + g)$ | von Bertalanffy growth rate (Eq. 3.23) | Note: all lengths are **structural** ($L = V^{1/3}$), not physical. Physical length $L_w = L / \delta_M$ where $\delta_M$ is the species-specific shape coefficient (not estimated by this package). ```{r ind-derived} der1 <- bdeb_derived(fit1, quantities = c("L_m", "L_inf", "k_M", "g", "growth_rate"), f = 1.0) summarise_draws(der1, "mean", "sd", "q2.5" = ~quantile(.x, 0.025), "q97.5" = ~quantile(.x, 0.975)) ``` ### Step 8: Scenario analysis — reduced food How does ultimate size change if food availability drops to 70%? Since $L_\infty \propto f$, we can compute this directly: ```{r ind-food, fig.cap="Posterior distributions of $L_\\infty$ at $f = 1.0$ (blue) and $f = 0.7$ (orange)."} d_f10 <- bdeb_derived(fit1, quantities = "L_inf", f = 1.0) d_f07 <- bdeb_derived(fit1, quantities = "L_inf", f = 0.7) df_compare <- data.frame( L_inf = c(d_f10$L_inf, d_f07$L_inf), food = rep(c("f = 1.0", "f = 0.7"), each = nrow(d_f10)) ) ggplot(df_compare, aes(x = L_inf, fill = food)) + geom_density(alpha = 0.4) + theme_bw(base_size = 12) + labs(x = expression(L[infinity] ~ "(cm)"), y = "Posterior density", fill = "Food level") ``` ## Hierarchical model: 21 individuals {#hierarchical} When multiple individuals are available, a hierarchical model is preferred. It estimates population-level distributions for parameters that vary across individuals, while sharing parameters that are species-level constants. ### Model structure The hierarchical model places a lognormal random effect on the assimilation rate: $$ \log\{p_{Am}\}_j = \mu_{\log p_{Am}} + \sigma_{\log p_{Am}} \cdot z_j, \qquad z_j \sim \mathcal{N}(0, 1) $$ This **non-centred parameterisation** (Betancourt and Girolami 2015) avoids the pathological funnel geometry that arises when $\sigma$ is small. The parameters $[p_M]$, $\kappa$, $v$, $[E_G]$ are shared across all individuals. ### Prepare and specify ```{r hier-data} dat_all <- bdeb_data(growth = eisenia_growth, f_food = 1.0) dat_all # 21 individuals, 273 observations ``` ```{r hier-model} mod_h <- bdeb_model(dat_all, type = "hierarchical", priors = list( mu_log_p_Am = prior_normal(mu = 1.5, sigma = 0.5), sigma_log_p_Am = prior_exponential(rate = 2), p_M = prior_lognormal(mu = -1.0, sigma = 0.5), kappa = prior_beta(a = 3, b = 2), v = prior_lognormal(mu = -1.5, sigma = 0.5), E_G = prior_lognormal(mu = 6.0, sigma = 0.5), sigma_L = prior_halfnormal(sigma = 0.05) )) ``` The prior on $\sigma_{\log p_{Am}}$ uses an Exponential(2) distribution, following the guidance of Gelman (2006) for hierarchical variance parameters. This places most prior mass near zero while allowing substantial variation if the data support it. ### Fit with within-chain parallelism With 21 individuals, each requiring an independent ODE solve per iteration, the hierarchical model benefits from **within-chain parallelism** via Stan's `reduce_sum`. Setting `threads_per_chain = 4` distributes the 21 ODE solves across 4 threads per chain. ```{r hier-fit} fit_h <- bdeb_fit(mod_h, chains = 4, iter_warmup = 1000, iter_sampling = 2000, adapt_delta = 0.95, max_treedepth = 12, threads_per_chain = 4, seed = 123 ) ``` ### Diagnostics ```{r hier-diag} bdeb_diagnose(fit_h) ``` ```{r hier-trace, fig.cap="Trace plots for population-level hyperparameters $\\mu_{\\log p_{Am}}$ and $\\sigma_{\\log p_{Am}}$."} plot(fit_h, type = "trace", pars = c("mu_log_p_Am", "sigma_log_p_Am")) ``` ```{r hier-post, fig.cap="Marginal posterior densities for shared parameters."} plot(fit_h, type = "posterior", pars = c("mu_log_p_Am", "sigma_log_p_Am", "p_M", "kappa")) ``` ### Population-level results ```{r hier-pop} bdeb_summary(fit_h, pars = c("mu_log_p_Am", "sigma_log_p_Am", "p_M", "kappa", "v", "E_G", "sigma_L"), prob = 0.95) ``` ### Shrinkage of individual estimates A key feature of hierarchical models is **shrinkage**: individuals with sparse or noisy data are pulled toward the population mean. This is partial pooling — a principled compromise between complete pooling (ignoring individual variation) and no pooling (fitting each individual independently). ```{r hier-shrinkage, fig.cap="Individual-level $\\{p_{Am}\\}$ estimates (points: posterior means; bars: 90% CI) compared to the population mean (dashed red line). Shrinkage toward the mean is visible for individuals with noisier data."} ind_summary <- bdeb_summary(fit_h, pars = paste0("p_Am_ind[", 1:21, "]"), prob = 0.90) pop_summary <- bdeb_summary(fit_h, pars = "mu_log_p_Am") pop_mean_pAm <- exp(as.data.frame(pop_summary)$mean) ind_df <- as.data.frame(ind_summary) ind_df$individual <- 1:21 ggplot(ind_df, aes(x = individual, y = mean)) + geom_pointrange(aes(ymin = `5%`, ymax = `95%`), colour = "steelblue", size = 0.4) + geom_hline(yintercept = pop_mean_pAm, linetype = "dashed", colour = "red", linewidth = 0.8) + theme_bw(base_size = 12) + labs(x = "Individual", y = expression({p[Am]} ~ "(J/d/cm"^2*")"), title = "Individual assimilation rates with 90% CI") ``` ### Prediction for a new individual The `generated quantities` block draws `p_Am_new` from the population distribution — useful for predicting the performance of an unobserved individual from the same population: ```{r hier-new} bdeb_summary(fit_h, pars = "p_Am_new", prob = 0.95) ``` ### Individual vs hierarchical: comparison ```{r compare} s_ind <- bdeb_summary(fit1, pars = c("p_Am", "p_M", "kappa"), prob = 0.90) s_hier <- bdeb_summary(fit_h, pars = c("p_M", "kappa"), prob = 0.90) cat("=== Individual model (id = 5, n = 1) ===\n") print(as.data.frame(s_ind), digits = 3, row.names = FALSE) cat("\n=== Hierarchical model (n = 21) ===\n") print(as.data.frame(s_hier), digits = 3, row.names = FALSE) ``` The hierarchical model yields **narrower credible intervals** for shared parameters ($[p_M]$, $\kappa$) because it pools information across 21 individuals. # Part 2: DEBtox Analysis {#debtox} ## Background The DEBtox framework (Jager et al. 2006; Jager and Zimmer 2012) extends DEB with toxicokinetic-toxicodynamic (TKTD) components. A scaled internal damage variable $D_w$ tracks the toxicant's effect: $$ \frac{dD_w}{dt} = k_d\bigl(\max(C_w - z_w, 0) - D_w\bigr) $$ where $k_d$ is the damage recovery rate, $C_w$ is the external concentration, and $z_w$ is the **no-effect concentration** (NEC). The damage causes a stress factor $s = b_w \cdot D_w$ that reduces assimilation: $$ \dot{p}_A = f\{p_{Am}\}L^2 \cdot \max(1 - s, \; 0) $$ At steady state ($D_w = C_w - z_w$ for $C_w > z_w$), the EC~50~ for 50% assimilation reduction is: $$ \text{EC}_{50} = z_w + \frac{0.5}{b_w} $$ ## Data exploration The `debtox_growth` dataset simulates growth under 4 toxicant concentrations (0, 20, 80, 200 arbitrary units), 10 individuals per group, measured weekly over 6 weeks. True parameters: NEC = 15, $b_w = 0.003$. ```{r debtox-explore, fig.cap="Growth trajectories under 4 toxicant concentrations. Higher concentrations suppress growth through reduced assimilation."} data(debtox_growth) ggplot(debtox_growth, aes(time, length, colour = factor(concentration), group = id)) + geom_line(alpha = 0.3) + geom_point(size = 0.8, alpha = 0.4) + facet_wrap(~concentration, labeller = label_both) + theme_bw(base_size = 11) + scale_colour_brewer(palette = "RdYlBu", direction = -1) + labs(x = "Time (days)", y = "Structural length (cm)", colour = "Concentration") + theme(legend.position = "none") ``` ## Data preparation ```{r debtox-prep} conc_levels <- unique(debtox_growth$concentration) conc_map <- setNames(conc_levels, as.character(conc_levels)) dat_tox <- bdeb_data( growth = debtox_growth, concentration = conc_map, f_food = 1.0 ) dat_tox ``` ## Model specification ```{r debtox-model} mod_tox <- bdeb_tox(dat_tox, stress = "assimilation", priors = list( p_Am = prior_lognormal(mu = 1.5, sigma = 0.5), p_M = prior_lognormal(mu = -1.0, sigma = 0.5), kappa = prior_beta(a = 3, b = 2), v = prior_lognormal(mu = -1.5, sigma = 0.5), E_G = prior_lognormal(mu = 6.0, sigma = 0.5), sigma_L = prior_halfnormal(sigma = 0.05), k_d = prior_lognormal(mu = -1.0, sigma = 1.0), z_w = prior_lognormal(mu = 2.5, sigma = 1.0), b_w = prior_lognormal(mu = -5.0, sigma = 2.0) )) mod_tox ``` **Prior rationale for toxicological parameters:** | Parameter | Prior | Median | 95% prior range | Rationale | |-----------|-------|--------|-----------------|-----------| | $k_d$ | LogNormal(−1, 1) | 0.37 d$^{-1}$ | 0.05–2.7 | Damage recovery: hours to days | | $z_w$ | LogNormal(2.5, 1) | 12.2 | 1.6–89 | NEC within tested range 0–200 | | $b_w$ | LogNormal(−5, 2) | 0.0067 | 0.00009–0.5 | Weakly informative on effect intensity | ## Fit ```{r debtox-fit} fit_tox <- bdeb_fit(mod_tox, chains = 4, iter_warmup = 1000, iter_sampling = 2000, adapt_delta = 0.95, max_treedepth = 12, threads_per_chain = 2, seed = 77 ) ``` ## Diagnostics ```{r debtox-diag} bdeb_diagnose(fit_tox) ``` ```{r debtox-trace-tox, fig.cap="Trace plots for the three toxicological parameters. Good mixing is essential for reliable EC$_{50}$ and NEC estimates."} plot(fit_tox, type = "trace", pars = c("k_d", "z_w", "b_w")) ``` ```{r debtox-post-tox, fig.cap="Marginal posterior densities for toxicological parameters."} plot(fit_tox, type = "posterior", pars = c("k_d", "z_w", "b_w")) ``` ```{r debtox-pairs, fig.cap="Posterior pairs for toxicological parameters. A correlation between $z_w$ and $b_w$ is expected since both determine the shape of the dose-response curve."} plot(fit_tox, type = "pairs", pars = c("z_w", "b_w", "k_d")) ``` ## Parameter estimates ```{r debtox-summary} bdeb_summary(fit_tox, pars = c("p_Am", "p_M", "kappa", "v", "E_G", "k_d", "z_w", "b_w", "sigma_L"), prob = 0.95) ``` ## EC~50~ and NEC The EC~50~ is computed analytically in the Stan `generated quantities` block, giving the full posterior distribution without post-hoc root-finding. ```{r debtox-ec50, fig.cap="Posterior distribution of EC$_{50}$ (blue histogram) with the posterior median (red dashed line). The full distribution — not just a point estimate — is available for regulatory risk assessment."} ec <- bdeb_ec50(fit_tox, prob = 0.95) print(ec$summary, digits = 3) hist(ec$draws, breaks = 50, col = "steelblue", border = "white", main = expression("Posterior distribution of EC"[50]), xlab = "Concentration", freq = FALSE) abline(v = ec$summary$median[1], col = "red", lwd = 2, lty = 2) legend("topright", "Posterior median", col = "red", lty = 2, lwd = 2, bty = "n") ``` **Interpretation for risk assessment:** - **NEC ($z_w$):** the concentration threshold below which no toxic effect is expected. A key regulatory endpoint under REACH (ECHA 2017). - **EC~50~:** the concentration causing 50% reduction in assimilation. The full posterior quantifies the uncertainty: a decision-maker can use, e.g., the lower 5th percentile as a conservative estimate. ## Dose-response curve ```{r debtox-dr, fig.cap="Dose-response curve with posterior uncertainty bands (blue lines: individual posterior draws). The dashed horizontal line marks 50% effect; vertical dashed lines mark the NEC (green) and EC$_{50}$ (red)."} plot_dose_response(fit_tox, n_draws = 200) ``` ## Prior sensitivity analysis A Bayesian analysis should always report the sensitivity of key conclusions to prior choices. We refit with a tighter prior on $z_w$: ```{r debtox-sens} mod_tox2 <- bdeb_tox(dat_tox, stress = "assimilation", priors = list( z_w = prior_lognormal(mu = 3.0, sigma = 0.3), # tighter b_w = prior_lognormal(mu = -5.0, sigma = 2.0) )) fit_tox2 <- bdeb_fit(mod_tox2, chains = 4, adapt_delta = 0.95, threads_per_chain = 2, seed = 78) cat("=== Original: z_w ~ LogNormal(2.5, 1.0) ===\n") bdeb_summary(fit_tox, pars = c("z_w", "b_w"), prob = 0.95) cat("\n=== Tighter: z_w ~ LogNormal(3.0, 0.3) ===\n") bdeb_summary(fit_tox2, pars = c("z_w", "b_w"), prob = 0.95) ``` If the posteriors agree despite different priors, the **data are informative** and the inference is robust. If they diverge, the parameter is **prior-dominated** and should be reported as weakly identified. # Part 3: Practical Tools {#tools} ## Structural vs physical length BayesianDEB works with **structural length** $L = V^{1/3}$ (cube root of structural volume), not the physical body length $L_w$ that you measure with a ruler. The two are related by the species-specific **shape coefficient** $\delta_M$: $$L = \delta_M \times L_w$$ | Species | $\delta_M$ | Source | |---------|:----------:|--------| | *Eisenia fetida* | 0.24 | AmP | | *Folsomia candida* | 0.19 | AmP | | *Daphnia magna* | 0.37 | AmP | **Before fitting**, convert your measured lengths: ```{r convert-length} # Example: measured body lengths in mm for E. fetida L_physical_mm <- c(12, 18, 25, 30) delta_M <- 0.24 # Convert to structural length in cm L_structural_cm <- delta_M * L_physical_mm / 10 L_structural_cm # [1] 0.288 0.432 0.600 0.720 ``` If you pass physical lengths directly, the estimated DEB parameters will absorb the shape coefficient and will **not** be comparable with AmP values. BayesianDEB warns if maximum length exceeds 10 cm, which is unusually large for structural length. ## Prior predictive check Before fitting, it is good practice to verify that priors produce biologically plausible predictions (Gabry et al., 2019). Sample directly from the prior distributions and compute derived quantities: ```{r prior-pred} set.seed(42) n_sim <- 4000 # Sample from priors p_Am_sim <- rlnorm(n_sim, 1.5, 0.5) p_M_sim <- rlnorm(n_sim, -1.0, 0.5) kappa_sim <- rbeta(n_sim, 3, 2) v_sim <- rlnorm(n_sim, -1.5, 0.5) E_G_sim <- rlnorm(n_sim, 6.0, 0.5) # Prior predictive for L_inf L_inf_prior <- kappa_sim * p_Am_sim / p_M_sim hist(L_inf_prior, breaks = 50, col = "steelblue", border = "white", main = "Prior predictive: ultimate structural length", xlab = expression(L[infinity] ~ "(cm)"), xlim = c(0, 50)) # Should cover plausible range for earthworms (~2-20 cm structural) ``` If the prior predictive distribution covers unreasonable values (e.g., $L_\infty > 100$ cm for an earthworm), tighten the priors. ## Observation model selection By default, growth observations use a Gaussian likelihood and reproduction uses negative binomial. You can switch the observation model via the `observation` argument — the Stan likelihood is controlled by integer flags, so **no recompilation is needed**: ```{r obs-models} # Robust to outliers: Student-t with 5 df mod_robust <- bdeb_model(dat1, type = "individual", observation = list(growth = obs_student_t(nu = 5))) # Multiplicative error (constant CV) mod_logn <- bdeb_model(dat1, type = "individual", observation = list(growth = obs_lognormal())) # For reproduction: Poisson instead of NegBin # (appropriate when overdispersion is negligible) mod_pois <- bdeb_model(dat_gr, type = "growth_repro", observation = list(growth = obs_normal(), reproduction = obs_poisson())) ``` Available observation families: | Endpoint | Family | Function | When to use | |----------|--------|----------|-------------| | Growth | Gaussian | `obs_normal()` | Default; additive error | | Growth | Log-normal | `obs_lognormal()` | Multiplicative error (constant CV) | | Growth | Student-t | `obs_student_t(nu)` | Outlier-robust | | Reproduction | Neg. binomial | `obs_negbinom()` | Default; overdispersed counts | | Reproduction | Poisson | `obs_poisson()` | Equidispersed counts | ## Temperature correction DEB rate parameters scale with temperature via the Arrhenius relationship (Kooijman 2010, Eq. 1.2): $$ c_T = \exp\!\left(\frac{T_A}{T_\text{ref}} - \frac{T_A}{T}\right) $$ ```{r arrhenius} # Experiment at 22 C, reference 20 C, typical T_A for ectotherms cT <- arrhenius(temp = 273.15 + 22, T_ref = 273.15 + 20, T_A = 8000) cat("Temperature correction factor:", round(cT, 3), "\n") # Rate at reference temperature: p_Am_ref = p_Am_obs / cT ``` ## Energy flux calculator Inspect the energetics at a specific state: ```{r fluxes} fl <- deb_fluxes(E = 10, V = 0.5, f = 1.0, p_Am = 5, p_M = 0.5, kappa = 0.75, v = 0.2, E_G = 400) cat(sprintf("Assimilation (p_A): %.3f J/d\n", fl$p_A)) cat(sprintf("Mobilisation (p_C): %.3f J/d\n", fl$p_C)) cat(sprintf("Maintenance (p_M): %.3f J/d\n", fl$p_M)) cat(sprintf("Growth (p_G): %.3f J/d\n", fl$p_G)) cat(sprintf("Struct. length (L) : %.3f cm\n", fl$L)) cat(sprintf("Scaled reserve (e) : %.3f\n", fl$e)) ``` ## Converting cumulative reproduction data Many protocols report cumulative offspring. The `repro_to_intervals()` function converts these to the interval format required by BayesianDEB: ```{r repro-convert} cumul <- data.frame( id = rep(1, 5), time = c(0, 7, 14, 21, 28), cumulative = c(0, 10, 30, 60, 100) ) repro_to_intervals(cumul) # id t_start t_end count # 1 1 0 7 10 # 2 1 7 14 20 # 3 1 14 21 30 # 4 1 21 28 40 ``` # Validation Against Published DEB Parameters {#validation} The `eisenia_growth` dataset was simulated using DEB parameters from the Add-my-Pet (AmP) collection entry for *Eisenia fetida* (Marques et al., 2018). The table below compares the simulation truth with published AmP estimates and the expected posterior recovery from BayesianDEB: | Parameter | Symbol | True (simulation) | AmP estimate | Units | |-----------|--------|:-----------------:|:------------:|-------| | Assimilation rate | $\{p_{Am}\}$ | 5.0 | 3.9–6.2 | J d$^{-1}$ cm$^{-2}$ | | Maintenance rate | $[p_M]$ | 0.5 | 0.3–0.8 | J d$^{-1}$ cm$^{-3}$ | | Allocation fraction | $\kappa$ | 0.75 | 0.6–0.85 | — | | Energy conductance | $v$ | 0.2 | 0.1–0.3 | cm d$^{-1}$ | | Cost of structure | $[E_G]$ | 400 | 200–600 | J cm$^{-3}$ | | Max. structural length | $L_m$ | 7.5 | 5–12 | cm | The simulation truth falls within the published AmP ranges for all parameters. When BayesianDEB is fitted to these data (see Section \@ref(individual)), the posterior medians should recover values close to the simulation truth, providing a **closed-loop validation**: known parameters → simulated data → Bayesian recovery → comparison with truth. This is not a substitute for fitting real experimental data, but it demonstrates that: 1. The DEB ODE implementation is correct (simulated data match the expected growth pattern). 2. The Bayesian inference machinery recovers known parameters. 3. The prior specification is compatible with realistic parameter ranges from the AmP database. For real-data applications, we recommend comparing estimated parameters with the AmP entry for the species of interest as a sanity check. # Summary {#summary} This vignette demonstrated three analysis types: | Analysis | Model type | Individuals | Key output | |----------|-----------|:-----------:|------------| | Single growth | `"individual"` | 1 | DEB posteriors, PPC, $L_\infty$ | | Population growth | `"hierarchical"` | 21 | $\mu/\sigma$ of $\{p_{Am}\}$, shrinkage, prediction for new individual | | Toxicant effect | `"debtox"` | 40 (4 groups) | EC~50~, NEC with full uncertainty | **What the Bayesian framework provides:** 1. **Full uncertainty quantification** — every derived quantity ($L_\infty$, EC~50~, NEC) comes as a posterior distribution, not a point estimate. 2. **Prior incorporation** — biological knowledge from the AmP database or regulatory guidelines constrains the estimation. 3. **Hierarchical structure** — partial pooling borrows strength across individuals, improving estimates for data-poor organisms. 4. **Principled model checking** — posterior predictive checks and convergence diagnostics provide clear evidence of model adequacy. 5. **Within-chain parallelism** — `reduce_sum` accelerates hierarchical and DEBtox models on multi-core machines. # References {-} Betancourt, M. and Girolami, M. (2015). Hamiltonian Monte Carlo for hierarchical models. In: Upadhyay, S.K. et al. (eds) *Current Trends in Bayesian Methodology with Applications*. CRC Press, pp. 79–101. Carpenter, B., Gelman, A., Hoffman, M.D. et al. (2017). Stan: A probabilistic programming language. *Journal of Statistical Software*, 76(1), 1–32. doi: [10.18637/jss.v076.i01](https://doi.org/10.18637/jss.v076.i01) ECHA (2017). *Guidance on Information Requirements and Chemical Safety Assessment, Chapter R.10: Characterisation of dose [concentration]- response for environment*. European Chemicals Agency. Gelman, A. (2006). Prior distributions for variance parameters in hierarchical models. *Bayesian Analysis*, 1(3), 515–534. doi: [10.1214/06-BA117A](https://doi.org/10.1214/06-BA117A) Gelman, A. and Hill, J. (2006). *Data Analysis Using Regression and Multilevel/Hierarchical Models*. Cambridge University Press. Gelman, A., Carlin, J.B., Stern, H.S. et al. (2013). *Bayesian Data Analysis*. 3rd edition. Chapman & Hall/CRC. Hoffman, M.D. and Gelman, A. (2014). The No-U-Turn Sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. *Journal of Machine Learning Research*, 15(47), 1593–1623. Jager, T., Heugens, E.H.W. and Kooijman, S.A.L.M. (2006). Making sense of ecotoxicological test results: towards application of process-based models. *Ecotoxicology*, 15(3), 305–314. doi: [10.1007/s10646-006-0060-x](https://doi.org/10.1007/s10646-006-0060-x) Jager, T. and Zimmer, E.I. (2012). Simplified Dynamic Energy Budget model for analysing ecotoxicity data. *Ecological Modelling*, 225, 74–81. doi: [10.1016/j.ecolmodel.2011.11.012](https://doi.org/10.1016/j.ecolmodel.2011.11.012) Kooijman, S.A.L.M. (2010). *Dynamic Energy Budget Theory for Metabolic Organisation*. 3rd edition. Cambridge University Press. doi: [10.1017/CBO9780511805400](https://doi.org/10.1017/CBO9780511805400) Marques, G.M., Augustine, S., Lika, K. et al. (2018). The AmP project: comparing species on the basis of dynamic energy budget parameters. *PLOS Computational Biology*, 14(5), e1006100. doi: [10.1371/journal.pcbi.1006100](https://doi.org/10.1371/journal.pcbi.1006100) Vehtari, A., Gelman, A., Simpson, D. et al. (2021). Rank-normalization, folding, and localization: an improved $\hat{R}$ for assessing convergence of MCMC. *Bayesian Analysis*, 16(2), 667–718. doi: [10.1214/20-BA1221](https://doi.org/10.1214/20-BA1221) # Session info {-} ```{r session} sessionInfo() ```