---
title: "Bayesian DEB Modelling of Eisenia fetida Growth and DEBtox Analysis"
author: "Branimir K. Hackenberger, Tamara Djerdj, Domagoj K. Hackenberger"
date: "`r Sys.Date()`"
output:
  rmarkdown::html_vignette:
    toc: true
    toc_depth: 3
    number_sections: true
vignette: >
  %\VignetteIndexEntry{Bayesian DEB Modelling of Eisenia fetida Growth and DEBtox Analysis}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 5,
  fig.align = "center",
  eval = FALSE
)
```

# Introduction

Dynamic Energy Budget (DEB) theory provides a mechanistic,
thermodynamically consistent framework for describing how organisms
acquire and utilise energy for maintenance, growth, development, and
reproduction (Kooijman 2010).  Classical DEB calibration relies on
deterministic optimisation, yielding a single best-fit parameter
vector without formal uncertainty quantification.

**BayesianDEB** embeds the DEB ordinary differential equation (ODE)
system within a Bayesian state-space model, using Stan's Hamiltonian
Monte Carlo (HMC) sampler (Carpenter et al. 2017) to explore the full
joint posterior distribution of all parameters.  This approach naturally
provides:

- **Uncertainty propagation** to derived quantities (ultimate length,
  EC~50~, NEC).
- **Prior incorporation** of biological knowledge from the AmP
  collection (Marques et al. 2018) or previous studies.
- **Hierarchical modelling** with partial pooling across individuals
  (Gelman and Hill 2006).
- **Model checking** through posterior predictive checks (Gelman et al.
  2013, Ch. 6).

This vignette demonstrates the complete workflow on two standard
ecotoxicological test organisms:

1. ***Eisenia fetida*** (earthworm) — individual growth model, then a
   hierarchical multi-individual analysis.
2. ***DEBtox analysis*** — toxicokinetic-toxicodynamic (TKTD) model
   on growth data under toxicant exposure, with EC~50~/NEC estimation.


# Prerequisites

```{r prerequisites}
library(BayesianDEB)
library(ggplot2)
library(posterior)  # for summarise_draws()
```

BayesianDEB requires **cmdstanr** and a working CmdStan installation
for model fitting.  Data preparation, prior specification, and utility
functions work without Stan.

```{r check-stan}
check_cmdstanr()  # informative error if missing
```


# Part 1: *Eisenia fetida* Growth {#eisenia}

## Data description

The `eisenia_growth` dataset contains simulated weekly length
measurements for 21 *Eisenia fetida* individuals over 84 days (12
weeks).  The simulation used the standard 2-state DEB model (reserve
$E$, structure $V$) with parameters representative of *E. fetida* from
the AmP collection: $\{p_{Am}\} = 5.0$ J d$^{-1}$ cm$^{-2}$, $[p_M] =
0.5$ J d$^{-1}$ cm$^{-3}$, $\kappa = 0.75$, $v = 0.2$ cm d$^{-1}$,
$[E_G] = 400$ J cm$^{-3}$.  Individual variation in $\{p_{Am}\}$ (CV
$\approx$ 10%) and Gaussian observation error ($\sigma_L = 0.015$ cm)
were added.

```{r eisenia-explore}
data(eisenia_growth)

# Structure: 273 obs, 3 variables (id, time, length)
str(eisenia_growth)

length(unique(eisenia_growth$id))   # 21 individuals
length(unique(eisenia_growth$time)) # 13 time points (days 0–84)
```

```{r eisenia-plot, fig.cap="Growth trajectories of 21 *E. fetida* individuals.  Structural length $L = V^{1/3}$ measured weekly over 12 weeks."}
ggplot(eisenia_growth, aes(time, length, group = id)) +
  geom_line(alpha = 0.3, colour = "steelblue") +
  geom_point(size = 0.8, alpha = 0.4) +
  theme_bw(base_size = 12) +
  labs(x = "Time (days)", y = expression(paste("Structural length ", L, " (cm)")),
       title = "Eisenia fetida: 21 individuals, 12 weeks")
```

Key features visible in the data:

- **Sigmoidal growth** consistent with the DEB prediction $L(t)
  \to L_\infty$ as $t \to \infty$.
- **Individual variation** in growth rate (spread of trajectories).
- **Measurement noise** visible as scatter around the smooth trend.


## Individual model: single organism {#individual}

We start with a single individual to validate the approach.

### Step 1: Prepare data

```{r ind-data}
df1 <- eisenia_growth[eisenia_growth$id == 5, ]
dat1 <- bdeb_data(growth = df1, f_food = 1.0)
dat1
```

The `f_food = 1.0` argument specifies ad libitum feeding ($f = 1$, the
ratio of actual to maximum ingestion rate; Kooijman 2010, Eq. 2.3).

### Step 2: Specify model and priors

The individual model tracks two state variables: reserve energy $E$ (J)
and structural volume $V$ (cm$^3$), governed by:

$$
\frac{dE}{dt} = f\{p_{Am}\}L^2 - \frac{EvL}{E + [E_G]V},
\qquad
\frac{dV}{dt} = \frac{\kappa \dot{p}_C - [p_M]V}{[E_G]}
$$

where $L = V^{1/3}$ is structural length.  Observed lengths are assumed
to follow $L_\text{obs} \sim \mathcal{N}(\hat{L}, \sigma_L)$.

We set biologically informed priors based on published AmP values for
earthworms:

| Parameter | Prior | Rationale |
|-----------|-------|-----------|
| $\{p_{Am}\}$ | LogNormal(1.5, 0.5) | Median $e^{1.5} \approx 4.5$; AmP range 3–8 |
| $[p_M]$ | LogNormal(−1.0, 0.5) | Median $e^{-1} \approx 0.37$; typical 0.15–0.8 |
| $\kappa$ | Beta(3, 2) | Mode 0.67; earthworms allocate ~60–80% to soma |
| $v$ | LogNormal(−1.5, 0.5) | Median $e^{-1.5} \approx 0.22$ cm/d |
| $[E_G]$ | LogNormal(6.0, 0.5) | Median $e^6 \approx 403$; typical 200–800 |
| $\sigma_L$ | HalfNormal(0.05) | Measurement precision ~0.01–0.03 cm |

```{r ind-model}
mod1 <- bdeb_model(dat1, type = "individual",
  priors = list(
    p_Am    = prior_lognormal(mu = 1.5, sigma = 0.5),
    p_M     = prior_lognormal(mu = -1.0, sigma = 0.5),
    kappa   = prior_beta(a = 3, b = 2),
    v       = prior_lognormal(mu = -1.5, sigma = 0.5),
    E_G     = prior_lognormal(mu = 6.0, sigma = 0.5),
    sigma_L = prior_halfnormal(sigma = 0.05)
  ))
mod1
```

Unspecified priors (`E0`, `L0`) are filled automatically from
`prior_default("individual")`.

### Step 3: Fit via MCMC

The model is compiled to C++ and sampled using the No-U-Turn Sampler
(NUTS; Hoffman and Gelman 2014) with the stiff BDF ODE solver.

```{r ind-fit}
fit1 <- bdeb_fit(mod1,
  chains        = 4,
  iter_warmup   = 1000,
  iter_sampling = 2000,
  adapt_delta   = 0.9,
  seed          = 42
)
fit1
```

### Step 4: Convergence diagnostics

Diagnostics follow the recommendations of Vehtari et al. (2021):

```{r ind-diag}
diag1 <- bdeb_diagnose(fit1)
```

**What to check:**

- **Divergences = 0** — if present, increase `adapt_delta` (e.g., 0.95).
- **$\hat{R} < 1.01$** for all parameters — chains have mixed.
- **Bulk ESS > 400** — sufficient effective samples for posterior means.
- **Tail ESS > 400** — sufficient for credible interval endpoints.
- **E-BFMI > 0.3** — momentum resampling is efficient.

```{r ind-trace, fig.cap="MCMC trace plots for core DEB parameters.  Well-mixed chains should appear as overlapping 'hairy caterpillars'."}
plot(fit1, type = "trace",
     pars = c("p_Am", "p_M", "kappa", "sigma_L"))
```

```{r ind-pairs, fig.cap="Bivariate posterior scatter.  Strong correlation between $\\{p_{Am}\\}$ and $[p_M]$ is expected: both control ultimate size $L_\\infty = \\kappa \\{p_{Am}\\} / [p_M]$."}
plot(fit1, type = "pairs",
     pars = c("p_Am", "p_M", "kappa", "E_G"))
```

### Step 5: Posterior summary

```{r ind-summary}
bdeb_summary(fit1,
  pars = c("p_Am", "p_M", "kappa", "v", "E_G", "sigma_L"),
  prob = 0.95)
```

### Step 6: Posterior predictive check

Posterior predictive checks (PPCs) compare data replicated from the
fitted model ($L^\text{rep}$) with the observed data ($L^\text{obs}$).
If the model fits well, the observed points should fall within the
envelope of replicated trajectories (Gelman et al. 2013, Ch. 6).

```{r ind-ppc, fig.cap="Posterior predictive check: grey lines are replicated growth trajectories, red points are observed data."}
ppc1 <- bdeb_ppc(fit1, type = "growth")
plot(ppc1, n_draws = 200)
```

```{r ind-traj, fig.cap="Posterior predicted trajectories (blue) with observed data (black points).  The spread reflects parameter uncertainty."}
plot(fit1, type = "trajectory", n_draws = 200)
```

### Step 7: Derived quantities

BayesianDEB computes biologically meaningful quantities directly from
the posterior, automatically propagating uncertainty:

| Quantity | Formula | Interpretation |
|----------|---------|---------------|
| $L_m$ | $\kappa \{p_{Am}\} / [p_M]$ | Maximum structural length ($f = 1$) |
| $L_\infty$ | $f \cdot L_m$ | Ultimate structural length at food $f$ |
| $k_M$ | $[p_M] / [E_G]$ | Somatic maintenance rate constant |
| $g$ | $[E_G] \, v / (\kappa \{p_{Am}\})$ | Energy investment ratio |
| $\dot{r}_B$ | $k_M \, g \,/\, 3(f + g)$ | von Bertalanffy growth rate (Eq. 3.23) |

Note: all lengths are **structural** ($L = V^{1/3}$), not physical.
Physical length $L_w = L / \delta_M$ where $\delta_M$ is the
species-specific shape coefficient (not estimated by this package).

```{r ind-derived}
der1 <- bdeb_derived(fit1,
  quantities = c("L_m", "L_inf", "k_M", "g", "growth_rate"), f = 1.0)

summarise_draws(der1,
  "mean", "sd",
  "q2.5"  = ~quantile(.x, 0.025),
  "q97.5" = ~quantile(.x, 0.975))
```

### Step 8: Scenario analysis — reduced food

How does ultimate size change if food availability drops to 70%?  Since
$L_\infty \propto f$, we can compute this directly:

```{r ind-food, fig.cap="Posterior distributions of $L_\\infty$ at $f = 1.0$ (blue) and $f = 0.7$ (orange)."}
d_f10 <- bdeb_derived(fit1, quantities = "L_inf", f = 1.0)
d_f07 <- bdeb_derived(fit1, quantities = "L_inf", f = 0.7)

df_compare <- data.frame(
  L_inf = c(d_f10$L_inf, d_f07$L_inf),
  food  = rep(c("f = 1.0", "f = 0.7"), each = nrow(d_f10))
)

ggplot(df_compare, aes(x = L_inf, fill = food)) +
  geom_density(alpha = 0.4) +
  theme_bw(base_size = 12) +
  labs(x = expression(L[infinity] ~ "(cm)"),
       y = "Posterior density",
       fill = "Food level")
```


## Hierarchical model: 21 individuals {#hierarchical}

When multiple individuals are available, a hierarchical model is
preferred.  It estimates population-level distributions for parameters
that vary across individuals, while sharing parameters that are
species-level constants.

### Model structure

The hierarchical model places a lognormal random effect on the
assimilation rate:

$$
\log\{p_{Am}\}_j = \mu_{\log p_{Am}} + \sigma_{\log p_{Am}} \cdot z_j,
\qquad z_j \sim \mathcal{N}(0, 1)
$$

This **non-centred parameterisation** (Betancourt and Girolami 2015)
avoids the pathological funnel geometry that arises when $\sigma$ is
small.  The parameters $[p_M]$, $\kappa$, $v$, $[E_G]$ are shared
across all individuals.

### Prepare and specify

```{r hier-data}
dat_all <- bdeb_data(growth = eisenia_growth, f_food = 1.0)
dat_all  # 21 individuals, 273 observations
```

```{r hier-model}
mod_h <- bdeb_model(dat_all, type = "hierarchical",
  priors = list(
    mu_log_p_Am    = prior_normal(mu = 1.5, sigma = 0.5),
    sigma_log_p_Am = prior_exponential(rate = 2),
    p_M            = prior_lognormal(mu = -1.0, sigma = 0.5),
    kappa          = prior_beta(a = 3, b = 2),
    v              = prior_lognormal(mu = -1.5, sigma = 0.5),
    E_G            = prior_lognormal(mu = 6.0, sigma = 0.5),
    sigma_L        = prior_halfnormal(sigma = 0.05)
  ))
```

The prior on $\sigma_{\log p_{Am}}$ uses an Exponential(2) distribution,
following the guidance of Gelman (2006) for hierarchical variance
parameters.  This places most prior mass near zero while allowing
substantial variation if the data support it.

### Fit with within-chain parallelism

With 21 individuals, each requiring an independent ODE solve per
iteration, the hierarchical model benefits from **within-chain
parallelism** via Stan's `reduce_sum`.  Setting `threads_per_chain = 4`
distributes the 21 ODE solves across 4 threads per chain.

```{r hier-fit}
fit_h <- bdeb_fit(mod_h,
  chains            = 4,
  iter_warmup       = 1000,
  iter_sampling     = 2000,
  adapt_delta       = 0.95,
  max_treedepth     = 12,
  threads_per_chain = 4,
  seed              = 123
)
```

### Diagnostics

```{r hier-diag}
bdeb_diagnose(fit_h)
```

```{r hier-trace, fig.cap="Trace plots for population-level hyperparameters $\\mu_{\\log p_{Am}}$ and $\\sigma_{\\log p_{Am}}$."}
plot(fit_h, type = "trace",
     pars = c("mu_log_p_Am", "sigma_log_p_Am"))
```

```{r hier-post, fig.cap="Marginal posterior densities for shared parameters."}
plot(fit_h, type = "posterior",
     pars = c("mu_log_p_Am", "sigma_log_p_Am", "p_M", "kappa"))
```

### Population-level results

```{r hier-pop}
bdeb_summary(fit_h,
  pars = c("mu_log_p_Am", "sigma_log_p_Am",
           "p_M", "kappa", "v", "E_G", "sigma_L"),
  prob = 0.95)
```

### Shrinkage of individual estimates

A key feature of hierarchical models is **shrinkage**: individuals with
sparse or noisy data are pulled toward the population mean.  This is
partial pooling — a principled compromise between complete pooling
(ignoring individual variation) and no pooling (fitting each individual
independently).

```{r hier-shrinkage, fig.cap="Individual-level $\\{p_{Am}\\}$ estimates (points: posterior means; bars: 90% CI) compared to the population mean (dashed red line).  Shrinkage toward the mean is visible for individuals with noisier data."}
ind_summary <- bdeb_summary(fit_h,
  pars = paste0("p_Am_ind[", 1:21, "]"),
  prob = 0.90)

pop_summary <- bdeb_summary(fit_h, pars = "mu_log_p_Am")
pop_mean_pAm <- exp(as.data.frame(pop_summary)$mean)

ind_df <- as.data.frame(ind_summary)
ind_df$individual <- 1:21

ggplot(ind_df, aes(x = individual, y = mean)) +
  geom_pointrange(aes(ymin = `5%`, ymax = `95%`),
                  colour = "steelblue", size = 0.4) +
  geom_hline(yintercept = pop_mean_pAm, linetype = "dashed",
             colour = "red", linewidth = 0.8) +
  theme_bw(base_size = 12) +
  labs(x = "Individual", y = expression({p[Am]} ~ "(J/d/cm"^2*")"),
       title = "Individual assimilation rates with 90% CI")
```

### Prediction for a new individual

The `generated quantities` block draws `p_Am_new` from the population
distribution — useful for predicting the performance of an unobserved
individual from the same population:

```{r hier-new}
bdeb_summary(fit_h, pars = "p_Am_new", prob = 0.95)
```

### Individual vs hierarchical: comparison

```{r compare}
s_ind  <- bdeb_summary(fit1, pars = c("p_Am", "p_M", "kappa"), prob = 0.90)
s_hier <- bdeb_summary(fit_h, pars = c("p_M", "kappa"), prob = 0.90)

cat("=== Individual model (id = 5, n = 1) ===\n")
print(as.data.frame(s_ind), digits = 3, row.names = FALSE)

cat("\n=== Hierarchical model (n = 21) ===\n")
print(as.data.frame(s_hier), digits = 3, row.names = FALSE)
```

The hierarchical model yields **narrower credible intervals** for shared
parameters ($[p_M]$, $\kappa$) because it pools information across 21
individuals.


# Part 2: DEBtox Analysis {#debtox}

## Background

The DEBtox framework (Jager et al. 2006; Jager and Zimmer 2012) extends
DEB with toxicokinetic-toxicodynamic (TKTD) components.  A scaled
internal damage variable $D_w$ tracks the toxicant's effect:

$$
\frac{dD_w}{dt} = k_d\bigl(\max(C_w - z_w, 0) - D_w\bigr)
$$

where $k_d$ is the damage recovery rate, $C_w$ is the external
concentration, and $z_w$ is the **no-effect concentration** (NEC).  The
damage causes a stress factor $s = b_w \cdot D_w$ that reduces
assimilation:

$$
\dot{p}_A = f\{p_{Am}\}L^2 \cdot \max(1 - s, \; 0)
$$

At steady state ($D_w = C_w - z_w$ for $C_w > z_w$), the EC~50~ for
50% assimilation reduction is:

$$
\text{EC}_{50} = z_w + \frac{0.5}{b_w}
$$


## Data exploration

The `debtox_growth` dataset simulates growth under 4 toxicant
concentrations (0, 20, 80, 200 arbitrary units), 10 individuals per
group, measured weekly over 6 weeks.  True parameters: NEC = 15,
$b_w = 0.003$.

```{r debtox-explore, fig.cap="Growth trajectories under 4 toxicant concentrations.  Higher concentrations suppress growth through reduced assimilation."}
data(debtox_growth)

ggplot(debtox_growth,
       aes(time, length, colour = factor(concentration), group = id)) +
  geom_line(alpha = 0.3) +
  geom_point(size = 0.8, alpha = 0.4) +
  facet_wrap(~concentration, labeller = label_both) +
  theme_bw(base_size = 11) +
  scale_colour_brewer(palette = "RdYlBu", direction = -1) +
  labs(x = "Time (days)", y = "Structural length (cm)",
       colour = "Concentration") +
  theme(legend.position = "none")
```


## Data preparation

```{r debtox-prep}
conc_levels <- unique(debtox_growth$concentration)
conc_map <- setNames(conc_levels, as.character(conc_levels))

dat_tox <- bdeb_data(
  growth        = debtox_growth,
  concentration = conc_map,
  f_food        = 1.0
)
dat_tox
```

## Model specification

```{r debtox-model}
mod_tox <- bdeb_tox(dat_tox, stress = "assimilation",
  priors = list(
    p_Am    = prior_lognormal(mu = 1.5, sigma = 0.5),
    p_M     = prior_lognormal(mu = -1.0, sigma = 0.5),
    kappa   = prior_beta(a = 3, b = 2),
    v       = prior_lognormal(mu = -1.5, sigma = 0.5),
    E_G     = prior_lognormal(mu = 6.0, sigma = 0.5),
    sigma_L = prior_halfnormal(sigma = 0.05),
    k_d     = prior_lognormal(mu = -1.0, sigma = 1.0),
    z_w     = prior_lognormal(mu = 2.5, sigma = 1.0),
    b_w     = prior_lognormal(mu = -5.0, sigma = 2.0)
  ))
mod_tox
```

**Prior rationale for toxicological parameters:**

| Parameter | Prior | Median | 95% prior range | Rationale |
|-----------|-------|--------|-----------------|-----------|
| $k_d$ | LogNormal(−1, 1) | 0.37 d$^{-1}$ | 0.05–2.7 | Damage recovery: hours to days |
| $z_w$ | LogNormal(2.5, 1) | 12.2 | 1.6–89 | NEC within tested range 0–200 |
| $b_w$ | LogNormal(−5, 2) | 0.0067 | 0.00009–0.5 | Weakly informative on effect intensity |


## Fit

```{r debtox-fit}
fit_tox <- bdeb_fit(mod_tox,
  chains            = 4,
  iter_warmup       = 1000,
  iter_sampling     = 2000,
  adapt_delta       = 0.95,
  max_treedepth     = 12,
  threads_per_chain = 2,
  seed              = 77
)
```

## Diagnostics

```{r debtox-diag}
bdeb_diagnose(fit_tox)
```

```{r debtox-trace-tox, fig.cap="Trace plots for the three toxicological parameters.  Good mixing is essential for reliable EC$_{50}$ and NEC estimates."}
plot(fit_tox, type = "trace", pars = c("k_d", "z_w", "b_w"))
```

```{r debtox-post-tox, fig.cap="Marginal posterior densities for toxicological parameters."}
plot(fit_tox, type = "posterior", pars = c("k_d", "z_w", "b_w"))
```

```{r debtox-pairs, fig.cap="Posterior pairs for toxicological parameters.  A correlation between $z_w$ and $b_w$ is expected since both determine the shape of the dose-response curve."}
plot(fit_tox, type = "pairs", pars = c("z_w", "b_w", "k_d"))
```


## Parameter estimates

```{r debtox-summary}
bdeb_summary(fit_tox,
  pars = c("p_Am", "p_M", "kappa", "v", "E_G",
           "k_d", "z_w", "b_w", "sigma_L"),
  prob = 0.95)
```


## EC~50~ and NEC

The EC~50~ is computed analytically in the Stan `generated quantities`
block, giving the full posterior distribution without post-hoc
root-finding.

```{r debtox-ec50, fig.cap="Posterior distribution of EC$_{50}$ (blue histogram) with the posterior median (red dashed line).  The full distribution — not just a point estimate — is available for regulatory risk assessment."}
ec <- bdeb_ec50(fit_tox, prob = 0.95)
print(ec$summary, digits = 3)

hist(ec$draws, breaks = 50, col = "steelblue", border = "white",
     main = expression("Posterior distribution of EC"[50]),
     xlab = "Concentration", freq = FALSE)
abline(v = ec$summary$median[1], col = "red", lwd = 2, lty = 2)
legend("topright", "Posterior median",
       col = "red", lty = 2, lwd = 2, bty = "n")
```

**Interpretation for risk assessment:**

- **NEC ($z_w$):** the concentration threshold below which no toxic
  effect is expected.  A key regulatory endpoint under REACH
  (ECHA 2017).
- **EC~50~:** the concentration causing 50% reduction in assimilation.
  The full posterior quantifies the uncertainty: a decision-maker can
  use, e.g., the lower 5th percentile as a conservative estimate.


## Dose-response curve

```{r debtox-dr, fig.cap="Dose-response curve with posterior uncertainty bands (blue lines: individual posterior draws).  The dashed horizontal line marks 50% effect; vertical dashed lines mark the NEC (green) and EC$_{50}$ (red)."}
plot_dose_response(fit_tox, n_draws = 200)
```


## Prior sensitivity analysis

A Bayesian analysis should always report the sensitivity of key
conclusions to prior choices.  We refit with a tighter prior on $z_w$:

```{r debtox-sens}
mod_tox2 <- bdeb_tox(dat_tox, stress = "assimilation",
  priors = list(
    z_w = prior_lognormal(mu = 3.0, sigma = 0.3),  # tighter
    b_w = prior_lognormal(mu = -5.0, sigma = 2.0)
  ))
fit_tox2 <- bdeb_fit(mod_tox2, chains = 4, adapt_delta = 0.95,
                     threads_per_chain = 2, seed = 78)

cat("=== Original: z_w ~ LogNormal(2.5, 1.0) ===\n")
bdeb_summary(fit_tox,  pars = c("z_w", "b_w"), prob = 0.95)

cat("\n=== Tighter:  z_w ~ LogNormal(3.0, 0.3) ===\n")
bdeb_summary(fit_tox2, pars = c("z_w", "b_w"), prob = 0.95)
```

If the posteriors agree despite different priors, the **data are
informative** and the inference is robust.  If they diverge, the
parameter is **prior-dominated** and should be reported as weakly
identified.


# Part 3: Practical Tools {#tools}

## Structural vs physical length

BayesianDEB works with **structural length** $L = V^{1/3}$ (cube root
of structural volume), not the physical body length $L_w$ that you
measure with a ruler.  The two are related by the species-specific
**shape coefficient** $\delta_M$:

$$L = \delta_M \times L_w$$

| Species | $\delta_M$ | Source |
|---------|:----------:|--------|
| *Eisenia fetida* | 0.24 | AmP |
| *Folsomia candida* | 0.19 | AmP |
| *Daphnia magna* | 0.37 | AmP |

**Before fitting**, convert your measured lengths:

```{r convert-length}
# Example: measured body lengths in mm for E. fetida
L_physical_mm <- c(12, 18, 25, 30)
delta_M <- 0.24

# Convert to structural length in cm
L_structural_cm <- delta_M * L_physical_mm / 10
L_structural_cm
# [1] 0.288 0.432 0.600 0.720
```

If you pass physical lengths directly, the estimated DEB parameters
will absorb the shape coefficient and will **not** be comparable with
AmP values.  BayesianDEB warns if maximum length exceeds 10 cm, which
is unusually large for structural length.


## Prior predictive check

Before fitting, it is good practice to verify that priors produce
biologically plausible predictions (Gabry et al., 2019).  Sample
directly from the prior distributions and compute derived quantities:

```{r prior-pred}
set.seed(42)
n_sim <- 4000

# Sample from priors
p_Am_sim  <- rlnorm(n_sim, 1.5, 0.5)
p_M_sim   <- rlnorm(n_sim, -1.0, 0.5)
kappa_sim <- rbeta(n_sim, 3, 2)
v_sim     <- rlnorm(n_sim, -1.5, 0.5)
E_G_sim   <- rlnorm(n_sim, 6.0, 0.5)

# Prior predictive for L_inf
L_inf_prior <- kappa_sim * p_Am_sim / p_M_sim

hist(L_inf_prior, breaks = 50, col = "steelblue", border = "white",
     main = "Prior predictive: ultimate structural length",
     xlab = expression(L[infinity] ~ "(cm)"), xlim = c(0, 50))
# Should cover plausible range for earthworms (~2-20 cm structural)
```

If the prior predictive distribution covers unreasonable values (e.g.,
$L_\infty > 100$ cm for an earthworm), tighten the priors.


## Observation model selection

By default, growth observations use a Gaussian likelihood and
reproduction uses negative binomial.  You can switch the observation
model via the `observation` argument — the Stan likelihood is
controlled by integer flags, so **no recompilation is needed**:

```{r obs-models}
# Robust to outliers: Student-t with 5 df
mod_robust <- bdeb_model(dat1, type = "individual",
  observation = list(growth = obs_student_t(nu = 5)))

# Multiplicative error (constant CV)
mod_logn <- bdeb_model(dat1, type = "individual",
  observation = list(growth = obs_lognormal()))

# For reproduction: Poisson instead of NegBin
# (appropriate when overdispersion is negligible)
mod_pois <- bdeb_model(dat_gr, type = "growth_repro",
  observation = list(growth = obs_normal(),
                     reproduction = obs_poisson()))
```

Available observation families:

| Endpoint | Family | Function | When to use |
|----------|--------|----------|-------------|
| Growth | Gaussian | `obs_normal()` | Default; additive error |
| Growth | Log-normal | `obs_lognormal()` | Multiplicative error (constant CV) |
| Growth | Student-t | `obs_student_t(nu)` | Outlier-robust |
| Reproduction | Neg. binomial | `obs_negbinom()` | Default; overdispersed counts |
| Reproduction | Poisson | `obs_poisson()` | Equidispersed counts |


## Temperature correction

DEB rate parameters scale with temperature via the Arrhenius
relationship (Kooijman 2010, Eq. 1.2):

$$
c_T = \exp\!\left(\frac{T_A}{T_\text{ref}} - \frac{T_A}{T}\right)
$$

```{r arrhenius}
# Experiment at 22 C, reference 20 C, typical T_A for ectotherms
cT <- arrhenius(temp = 273.15 + 22, T_ref = 273.15 + 20, T_A = 8000)
cat("Temperature correction factor:", round(cT, 3), "\n")
# Rate at reference temperature: p_Am_ref = p_Am_obs / cT
```

## Energy flux calculator

Inspect the energetics at a specific state:

```{r fluxes}
fl <- deb_fluxes(E = 10, V = 0.5, f = 1.0,
                 p_Am = 5, p_M = 0.5, kappa = 0.75,
                 v = 0.2, E_G = 400)

cat(sprintf("Assimilation  (p_A): %.3f J/d\n", fl$p_A))
cat(sprintf("Mobilisation  (p_C): %.3f J/d\n", fl$p_C))
cat(sprintf("Maintenance   (p_M): %.3f J/d\n", fl$p_M))
cat(sprintf("Growth        (p_G): %.3f J/d\n", fl$p_G))
cat(sprintf("Struct. length (L) : %.3f cm\n",  fl$L))
cat(sprintf("Scaled reserve (e) : %.3f\n",     fl$e))
```

## Converting cumulative reproduction data

Many protocols report cumulative offspring.  The `repro_to_intervals()`
function converts these to the interval format required by BayesianDEB:

```{r repro-convert}
cumul <- data.frame(
  id = rep(1, 5),
  time = c(0, 7, 14, 21, 28),
  cumulative = c(0, 10, 30, 60, 100)
)
repro_to_intervals(cumul)
#   id t_start t_end count
# 1  1       0     7    10
# 2  1       7    14    20
# 3  1      14    21    30
# 4  1      21    28    40
```


# Validation Against Published DEB Parameters {#validation}

The `eisenia_growth` dataset was simulated using DEB parameters from
the Add-my-Pet (AmP) collection entry for *Eisenia fetida* (Marques
et al., 2018).  The table below compares the simulation truth with
published AmP estimates and the expected posterior recovery from
BayesianDEB:

| Parameter | Symbol | True (simulation) | AmP estimate | Units |
|-----------|--------|:-----------------:|:------------:|-------|
| Assimilation rate | $\{p_{Am}\}$ | 5.0 | 3.9–6.2 | J d$^{-1}$ cm$^{-2}$ |
| Maintenance rate | $[p_M]$ | 0.5 | 0.3–0.8 | J d$^{-1}$ cm$^{-3}$ |
| Allocation fraction | $\kappa$ | 0.75 | 0.6–0.85 | — |
| Energy conductance | $v$ | 0.2 | 0.1–0.3 | cm d$^{-1}$ |
| Cost of structure | $[E_G]$ | 400 | 200–600 | J cm$^{-3}$ |
| Max. structural length | $L_m$ | 7.5 | 5–12 | cm |

The simulation truth falls within the published AmP ranges for all
parameters.  When BayesianDEB is fitted to these data (see
Section \@ref(individual)), the posterior medians should recover values
close to the simulation truth, providing a **closed-loop validation**:
known parameters → simulated data → Bayesian recovery → comparison
with truth.

This is not a substitute for fitting real experimental data, but it
demonstrates that:

1. The DEB ODE implementation is correct (simulated data match the
   expected growth pattern).
2. The Bayesian inference machinery recovers known parameters.
3. The prior specification is compatible with realistic parameter
   ranges from the AmP database.

For real-data applications, we recommend comparing estimated parameters
with the AmP entry for the species of interest as a sanity check.


# Summary {#summary}

This vignette demonstrated three analysis types:

| Analysis | Model type | Individuals | Key output |
|----------|-----------|:-----------:|------------|
| Single growth | `"individual"` | 1 | DEB posteriors, PPC, $L_\infty$ |
| Population growth | `"hierarchical"` | 21 | $\mu/\sigma$ of $\{p_{Am}\}$, shrinkage, prediction for new individual |
| Toxicant effect | `"debtox"` | 40 (4 groups) | EC~50~, NEC with full uncertainty |

**What the Bayesian framework provides:**

1. **Full uncertainty quantification** — every derived quantity ($L_\infty$,
   EC~50~, NEC) comes as a posterior distribution, not a point estimate.
2. **Prior incorporation** — biological knowledge from the AmP database
   or regulatory guidelines constrains the estimation.
3. **Hierarchical structure** — partial pooling borrows strength across
   individuals, improving estimates for data-poor organisms.
4. **Principled model checking** — posterior predictive checks and
   convergence diagnostics provide clear evidence of model adequacy.
5. **Within-chain parallelism** — `reduce_sum` accelerates hierarchical
   and DEBtox models on multi-core machines.


# References {-}

Betancourt, M. and Girolami, M. (2015). Hamiltonian Monte Carlo for
hierarchical models. In: Upadhyay, S.K. et al. (eds) *Current Trends
in Bayesian Methodology with Applications*. CRC Press, pp. 79–101.

Carpenter, B., Gelman, A., Hoffman, M.D. et al. (2017). Stan: A
probabilistic programming language. *Journal of Statistical Software*,
76(1), 1–32. doi:
[10.18637/jss.v076.i01](https://doi.org/10.18637/jss.v076.i01)

ECHA (2017). *Guidance on Information Requirements and Chemical Safety
Assessment, Chapter R.10: Characterisation of dose [concentration]-
response for environment*. European Chemicals Agency.

Gelman, A. (2006). Prior distributions for variance parameters in
hierarchical models. *Bayesian Analysis*, 1(3), 515–534. doi:
[10.1214/06-BA117A](https://doi.org/10.1214/06-BA117A)

Gelman, A. and Hill, J. (2006). *Data Analysis Using Regression and
Multilevel/Hierarchical Models*. Cambridge University Press.

Gelman, A., Carlin, J.B., Stern, H.S. et al. (2013). *Bayesian Data
Analysis*. 3rd edition. Chapman & Hall/CRC.

Hoffman, M.D. and Gelman, A. (2014). The No-U-Turn Sampler: adaptively
setting path lengths in Hamiltonian Monte Carlo. *Journal of Machine
Learning Research*, 15(47), 1593–1623.

Jager, T., Heugens, E.H.W. and Kooijman, S.A.L.M. (2006). Making
sense of ecotoxicological test results: towards application of
process-based models. *Ecotoxicology*, 15(3), 305–314. doi:
[10.1007/s10646-006-0060-x](https://doi.org/10.1007/s10646-006-0060-x)

Jager, T. and Zimmer, E.I. (2012). Simplified Dynamic Energy Budget
model for analysing ecotoxicity data. *Ecological Modelling*, 225,
74–81. doi:
[10.1016/j.ecolmodel.2011.11.012](https://doi.org/10.1016/j.ecolmodel.2011.11.012)

Kooijman, S.A.L.M. (2010). *Dynamic Energy Budget Theory for Metabolic
Organisation*. 3rd edition. Cambridge University Press. doi:
[10.1017/CBO9780511805400](https://doi.org/10.1017/CBO9780511805400)

Marques, G.M., Augustine, S., Lika, K. et al. (2018). The AmP project:
comparing species on the basis of dynamic energy budget parameters.
*PLOS Computational Biology*, 14(5), e1006100. doi:
[10.1371/journal.pcbi.1006100](https://doi.org/10.1371/journal.pcbi.1006100)

Vehtari, A., Gelman, A., Simpson, D. et al. (2021).
Rank-normalization, folding, and localization: an improved $\hat{R}$ for
assessing convergence of MCMC. *Bayesian Analysis*, 16(2), 667–718.
doi: [10.1214/20-BA1221](https://doi.org/10.1214/20-BA1221)


# Session info {-}

```{r session}
sessionInfo()
```