---
title: "Using ggeffects with nestedLogit models"
author: "Michael Friendly"
date: "`r Sys.Date()`"
package: nestedLogit
output:
  rmarkdown::html_vignette:
  fig_caption: yes
bibliography: ["references.bib", "packages.bib"]
csl: apa.csl
vignette: >
  %\VignetteIndexEntry{Using ggeffects with nestedLogit models}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  message = FALSE,
  warning = FALSE,
  fig.align = "center",
  fig.height = 6,
  fig.width = 7,
  fig.path = "fig/",
  dev = "png",
  comment = "#>"
)

# save some typing
knitr::set_alias(w = "fig.width",
                 h = "fig.height",
                 cap = "fig.cap")

set.seed(47)
.opts <- options(digits = 4)

# packages to be cited in packages.bib
.to.cite <- c("nestedLogit", "ggeffects", "ggplot2")
```

Load the packages we'll use here:
```{r setup}
library(nestedLogit)    # Nested Dichotomy Logistic Regression Models
library(ggeffects)      # Create Tidy Data Frames of Marginal Effects
library(ggplot2)        # Data Visualisations Using the Grammar of Graphics
```

## Overview

The `ggeffects` package [@R-ggeffects; @ggeffects2018] provides a simple and unified interface for
computing and plotting adjusted predictions and marginal effects from a wide
variety of regression models. Its main function, `predict_response()`,
returns a tidy data frame of model predictions that can be plotted directly
with a built-in `plot()` method or further customized with `ggplot2`.

The package now supports `"nestedLogit"` objects, making it easy to visualize predicted
probabilities for each response category across levels of the predictors, without the
manual data wrangling described in `vignette("plotting-ggplot", package = "nestedLogit")`.

**Note**:: `ggeffects` will at some point be superseded by the [`modelbased`](https://easystats.github.io/modelbased/) package from the [`easystats`](https://easystats.github.io/easystats/) project. 


## 👩 Women's labor force participation

We use the standard `Womenlf` example from the main vignette. The response `partic` has
three categories --- not working, working part-time, and working full-time --- modeled
as nested dichotomies against husband's income and presence of young children.

```{r wlf-model}
data(Womenlf, package = "carData")
comparisons <- logits(work = dichotomy("not.work", c("parttime", "fulltime")),
                      full = dichotomy("parttime", "fulltime"))

wlf.nested <- nestedLogit(partic ~ hincome + children,
                          dichotomies = comparisons,
                          data = Womenlf)
```

### Predicted probabilities with `predict_response()`

The simplest way to obtain predicted probabilities and a plot is with
`predict_response()`, specifying the focal predictors in the `terms` argument.
This returns predicted probabilities for each response level, with confidence intervals,
averaged over the non-focal predictors. The default print method displays these ("prettified") for a
small subset of values of the quantitative predictor, `hincome`.

```{r wlf-predict-response}
wlf.pred <- predict_response(wlf.nested, terms = c("hincome", "children"))
wlf.pred
```

The default `plot()` method produces a faceted plot with one panel for each response category
and separate curves for the levels of the other predictor (`children`).

```{r wlf-ggeffects-plot1}
#| out.width = "100%",
#| fig.height = 5,
#| fig.cap = "Predicted probabilities from `predict_response()` with default `ggeffects` plot."
plot(wlf.pred)
```

### Customizing the plot

The `plot()` method returns a `ggplot` object, so it can be further customized with
standard `ggplot2` functions. For example, we can adjust the line size, labels,
theme, and legend position:

```{r wlf-ggeffects-plot2}
#| out.width = "100%",
#| fig.height = 5,
#| fig.cap = "Customized `ggeffects` plot with adjusted labels and theme."
plot(wlf.pred,
     line_size = 2) +
  labs(title = "Predicted Probabilities of Work by Husband's Income",
       y = "Probability",
       x = "Husband's Income") +
  theme_ggeffects(base_size = 14) +
  theme(legend.position = "top")
```

### Plotting on the logit scale

`ggplot2` provides a built-in `"logit"` transformation for axes via
`scale_y_continuous(transform = "logit")`. This displays predicted probabilities
on the logit scale, $\text{logit}(p) = \log(p / (1-p))$, where the axis labels
remain as probabilities but their spacing reflects the logit transformation.
This is useful because the logistic regression model is linear on the logit scale,
so the predicted curves appear as straighter lines.

Since the `plot()` method for `ggeffects` returns a `ggplot` object, we can
simply add this scale transformation. Note that we need to specify breaks
manually, because the automatic break algorithm does not work well with the
logit transformation.

```{r wlf-logit-scale}
#| out.width = "100%",
#| fig.height = 5,
#| fig.cap = "Predicted probabilities on the logit scale."
plot(wlf.pred,
     line_size = 2) +
  scale_y_continuous(
    transform = "logit",
    breaks = c(0.05, 0.10, 0.25, 0.50, 0.75, 0.90, 0.95)
  ) +
  labs(title = "Predicted Probabilities (logit scale)",
       y = "Probability (logit scale)",
       x = "Husband's Income") +
  theme_ggeffects(base_size = 16) +
  theme(legend.position = "top")
```

These plots show something different, and simpler than what appears on the probability scale. For both the `not.work`
and `fulltime` response categories, the effect of children is essentially an additive one, with having small children
reducing the probability. The response category `parttime` shows an interactive pattern, with different shapes
for those with children present vs. absent.

### Plotting the dichotomies

Recent updates[^issue671] to `ggeffects` now make it possible to calculate and plot the predicted probabilities
for the  sub-models that comprise the nested logit --- for example, plotting
predicted values for the `work` and `full` dichotomies separately. This is done by
passing the argument `submodels = "dichotomies"` to `predict_response()`. 

[^issue671]: I'm grateful to Daniel Lüdecke for considering [this problem](https://github.com/strengejacke/ggeffects/issues/671) and revising the `ggeffects` package to more fully accommodate nestedLogit models.

In such plots,
it is sometimes useful to show explicitly the points on the grid of predictor values
used in estimation. Just add `geom_point()` for this. You can also achieve greater
resolution in the plot by moving the legend inside the plot as shown below.


```{r wlf-dichotomies}
#| out.width = "100%",
#| fig.height = 5,
#| fig.cap = "Predicted probabilities for the two dichotomies."
wlf.pred.dichot <- predict_response(wlf.nested, terms = c("hincome", "children"),
                             submodel = "dichotomies")

plot(wlf.pred.dichot) +
  geom_point() +
  theme(legend.position = "inside",
        legend.position.inside = c(.40, .85))
```


In this view, we see that the decision to work or not varies in a simple decreasing manner
with husband's income, and there is a large decrement in the probability of working
when there are young children in the home.

For those women who are working, the distinction between working fulltime vs. parttime is also
simple to describe.


## 🐊 Alligator food choice

As a simpler example with a single continuous predictor, we fit a nested logit model to the
`gators` data (originally from @Agresti:2002)
predicting primary food choice from alligator length.
The first dichotomy contrasts {Other} vs. {Fish, Invertebrates}, and the second
contrasts {Fish} vs. {Invertebrates}.

```{r gators-model}
data(gators)

# setup the dichotomies
gators.dichots = logits(
     other   = dichotomy("Other", c("Fish", "Invertebrates")),
     fish_inv = dichotomy("Fish", "Invertebrates"))

as.tree(gators.dichots)

# fit the model
gators.nested <- nestedLogit(food ~ length,
                             dichotomies = gators.dichots,
                             data = gators)
```

Now, get the predicted response probabilities, and plot them:
```{r gators-ggeffects}
#| out.width = "100%",
#| fig.height = 4,
#| fig.cap = "Predicted food choice probabilities for alligators by length."
predict_response(gators.nested, terms = "length") |>
  plot(line_size = 2)
```

As you can see, the main thing going on here is that larger alligators prefer fish, which smaller ones prefer invertebrates.
This makes perfect sense if you're an alligator!

For comparison, the basic `nestedLogit::plot()` method using default arguments gives a similar plot, with the three curves
overlaid in a single panel (it uses `graphics::matplot()`). A new feature of the plot method is
to dispense with a legend entirely, by using direct labels (`label = TRUE`) on the predicted curves.

```{r gators-plot, echo=-1}
#| out.width = "100%",
#| fig.height = 5,
#| fig.cap = "Predicted food choice probabilities for alligators by length."
par(mar = c(4, 4, 1, 1) + 0.5)
plot(gators.nested, x.var = "length",
     lty=1, lwd = 4,
     label = TRUE, label.col = "black", cex.lab = 1.3)

```

## Summary

The `ggeffects` package computes and plots predicted probabilities for the
_response categories_ of a nested logit model. As shown above, these can be
displayed on the logit scale using `ggplot2`'s built-in axis transformation.
You can now also display the predicted probabilities for the separate
dichotomies (e.g., `work` and `full`) as illustrated above.

<!--
However, `ggeffects` does not currently provide access to the individual
dichotomy sub-models that comprise the nested logit --- for example, plotting
predicted values for the `work` and `full` dichotomies separately.
(This is due to a conflict in the names of arguments (`model`), and may be resolved
in a future version of `nestedLogit`.)
-->

For  more specialized displays, see `vignette("plotting-ggplot", package = "nestedLogit")`,
which describes a manual workflow using `predict()` with `model = "dichotomies"` and
`as.data.frame()` to construct fully customized `ggplot2` plots.

```{r write-bib, echo=FALSE}
pkgs <- unique(c(.to.cite, .packages()))
knitr::write_bib(pkgs, file = here::here("vignettes", "packages.bib"))
```

## References

```{r, include = FALSE}
options(.opts)
```