---
title: "Regression Tables with huxreg"
output: 
  html_document: default
vignette: >
  %\VignetteIndexEntry{Regression Tables with huxreg}  
  %\VignetteEngine{knitr::rmarkdown}  
  %\VignetteEncoding{UTF-8}
header-includes:
  - \usepackage{placeins}
---


```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(huxtable)

is_latex <- guess_knitr_output_format() == "latex"
knitr::knit_hooks$set(
  barrier = function(before, options, envir) {
    if (! before && is_latex) knitr::asis_output("\\FloatBarrier")
  }
)

if (is_latex) knitr::opts_chunk$set(barrier = TRUE)
options(huxtable.latex_siunitx_align = FALSE)
```


## Regression tables with `huxreg`

Huxtable includes the function `huxreg` to build a table of regressions.

You call `huxreg` with a list of models. These models can be of any class
which has a `tidy` method defined in the
[broom](https://cran.r-project.org/?package=broom) package. The method should
return a list of regression coefficients with names `term`, `estimate`,
`std.error` and `p.value`. That covers most standard regression packages.

Let's start by running some regressions to predict a diamond's price.


```{r}
data(diamonds, package = "ggplot2")
diamonds <- diamonds[1:100,]

lm1 <- lm(price ~ carat + depth, diamonds)
lm2 <- lm(price ~ depth + factor(color, ordered = FALSE), diamonds)
lm3 <- lm(log(price) ~ carat + depth, diamonds)
```


Now, we use `huxreg` to display the regression output side by side.


```{r}

huxreg(lm1, lm2, lm3)
```


The basic output includes estimates, standard errors and summary statistics. 

Some of those variable names are hard to read. We can change them by providing
a named vector of variables in the `coefs` argument.


```{r}
color_names <- grep("factor", names(coef(lm2)), value = TRUE)
names(color_names) <- gsub(".*)(.)", "Color: \\1", color_names)

huxreg(lm1, lm2, lm3, coefs = c("Carat" = "carat", "Depth" = "depth", color_names))
```


Or, since the output from `huxreg` is just a huxtable, we could just
edit its contents directly.


```{r}
diamond_regs <- huxreg(lm1, lm2, lm3)
diamond_regs[seq(8, 18, 2), 1] <- paste("Color:", LETTERS[5:10])

# prints the same as above

```

Of course, we aren't limited to just changing names. We can also make our table
prettier. Let's put our footnote in italic, add a caption, and highlight
the cell background of significant coefficients. All of these are just standard
huxtable commands.


```{r}
suppressPackageStartupMessages(library(dplyr))

diamond_regs |> 
      map_background_color(-1, -1, by_regex(
        "\\*" = "yellow"
      )) |> 
      set_italic(final(1), 1) |> 
      set_caption("Linear regressions of diamond prices")

```


By default, standard errors are shown below coefficient estimates. To display
them in a column to the right, use `error_pos = "right"`:


```{r}
huxreg(lm1, lm3, error_pos = "right")
```


This will give column headings a column span of 2.

To display standard errors in the same cell as estimates, use `error_pos = "same"`:


```{r}
huxreg(lm1, lm3, error_pos = "same")
```


You can change the default column headings by naming the model arguments:


```{r}
huxreg("Price" = lm1, "Log price" = lm3)
```


To display a particular row of summary statistics, use the `statistics`
parameter. This should be a character vector. Valid values are anything returned
from your models by `broom::glance`:


```{r}
gl <- as_hux(broom::glance(lm1))

gl |> 
      restack_down(cols = 3, on_remainder = "fill") |> 
      set_bold(odds, everywhere)
```


Another value you can use is `"nobs"`, which
returns the number of observations from the regression. If the `statistics`
vector has names, these will be used for row headings:


```{r}
huxreg(lm1, lm3, statistics = c("N. obs." = "nobs", 
      "R squared" = "r.squared", "F statistic" = "statistic",
      "P value" = "p.value"))
```


By default, `huxreg` displays significance stars. You can alter the symbols used
and significance levels with the `stars` parameter, or set `stars = NULL` to
turn off significance stars completely.


```{r}
huxreg(lm1, lm3, stars = c(`*` = 0.1, `**` = 0.05, `***` = 0.01)) # a little boastful?
```


You aren't limited to displaying standard errors of the estimates. If you
prefer, you can display t statistics or p values, using the `error_format`
option. Any column from `tidy` can be used by putting it in curly brackets:


```{r}
# Another useful column: p.value
huxreg(
        lm1, lm3, 
        error_format = "[{statistic}]", 
        note         = "{stars}. T statistics in brackets."
      )

```


Here we also changed the footnote, using `note`. If `note` contains the string
`"{stars}"` it will be replaced by a description of the significance stars used.
If you don't want a footnote, just set `note = NULL`.

Alternatively, you can display confidence intervals. Use `ci_level` to set the
confidence level for the interval, then use `{conf.low}` and `{conf.high}` in
`error_format`:


```{r}
huxreg(lm1, lm3, ci_level = .99, error_format = "({conf.low} -- {conf.high})")
```


To change number formatting, set the `number_format` parameter. This works the
same as the `number_format` property for a huxtable - if it is numeric, numbers
will be rounded to that many decimal places; if it is character, it will be
taken as a format to the base R `sprintf` function. `huxreg` tries to be smart
and to format summary statistics like `nobs` as integers.


```{r}
huxreg(lm1, lm3, number_format = 2)
```


Lastly, if you want to bold all significant coefficients, set the parameter
`bold_signif` to a maximum significance level:


```{r}
huxreg(lm1, lm3, bold_signif = 0.05)
```


## Altering data

Sometimes, you want to report different statistics for a model. For example, you
might want to use robust standard errors.

One way to do this is to pass a `tidy`-able test object into `huxreg`. The
function `coeftest` in the "lmtest" package has `tidy` methods defined:


```{r}
library(lmtest)
library(sandwich)
lm_robust <- coeftest(lm1, vcov = vcovHC, save = TRUE)
huxreg("Normal SEs" = lm1, "Robust SEs" = lm_robust)
```


If that is not possible, you can compute statistics yourself and add them to
your model using the `tidy_override` function:


```{r}
lm_fixed <- tidy_override(lm1, p.value = c(0.5, 0.2, 0.06))
huxreg("Normal p values" = lm1, "Supplied p values" = lm_fixed)
```


You can override any statistics returned by `tidy` or `glance`.

If you want to completely replace the output of tidy, use the `tidy_replace()`
function. For example, here's how to print different coefficients for a 
multinomial model.


```{r}
mnl <- nnet::multinom(gear ~ mpg, mtcars)
tidied <- broom::tidy(mnl)
models <- list()
models[["4 gears"]] <- tidy_replace(mnl, tidied[tidied$y.level == 4, ])
models[["5 gears"]] <- tidy_replace(mnl, tidied[tidied$y.level == 5, ])
huxreg(models, statistics = "AIC")
```