---
title: "Summary tables for APA-style reporting"
description: >
  Learn when to use table_categorical(), table_continuous(), and
  table_continuous_lm() for APA-style reporting in R, how their shared
  arguments fit together, and which output format to choose for
  console, Quarto, Word, or Excel workflows.
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Summary tables for APA-style reporting}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

build_rich_tables <- identical(Sys.getenv("IN_PKGDOWN"), "true")

pkgdown_dark_gt <- function(tab) {
  tab |>
    gt::opt_css(
      css = paste(
        ".gt_table, .gt_heading, .gt_col_headings, .gt_col_heading,",
        ".gt_column_spanner_outer, .gt_column_spanner, .gt_title,",
        ".gt_subtitle, .gt_sourcenotes, .gt_sourcenote {",
        "  background-color: transparent !important;",
        "  color: currentColor !important;",
        "}",
        sep = "\n"
      )
    )
}
```

```{r setup}
library(spicy)
```

`table_categorical()`, `table_continuous()`, and
`table_continuous_lm()` share the same reporting grammar: choose
variables with `select`, optionally split the table with `by`, apply
readable labels, and pick an output format that matches your reporting
workflow. This vignette focuses on that shared logic rather than
repeating every function-specific option.

## Choose the right function

Use the function that matches the type of variables you want to report:

| Function | Use for | Optional `by` | Typical additions |
|:--|:--|:--|:--|
| `table_categorical()` | Factors, labelled categorical variables, grouped frequency-style summaries | Yes | Chi-squared test, association measure, confidence interval |
| `table_continuous()` | Numeric or continuous variables | Yes | Group-comparison test, statistic, effect size |
| `table_continuous_lm()` | Continuous outcomes in a linear-model framework | No, requires a single predictor | Robust `HC*` standard errors, model fit, case weights |

In practice:

- use `table_categorical()` for smoking status, education, or activity;
- use `table_continuous()` for BMI, income, or scale scores;
- use `table_continuous_lm()` when the same outcomes should be reported
  through simple weighted or robust linear models;
- keep `by` for the grouping variable you want to compare across.

## A shared interface

Both functions use the same core arguments:

```{r grammar-categorical}
table_categorical(
  sochealth,
  select = c(smoking, physical_activity),
  by = education,
  labels = c("Smoking status", "Regular physical activity"),
  output = "tinytable"
)
```

```{r grammar-continuous}
table_continuous(
  sochealth,
  select = c(bmi, wellbeing_score, life_sat_health),
  by = education,
  labels = c(
    bmi = "Body mass index",
    wellbeing_score = "Well-being score",
    life_sat_health = "Satisfaction with health"
  ),
  output = "tinytable"
)
```

```{r grammar-continuous-lm}
table_continuous_lm(
  sochealth,
  select = c(bmi, wellbeing_score, life_sat_health),
  by = education,
  weights = weight
)
```

The same argument pattern works in both cases:

- `select` chooses the reported variables;
- `by` defines the grouping structure;
- `labels` cleans up the row labels;
- `output` decides how the result is rendered or exported.

For model-based continuous tables, the same pattern applies, but `by`
must be a single predictor because one linear model is fit per outcome.

## A practical reporting sequence

A common report contains both table types, often with the same grouping
variable. For example, you might first summarize categorical health
behaviors, then summarize continuous well-being indicators.

### Categorical variables

```{r report-categorical, eval = build_rich_tables}
pkgdown_dark_gt(
  table_categorical(
    sochealth,
    select = c(smoking, physical_activity, dentist_12m),
    by = education,
    labels = c(
      "Smoking status",
      "Regular physical activity",
      "Visited a dentist in the last 12 months"
    ),
    output = "gt"
  )
)
```

### Continuous variables

```{r report-continuous, eval = build_rich_tables}
pkgdown_dark_gt(
  table_continuous(
    sochealth,
    select = c(bmi, wellbeing_score, life_sat_health),
    by = education,
    labels = c(
      bmi = "Body mass index",
      wellbeing_score = "Well-being score",
      life_sat_health = "Satisfaction with health"
    ),
    p_value = TRUE,
    effect_size = TRUE,
    output = "gt"
  )
)
```

This keeps the reporting structure consistent while still using the
function that fits each variable type.

### Model-based continuous variables

```{r report-continuous-lm, eval = build_rich_tables}
pkgdown_dark_gt(
  table_continuous_lm(
    sochealth,
    select = c(bmi, wellbeing_score, life_sat_health),
    by = sex,
    vcov = "HC3",
    statistic = TRUE,
    output = "gt"
  )
)
```

This is the better summary-table path when the article is already
organized around simple linear models, weighted analyses, or robust
standard errors.

## Choose the output format

All three functions support the same reporting formats:

| Output | Best use |
|:--|:--|
| `"default"` | Quick console review in plain ASCII |
| `"tinytable"` | Quarto or R Markdown documents |
| `"gt"` | HTML output with styled reporting tables |
| `"flextable"` | Office-first workflows; also renders in HTML |
| `"excel"` | Spreadsheet handoff or downstream editing |
| `"word"` | Direct `.docx` export |
| `"clipboard"` | Fast pasting into another application |

Pick the output based on where the table is going, not on the analysis
itself. The underlying selection and grouping pattern stays the same.

If you want an object that fits naturally into Word and PowerPoint
workflows but can also be rendered in HTML documents, `flextable` is a
good choice:

```{r output-flextable, eval = FALSE}
if (requireNamespace("flextable", quietly = TRUE)) {
  table_continuous(
    sochealth,
    select = c(bmi, wellbeing_score, life_sat_health),
    by = education,
    output = "flextable"
  )
}
```

## Post-process the returned table object

Both summary-table helpers return regular `gt`, `tinytable`, or
`flextable` objects, so you can keep styling them with the native
package API.

Use `gt::` functions when you want to keep the `gt` workflow:

```{r postprocess-gt, eval = build_rich_tables}
tab <- pkgdown_dark_gt(table_categorical(
  sochealth,
  select = c(smoking, physical_activity),
  by = education,
  labels = c("Smoking status", "Regular physical activity"),
  output = "gt"
))

tab |>
  gt::tab_header(
    title = "Health behaviors by education",
    subtitle = "Categorical summary table"
  ) |>
  gt::tab_source_note(
    gt::md("*Percentages are computed within each education group.*")
  )
```

Use `tinytable::` functions when you want lightweight table-specific
styling:

```{r postprocess-tinytable, eval = build_rich_tables}
tab <- table_categorical(
  sochealth,
  select = c(smoking, physical_activity),
  by = education,
  labels = c("Smoking status", "Regular physical activity"),
  output = "tinytable"
)

tab |>
  tinytable::style_tt(
    i = 2:3,
    j = 2:5,
    background = "red",
    color = "white",
    bold = TRUE
  )
```

Use `flextable::` functions when you want to keep working toward Office
or HTML document output. The example is shown as code here because the
dark pkgdown theme is not a reliable preview of the final `flextable`
HTML rendering:

```{r postprocess-flextable, eval = FALSE}
if (requireNamespace("flextable", quietly = TRUE)) {
  tab <- table_continuous(
    sochealth,
    select = c(bmi, wellbeing_score),
    by = education,
    output = "flextable"
  )

  tab |>
    flextable::theme_booktabs() |>
    flextable::autofit() |>
    flextable::fontsize(size = 10, part = "all")
}
```

## Keep the detailed options in the function-specific articles

The dedicated articles go deeper into each function:

- `table_categorical()` covers missing values, level filtering,
  association measures, and one-way frequency-style tables.
- `table_continuous()` covers grouped descriptive statistics,
  parametric and nonparametric tests, and effect sizes.
- `table_continuous_lm()` covers estimated means or slopes from simple
  linear models, robust standard errors, and case weights.

Use this vignette as the final reporting overview, then consult the
function-specific articles when you need the detailed controls.