---
title: Design
vignette: >
%\VignetteIndexEntry{Design}
%\VignetteEngine{quarto::html}
%\VignetteEncoding{UTF-8}
---
This page documents the general design of fastreg. It covers some
requirements, the public-facing interface, and some diagrams
highlighting the general flow of the main functions.
## Requirements
The core requirements of fastreg are to:
1. Convert Danish register data from SAS files to the modern and
efficient Parquet format.
2. Read register Parquet files into R as a DuckDB table.
3. Provide a [targets](https://docs.ropensci.org/targets/) pipeline
template to convert multiple registers in parallel.
4. Provide functions to list available SAS or Parquet register files
directly from R.
## Interface
The interface (the functions and objects that are exposed to users) is
based on some specific naming conventions. Specifically, we generally
name function by the **action** they perform and the **object(s)** they
perform it on in the format `{action}_{object}()`. **Actions** are verbs
that describe what a function does, while **objects** are nouns that
represent the objects that the functions operate on. Below is an
overview of the main actions and objects within fastreg.
The actions are:
- `convert`: Convert a register SAS file (or multiple) to Parquet.
- `list`: List files in a directory, e.g., SAS or Parquet files.
- `read`: Read a Parquet register into R as a DuckDB table.
- `use`: Use a template in the current project.
While the objects are:
- `chunk_size`: Number of rows to read per chunk during conversion.
- `path`: A character vector of one or more paths.
- `output_dir`: The directory to save the Parquet output to.
::: callout-tip
For a list of all the public functions, see the
[Reference](https://dp-next.github.io/fastreg/reference/index.html)
page.
:::
### Converting SAS files from a single register
```{mermaid}
%%| label: fig-flow
%%| fig-cap: "Expected workflow for converting SAS files from a single register using `convert_register()`."
%%| fig-alt: "A flowchart showing the expected flow of converting register SAS files to Parquet files."
flowchart TD
identify_paths("Identify register path(s)
with list_sas_files(path)")
path[/"path
[Character vector]"/]
output_dir[/"output_dir
[Character scalar]"/]
chunk_size[/"chunk_size
[Integer scalar]"/]
convert_register("convert_register()")
output[/"Parquet file(s)
written to output_dir"/]
%% Edges
identify_paths -.-> path --> convert_register
output_dir & chunk_size --> convert_register
convert_register --> output
%% Style
style identify_paths fill:#FFFFFF, color:#000000, stroke-dasharray: 5 5
```
### Converting multiple registers in parallel
```{mermaid}
%%| label: fig-targets-flow
%%| fig-cap: "Expected workflow for converting multiple registers using the targets pipeline."
%%| fig-alt: "A flowchart showing the expected flow of converting register SAS files to Parquet files using the provided targets pipeline template."
flowchart TD
copy_pipeline("use_targets_template()")
edit["Edit _targets.R as needed"]
run_pipeline("targets::tar_make()")
output[/"Parquet file(s)
written to directory
specified in _targets.R"/]
%% Edges
copy_pipeline --> edit --> run_pipeline --> output
%% Style
style edit fill:#FFFFFF, color:#000000, stroke-dasharray: 5 5
```
### Reading a Parquet register
```{mermaid}
%%| label: fig-flow-use
%%| fig-cap: "Expected workflow for reading a Parquet register as a DuckDB table using `read_register()`."
%%| fig-alt: "A flowchart showing the expected flow of reading a Parquet register created with the fastreg package."
flowchart TD
path[/"path
[Character scalar]"/]
read_register("read_register()")
output[/"Output
[DuckDB table]"/]
%% Edges
path --> read_register --> output
```