This page documents the general design of fastreg. It covers some requirements, the public-facing interface, and some diagrams highlighting the general flow of the main functions.
Requirements
The core requirements of fastreg are to:
- Convert Danish register data from SAS files to the modern and efficient Parquet format.
- Read register Parquet files into R as a DuckDB table.
- Provide a targets pipeline template to convert multiple registers in parallel.
- Provide functions to list available SAS or Parquet register files directly from R.
Interface
The interface (the functions and objects that are exposed to users) is based on some specific naming conventions. Specifically, we generally name function by the action they perform and the object(s) they perform it on in the format {action}_{object}(). Actions are verbs that describe what a function does, while objects are nouns that represent the objects that the functions operate on. Below is an overview of the main actions and objects within fastreg.
The actions are:
convert: Convert a register SAS file (or multiple) to Parquet.
list: List files in a directory, e.g., SAS or Parquet files.
read: Read a Parquet register into R as a DuckDB table.
use: Use a template in the current project.
While the objects are:
chunk_size: Number of rows to read per chunk during conversion.
path: A character vector of one or more paths.
output_dir: The directory to save the Parquet output to.
Tip
For a list of all the public functions, see the Reference page.
Converting SAS files from a single register
Converting multiple registers in parallel
Reading a Parquet register