submitr is the third package in the From the
Notebook to the Cluster family, alongside toolero
and containr. It provides a workflow for submitting
containerized R analyses to the UW-Madison Center for High Throughput
Computing (CHTC) from inside R.
htc_config() – create or read a project-level
htc.cfg configuration file. On first use, prompts
interactively for username and server, displays ControlMaster SSH setup
guidance to reduce Duo MFA prompts, writes htc.cfg, and
adds it to .gitignore. Subsequent calls read the existing
file and validate server reachability. Returns a named list with
username and server. Errors informatively when
username or server are supplied as empty
strings.htc_gen_submit() – generate an HTCondor
.sub submit file from project parameters. Supports
single-job and multiple-job modes. Multiple mode reads a manifest from
toolero::write_by_group(manifest = TRUE), extracts
filenames, writes subdatasets.csv, and emits
queue file from subdatasets.csv. Resource presets
(small, medium, large,
custom) are loaded at runtime from
inst/extdata/htc-resources.yaml; a local
./htc-resources.yaml takes precedence over the package
default. GPU support via gpu = TRUE and
gpu_options. comments = TRUE annotates each
section of the generated file with explanatory text.
htc_gen_executable() – generate the .sh
executable script that HTCondor runs inside the container. Produces a
four-element script: shebang, mkdir, Rscript,
and tar. In multiple-job mode, passes ${1} as
a positional argument to the R script. r_script must be
supplied explicitly – there is no default.
set_executable = TRUE (default) sets executable permissions
via Sys.chmod().
htc_upload() – copy files to the CHTC submit node
via scp. Accepts single files, vectors of files,
directories (transferred recursively), and glob patterns.
remote_path defaults to "~/".
dry_run = TRUE previews the command without executing
it.
htc_submit() – run condor_submit on the
remote submit node via SSH from the remote directory where files were
uploaded. Returns the cluster ID invisibly for use with
htc_status(). Supports
dry_run = TRUE.
htc_status() – check job progress via
condor_q. Optionally filters by cluster ID.
watch = TRUE polls at interval seconds
(default 60) until the cluster ID leaves the queue. Returns
condor_q output invisibly as a character vector. Supports
dry_run = TRUE.
htc_download() – copy result files back from the
submit node via scp. Supports single filenames, vectors of
filenames, and glob patterns ("*.tar.gz",
"job.*"). Glob patterns are single-quoted to prevent local
shell expansion. local_path defaults to ".".
Supports dry_run = TRUE.
inst/extdata/htc-resources.yaml ships with the
package and provides default resource presets for
htc_gen_submit().
inst/extdata/hello-world.sub and
inst/extdata/hello-world.sh included as test files for
end-to-end workflow verification.
inst/extdata/sample.R included as a sample R script
for use in examples.
The test suite uses a three-layer strategy to handle the fact that
end-to-end testing requires a live HTCondor environment and SSH access.
Layer 1 covers argument validation. Layer 2 covers command construction
using dry_run = TRUE and mocked bindings. Layer 3
integration tests are opt-in via
Sys.setenv(CHTC_USERNAME = "your.netid") and never run on
CRAN or CI. 153 tests passing across seven test files.