systemPipeR 2.10.0
Note: the most recent version of this tutorial can be found here.
Note: if you use systemPipeR in published research, please cite:
Backman, T.W.H and Girke, T. (2016). systemPipeR: NGS Workflow and Report Generation Environment. BMC Bioinformatics, 17: 388. 10.1186/s12859-016-1241-0.
The intended way of running systemPipeR workflows is via *.Rmd files, which
can be executed either line-wise in interactive mode or with a single command from
R or the command-line. This way comprehensive and reproducible analysis reports
can be generated in PDF or HTML format in a fully automated manner by making use
of the highly functional reporting utilities available for R.
Templates for setting up custom project reports are provided as *.Rmd files
by the helper package systemPipeRdata and in the vignettes subdirectory of
systemPipeR. The corresponding HTML of these report templates are available here: systemPipeRNAseq, systemPipeRIBOseq, systemPipeChIPseq and systemPipeVARseq. To work with *.Rmd files efficiently, basic knowledge of knitr and Latex or R Markdown v2 is required.
Figure 1: systemPipeR’s preconfigured directory structure
The working environment of the sample data loaded in the previous step contains the following pre-configured directory structure. Directory names are indicated in green. Users can change this structure as needed, but need to adjust the code in their workflows accordingly.
CWL param and input.yml files need to be in the same subdirectory.The following parameter files are included in each workflow template:
targets.txt: initial one provided by user; downstream targets_*.txt files are generated automatically*.param/cwl: defines parameter for input/output file operations, e.g.:
hisat2-se/hisat2-mapping-se.cwlhisat2-se/hisat2-mapping-se.yml*_run.sh: optional bash scripts.batchtools.conf.R: defines the type of scheduler for batchtools pointing to template file of cluster, and located in user’s home directory*.tmpl: specifies parameters of scheduler used by a system, e.g. Torque, SGE, Slurm, etc.This workflow demonstrates how to use various utilities for building and running automated end-to-end analysis workflows for RNA-Seq data.
The full workflow can be found here: HTML, .Rmd, and .R.
Load the RNA-Seq sample workflow into your current working directory.
library(systemPipeRdata)
genWorkenvir(workflow = "rnaseq")
setwd("rnaseq")
This template provides some common steps for a RNAseq workflow. One can add, remove, modify
workflow steps by operating on the sal object.
sal <- SPRproject()
sal <- importWF(sal, file_path = "systemPipeRNAseq.Rmd", verbose = FALSE)
Workflow includes following steps:
HISAT2 (or any other RNA-Seq aligner)sal <- runWF(sal)
plotWF(sal)
sal <- renderReport(sal)
sal <- renderLogs(sal)
This workflow demonstrates how to use various utilities for building and running automated end-to-end analysis workflows for ChIP-Seq data.
The full workflow can be found here: HTML, .Rmd, and .R.
Load the ChIP-Seq sample workflow into your current working directory.
library(systemPipeRdata)
genWorkenvir(workflow = "chipseq")
setwd("chipseq")
Workflow includes following steps:
Bowtie2 or rsubreadMACS2This template provides some common steps for a ChIPseq workflow. One can add, remove, modify
workflow steps by operating on the sal object.
sal <- SPRproject()
sal <- importWF(sal, file_path = "systemPipeChIPseq.Rmd", verbose = FALSE)
sal <- runWF(sal)
plotWF(sal)
sal <- renderReport(sal)
sal <- renderLogs(sal)
This workflow demonstrates how to use various utilities for building and running automated end-to-end analysis workflows for VAR-Seq data.
The full workflow can be found here: HTML, .Rmd, and .R.
Load the VAR-Seq sample workflow into your current working directory.
library(systemPipeRdata)
genWorkenvir(workflow = "varseq")
setwd("varseq")
Workflow includes following steps:
gsnap, bwaVariantTools, GATK, BCFtoolsVariantTools and VariantAnnotationVariantAnnotationThis template provides some common steps for a VARseq workflow. One can add, remove, modify
workflow steps by operating on the sal object.
sal <- SPRproject()
sal <- importWF(sal, file_path = "systemPipeVARseq.Rmd", verbose = FALSE)
sal <- runWF(sal)
plotWF(sal)
sal <- renderReport(sal)
sal <- renderLogs(sal)
This workflow demonstrates how to use various utilities for building and running automated end-to-end analysis workflows for RIBO-Seq data.
The full workflow can be found here: HTML, .Rmd, and .R.
Load the RIBO-Seq sample workflow into your current working directory.
library(systemPipeRdata)
genWorkenvir(workflow = "riboseq")
setwd("riboseq")
Workflow includes following steps:
HISAT2 (or any other RNA-Seq aligner)This template provides some common steps for a RIBOseq workflow. One can add, remove, modify
workflow steps by operating on the sal object.
sal <- SPRproject()
sal <- importWF(sal, file_path = "systemPipeRIBOseq.Rmd", verbose = FALSE)
sal <- runWF(sal)
plotWF(sal)
sal <- renderReport(sal)
sal <- renderLogs(sal)
sessionInfo()
## R version 4.4.0 beta (2024-04-15 r86425)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.4 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.19-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] magrittr_2.0.3 systemPipeR_2.10.0
## [3] ShortRead_1.62.0 GenomicAlignments_1.40.0
## [5] SummarizedExperiment_1.34.0 Biobase_2.64.0
## [7] MatrixGenerics_1.16.0 matrixStats_1.3.0
## [9] BiocParallel_1.38.0 Rsamtools_2.20.0
## [11] Biostrings_2.72.0 XVector_0.44.0
## [13] GenomicRanges_1.56.0 GenomeInfoDb_1.40.0
## [15] IRanges_2.38.0 S4Vectors_0.42.0
## [17] BiocGenerics_0.50.0 BiocStyle_2.32.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.1 viridisLite_0.4.2 dplyr_1.1.4
## [4] farver_2.1.1 bitops_1.0-7 fastmap_1.1.1
## [7] digest_0.6.35 lifecycle_1.0.4 pwalign_1.0.0
## [10] compiler_4.4.0 rlang_1.1.3 sass_0.4.9
## [13] tools_4.4.0 utf8_1.2.4 yaml_2.3.8
## [16] systemPipeRdata_2.7.0 knitr_1.46 S4Arrays_1.4.0
## [19] labeling_0.4.3 htmlwidgets_1.6.4 interp_1.1-6
## [22] DelayedArray_0.30.0 xml2_1.3.6 RColorBrewer_1.1-3
## [25] abind_1.4-5 withr_3.0.0 hwriter_1.3.2.1
## [28] grid_4.4.0 fansi_1.0.6 latticeExtra_0.6-30
## [31] colorspace_2.1-0 ggplot2_3.5.1 scales_1.3.0
## [34] tinytex_0.50 cli_3.6.2 rmarkdown_2.26
## [37] crayon_1.5.2 generics_0.1.3 remotes_2.5.0
## [40] rstudioapi_0.16.0 httr_1.4.7 cachem_1.0.8
## [43] stringr_1.5.1 zlibbioc_1.50.0 parallel_4.4.0
## [46] formatR_1.14 BiocManager_1.30.22 vctrs_0.6.5
## [49] Matrix_1.7-0 jsonlite_1.8.8 bookdown_0.39
## [52] systemfonts_1.0.6 magick_2.8.3 jpeg_0.1-10
## [55] crosstalk_1.2.1 jquerylib_0.1.4 glue_1.7.0
## [58] codetools_0.2-20 DT_0.33 stringi_1.8.3
## [61] gtable_0.3.5 deldir_2.0-4 UCSC.utils_1.0.0
## [64] munsell_0.5.1 tibble_3.2.1 pillar_1.9.0
## [67] htmltools_0.5.8.1 GenomeInfoDbData_1.2.12 R6_2.5.1
## [70] kableExtra_1.4.0 evaluate_0.23 lattice_0.22-6
## [73] highr_0.10 png_0.1-8 bslib_0.7.0
## [76] Rcpp_1.0.12 svglite_2.1.3 SparseArray_1.4.0
## [79] xfun_0.43 pkgconfig_2.0.3
This project is funded by NSF award ABI-1661152.