--- title: "FLAMES 2.3.1" author: "Changqing Wang, Oliver Voogd, Yupei You" package: FLAMES output: BiocStyle::html_document: toc: true vignette: > %\VignetteIndexEntry{FLAMES} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} bibliography: ref.bib --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(FLAMES) ``` # FLAMES The overhauled FLAMES 2.3.1 pipeline provides convenient pipelines for performing single-cell and bulk long-read analysis of mutations and isoforms. The pipeline is designed to take various type of experiments, e.g. with or without known cell barcodes and custome cell barcode designs, to reduce the need of constantly re-inventing the wheel for every new sequencing protocol. ![(#fig:workflow) FLAMES pipeline](FLAMESpipeline-01.png){width=800px} ## Creating a pipeline To start your long-read RNA-seq analysis, simply create a pipeline via either `BulkPipeline()`, `SingleCellPipeline()` or `MultiSampleSCPipeline()`. Let's try `SingleCellPipeline()` first: ```{r, eval=TRUE, echo=TRUE} outdir <- tempfile() dir.create(outdir) # some example data # known cell barcodes, e.g. from coupled short-read sequencing bc_allow <- file.path(outdir, "bc_allow.tsv") R.utils::gunzip( filename = system.file("extdata", "bc_allow.tsv.gz", package = "FLAMES"), destname = bc_allow, remove = FALSE ) # reference genome genome_fa <- file.path(outdir, "rps24.fa") R.utils::gunzip( filename = system.file("extdata", "rps24.fa.gz", package = "FLAMES"), destname = genome_fa, remove = FALSE ) pipeline <- SingleCellPipeline( # use the default configs config_file = create_config(outdir), outdir = outdir, # the input fastq file fastq = system.file("extdata", "fastq", "musc_rps24.fastq.gz", package = "FLAMES"), # reference annotation file annotation = system.file("extdata", "rps24.gtf.gz", package = "FLAMES"), genome_fa = genome_fa, barcodes_file = bc_allow ) pipeline ``` ## Running the pipeline To execute the pipeline, simply call `run_FLAMES(pipeline)`. This will run through all the steps in the pipeline, returning a updated pipeline object: ```{r, eval=TRUE, echo=TRUE} pipeline <- run_FLAMES(pipeline) pipeline ``` If you run into any error, `run_FLAMES()` will stop and return the pipeline object with the error message. After resolving the error, you can run `resume_FLAMES(pipeline)` to continue the pipeline from the last step. There is also `run_step(pipeline, step_name)` to run a specific step in the pipeline. Let's show this by deliberately causing an error via deleting the input files: ```{r, eval=TRUE, echo=TRUE} # set up a new pipeline outdir2 <- tempfile() pipeline2 <- SingleCellPipeline( config_file = create_config(outdir), outdir = outdir2, fastq = system.file("extdata", "fastq", "musc_rps24.fastq.gz", package = "FLAMES"), annotation = system.file("extdata", "rps24.gtf.gz", package = "FLAMES"), genome_fa = genome_fa, barcodes_file = bc_allow ) # delete the reference genome unlink(genome_fa) pipeline2 <- run_FLAMES(pipeline2) pipeline2 ``` Let's then fix the error by re-creating the reference genome file and resume the pipeline: ```{r, eval=TRUE, echo=TRUE} R.utils::gunzip( filename = system.file("extdata", "rps24.fa.gz", package = "FLAMES"), destname = genome_fa, remove = FALSE ) pipeline2 <- resume_FLAMES(pipeline2) pipeline2 ``` After completing the pipeline, a `SingleCellExperiment` object is created (or `SummarizedExperiment` for bulk pipeline and list of `SingleCellExperiment` for multi-sample pipeline). You can access the results via `experiment(pipeline)`: ```{r, eval=TRUE, echo=TRUE} experiment(pipeline) ``` ## Visualizations ### QC plots Quality metrics are collected throughout the pipeline, and FLAMES provide visiualization functions to plot the metrics. For the first demultiplexing step, we can use the `plot_demultiplex` function to see how well many reads are retained after demultiplexing: ```{r, eval=TRUE, echo=TRUE} # example_pipeline provides an example pipeline for each of the three types # of pipelines: BulkPipeline, SingleCellPipeline and MultiSampleSCPipeline mspipeline <- example_pipeline("MultiSampleSCPipeline") # don't have to run the entire pipeline for this # let's just run the demultiplexing step mspipeline <- run_step(mspipeline, "barcode_demultiplex") plot_demultiplex(mspipeline) ``` ### Work in progress More examples coming soon. ## FLAMES on Windows Due to FLAMES requiring minimap2 and pysam, FLAMES is currently unavaliable on Windows. ## Citation Please cite the flames's paper [@flames] if you use flames in your research. As FLAMES used incorporates BLAZE [@blaze], flexiplex [@flexiplex] and minimap2 [@minimap2], samtools, bambu [@bambu]. Please make sure to cite when using these tools. # Session Info ``` {r echo=FALSE} utils::sessionInfo() ``` # References