Bioconductor: Analysis and comprehension of high-throughput genomic data
Packages, vignettes, work flows
Package installation and use
A package needs to be installed once, using the instructions on the package landing page (e.g., DESeq2).
source("https://bioconductor.org/biocLite.R")
biocLite(c("DESeq2", "org.Hs.eg.db"))biocLite() installs Bioconductor, CRAN, and github packages.
Once installed, the package can be loaded into an R session
library(GenomicRanges)and the help system queried interactively, as outlined above:
help(package="GenomicRanges")
vignette(package="GenomicRanges")
vignette(package="GenomicRanges", "GenomicRangesHOWTOs")
?GRangesGoals
What a few lines of R has to say
x <- rnorm(1000)
y <- x + rnorm(1000)
df <- data.frame(X=x, Y=y)
plot(Y ~ X, df)
fit <- lm(Y ~ X, df)
anova(fit)## Analysis of Variance Table
## 
## Response: Y
##            Df  Sum Sq Mean Sq F value    Pr(>F)    
## X           1 1001.14 1001.14    1013 < 2.2e-16 ***
## Residuals 998  986.27    0.99                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1abline(fit)Classes and methods – “S3”
data.frame()Creates an instance or object
plot(), lm(), anova(), abline(): methods defined on generics to transform instances
Discovery and help
class(fit)
methods(class=class(fit))
methods(plot)
?"plot"
?"plot.formula"tab completion!
Bioconductor classes and methods – “S4”
Example: working with DNA sequences
library(Biostrings)
dna <- DNAStringSet(c("AACAT", "GGCGCCT"))
reverseComplement(dna)##   A DNAStringSet instance of length 2
##     width seq
## [1]     5 ATGTT
## [2]     7 AGGCGCCdata(phiX174Phage)
phiX174Phage##   A DNAStringSet instance of length 6
##     width seq                                                                   names               
## [1]  5386 GAGTTTTATCGCTTCCATGACGCAGAAGTTAAC...TTCGATAAAAATGATTGGCGTATCCAACCTGCA Genbank
## [2]  5386 GAGTTTTATCGCTTCCATGACGCAGAAGTTAAC...TTCGATAAAAATGATTGGCGTATCCAACCTGCA RF70s
## [3]  5386 GAGTTTTATCGCTTCCATGACGCAGAAGTTAAC...TTCGATAAAAATGATTGGCGTATCCAACCTGCA SS78
## [4]  5386 GAGTTTTATCGCTTCCATGACGCAGAAGTTAAC...TTCGATAAAAATGATTGGCGTATCCAACCTGCA Bull
## [5]  5386 GAGTTTTATCGCTTCCATGACGCAGAAGTTAAC...TTCGATAAAAATGATTGGCGTATCCAACCTGCA G97
## [6]  5386 GAGTTTTATCGCTTCCATGACGCAGAAGTTAAC...TTCGATAAAAATGATTGGCGTATCCAACCTGCA NEB03letterFrequency(phiX174Phage, "GC", as.prob=TRUE)##            G|C
## [1,] 0.4476420
## [2,] 0.4472707
## [3,] 0.4472707
## [4,] 0.4470850
## [5,] 0.4472707
## [6,] 0.4470850Discovery and help
class(dna)
?"DNAStringSet-class"
?"reverseComplement,DNAStringSet-method"Experimental design
Wet-lab sequence preparation (figure from http://rnaseq.uoregon.edu/)
(Illumina) Sequencing (Bentley et al., 2008, doi:10.1038/nature07517)