Why bother?
GRanges for describing regions of interest
in sequencing experimentsMotivation
'Is a' versus 'has a'
Starting our class definition
matrix() with row- and column data.frame() annotationsConstruct with setClass; returns a simple generating function.
.AnnMat <- setClass("AnnMat",
  representation(matData="matrix", rowData="data.frame",
      colData="data.frame"))
contains= argument;
multiple in inheritance possible.am0 <- .AnnMat()
am1 <- .AnnMat(
    matData=matrix(1:10, 2, dimnames=list(letters[1:2], LETTERS[1:5])),
    rowData=data.frame(roi_id=1:2),
    colData=data.frame(
        sample_id=1:5,
        treatment=c("A", "A", "B", "B", "B")))
A simple method: accessors (“getters”)
Definition of generics
setGeneric("rowData", function(x, ...) standardGeneric("rowData"))
## [1] "rowData"
setGeneric("colData", function(x, ...) standardGeneric("colData"))
## [1] "colData"
setGeneric("matData", function(x, ...) standardGeneric("matData"))
## [1] "matData"
Then methods implemented of class-specific methods on the generics
setMethod("rowData", "AnnMat", function(x, ...) x@rowData)
## [1] "rowData"
setMethod("colData", "AnnMat", function(x, ...) x@colData)
## [1] "colData"
setMethod("matData", "AnnMat", function(x, ...) x@matData)
## [1] "matData"
A simple method: dim and dimnames
Discover existing generic, for signature
getGeneric("dim")
## standardGeneric for "dim" defined from package "base"
## 
## function (x) 
## standardGeneric("dim", .Primitive("dim"))
## <bytecode: 0x103cb4558>
## <environment: 0x103c9ced8>
## Methods may be defined for arguments: x
## Use  showMethods("dim")  for currently available ones.
setMethod("dim", "AnnMat", function(x) dim(matData(x)))
## [1] "dim"
setMethod("dimnames", "AnnMat", function(x) dimnames(matData(x)))
## [1] "dimnames"
nrow() and ncol() (and rownames() and
colnames()) for free! Why is that?dim(am1)
## [1] 2 5
nrow(am1)
## [1] 2
A simple method: show
getGeneric("show")showMethods("show", where=search())Our implementation: brief summary, with an eye toward re-use by derived classes. Avoid direct slot access
setMethod("show", "AnnMat", function(object)
{
    cat("class:", class(object), "\n")
    cat("dim:", dim(object), "\n")
    cat("rowData names():", names(rowData(object)), "\n")
    cat("colData names():", names(colData(object)), "\n")
})
## [1] "show"
am1
## class: AnnMat 
## dim: 2 5 
## rowData names(): roi_id 
## colData names(): sample_id treatment
Question Is there a better overall philosophy for show?
A more complicated method: updating ('replacement', 'setter') methods
Provide the illusion and simple syntax for in-place modification
A familiar example: update the value of a column in a data.frame
df <- data.frame(x=1:5, y=5:1)
df[,"x"] <- log(df$x)
df[,"x"] as
[.data.frame, “the subset method for data.frame”, and df[,"x"]
<- value as
[<-.data.frame “the subset-replace method for data.frame”.[<-.data.framehead(get("[<-.data.frame"))
##                                                        
## 1 function (x, i, j, value)                            
## 2 {                                                    
## 3     if (!all(names(sys.call()) %in% c("", "value"))) 
## 4         warning("named arguments are discouraged")   
## 5     nA <- nargs()                                    
## 6     if (nA == 4L) {
R's parser translates df[,"x"] <- value to
[<-.data.frame(x, , “x”, value) and actually modifies (a copied, 
if necessary) first argument.
Replacement methods, e.g., matData<-, signature takes the object
to be updated, additional optional arguments, and the value to
update the argument with
setGeneric("matData<-", function(x, ..., value)
    standardGeneric("matData<-"))
## [1] "matData<-"
Dispatch on one or both of x, value
Impelement as a method that dispatches on both the object and value, updates the slot, and returns the updated object.
setReplaceMethod("matData", c("AnnMat", "matrix"),
    function(x, ..., value)
{
    x@matData <- value
    x
})
## [1] "matData<-"
Exercise: walk through how that assignment in the body works.
Another replacement method, for dimnames<- (the generic already
exists; what is it?)
setReplaceMethod("dimnames", c("AnnMat", "list"), 
    function(x, value)
{
    dimnames(matData(x)) <- value
    value
})
## [1] "dimnames<-"
Exercise: walk through how that assignment in the body of the method works
Hey neat, we get rownames<- and colnames<- for free!
A more complicated operation: validity
Constraints on row, column and matrix dimensions: all must be equal
validity argument to setClass, or setValidity() function call. 
Validity function is weird
Evaluated frequently, so needs to be efficient / light-weight
setValidity("AnnMat", function(object) {
     msg <- NULL
     if (nrow(rowData(object)) != nrow(matData(object)))
         msg <- c(msg, 
             "number of rowData rows and matData rows differ")
     if (nrow(colData(object)) != ncol(matData(object)))
         msg <- c(msg,
             "number of colData rows and matData columns differ")
     if (is.null(msg)) TRUE else msg
})
## Class "AnnMat" [in ".GlobalEnv"]
## 
## Slots:
##                                        
## Name:     matData    rowData    colData
## Class:     matrix data.frame data.frame
In action:
.AnnMat(matData=matrix(1:10, 2), 
     rowData=data.frame(roi_id=1:2),
     colData=data.frame(sample_id=1:5))
## class: AnnMat 
## dim: 2 5 
## rowData names(): roi_id 
## colData names(): sample_id
cat(try({
     .AnnMat(matData=matrix(1:10, 2), 
         rowData=data.frame(roi_id=1:5),
         colData=data.frame(sample_id=1:2))
}))
## Error in validObject(.Object) : 
##   invalid class "AnnMat" object: 1: number of rowData rows and matData rows differ
## invalid class "AnnMat" object: 2: number of colData rows and matData columns differ
A more complicated method: subsetting
Why do we need this? Part of the informal matrix 'API' expected by a user
Discovery
getGeneric("[")
## standardGeneric for "[" defined from package "base"
## 
## function (x, i, j, ..., drop = TRUE) 
## standardGeneric("[", .Primitive("["))
## <bytecode: 0x102bac148>
## <environment: 0x102c2b190>
## Methods may be defined for arguments: x, i, j, drop
## Use  showMethods("[")  for currently available ones.
Possible methods multiply – x times i times j; e.g., i could be 
integer, logical, character, …
One approach – facade of methods that do minimal work to translate into a common base function
Special variable classes: ANY, missing
Exploit default initialize function, which acts as a copy constructor 
that updates slots in its first argument with values provided by named
arguments.
Pass '…' to allow derived classes to use this method
setMethod("[", c("AnnMat", "ANY", "ANY"),
    function(x, i, j,  ..., drop=TRUE)
{
    ## FIXME: warn user about ignoring 'drop'?
   initialize(x, matData=matData(x)[i, j, drop=FALSE],
        rowData=rowData(x)[i,,drop=FALSE],
        colData=colData(x)[,j,drop=FALSE], ...)
})
## [1] "["
setMethod("[", c("AnnMat", "ANY", "missing"),
    function(x, i, j, ..., drop=TRUE)
{
    initialize(x, matData=matData(x)[i,,drop=FALSE],
        rowData=rowData(x)[i,,drop=FALSE])
})
## [1] "["
setMethod("[", c("AnnMat", "missing", "ANY"),
    function(x, i, j, ..., drop=TRUE)
{
    initialize(x, matData=matData(x)[,j,drop=FALSE],
        colData=colData(x)[j,,drop=FALSE])
})
## [1] "["
setMethod("[", c("AnnMat", "missing", "missing"),
    function(x, i, j, ..., drop=TRUE)
{
    initialize(x, ...)
})
## [1] "["
Exercise: create some simple unit tests for these methods.
Seems 'good enough' for numeric or logical indexes, what about character?
What else have we agreed to in the matrix API?