\section{Convert data from plate order to SGI-arrary} Convert data from primary plate readout to an multi-dimensional SGI-array. \subsection{Preliminaries} Load HD2013SGI library. <>= library(HD2013SGI) @ Load screening data. <>= data("featuresPerWell", package="HD2013SGI") @ Create output directories. <>= dir.create(file.path("result","data"),recursive=TRUE,showWarnings=FALSE) @ \subsection{Parse plate barcodes to annotate the plates} <>= NROW = 15 NCOL = 23 NFIELD = 4 @ Plate barcodes and plate numbers are extracted from the annotation. The plates have \Sexpr{NROW} rows and \Sexpr{NCOL} columns. \Sexpr{NFIELD} fields are imaged per well. The plate annotation is summarized in a data frame. <>= plates = featuresPerWell$Anno[seq(1,nrow(featuresPerWell$Anno), by=NFIELD*NCOL*NROW),"plate"] PlateAnnotation = HD2013SGI:::parsePlateBarcodes(plates) head(PlateAnnotation) @ The names of all query genes are extracted. <>= S = which(PlateAnnotation$queryGroup=="sample") tdnames = unique(PlateAnnotation$targetDesign[S]) qnames = unique(PlateAnnotation$queryGene[S]) qdnames = unique(PlateAnnotation$queryDesign[S]) repnames = unique(PlateAnnotation$replicate[S]) @ \subsection{Reorder data} The data is reordered and saved as an array. The mean of the measurements in the four fields per well is taken. <>= D = array(0.0, dim=c(field=NFIELD,col=NCOL,row=NROW, features=dim(featuresPerWell$data)[2], targetDesign=length(tdnames), query=length(qnames),queryDesign=length(qdnames), replicate=length(repnames))) dimnames(D) = list(field=seq_len(NFIELD), col=seq_len(NCOL),row=LETTERS[seq_len(NROW)+1], features=dimnames(featuresPerWell$data)[[2]], targetDesign=tdnames, queryGene=qnames,queryDesign=qdnames,replicate=repnames) z=0 for (td in tdnames) { for (q in qnames) { for (qd in qdnames) { for (r in repnames) { plate = PlateAnnotation$plate[ which((PlateAnnotation$targetDesign == td) & (PlateAnnotation$queryGene == q) & (PlateAnnotation$queryDesign == qd) & (PlateAnnotation$replicate == r) ) ] z=z+1 I = which(featuresPerWell$Anno$plate == plate) D[,,,,td,q,qd,r] = as.vector(featuresPerWell$data[I,]) } } } } D[is.na(D)] = 0.0 D = (D[1,,,,,,,] + D[2,,,,,,,] + D[3,,,,,,,] + D[4,,,,,,,])/4 # D = apply(D,2:8,mean,na.rm=TRUE) D = aperm(D,c(1,2,4,5,6,3,7)) dn = dimnames(D) dim(D) = c(prod(dim(D)[1:2]),dim(D)[3:7]) dimnames(D) = c(list(targetGene = sprintf("%s%d",rep(LETTERS[seq_len(NROW)+1],each=NCOL), rep(seq_len(NCOL),times=NROW))), dn[3:7]) datamatrixfull = list(D = D) save(datamatrixfull, file=file.path("result","data","datamatrixfull.rda")) @ The raw data is represented as a 6-dimensional array with dimensions\\ \begin{center} \begin{tabular}{lrrl} & \Sexpr{dim(datamatrixfull$D)[1]} & target genes \\ $\times$ & \Sexpr{dim(datamatrixfull$D)[2]} & siRNA target designs \\ $\times$ & \Sexpr{dim(datamatrixfull$D)[3]} & query genes \\ $\times$ & \Sexpr{dim(datamatrixfull$D)[4]} & siRNA query designs \\ $\times$ & \Sexpr{dim(datamatrixfull$D)[5]} & phenotypic features\\ $\times$ & \Sexpr{dim(datamatrixfull$D)[6]} & biological replicates \end{tabular} \end{center}