ZarrArray 0.99.3
ZarrArray is an infrastructure package that leverages the Rarr package to bring Zarr datasets in R as DelayedArray objects.
Like any other Bioconductor package, ZarrArray should
always be installed with BiocManager::install():
if (!require("BiocManager", quietly=TRUE))
install.packages("BiocManager")
BiocManager::install("ZarrArray")
Load the package:
library(ZarrArray)
The main class in the package is the ZarrArray class. A ZarrArray object is an array-like object that represents a Zarr dataset in R.
To create a ZarrArray object, simply call the ZarrArray() constructor
function on the path to a Zarr dataset:
zarr_path <- system.file(package="Rarr", "extdata",
"zarr_examples", "column-first", "int32.zarr")
A <- ZarrArray(zarr_path)
A
## <30 x 20 x 10> ZarrArray object of type "integer":
## ,,1
## [,1] [,2] [,3] [,4] ... [,17] [,18] [,19] [,20]
## [1,] 1 2 3 4 . 17 18 19 20
## [2,] 1 0 0 0 . 0 0 0 0
## ... . . . . . . . . .
## [29,] 1 0 0 0 . 0 0 0 0
## [30,] 1 0 0 0 . 0 0 0 0
##
## ...
##
## ,,10
## [,1] [,2] [,3] [,4] ... [,17] [,18] [,19] [,20]
## [1,] 0 0 0 0 . 0 0 0 0
## [2,] 0 0 0 0 . 0 0 0 0
## ... . . . . . . . . .
## [29,] 0 0 0 0 . 0 0 0 0
## [30,] 0 0 0 0 . 0 0 0 0
Note that ZarrArray objects are DelayedArray derivatives and therefore support all operations (delayed or block-processed) supported by DelayedArray objects:
class(A)
## [1] "ZarrArray"
## attr(,"package")
## [1] "ZarrArray"
is(A, "DelayedArray")
## [1] TRUE
This allows ZarrArray objects to “look and feel” like ordinary arrays or
matrices in R by mimicking their behavior. In particular, ZarrArray objects
suppport most of the “standard array API” defined in base R like dim(),
length(), dimnames(), [, aperm(), max(), sum(), arithmetic and
comparison operations, math functions, etc…
dim(A)
## [1] 30 20 10
length(A)
## [1] 6000
A[1:5, , 1]
## <5 x 20> DelayedMatrix object of type "integer":
## [,1] [,2] [,3] [,4] ... [,17] [,18] [,19] [,20]
## [1,] 1 2 3 4 . 17 18 19 20
## [2,] 1 0 0 0 . 0 0 0 0
## [3,] 1 0 0 0 . 0 0 0 0
## [4,] 1 0 0 0 . 0 0 0 0
## [5,] 1 0 0 0 . 0 0 0 0
aperm(A)
## <10 x 20 x 30> DelayedArray object of type "integer":
## ,,1
## [,1] [,2] [,3] [,4] ... [,17] [,18] [,19] [,20]
## [1,] 1 2 3 4 . 17 18 19 20
## [2,] 0 0 0 0 . 0 0 0 0
## ... . . . . . . . . .
## [9,] 0 0 0 0 . 0 0 0 0
## [10,] 0 0 0 0 . 0 0 0 0
##
## ...
##
## ,,30
## [,1] [,2] [,3] [,4] ... [,17] [,18] [,19] [,20]
## [1,] 1 0 0 0 . 0 0 0 0
## [2,] 0 0 0 0 . 0 0 0 0
## ... . . . . . . . . .
## [9,] 0 0 0 0 . 0 0 0 0
## [10,] 0 0 0 0 . 0 0 0 0
max(A)
## [1] 20
sum(A)
## [1] 239
A - 0.5
## <30 x 20 x 10> DelayedArray object of type "double":
## ,,1
## [,1] [,2] [,3] ... [,19] [,20]
## [1,] 0.5 1.5 2.5 . 18.5 19.5
## [2,] 0.5 -0.5 -0.5 . -0.5 -0.5
## ... . . . . . .
## [29,] 0.5 -0.5 -0.5 . -0.5 -0.5
## [30,] 0.5 -0.5 -0.5 . -0.5 -0.5
##
## ...
##
## ,,10
## [,1] [,2] [,3] ... [,19] [,20]
## [1,] -0.5 -0.5 -0.5 . -0.5 -0.5
## [2,] -0.5 -0.5 -0.5 . -0.5 -0.5
## ... . . . . . .
## [29,] -0.5 -0.5 -0.5 . -0.5 -0.5
## [30,] -0.5 -0.5 -0.5 . -0.5 -0.5
A^3
## <30 x 20 x 10> DelayedArray object of type "double":
## ,,1
## [,1] [,2] [,3] ... [,19] [,20]
## [1,] 1 8 27 . 6859 8000
## [2,] 1 0 0 . 0 0
## ... . . . . . .
## [29,] 1 0 0 . 0 0
## [30,] 1 0 0 . 0 0
##
## ...
##
## ,,10
## [,1] [,2] [,3] ... [,19] [,20]
## [1,] 0 0 0 . 0 0
## [2,] 0 0 0 . 0 0
## ... . . . . . .
## [29,] 0 0 0 . 0 0
## [30,] 0 0 0 . 0 0
A == 0L
## <30 x 20 x 10> DelayedArray object of type "logical":
## ,,1
## [,1] [,2] [,3] ... [,19] [,20]
## [1,] FALSE FALSE FALSE . FALSE FALSE
## [2,] FALSE TRUE TRUE . TRUE TRUE
## ... . . . . . .
## [29,] FALSE TRUE TRUE . TRUE TRUE
## [30,] FALSE TRUE TRUE . TRUE TRUE
##
## ...
##
## ,,10
## [,1] [,2] [,3] ... [,19] [,20]
## [1,] TRUE TRUE TRUE . TRUE TRUE
## [2,] TRUE TRUE TRUE . TRUE TRUE
## ... . . . . . .
## [29,] TRUE TRUE TRUE . TRUE TRUE
## [30,] TRUE TRUE TRUE . TRUE TRUE
sqrt(A)
## <30 x 20 x 10> DelayedArray object of type "double":
## ,,1
## [,1] [,2] [,3] ... [,19] [,20]
## [1,] 1.000000 1.414214 1.732051 . 4.358899 4.472136
## [2,] 1.000000 0.000000 0.000000 . 0.000000 0.000000
## ... . . . . . .
## [29,] 1 0 0 . 0 0
## [30,] 1 0 0 . 0 0
##
## ...
##
## ,,10
## [,1] [,2] [,3] ... [,19] [,20]
## [1,] 0 0 0 . 0 0
## [2,] 0 0 0 . 0 0
## ... . . . . . .
## [29,] 0 0 0 . 0 0
## [30,] 0 0 0 . 0 0
In the 2D case, they also support the “standard matrix API” defined
in base R like nrow(), ncol(), rownames(), colnames(), t(),
rbind(), cbind(), rowSums(), colSums(), %*%, etc…, as well
as some row/column summarization operations from the matrixStats
package like rowMaxs(), colVars(), etc…
Other operations are supported that are specific to DelayedArray objects and their derivatives:
path(A)
## [1] "/home/biocbuild/bbs-3.23-bioc/R/site-library/Rarr/extdata/zarr_examples/column-first/int32.zarr/"
type(A)
## [1] "integer"
chunkdim(A)
## [1] 10 10 5
a <- as.array(A)
See ?ZarrArray for more information.
The writeZarrArray() function can be used to write an array-like
object to disk in Zarr format.
For example we can write back A to disk but with a different physical chunk geometry:
path1 <- tempfile(fileext=".zarr")
writeZarrArray(A, path1, chunkdim=c(3, 5, 2))
## <30 x 20 x 10> ZarrArray object of type "integer":
## ,,1
## [,1] [,2] [,3] [,4] ... [,17] [,18] [,19] [,20]
## [1,] 1 2 3 4 . 17 18 19 20
## [2,] 1 0 0 0 . 0 0 0 0
## ... . . . . . . . . .
## [29,] 1 0 0 0 . 0 0 0 0
## [30,] 1 0 0 0 . 0 0 0 0
##
## ...
##
## ,,10
## [,1] [,2] [,3] [,4] ... [,17] [,18] [,19] [,20]
## [1,] 0 0 0 0 . 0 0 0 0
## [2,] 0 0 0 0 . 0 0 0 0
## ... . . . . . . . . .
## [29,] 0 0 0 0 . 0 0 0 0
## [30,] 0 0 0 0 . 0 0 0 0
Or, we can transform A and then write it back to disk:
path2 <- tempfile(fileext=".zarr")
A2 <- sqrt(t(A[ , , 1]) + 1) # all these operations are delayed
writeZarrArray(A2, path2) # realizes the delayed operations block by block
## <20 x 30> ZarrMatrix object of type "double":
## [,1] [,2] [,3] ... [,29] [,30]
## [1,] 1.414214 1.414214 1.414214 . 1.414214 1.414214
## [2,] 1.732051 1.000000 1.000000 . 1.000000 1.000000
## [3,] 2.000000 1.000000 1.000000 . 1.000000 1.000000
## [4,] 2.236068 1.000000 1.000000 . 1.000000 1.000000
## [5,] 2.449490 1.000000 1.000000 . 1.000000 1.000000
## ... . . . . . .
## [16,] 4.123106 1.000000 1.000000 . 1 1
## [17,] 4.242641 1.000000 1.000000 . 1 1
## [18,] 4.358899 1.000000 1.000000 . 1 1
## [19,] 4.472136 1.000000 1.000000 . 1 1
## [20,] 4.582576 1.000000 1.000000 . 1 1
Note that writeZarrArray() leverages lower-level functionality implemented
in the Rarr package like create_empty_zarr_array()
and update_zarr_array(). See ?writeZarrArray for more information.
sessionInfo()
## R version 4.6.0 alpha (2026-04-05 r89794)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.4 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] ZarrArray_0.99.3 DelayedArray_0.37.1 SparseArray_1.11.13
## [4] S4Arrays_1.11.1 IRanges_2.45.0 abind_1.4-8
## [7] S4Vectors_0.49.1 MatrixGenerics_1.23.0 matrixStats_1.5.0
## [10] BiocGenerics_0.57.0 generics_0.1.4 Matrix_1.7-5
## [13] BiocStyle_2.39.0
##
## loaded via a namespace (and not attached):
## [1] jsonlite_2.0.0 crayon_1.5.3 compiler_4.6.0
## [4] BiocManager_1.30.27 Rcpp_1.1.1 jquerylib_0.1.4
## [7] yaml_2.3.12 fastmap_1.2.0 lattice_0.22-9
## [10] R6_2.6.1 XVector_0.51.0 curl_7.0.0
## [13] httr2_1.2.2 knitr_1.51 paws.storage_0.9.0
## [16] bookdown_0.46 paws.common_0.8.9 bslib_0.10.0
## [19] R.utils_2.13.0 rlang_1.2.0 cachem_1.1.0
## [22] xfun_0.57 sass_0.4.10 otel_0.2.0
## [25] cli_3.6.5 magrittr_2.0.5 digest_0.6.39
## [28] grid_4.6.0 rappdirs_0.3.4 lifecycle_1.0.5
## [31] R.methodsS3_1.8.2 R.oo_1.27.1 glue_1.8.0
## [34] evaluate_1.0.5 Rarr_1.11.37 rmarkdown_2.31
## [37] tools_4.6.0 htmltools_0.5.9