Contents

1 Introduction

ZarrArray is an infrastructure package that leverages the Rarr package to bring Zarr datasets in R as DelayedArray objects.

2 Install and load the package

Like any other Bioconductor package, ZarrArray should always be installed with BiocManager::install():

if (!require("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install("ZarrArray")

Load the package:

library(ZarrArray)

3 ZarrArray objects

The main class in the package is the ZarrArray class. A ZarrArray object is an array-like object that represents a Zarr dataset in R.

3.1 Construction

To create a ZarrArray object, simply call the ZarrArray() constructor function on the path to a Zarr dataset:

zarr_path <- system.file(package="Rarr", "extdata",
                         "zarr_examples", "column-first", "int32.zarr")
A <- ZarrArray(zarr_path)
A
## <30 x 20 x 10> ZarrArray object of type "integer":
## ,,1
##        [,1]  [,2]  [,3]  [,4] ... [,17] [,18] [,19] [,20]
##  [1,]     1     2     3     4   .    17    18    19    20
##  [2,]     1     0     0     0   .     0     0     0     0
##   ...     .     .     .     .   .     .     .     .     .
## [29,]     1     0     0     0   .     0     0     0     0
## [30,]     1     0     0     0   .     0     0     0     0
## 
## ...
## 
## ,,10
##        [,1]  [,2]  [,3]  [,4] ... [,17] [,18] [,19] [,20]
##  [1,]     0     0     0     0   .     0     0     0     0
##  [2,]     0     0     0     0   .     0     0     0     0
##   ...     .     .     .     .   .     .     .     .     .
## [29,]     0     0     0     0   .     0     0     0     0
## [30,]     0     0     0     0   .     0     0     0     0

3.2 Array and matrix operations

Note that ZarrArray objects are DelayedArray derivatives and therefore support all operations (delayed or block-processed) supported by DelayedArray objects:

class(A)
## [1] "ZarrArray"
## attr(,"package")
## [1] "ZarrArray"
is(A, "DelayedArray")
## [1] TRUE

This allows ZarrArray objects to “look and feel” like ordinary arrays or matrices in R by mimicking their behavior. In particular, ZarrArray objects suppport most of the “standard array API” defined in base R like dim(), length(), dimnames(), [, aperm(), max(), sum(), arithmetic and comparison operations, math functions, etc…

dim(A)
## [1] 30 20 10
length(A)
## [1] 6000
A[1:5, , 1]
## <5 x 20> DelayedMatrix object of type "integer":
##       [,1]  [,2]  [,3]  [,4] ... [,17] [,18] [,19] [,20]
## [1,]     1     2     3     4   .    17    18    19    20
## [2,]     1     0     0     0   .     0     0     0     0
## [3,]     1     0     0     0   .     0     0     0     0
## [4,]     1     0     0     0   .     0     0     0     0
## [5,]     1     0     0     0   .     0     0     0     0
aperm(A)
## <10 x 20 x 30> DelayedArray object of type "integer":
## ,,1
##        [,1]  [,2]  [,3]  [,4] ... [,17] [,18] [,19] [,20]
##  [1,]     1     2     3     4   .    17    18    19    20
##  [2,]     0     0     0     0   .     0     0     0     0
##   ...     .     .     .     .   .     .     .     .     .
##  [9,]     0     0     0     0   .     0     0     0     0
## [10,]     0     0     0     0   .     0     0     0     0
## 
## ...
## 
## ,,30
##        [,1]  [,2]  [,3]  [,4] ... [,17] [,18] [,19] [,20]
##  [1,]     1     0     0     0   .     0     0     0     0
##  [2,]     0     0     0     0   .     0     0     0     0
##   ...     .     .     .     .   .     .     .     .     .
##  [9,]     0     0     0     0   .     0     0     0     0
## [10,]     0     0     0     0   .     0     0     0     0
max(A)
## [1] 20
sum(A)
## [1] 239
A - 0.5
## <30 x 20 x 10> DelayedArray object of type "double":
## ,,1
##        [,1]  [,2]  [,3] ... [,19] [,20]
##  [1,]   0.5   1.5   2.5   .  18.5  19.5
##  [2,]   0.5  -0.5  -0.5   .  -0.5  -0.5
##   ...     .     .     .   .     .     .
## [29,]   0.5  -0.5  -0.5   .  -0.5  -0.5
## [30,]   0.5  -0.5  -0.5   .  -0.5  -0.5
## 
## ...
## 
## ,,10
##        [,1]  [,2]  [,3] ... [,19] [,20]
##  [1,]  -0.5  -0.5  -0.5   .  -0.5  -0.5
##  [2,]  -0.5  -0.5  -0.5   .  -0.5  -0.5
##   ...     .     .     .   .     .     .
## [29,]  -0.5  -0.5  -0.5   .  -0.5  -0.5
## [30,]  -0.5  -0.5  -0.5   .  -0.5  -0.5
A^3
## <30 x 20 x 10> DelayedArray object of type "double":
## ,,1
##        [,1]  [,2]  [,3] ... [,19] [,20]
##  [1,]     1     8    27   .  6859  8000
##  [2,]     1     0     0   .     0     0
##   ...     .     .     .   .     .     .
## [29,]     1     0     0   .     0     0
## [30,]     1     0     0   .     0     0
## 
## ...
## 
## ,,10
##        [,1]  [,2]  [,3] ... [,19] [,20]
##  [1,]     0     0     0   .     0     0
##  [2,]     0     0     0   .     0     0
##   ...     .     .     .   .     .     .
## [29,]     0     0     0   .     0     0
## [30,]     0     0     0   .     0     0
A == 0L
## <30 x 20 x 10> DelayedArray object of type "logical":
## ,,1
##        [,1]  [,2]  [,3] ... [,19] [,20]
##  [1,] FALSE FALSE FALSE   . FALSE FALSE
##  [2,] FALSE  TRUE  TRUE   .  TRUE  TRUE
##   ...     .     .     .   .     .     .
## [29,] FALSE  TRUE  TRUE   .  TRUE  TRUE
## [30,] FALSE  TRUE  TRUE   .  TRUE  TRUE
## 
## ...
## 
## ,,10
##        [,1]  [,2]  [,3] ... [,19] [,20]
##  [1,]  TRUE  TRUE  TRUE   .  TRUE  TRUE
##  [2,]  TRUE  TRUE  TRUE   .  TRUE  TRUE
##   ...     .     .     .   .     .     .
## [29,]  TRUE  TRUE  TRUE   .  TRUE  TRUE
## [30,]  TRUE  TRUE  TRUE   .  TRUE  TRUE
sqrt(A)
## <30 x 20 x 10> DelayedArray object of type "double":
## ,,1
##           [,1]     [,2]     [,3] ...    [,19]    [,20]
##  [1,] 1.000000 1.414214 1.732051   . 4.358899 4.472136
##  [2,] 1.000000 0.000000 0.000000   . 0.000000 0.000000
##   ...        .        .        .   .        .        .
## [29,]        1        0        0   .        0        0
## [30,]        1        0        0   .        0        0
## 
## ...
## 
## ,,10
##        [,1]  [,2]  [,3] ... [,19] [,20]
##  [1,]     0     0     0   .     0     0
##  [2,]     0     0     0   .     0     0
##   ...     .     .     .   .     .     .
## [29,]     0     0     0   .     0     0
## [30,]     0     0     0   .     0     0

In the 2D case, they also support the “standard matrix API” defined in base R like nrow(), ncol(), rownames(), colnames(), t(), rbind(), cbind(), rowSums(), colSums(), %*%, etc…, as well as some row/column summarization operations from the matrixStats package like rowMaxs(), colVars(), etc…

3.3 Other operations

Other operations are supported that are specific to DelayedArray objects and their derivatives:

path(A)
## [1] "/home/biocbuild/bbs-3.23-bioc/R/site-library/Rarr/extdata/zarr_examples/column-first/int32.zarr/"
type(A)
## [1] "integer"
chunkdim(A)
## [1] 10 10  5
a <- as.array(A)

See ?ZarrArray for more information.

4 Write an array-like object to disk in Zarr format

The writeZarrArray() function can be used to write an array-like object to disk in Zarr format.

For example we can write back A to disk but with a different physical chunk geometry:

path1 <- tempfile(fileext=".zarr")
writeZarrArray(A, path1, chunkdim=c(3, 5, 2))
## <30 x 20 x 10> ZarrArray object of type "integer":
## ,,1
##        [,1]  [,2]  [,3]  [,4] ... [,17] [,18] [,19] [,20]
##  [1,]     1     2     3     4   .    17    18    19    20
##  [2,]     1     0     0     0   .     0     0     0     0
##   ...     .     .     .     .   .     .     .     .     .
## [29,]     1     0     0     0   .     0     0     0     0
## [30,]     1     0     0     0   .     0     0     0     0
## 
## ...
## 
## ,,10
##        [,1]  [,2]  [,3]  [,4] ... [,17] [,18] [,19] [,20]
##  [1,]     0     0     0     0   .     0     0     0     0
##  [2,]     0     0     0     0   .     0     0     0     0
##   ...     .     .     .     .   .     .     .     .     .
## [29,]     0     0     0     0   .     0     0     0     0
## [30,]     0     0     0     0   .     0     0     0     0

Or, we can transform A and then write it back to disk:

path2 <- tempfile(fileext=".zarr")
A2 <- sqrt(t(A[ , , 1]) + 1)  # all these operations are delayed
writeZarrArray(A2, path2)  # realizes the delayed operations block by block
## <20 x 30> ZarrMatrix object of type "double":
##           [,1]     [,2]     [,3] ...    [,29]    [,30]
##  [1,] 1.414214 1.414214 1.414214   . 1.414214 1.414214
##  [2,] 1.732051 1.000000 1.000000   . 1.000000 1.000000
##  [3,] 2.000000 1.000000 1.000000   . 1.000000 1.000000
##  [4,] 2.236068 1.000000 1.000000   . 1.000000 1.000000
##  [5,] 2.449490 1.000000 1.000000   . 1.000000 1.000000
##   ...        .        .        .   .        .        .
## [16,] 4.123106 1.000000 1.000000   .        1        1
## [17,] 4.242641 1.000000 1.000000   .        1        1
## [18,] 4.358899 1.000000 1.000000   .        1        1
## [19,] 4.472136 1.000000 1.000000   .        1        1
## [20,] 4.582576 1.000000 1.000000   .        1        1

Note that writeZarrArray() leverages lower-level functionality implemented in the Rarr package like create_empty_zarr_array() and update_zarr_array(). See ?writeZarrArray for more information.

5 Session information

sessionInfo()
## R version 4.6.0 alpha (2026-04-05 r89794)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.4 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] ZarrArray_0.99.3      DelayedArray_0.37.1   SparseArray_1.11.13  
##  [4] S4Arrays_1.11.1       IRanges_2.45.0        abind_1.4-8          
##  [7] S4Vectors_0.49.1      MatrixGenerics_1.23.0 matrixStats_1.5.0    
## [10] BiocGenerics_0.57.0   generics_0.1.4        Matrix_1.7-5         
## [13] BiocStyle_2.39.0     
## 
## loaded via a namespace (and not attached):
##  [1] jsonlite_2.0.0      crayon_1.5.3        compiler_4.6.0     
##  [4] BiocManager_1.30.27 Rcpp_1.1.1          jquerylib_0.1.4    
##  [7] yaml_2.3.12         fastmap_1.2.0       lattice_0.22-9     
## [10] R6_2.6.1            XVector_0.51.0      curl_7.0.0         
## [13] httr2_1.2.2         knitr_1.51          paws.storage_0.9.0 
## [16] bookdown_0.46       paws.common_0.8.9   bslib_0.10.0       
## [19] R.utils_2.13.0      rlang_1.2.0         cachem_1.1.0       
## [22] xfun_0.57           sass_0.4.10         otel_0.2.0         
## [25] cli_3.6.5           magrittr_2.0.5      digest_0.6.39      
## [28] grid_4.6.0          rappdirs_0.3.4      lifecycle_1.0.5    
## [31] R.methodsS3_1.8.2   R.oo_1.27.1         glue_1.8.0         
## [34] evaluate_1.0.5      Rarr_1.11.37        rmarkdown_2.31     
## [37] tools_4.6.0         htmltools_0.5.9