The 4D Nucleome Data Coordination and Integration Center (DCIC) has developed and actively maintains a data portal providing public access to a wealth of resources to investigate 3D chromatin architecture. Notably, 3D chromatin conformation libraries relying on different technologies (“in situ” or “dilution” Hi-C, Capture Hi-C, Micro-C, DNase Hi-C, …), generated by 50+ collaborating labs, were homogenously processed, yielding more than 350 sets of processed files.
fourDNData (read 4DN-Data) is a package giving programmatic access
to these uniformly processed Hi-C contact files.
The fourDNData() function provides a gateway to 4DN-hosted Hi-C files,
including contact matrices (in .hic or .mcool) and other Hi-C derived
files such as annotated compartments, domains, insulation scores, or
.pairs files.
library(fourDNData)
head(fourDNData())
#>   experimentSetAccession     fileType     size organism experimentType details
#> 1           4DNES18BMU79        pairs 10151.53    mouse   in situ Hi-C   DpnII
#> 3           4DNES18BMU79          hic  5285.82    mouse   in situ Hi-C   DpnII
#> 4           4DNES18BMU79        mcool  6110.75    mouse   in situ Hi-C   DpnII
#> 5           4DNES18BMU79   boundaries     0.12    mouse   in situ Hi-C   DpnII
#> 6           4DNES18BMU79   insulation     7.18    mouse   in situ Hi-C   DpnII
#> 7           4DNES18BMU79 compartments     0.18    mouse   in situ Hi-C   DpnII
#>                                dataset
#> 1 Hi-C on Mouse Olfactory System cells
#> 3 Hi-C on Mouse Olfactory System cells
#> 4 Hi-C on Mouse Olfactory System cells
#> 5 Hi-C on Mouse Olfactory System cells
#> 6 Hi-C on Mouse Olfactory System cells
#> 7 Hi-C on Mouse Olfactory System cells
#>                                                         condition
#> 1 Mature olfactory sensory neurons with conditional Ldb1 knockout
#> 3 Mature olfactory sensory neurons with conditional Ldb1 knockout
#> 4 Mature olfactory sensory neurons with conditional Ldb1 knockout
#> 5 Mature olfactory sensory neurons with conditional Ldb1 knockout
#> 6 Mature olfactory sensory neurons with conditional Ldb1 knockout
#> 7 Mature olfactory sensory neurons with conditional Ldb1 knockout
#>                 biosource biosourceType             publication
#> 1 olfactory receptor cell  primary cell Monahan K et al. (2019)
#> 3 olfactory receptor cell  primary cell Monahan K et al. (2019)
#> 4 olfactory receptor cell  primary cell Monahan K et al. (2019)
#> 5 olfactory receptor cell  primary cell Monahan K et al. (2019)
#> 6 olfactory receptor cell  primary cell Monahan K et al. (2019)
#> 7 olfactory receptor cell  primary cell Monahan K et al. (2019)
#>                                                                                                                                   URL
#> 1 https://4dn-open-data-public.s3.amazonaws.com/fourfront-webprod/wfoutput/49504f97-904e-48c1-8c20-1033680b66da/4DNFIC5AHBPV.pairs.gz
#> 3      https://4dn-open-data-public.s3.amazonaws.com/fourfront-webprod/wfoutput/6cd4378a-8f51-4e65-99eb-15f5c80abf8d/4DNFIT4I5C6Z.hic
#> 4    https://4dn-open-data-public.s3.amazonaws.com/fourfront-webprod/wfoutput/01fb704f-2fd7-48c6-91af-c5f4584529ed/4DNFIVPAXJO8.mcool
#> 5   https://4dn-open-data-public.s3.amazonaws.com/fourfront-webprod/wfoutput/5c07cdee-53e2-43e0-8853-cfe5f057b3f1/4DNFIR3XCIMA.bed.gz
#> 6       https://4dn-open-data-public.s3.amazonaws.com/fourfront-webprod/wfoutput/d1f4beb9-701f-4188-abe2-6271fe658770/4DNFIXKKNMS7.bw
#> 7       https://4dn-open-data-public.s3.amazonaws.com/fourfront-webprod/wfoutput/3d429647-51c8-4e3a-a18b-eec0b1480905/4DNFIN13N8C1.bw
cool_file <- fourDNData('4DNESDP9ECMN')
cool_file
#>      experimentSetAccession     fileType   size organism experimentType details
#> 1067           4DNESDP9ECMN        pairs  14.77    human   in situ Hi-C    MboI
#> 1069           4DNESDP9ECMN          hic 197.60    human   in situ Hi-C    MboI
#> 1070           4DNESDP9ECMN        mcool  48.27    human   in situ Hi-C    MboI
#> 1071           4DNESDP9ECMN compartments   0.20    human   in situ Hi-C    MboI
#>                                          dataset
#> 1067 Hi-C on GM12878 cells - protocol variations
#> 1069 Hi-C on GM12878 cells - protocol variations
#> 1070 Hi-C on GM12878 cells - protocol variations
#> 1071 Hi-C on GM12878 cells - protocol variations
#>                                                              condition
#> 1067 in situ Hi-C on GM12878 crosslinking titration - 1% FA, 1 min, RT
#> 1069 in situ Hi-C on GM12878 crosslinking titration - 1% FA, 1 min, RT
#> 1070 in situ Hi-C on GM12878 crosslinking titration - 1% FA, 1 min, RT
#> 1071 in situ Hi-C on GM12878 crosslinking titration - 1% FA, 1 min, RT
#>      biosource          biosourceType              publication
#> 1067   GM12878 immortalized cell line Sanborn AL et al. (2015)
#> 1069   GM12878 immortalized cell line Sanborn AL et al. (2015)
#> 1070   GM12878 immortalized cell line Sanborn AL et al. (2015)
#> 1071   GM12878 immortalized cell line Sanborn AL et al. (2015)
#>                                                                                                                                      URL
#> 1067 https://4dn-open-data-public.s3.amazonaws.com/fourfront-webprod/wfoutput/c2ae7404-501a-4d80-957b-cd677e2bd38a/4DNFIU5XG6TN.pairs.gz
#> 1069      https://4dn-open-data-public.s3.amazonaws.com/fourfront-webprod/wfoutput/70c1472d-cf3a-41d7-8682-cd03b7cc978d/4DNFI2AGEBE5.hic
#> 1070    https://4dn-open-data-public.s3.amazonaws.com/fourfront-webprod/wfoutput/c81d77c0-b57e-4a29-80ac-ec6ab0714f57/4DNFI4988896.mcool
#> 1071       https://4dn-open-data-public.s3.amazonaws.com/fourfront-webprod/wfoutput/dc07042c-62d5-46ae-905d-8ec99b10cf9a/4DNFIDO8B3C6.bwfourDNData package can be installed from Bioconductor using the following
command:
if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("fourDNData")The HiCExperiment package can be used to import .mcool files provided by
fourDNData. Refer to HiCExperiment package documentation for further
information.
library(HiCExperiment)
#> Consider using the `HiContacts` package to perform advanced genomic operations 
#> on `HiCExperiment` objects.
#> 
#> Read "Orchestrating Hi-C analysis with Bioconductor" online book to learn more:
#> https://js2264.github.io/OHCA/
ID <- '4DNESDP9ECMN'
cf <- CoolFile(
    path = fourDNData(ID, type = 'mcool'), 
    metadata = as.list(fourDNData()[fourDNData()$experimentSetAccession == ID,])
)
x <- import(cf, resolution = 250000, focus = 'chr5:10000000-50000000')
x
#> `HiCExperiment` object with 7,466 contacts over 161 regions 
#> -------
#> fileName: "/home/biocbuild/.cache/R/fourDNData/261bbf2832193a_4DNFI4988896.mcool" 
#> focus: "chr5:10,000,000-50,000,000" 
#> resolutions(13): 1000 2000 ... 5000000 10000000
#> active resolution: 250000 
#> interactions: 2158 
#> scores(2): count balanced 
#> topologicalFeatures: compartments(0) borders(0) loops(0) viewpoints(0) 
#> pairsFile: N/A 
#> metadata(12): experimentSetAccession fileType ... publication URL
interactions(x)
#> GInteractions object with 2158 interactions and 4 metadata columns:
#>          seqnames1           ranges1     seqnames2           ranges2 |
#>              <Rle>         <IRanges>         <Rle>         <IRanges> |
#>      [1]      chr5 10000001-10250000 ---      chr5 10000001-10250000 |
#>      [2]      chr5 10000001-10250000 ---      chr5 10250001-10500000 |
#>      [3]      chr5 10000001-10250000 ---      chr5 10500001-10750000 |
#>      [4]      chr5 10000001-10250000 ---      chr5 10750001-11000000 |
#>      [5]      chr5 10000001-10250000 ---      chr5 11250001-11500000 |
#>      ...       ...               ... ...       ...               ... .
#>   [2154]      chr5 46000001-46250000 ---      chr5 46250001-46500000 |
#>   [2155]      chr5 46250001-46500000 ---      chr5 46250001-46500000 |
#>   [2156]      chr5 46250001-46500000 ---      chr5 47000001-47250000 |
#>   [2157]      chr5 49500001-49750000 ---      chr5 49500001-49750000 |
#>   [2158]      chr5 49750001-50000000 ---      chr5 49750001-50000000 |
#>            bin_id1   bin_id2     count  balanced
#>          <numeric> <numeric> <numeric> <numeric>
#>      [1]      3560      3560        30 0.3097516
#>      [2]      3560      3561         7 0.0574021
#>      [3]      3560      3562         2 0.0187244
#>      [4]      3560      3563         6 0.0567218
#>      [5]      3560      3565         1 0.0108409
#>      ...       ...       ...       ...       ...
#>   [2154]      3704      3705         2       NaN
#>   [2155]      3705      3705         5       NaN
#>   [2156]      3705      3708         1       NaN
#>   [2157]      3718      3718        11  0.320998
#>   [2158]      3719      3719         1       NaN
#>   -------
#>   regions: 161 ranges and 4 metadata columns
#>   seqinfo: 24 sequences from an unspecified genome
as(x, 'ContactMatrix')
#> class: ContactMatrix 
#> dim: 161 161 
#> type: dgCMatrix 
#> rownames: NULL
#> colnames: NULL
#> metadata(0):
#> regions: 161Rather than importing multiple files corresponding to a single experimentSet
accession ID one by one, one can import all the available files associated with
a experimentSet accession ID into a HiCExperiment object by using the
fourDNHiCExperiment() function.
library(HiCExperiment)
x <- fourDNHiCExperiment('4DNESDP9ECMN')
#> Fetching local Hi-C contact map from Bioc cache
#> Fetching local compartments bigwig file from Bioc cache
#> Insulation not found for the provided experimentSet accession.
#> Borders not found for the provided experimentSet accession.
#> Importing contacts in memorysessionInfo()
#> R version 4.5.0 RC (2025-04-04 r88126)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.2 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/New_York
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] HiCExperiment_1.8.0 fourDNData_1.8.0    BiocStyle_2.36.0   
#> 
#> loaded via a namespace (and not attached):
#>  [1] tidyselect_1.2.1            dplyr_1.1.4                
#>  [3] blob_1.2.4                  filelock_1.0.3             
#>  [5] Biostrings_2.76.0           bitops_1.0-9               
#>  [7] fastmap_1.2.0               RCurl_1.98-1.17            
#>  [9] BiocFileCache_2.16.0        GenomicAlignments_1.44.0   
#> [11] XML_3.99-0.18               digest_0.6.37              
#> [13] lifecycle_1.0.4             RSQLite_2.3.9              
#> [15] magrittr_2.0.3              compiler_4.5.0             
#> [17] rlang_1.1.6                 sass_0.4.10                
#> [19] tools_4.5.0                 yaml_2.3.10                
#> [21] rtracklayer_1.68.0          knitr_1.50                 
#> [23] S4Arrays_1.8.0              bit_4.6.0                  
#> [25] curl_6.2.2                  DelayedArray_0.34.0        
#> [27] abind_1.4-8                 BiocParallel_1.42.0        
#> [29] withr_3.0.2                 purrr_1.0.4                
#> [31] BiocGenerics_0.54.0         grid_4.5.0                 
#> [33] stats4_4.5.0                Rhdf5lib_1.30.0            
#> [35] SummarizedExperiment_1.38.0 cli_3.6.4                  
#> [37] rmarkdown_2.29              crayon_1.5.3               
#> [39] generics_0.1.3              httr_1.4.7                 
#> [41] tzdb_0.5.0                  rjson_0.2.23               
#> [43] DBI_1.2.3                   cachem_1.1.0               
#> [45] rhdf5_2.52.0                parallel_4.5.0             
#> [47] BiocManager_1.30.25         XVector_0.48.0             
#> [49] restfulr_0.0.15             matrixStats_1.5.0          
#> [51] vctrs_0.6.5                 Matrix_1.7-3               
#> [53] jsonlite_2.0.0              bookdown_0.43              
#> [55] IRanges_2.42.0              S4Vectors_0.46.0           
#> [57] bit64_4.6.0-1               strawr_0.0.92              
#> [59] jquerylib_0.1.4             glue_1.8.0                 
#> [61] codetools_0.2-20            GenomeInfoDb_1.44.0        
#> [63] GenomicRanges_1.60.0        BiocIO_1.18.0              
#> [65] UCSC.utils_1.4.0            tibble_3.2.1               
#> [67] pillar_1.10.2               htmltools_0.5.8.1          
#> [69] rhdf5filters_1.20.0         GenomeInfoDbData_1.2.14    
#> [71] R6_2.6.1                    dbplyr_2.5.0               
#> [73] vroom_1.6.5                 evaluate_1.0.3             
#> [75] lattice_0.22-7              Biobase_2.68.0             
#> [77] Rsamtools_2.24.0            memoise_2.0.1              
#> [79] bslib_0.9.0                 Rcpp_1.0.14                
#> [81] InteractionSet_1.36.0       SparseArray_1.8.0          
#> [83] xfun_0.52                   MatrixGenerics_1.20.0      
#> [85] pkgconfig_2.0.3