MatrixQCvis 1.0.0
Data quality assessment is an integral part of preparatory data analysis
to ensure sound biological information retrieval.
We present here the MatrixQCvis package, which provides shiny-based
interactive visualization of data quality metrics at the per-sample and
per-feature level. It is broadly applicable to quantitative omics data types
that come in matrix-like format (features x samples). It enables the detection
of low-quality samples, drifts, outliers and batch effects in data sets.
Visualizations include amongst others bar- and violin plots of the
(count/intensity) values, mean vs standard deviation plots, MA plots,
empirical cumulative distribution function (ECDF) plots, visualizations
of the distances between samples, and multiple types of dimension reduction
plots.
MatrixQCvis builds upon the Bioconductor SummarizedExperiment S4
class and enables thus the facile integration into existing workflows.
MatrixQCvis is especially addressed to analyze the quality of proteomics and
metabolomics data sets that are characterized by missing values as it
allows for imputation of missing values and differential expression analysis
using the proDA package (Ahlman-Eltze and Anders 2019). Besides this, MatrixQCvis is
extensible to other type of data(e.g. transcriptomics count data) that can be
represented as a SummarizedExperiment object.
Furthermore, the shiny application allows for simple differential
expression analysis using either moderated t-tests (from the limma package,
Ritchie et al. (2015)) or Wald tests (from the proDA package, Ahlman-Eltze and Anders (2019)).
Within this vignette, the term feature will refer to a probed molecular entity, e.g. gene, transcript, protein, peptide, or metabolite.
In the following, we will describe the major setup of MatrixQCvis and the
navigation through the shiny application, shinyQC.
MatrixQCvis is currently under active development. If you
discover any bugs, typos or develop ideas of improving
MatrixQCvis feel free to raise an issue via
GitHub or
send a mail to the developer.
To install MatrixQCvis enter the following to the R console
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("MatrixQCvis")Before starting with the analysis, load the MatrixQCvis package. This
will also load the required packages Biobase, BiocGenerics, GenomeInfoDb,
GenomicRanges, ggplot2, IRanges, MatrixGenerics,
parallel, matrixStats, plotly, S4Vectors, shiny,
shinydashboard, SummarizedExperiment, and stats4.
library(MatrixQCvis)The most important function to assess the data quality is the shinyQC function
and its most important argument is se. shinyQC expects a SummarizedExperiment
object. One requirement for the SummarizedExperiment object is that
it contains the column "name" in colData(se), storing the names of the
samples. If the SummarizedExperiment object does not contain this column,
shinyQC will stop and throw an error.
Alternatively, a SummarizedExperiment object can also be loaded from within
shinyQC (when no se object is supplied to shinyQC).
Objects belonging to the SummarizedExperiment class are containers for one
or more assays,
which are (numerical) matrices containing the quantitative, measured information
of the experiment. The rows represent features of interest (e.g. 
transcripts, peptides, proteins, or metabolites) and the columns represent the
samples. The
SummarizedExperiment object stores also information on the features of
interest (accessible by rowData) and information on the samples
(accessible by colData).
If there is more than one experimental data set (assay)
stored in the SummarizedExperiment object, a select option will appear in the
sidebar allowing for selecting the assay.
Here, we will generate a SummarizedExperiment object with synthetic
quantitative information on proteins and use this object, se, to showcase the
functionality of shinyQC. The object will contain information from 40
samples. The samples consist of four different types (1, 2, 3, and 4)
and each type has two treatments, A and B. In total there are 1000 features.
## create synthetic assay using the generate_synthetic_data function from the
## proDA package
library(proDA)
n_samples <- 40
n_feat <- 1000
data <- generate_synthetic_data(n_feat, n_conditions = n_samples / 10, 
    n_replicates = n_samples / 4, frac_changed = 0.1)
a <- data$Y
colnames(a) <- gsub(colnames(a), pattern = "Condition", replacement = "Sample")
## add some treatment-specific effects
set.seed(1)
a[, 1:5] <- a[, 1:5] + rnorm(5000, mean = 1.0, sd = 0.5)
a[, 11:15] <- a[, 11:15] + rnorm(5000, mean = 0.8, sd = 0.5)
a[, 21:25] <- a[, 21:25] + rnorm(5000, mean = 1.2, sd = 0.5)
a[, 31:35] <- a[, 31:35] + rnorm(5000, mean = 0.7, sd = 0.5)
## create information on the samples
type_sample <- gsub(data$groups, pattern = "Condition", replacement = "Type")
trmt_sample <- paste(
   c(rep("1", 10), rep("2", 10), rep("3", 10), rep("4", 10)),
   c(rep("A", 5), rep("B", 5)), sep = "_")
cD <- data.frame(name = colnames(a), type = type_sample, 
                     treatment = trmt_sample)
## create information on the proteins
rD <- data.frame(spectra = rownames(a))
## create se
se <- SummarizedExperiment(assay = a, rowData = rD, colData = cD)The actual shiny application can then be started by entering the following
to the R console:
qc <- shinyQC(se)The assignment to qc or any other object is not mandatory. Upon exiting the
shiny application via the exit button, shinyQC will return a
SummarizedExperiment object containing the imputed dataset
that can be in the following further analyzed.
Now, we will have a closer look on the user interface of the Shiny application.
Please note: Depending on the supplied SummarizedExperiment object the user
interface of shinyQC will differ:
SummarizedExperiment object containing missing values
Samples, Measured Values, Missing Values, Values,
Dimension Reduction and DE will be displayed,Values and Dimension Reduction the
imputed data set will be visualized,SummarizedExperiment object containing no missing values (i.e.
with complete observations)
Samples, Values, Dimension Reduction and DE
will be displayed,Values and DE the imputed data set
will not be visualized,In the following, the vignette will be (mainly) described from the point of view
of a SummarizedExperiment containing missing values.
The tab Samples gives general information on the number of samples in the
se object.
The first panel shows a barplot and allows to display the number of
samples per sample type, treatment, etc. As an example, if we want to display
how many samples are in se for the different types (type is a
column name in colData(se) and any column in colData(se) can be selected),
this panel will show the following output:
The figure in this panel displays the relative proportions of the numbers,
e.g. how many samples (in %) are there for type against treatment. Again,
type and treatment are columns in colData(se) and any column in
colData(se) can be selected.
The figure will tell us that se contains the different types/treatments in a
balanced manner:
The tabs Measured Values and Missing Values are only displayed if the
SummarizedExperiment object contains missing values.
The layout in the tabs Measured Values and Missing Values is similar.
Therefore, the two tabs are described here simultaneously and the differences
are pointed out where necessary.
The plot shows the number of measured or missing values per sample
in the data set (depending on the selected tab). For example for the tab
Measured Values, the plot will look like
In this case, the samples show approximately the same number of measured features, i.e. there is no indication to remove any sample based on this measure.
The plot shows different data depending on which tab
(Measured Values or Missing Values) is selected. The binwidth will be
determined by the slider input (Binwidth (Measured Values) or
Binwidth (Missing Values)).
The plot shows how often a feature was measured in a certain number of samples.
Examples:
in the case of one sample (x-axis), the y-axis will denote the number of features for which only one feature was measured.
in the case of the number of total samples (x-axis), the y-axis will denote the number of features with complete observations (i.e. the number of features for which the feature was quantified in all samples).
The plot shows how often a feature was missing in a certain number of samples.
Examples:
in the case of one sample (x-axis), the y-axis will denote the number of features for which only one feature was missing.
in the case of the number of total samples (x-axis), the y-axis will denote the number of features with completely missing observations (i.e. the number of features for which no feature was quantified in all samples).
The plot in this panel can be read accordingly to the one in the panel
Histogram Features, but, it is segregated for the specified variable
(e.g. it shows the distribution of measured/missing values among the sample
types).
The plot shows the interaction of sets between different variables depending on their presence or absence. The dots in the UpSet plot specify if the criteria for presence/absence are fulfilled (see below for the definition). In each column the intersections are displayed together with the number of features regarding the type of intersection. The boxplots in the rows display the number of present/absent features per set.
All sets present in the data set are displayed. Depending on the data set,
however, not all intersections sets are displayed (see the help page for
the upset function from the UpSetR package).
Presence is defined by a feature being measured in at least one sample of a set.
Absence is defined by a feature with only missing values (i.e. no measured values) of a set.
This panel will retrieve the features specified by intersection of sets.
This panel builds upon the Upset panel. By selecting the check boxes,
the names of the features (taken from the rownames of the features) are
printed as text that fulfill the defined intersection of sets.
Example: Four sets (Type_1, Type_2, Type_3, and Type_4) are found for
a specified variable (here: type). When selecting the boxes for Type_1 and
Type_2 (while not selecting the boxes for Type_3 and Type_4) the
features that are present/absent for Type_1 and Type_2
(but not in the sets Type_3 and Type_4) are returned.
For the tab Measured Values, presence is defined by a feature being
measured in at least one sample of a set.
For the tab Missing Values, absence is defined by a feature with only
missing values (i.e. no measured values) of a set.
The tab Values will take a closer look on the assay slot of the
SummarizedExperiment.
This panel shows the (count/intensity) values for raw (raw), normalized
(normalized), normalized+transformed (transformed),
normalized+transformed+batch corrected (batch corrected), and
normalized+transformed+batch corrected+imputed (imputed) (count/intensity)
values (imputation of missing values, imputed will only be shown if there are
missing values in the SummarizedExperiment). As already mentioned, the
different methods for normalization, transformation, batch correction (and
imputation) are specified in
the side panel.
For visualization purposes only, the (count/intensity) values for the raw and
normalized data sets can be log2 transformed (see the radio buttons in
Display log2 values? (only for 'raw' and 'normalized')).
The type of visualization (boxplot or violin plot) can be specified by
selecting boxplot or violin in the radio button panel (Type of display).
The figure (violin plot) using the raw values will look like
This panel shows a trend line for aggregated values to
indicate drifts/trends in data acquisition. It
shows the sum- or median-aggregated values (specified in
Select aggregation). The plot allows to display trends in data
acquisition that originate e.g. from differences in instrument
sensitivity. The panel allows to display aggregated values for
raw (raw), normalized
(normalized), normalized+transformed (transformed),
normalized+transformed+batch corrected (batch corrected), and
normalized+transformed+batch corrected+imputed (imputed) (count/intensity) values
(imputation of missing values, imputed will only be shown if there are missing
values in the SummarizedExperiment).
The different methods for normalization, transformation, batch correction (and
imputation) are specified in the sidebar panel.
The smoothing is calculated from the
selection of samples that are specified by the drop-down menus
Select variable and Select level to highlight.
The menu Select variable corresponds to the colnames in
colData(se). Here, we can select for the higher-order variable, e.g.
the sample type (containing for example Type_1, Type_2, …, QC).
The drop-down menu Select level to highlight will specify the actual selection
from which
the trend line will be calculated (e.g. Type_1, Type_2, …, QC). Also,
the menu will always include the level all, which will use all points to
calculate the trend line. If we want to calculate the trend line of
aggregated values of all samples belonging to the type QC, we select
QC in the drop-down menu.
The panel allows for further customization after expanding the collapsed box.
The data input is selected in the drop-down menu under Select data input.
The smoothing method (either LOESS or linear model) is selected in the drop-down
menu under Select smoothing method. The aggregation method is selected
in the drop-down menu Select smoothing method.
With the drop-down menu Select categorical variable to order samples,
the samples (x-axis) will be ordered alphanumerically according to the
selected level (and the sample name).
Here, we are interested in observing if there is a trend/drift for
samples of type Type_1. We select LOESS as the method for the trend line
and median as the aggregation method. The figure will then look as follows:
This panel shows the coefficient of variation values for raw (raw),
normalized (normalized), normalized+transformed+batch corrected
(transformed), and normalized+transformed+batch corrected+imputed (imputed)
(count/intensity) values (imputation of missing values, imputed will only be shown
if there are missing values in the SummarizedExperiment) among the samples.
The different methods for normalization, transformation, batch correction
(and imputation) are specified in the sidebar panel.
The panel displays the coefficient of variation values from the samples of the
SummarizedExperiment object. The coefficients of variation are calculated
according to the formula sd(x) / mean(x) * 100 with x the sample values
and sd the standard deviation. The plot might be useful when looking at the
coefficient of variation values of a specific sample type (e.g. QCs) and trying
to identify outliers.
Here, we shows the plot of coefficient of variation values from the
raw values (as obtained by assay(se)), normalized values (using sum
normalization), transformed values (using vsn), batch corrected values
(using none) and imputed values (using the MinDet algorithm, Lazar et al. (2016)).
The panel shows the three mean-sd (standard deviation) plots for
normalized+transformed (transformed),
normalized+transformed+batch corrected (batch corrected),
and normalized+transformed+batch corrected+imputed (imputed) values
(imputed will only be shown if there are missing
values in the SummarizedExperiment).
The sd and mean are calculated feature-wise from the values of the respective
data set. The plot allows to visualize if there is a dependence of the sd on
the mean. The red line depicts the running median estimator (window-width 10%).
In case of sd-mean independence, the running median should be approximately
horizontal.
For the transformed values, the mean-sd plot will look like
The panel displays MA plots and Hoeffding’s D statistic.
In the first part of the panel the A vs. M plots per sample are depicted.
The values are defined as follows
where \(I_i\) and \(I_j\) are transformed values.
In the case of raw or normalized the values are log2-transformed
prior to calculating A and M. In case of transformed,
batch corrected, or imputed the values are taken as they are
(N.B. when the transformation method is set to none the values are not
log2-transformed).
The values for \(I_i\) are taken from the sample \(i\). For \(I_j\), the
feature-wise means are calculated from the values of the group \(j\) of
samples specified by the drop-down menu group. The sample for calculating
\(I_i\) is excluded from the group \(j\). The group can be set to
"all" (i.e. all samples except sample \(i\) are used to calculate \(I_j\)) or
any other column in colData(se). For any group except "all" the group is
taken to which the sample \(i\) belongs to and the sample \(i\) is excluded from
the feature-wise calculation.
The MA values for all samples are by default displayed facet-wise. The MA plot
can be set to a specific sample by changing the selected value in the
drop-down menu plot.
The underlying data set can be selected by the drop-down menu
(Data set for the MA plot).
In the second part of the panel, the Hoeffding’s D statistic values are
visualized for the different data sets raw, normalized, transformed,
batch corrected, and imputed (imputed will only be shown if there are
missing values in the SummarizedExperiment).
D is a measure of the distance between F(A, M) and G(A)H(M), where
F(A, M) is the joint cumulative distribution function (CDF) of A and M,
and G and H are marginal CDFs. The higher the value of D, the more
dependent are A and M. The D values are connected for the same
samples along the different data sets (when lines is selected), allowing to
track the influence of
normalization, transformation, batch correction (and imputation) methods on the
D values.
The MA plot using the raw values and group = "all" will look like the
following (plot = “Sample_1-1”`):
## Warning: Removed 259 rows containing non-finite values (stat_binhex).The plot shows the ECDF of the values of the sample
\(i\) and the feature-wise mean values of a group \(j\) of samples specified
by the drop-down menu Group. The sample for calculating \(I_i\) is
excluded from the group \(j\). The group can be set to "all"
(i.e. all samples except sample \(i\) are used to calculate
\(I_j\)) or any other column in colData(se). For any group except "all" the
group is taken to which the sample \(i\) belongs to and the sample \(i\) is
excluded from the feature-wise calculation.
The underlying data set can be selected by the drop-down menu
Data set for the MA plot. The sample \(i\) can be selected by
the drop-down menu Sample. The group can be selected by
the drop-down menu Group.
The ECDF plot for the sample "Sample_1-1", group = "all", and the
raw values (as obtained by assay(se)) will look like:
On the left, the panel depicts heatmaps of distances between samples for
the data sets of
raw (raw), normalized (normalized), normalized+transformed
(transformed), normalized+transformed+batch corrected (batch corrected)
and normalized+transformed+imputed (imputed, imputed will only be shown if
there are missing values in the SummarizedExperiment).
The annotation of the heatmaps can be adjusted by the drop-down
menu annotation.
On the right panel the sum of distances to other samples is depicted.
The distance matrix and sum of distances (for raw values, as obtained by
assay(se)) will look like:
The first plot shows the values for each samples along the data processing steps. The feature to be displayed is selected via the drop-down menu Select feature.
## Warning: Removed 160 rows containing missing values (geom_point).## Warning: Removed 160 row(s) containing missing values (geom_path).The second plot shows the coefficient of variation of the values for all features
in the data set along the data processing steps. The features of same
identity can be connected by lines by clicking on lines.
The element in the bottom of the tab panel (Select features) will specify
the selection of features in the data set (the selection will be propagated through
all tabs):
all: when selected, all features in the uploaded SummarizedExperiment
object will used,exclude: when selected, the features specified in the text input field
will be excluded from the uploaded SummarizedExperiment,select: when selected, the features specified in the text input field
will be selected from the uploaded SummarizedExperiment
(note: a minimum of three features needs to be selected in order for the
selection to take place).Within this tab several dimension reduction plots are displayed to visualize
the level of similarity of samples of a data set:
principal component analysis (PCA), principal coordinates analysis
(PCoA, also known as multidimensional scaling, MDS),
non-metric multidimensional scaling (NMDS),
t-distributed stochastic neighbor embedding (tSNE, Maaten and Hinton (2008)), and
uniform manifold approximation and projection (UMAP, McInnes, Healy, and Melville (2018)).
Data input for the dimension reduction plots is the imputed data set.
The panel depicts a plot of PCA. The data input can be scaled and
centered prior to calculating the principal components by adjusting the
respective tick marks. The different PCs can be displayed by changing the
values in the drop-down menus for the x-axis and y-axis.
A scree plot, showing the explained variance per PC, is displayed in the right panel.
The panel depicts a plot of PCoA (= multidimensional scaling). Different distance measures can be used to calculate the distances between the samples (euclidean, maximum, manhattan, canberra, minkowski). The different axes of the transformed data can be displayed by changing the values in the drop-down menus for the x-axis and y-axis.
The panel depicts a plot of NMDS. Different distance measures can be used to calculate the distances between the samples (euclidean, maximum, manhattan, canberra, minkowski). The different axes of the transformed data can be displayed by changing the values in the drop-down menus for the x-axis and y-axis.
The panel depicts a plot of tSNE. The parameters Perplexity,
Number of iterations, Number of retained dimensions in initial PCA,
and Output dimensionality required for the tSNE algorithm can be
set with the sliders. For the parameter
Number of retained dimensions in initial PCA the panel
Principal components can be employed that shows on the left panel a
Scree plot of the data set and permuted values and corresponding p-values
from the permutation (set the number of principal components where the
explained variance is above the permuted data set/where p-values are below
0.05).
The different dimensions of the transformed data can be displayed by changing
the values in the drop-down menus for the x-axis and y-axis (either two or
three dimensions according to the Output dimensionality).
The panel depicts a plot of UMAP. The parameters Minimum distance,
Number of neighbors, and Spread, required for the UMAP algorithm can be
set with the sliders.
The different dimensions of the transformed data can be displayed by changing the values in the drop-down menus for the x-axis and y-axis.
This tab allows to test for differential expression between conditions.
Currently, two methods/tests are implemented for calculating differential
expression between conditions: moderated t-tests from limma and the
Wald test from proDA. The approach of proDA does not require imputed
values and will take the normalized+transformed+batch corrected
(batch corrected) data set as input.
The moderated t-statistic is the ratio of the M-value to its standard error,
where the M-value is the log2-fold change for the feature of interest.
The moderated t-statistic has the same interpretation as
an ordinary t-statistic except that the standard errors have been moderated
across features, borrowing information from the ensemble of features to aid
with inference about each individual feature (Kammers et al. 2015). We use the
eBayes function from limma to compute the moderated t-statistics
(trend and robust are set by default to TRUE, see ?eBayes for
further information).
proDA (Ahlman-Eltze and Anders 2019) was developed for the differential abundance analysis
in label-free proteomics data sets. proDA models missing values in an
intensity-dependent probabilistic manner based on a dropout curve
(for further details see Ahlman-Eltze and Anders (2019)).
In the input field for levels, the formula for the levels is entered.
The formula has to start with a ~ (tilde) and the R-specific symbolic form:
+ to add terms,: to denote an interaction term,* to denote factor crossing (a*b is interpreted as a+b+a:b), a and b
are columns in colData(se),- to remove the specified term, e.g. ~ a - 1 to specify no intercept, and
a a column in colData(se),+ 0 to alternatively specify a model without intercept.The colnames of colData can be added as terms.
The colnames of the model matrix can be used to calculate contrasts, e.g. 
a - b to specify the contrast between a and b. The contrasts can be
specified in the input field in the sidebar panel.
The panels Model matrix and Contrast matrix will show the model and contrast
matrix upon correct specification of the levels and contrasts.
The panel Top DE will show the differential features in a tabular format,
while the panel Volcano plot will display the information of
log fold change (for limma) or difference (for proDA) against the
p-values (displayed as -log10(p-value).
In the following, we will look at the different panels with the
se input as specified above.
The panel Sample meta-data will show the column data of the se object. The
output will help to specify the levels for the model matrix.
## DataFrame with 40 rows and 3 columns
##                    name        type   treatment
##             <character> <character> <character>
## Sample_1-1   Sample_1-1      Type_1         1_A
## Sample_1-2   Sample_1-2      Type_1         1_A
## Sample_1-3   Sample_1-3      Type_1         1_A
## Sample_1-4   Sample_1-4      Type_1         1_A
## Sample_1-5   Sample_1-5      Type_1         1_A
## ...                 ...         ...         ...
## Sample_4-6   Sample_4-6      Type_4         4_B
## Sample_4-7   Sample_4-7      Type_4         4_B
## Sample_4-8   Sample_4-8      Type_4         4_B
## Sample_4-9   Sample_4-9      Type_4         4_B
## Sample_4-10 Sample_4-10      Type_4         4_BWhen entering ~ treatment + 0 into the input field
Select levels for Model Matrix, the Model matrix panel will look like
(only the first few rows are shown here):
##            treatment1_A treatment1_B treatment2_A treatment2_B treatment3_A
## Sample_1-1            1            0            0            0            0
## Sample_1-2            1            0            0            0            0
## Sample_1-3            1            0            0            0            0
## Sample_1-4            1            0            0            0            0
## Sample_1-5            1            0            0            0            0
## Sample_1-6            0            1            0            0            0
##            treatment3_B treatment4_A treatment4_B
## Sample_1-1            0            0            0
## Sample_1-2            0            0            0
## Sample_1-3            0            0            0
## Sample_1-4            0            0            0
## Sample_1-5            0            0            0
## Sample_1-6            0            0            0As an example, we are interested in the differential expression between
1_A and 1_B, i.e. the differential expression between treatment A and
B of the type Type_1. We enter treatment1_B-treatment1_A in the input
field Select contast(s). The contrast matrix (only the first few rows
are shown here) will then look like:
##               Contrasts
## Levels         treatment1_B-treatment1_A
##   treatment1_A                        -1
##   treatment1_B                         1
##   treatment2_A                         0
##   treatment2_B                         0
##   treatment3_A                         0
##   treatment3_B                         0Switching to the panel Top DE, we will obtain a table with the differentially
expressed features (normalization method is set to none,
transformation method to none and imputation method to MinDet).
Here, differential expression is tested via moderated
t-tests (from the limma package):
The last panel of the tab DE displays the information from the differential
expression analysis. In the case of moderated t-tests, the plot shows the
log fold changes between the specified contrasts against the p-values.
Using the afore-mentioned specification for the model matrix and the
contrast matrix, the plot will look like:
All software and respective versions to build this vignette are listed here:
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.13-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.13-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] limma_3.48.0                proDA_1.6.0                
##  [3] MatrixQCvis_1.0.0           shiny_1.6.0                
##  [5] plotly_4.9.3                ggplot2_3.3.3              
##  [7] SummarizedExperiment_1.22.0 Biobase_2.52.0             
##  [9] GenomicRanges_1.44.0        GenomeInfoDb_1.28.0        
## [11] IRanges_2.26.0              S4Vectors_0.30.0           
## [13] BiocGenerics_0.38.0         MatrixGenerics_1.4.0       
## [15] matrixStats_0.58.0          knitr_1.33                 
## [17] BiocStyle_2.20.0           
## 
## loaded via a namespace (and not attached):
##   [1] backports_1.2.1        circlize_0.4.12        Hmisc_4.5-0           
##   [4] plyr_1.8.6             gmm_1.6-6              lazyeval_0.2.2        
##   [7] shinydashboard_0.7.1   splines_4.1.0          crosstalk_1.1.1       
##  [10] digest_0.6.27          foreach_1.5.1          htmltools_0.5.1.1     
##  [13] magick_2.7.2           fansi_0.4.2            magrittr_2.0.1        
##  [16] checkmate_2.0.0        cluster_2.1.2          doParallel_1.0.16     
##  [19] openxlsx_4.2.3         ComplexHeatmap_2.8.0   imputeLCMD_2.0        
##  [22] sandwich_3.0-1         askpass_1.1            jpeg_0.1-8.1          
##  [25] colorspace_2.0-1       xfun_0.23              dplyr_1.0.6           
##  [28] hexbin_1.28.2          crayon_1.4.1           RCurl_1.98-1.3        
##  [31] jsonlite_1.7.2         impute_1.66.0          survival_3.2-11       
##  [34] zoo_1.8-9              iterators_1.0.13       glue_1.4.2            
##  [37] gtable_0.3.0           zlibbioc_1.38.0        XVector_0.32.0        
##  [40] UpSetR_1.4.0           GetoptLong_1.0.5       DelayedArray_0.18.0   
##  [43] shape_1.4.6            scales_1.1.1           vsn_3.60.0            
##  [46] mvtnorm_1.1-1          DBI_1.1.1              Rcpp_1.0.6            
##  [49] viridisLite_0.4.0      xtable_1.8-4           htmlTable_2.2.1       
##  [52] clue_0.3-59            reticulate_1.20        foreign_0.8-81        
##  [55] preprocessCore_1.54.0  Formula_1.2-4          umap_0.2.7.0          
##  [58] htmlwidgets_1.5.3      httr_1.4.2             RColorBrewer_1.1-2    
##  [61] ellipsis_0.3.2         farver_2.1.0           pkgconfig_2.0.3       
##  [64] nnet_7.3-16            sass_0.4.0             utf8_1.2.1            
##  [67] labeling_0.4.2         tidyselect_1.1.1       shinyhelper_0.3.2     
##  [70] rlang_0.4.11           later_1.2.0            munsell_0.5.0         
##  [73] tools_4.1.0            generics_0.1.0         evaluate_0.14         
##  [76] stringr_1.4.0          fastmap_1.1.0          yaml_2.2.1            
##  [79] zip_2.1.1              purrr_0.3.4            nlme_3.1-152          
##  [82] mime_0.10              compiler_4.1.0         rstudioapi_0.13       
##  [85] curl_4.3.1             png_0.1-7              affyio_1.62.0         
##  [88] statmod_1.4.36         tibble_3.1.2           bslib_0.2.5.1         
##  [91] stringi_1.6.2          highr_0.9              RSpectra_0.16-0       
##  [94] lattice_0.20-44        Matrix_1.3-3           shinyjs_2.0.0         
##  [97] vegan_2.5-7            permute_0.9-5          tmvtnorm_1.4-10       
## [100] vctrs_0.3.8            pillar_1.6.1           norm_1.0-9.5          
## [103] lifecycle_1.0.0        BiocManager_1.30.15    jquerylib_0.1.4       
## [106] GlobalOptions_0.1.2    data.table_1.14.0      bitops_1.0-7          
## [109] httpuv_1.6.1           extraDistr_1.9.1       affy_1.70.0           
## [112] R6_2.5.0               latticeExtra_0.6-29    pcaMethods_1.84.0     
## [115] bookdown_0.22          promises_1.2.0.1       gridExtra_2.3         
## [118] codetools_0.2-18       MASS_7.3-54            assertthat_0.2.1      
## [121] openssl_1.4.4          rjson_0.2.20           withr_2.4.2           
## [124] GenomeInfoDbData_1.2.6 mgcv_1.8-35            grid_4.1.0            
## [127] rpart_4.1-15           tidyr_1.1.3            rmarkdown_2.8         
## [130] Cairo_1.5-12.2         Rtsne_0.15             base64enc_0.1-3Ahlman-Eltze, C., and S. Anders. 2019. “ProDA: Probabilistic Dropout Analysis for Identifying Differentially Abundant Proteins in Label-Free Mass Spectrometry.” bioRxiv. https://doi.org/10.1101/661496.
Huber, W., A. von Heydebreck, H. Sueltmann, A. Poustka, and M. Vingron. 2002. “Variance Stabilization Applied to Microarray Data Calibration and to the Quantification of Differential Expression.” Bioinformatics, S96–S104. https://doi.org/10.1093/bioinformatics/18.suppl_1.S96.
Kammers, K., R.N. Cole, C. Tiengwe, and I. Ruczinski. 2015. “Detecting Significant Changes in Protein Abundance.” EuPA Open Proteomics, 11–19. https://doi.org/10.1016/j.euprot.2015.02.002.
Lazar, C., L. Gatto, M. Ferro, C. Bruley, and T. Burger. 2016. “Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies.” J. Proteome Res., 1116–26. https://doi.org/10.1021/acs.jproteome.5b00981.
Maaten, L. van der, and G. Hinton. 2008. “Visualizing Data Using T-Sne.” J. Machine Learning Research, 2579–2605.
McInnes, L., J. Healy, and J. Melville. 2018. “UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction.” arXiv, 1802.03426.
Ritchie, M.E., B. Phipson, D. Wu, Y. Hu, C.W. Law, W. Shi, and G.K. Smyth. 2015. “Limma Powers Differential Expression Analyses for Rna-Sequencing and Microarray Studies.” Nucleic Acids Research, e47. https://doi.org/10.1093/nar/gkv007.