Mass spectrometry-based proteomics operates in two fundamentally different modes:
Discovery proteomics (DDA) casts a wide net:
Targeted proteomics (PRM/SRM) takes aim at specific targets:
This document analyzes a PRM dataset, demonstrating how peppwR handles the unique challenges of targeted proteomics, including missing data and FDR considerations.
Even with targeted methods, missing values are ubiquitous in mass spectrometry. But not all missing data is created equal:
MCAR (Missing Completely At Random)
MNAR (Missing Not At Random)
peppwR tracks both the rate of missingness AND evidence for MNAR patterns, enabling more realistic power estimates.
This analysis uses fungal phosphoproteomics data from a targeted PRM experiment:
| Property | Value |
|---|---|
| Organism | Magnaporthe oryzae (rice blast fungus) |
| Method | PRM (Parallel Reaction Monitoring) |
| Comparison | Early (t=0) vs late (t=6) timepoints |
| Biological replicates | 3 per condition |
| Technical replicates | 2 per biological replicate |
| Peptides | 285 phosphopeptides |
| Missing data | ~17% |
library(peppwR)
library(dplyr)
library(ggplot2)
library(tibble)
# Load the PRM experiment data
prm <- read.csv("../../sample_data/prm_data.csv")
# Examine the structure
glimpse(prm)
## Rows: 20,520
## Columns: 7
## $ molecule_list_name <chr> "MOL_0001", "MOL_0001", "MOL_0001", "MOL_000…
## $ peptide_modified_sequence <chr> "PEP_00001", "PEP_00001", "PEP_00001", "PEP_…
## $ genotype <chr> "Genotype_A", "Genotype_A", "Genotype_A", "G…
## $ timepoint <dbl> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0,…
## $ bio_rep <int> 1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1, 1, 2,…
## $ tech_rep <int> 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1,…
## $ total_area <dbl> 1.031275e-03, 1.130267e-03, 5.243216e-04, 2.…
# Check for missing values in raw data
cat("Total observations:", nrow(prm), "\n")
## Total observations: 20520
cat("Missing values:", sum(is.na(prm$total_area)), "\n")
## Missing values: 3547
cat("Missing rate:", round(mean(is.na(prm$total_area)) * 100, 1), "%\n")
## Missing rate: 17.3 %
Technical replicates measure the same biological sample multiple times. They capture measurement variability, not biological variability.
For power analysis, biological replicates are the true unit of replication - they represent independent observations of the biological phenomenon we’re studying. We average technical replicates to get one value per biological replicate.
# Filter to early (0) vs late (6) timepoints
# Average technical replicates within each biological replicate
pilot <- prm |>
filter(timepoint %in% c(0, 6)) |>
group_by(peptide_modified_sequence, genotype, timepoint, bio_rep) |>
summarise(abundance = mean(total_area, na.rm = TRUE), .groups = "drop") |>
transmute(
peptide_id = peptide_modified_sequence,
condition = paste0("t", timepoint),
abundance = abundance
)
# Summary statistics
cat("Unique peptides:", n_distinct(pilot$peptide_id), "\n")
## Unique peptides: 285
cat("Observations per condition:\n")
## Observations per condition:
pilot |> count(condition)
## # A tibble: 2 Ă— 2
## condition n
## <chr> <int>
## 1 t0 1710
## 2 t6 1710
# Check missingness after averaging technical replicates
# (NaN results when both tech reps are NA)
pilot <- pilot |>
mutate(abundance = ifelse(is.nan(abundance), NA, abundance))
cat("Missing after averaging tech reps:", sum(is.na(pilot$abundance)), "\n")
## Missing after averaging tech reps: 429
cat("Missing rate:", round(mean(is.na(pilot$abundance)) * 100, 1), "%\n")
## Missing rate: 12.5 %
Before diving into power analysis, we need to understand our missing data. peppwR automatically computes missingness statistics during distribution fitting.
# Fit distributions - missingness is tracked automatically
fits <- fit_distributions(pilot, "peptide_id", "condition", "abundance")
# Summary includes missingness information
print(fits)
## peppwr_fits object
## ------------------
## 570 peptides fitted
##
## Best fit distribution counts:
## Gamma: 93
## Inverse Gaussian: 36
## InvGamma: 2
## Lognormal: 5
## Normal: 49
## Pareto: 268
## Skew Normal: 117
##
## Missingness: 429/3420 values NA (12.5%)
## Peptides with missing data: 180
##
## Dataset-level MNAR (abundance vs missingness):
## Correlation: r = -0.40 (p = 1.3e-07)
## Moderate evidence of MNAR: low-abundance peptides have more missing values (p = 1.3e-07)
##
## Per-peptide MNAR (within-peptide pattern):
## 0 of 180 peptides with score > 2
## Note: Low power with N~6 observations per peptide
The plot_missingness() function provides three
complementary views:
plot_missingness(fits)
Missingness patterns across the PRM peptidome. Left: Distribution of NA rates. Middle: MNAR scores (z-statistic; values > 2 suggest informative missingness). Right: Relationship between mean abundance and NA rate.
Mass spectrometry data can exhibit two distinct types of MNAR, and peppwR tracks both:
| Type | Question | Detection Method | Reliability at Small N |
|---|---|---|---|
| Between-peptide (dataset-level) | Are low-abundance peptides more likely to have missing values? | Spearman correlation: log(mean_abundance) vs NA rate | Good - aggregates across many peptides |
| Within-peptide (per-peptide) | Within a single peptide, are low observations more likely missing? | Rank-based z-score comparing observed to expected ranks | Poor - requires N > 15 per peptide |
Important: The per-peptide MNAR score requires substantial observations per peptide (N > 15) for reliable detection. With typical proteomics sample sizes (N = 3-6 per group), this score has very low statistical power and often reports “0 of N peptides with MNAR” even when clear MNAR patterns exist at the dataset level.
For small sample sizes, rely on the dataset-level
correlation shown in the print output and the third panel of
plot_missingness().
The dataset-level MNAR metric correlates log(mean_abundance) with NA rate across all peptides. A negative correlation indicates that low-abundance peptides have more missing values - the hallmark of detection-limit-driven missingness.
| Correlation (r) | Interpretation |
|---|---|
| r > -0.1 | No evidence of MNAR |
| -0.3 < r < -0.1 | Weak evidence of MNAR |
| -0.5 < r < -0.3 | Moderate evidence of MNAR |
| r < -0.5 | Strong evidence of MNAR |
The per-peptide MNAR score is a z-statistic testing whether missing observations within a peptide have systematically lower abundance than observed values. While conceptually useful, this metric requires N > 15 observations per peptide for reliable detection.
| MNAR Score | Interpretation | Note |
|---|---|---|
| < 1 | Little evidence for MNAR | |
| 1-2 | Weak evidence | |
| 2-3 | Moderate evidence | |
| > 3 | Strong evidence for MNAR |
With small sample sizes (N < 15), expect most peptides to show “no evidence” even when MNAR exists. This is a power limitation, not evidence against MNAR.
# Identify peptides with strong MNAR evidence
mnar_peptides <- get_mnar_peptides(fits, threshold = 2)
cat("Peptides with MNAR evidence (z > 2):", nrow(mnar_peptides), "\n")
## Peptides with MNAR evidence (z > 2): 0
if (nrow(mnar_peptides) > 0) {
cat("\nTop MNAR peptides:\n")
print(head(mnar_peptides, 10))
}
# Note: With small N, the dataset-level correlation (shown in print output)
# is more reliable than per-peptide scores
Peptides with high MNAR scores require careful handling:
plot(fits)
Best-fit distribution counts for PRM data.
Important: Like the DDA dataset, we see that certain distributions may dominate the “best fit” counts. This is an artifact of small sample size (only 6 observations per peptide when averaged across tech reps), not a statement about true underlying distributions.
With more biological replicates, we would expect gamma and lognormal distributions to fit better - these are the typical distributions for mass spectrometry abundance data.
p <- plot_param_distribution(fits)
print(p)
Distribution of AIC values across peptides for each fitted distribution.
# Count peptides per best-fit distribution
cat("\nPeptides per best-fit distribution:\n")
##
## Peptides per best-fit distribution:
tibble(distribution = fits$best) |>
count(distribution) |>
arrange(desc(n))
## # A tibble: 7 Ă— 2
## distribution n
## <chr> <int>
## 1 Pareto 268
## 2 Skew Normal 117
## 3 Gamma 93
## 4 Normal 49
## 5 Inverse Gaussian 36
## 6 Lognormal 5
## 7 InvGamma 2
Critical first step: With small samples (N=3), test choice dramatically affects power. Before committing to detailed analyses, we compare available tests.
| Test | Type | Characteristics |
|---|---|---|
| Wilcoxon rank-sum | Non-parametric | Conservative, robust to outliers, needs larger N |
| Bootstrap-t | Resampling | Handles non-normality through resampling |
| Bayes factor | Bayesian | Evidence-based, often more powerful at small N |
# Run all three tests (use n_sim = 100 for faster rendering)
power_wilcox <- power_analysis(fits, effect_size = 2, n_per_group = 3,
find = "power", test = "wilcoxon", n_sim = 100)
power_boot <- power_analysis(fits, effect_size = 2, n_per_group = 3,
find = "power", test = "bootstrap_t", n_sim = 100)
power_bayes <- power_analysis(fits, effect_size = 2, n_per_group = 3,
find = "power", test = "bayes_t", n_sim = 100)
# Create comparison table
comparison <- tibble(
Test = c("Wilcoxon rank-sum", "Bootstrap-t", "Bayes factor"),
`Median Power` = c(
median(power_wilcox$simulations$peptide_power, na.rm = TRUE),
median(power_boot$simulations$peptide_power, na.rm = TRUE),
median(power_bayes$simulations$peptide_power, na.rm = TRUE)
),
`% > 50% Power` = c(
mean(power_wilcox$simulations$peptide_power > 0.5, na.rm = TRUE) * 100,
mean(power_boot$simulations$peptide_power > 0.5, na.rm = TRUE) * 100,
mean(power_bayes$simulations$peptide_power > 0.5, na.rm = TRUE) * 100
),
`% > 80% Power` = c(
mean(power_wilcox$simulations$peptide_power > 0.8, na.rm = TRUE) * 100,
mean(power_boot$simulations$peptide_power > 0.8, na.rm = TRUE) * 100,
mean(power_bayes$simulations$peptide_power > 0.8, na.rm = TRUE) * 100
)
)
knitr::kable(comparison, digits = 2,
caption = "Power comparison across statistical tests (N=3, 2-fold effect)")
| Test | Median Power | % > 50% Power | % > 80% Power |
|---|---|---|---|
| Wilcoxon rank-sum | 0.00 | 0.00 | 0.00 |
| Bootstrap-t | 0.21 | 11.47 | 5.02 |
| Bayes factor | 0.56 | 66.67 | 16.13 |
The table above shows how different tests perform at small sample sizes:
Wilcoxon rank-sum is conservative - non-parametric tests trade statistical assumptions for larger sample size requirements.
Bootstrap-t uses resampling to handle non-normality, potentially offering intermediate power.
Bayes factor tests quantify evidence for an effect, often performing better at small N.
For remaining analyses: We use the Bayes factor test since it provides usable power estimates at N=3.
Power analysis can answer three related questions:
With 3 biological replicates and a 2-fold effect, what power do we achieve?
# Using Bayes factor test based on comparison results
power_current <- power_analysis(fits, effect_size = 2, n_per_group = 3,
find = "power", test = "bayes_t", n_sim = 100)
print(power_current)
## peppwr_power analysis
## ---------------------
## Mode: per_peptide
##
## Power: 61%
## Sample size: 3 per group
## Effect size: 2.00-fold
##
## Statistical test: bayes_t
## Decision threshold: BF > 3 (substantial evidence)
plot(power_current)
Distribution of power across peptides with N=3 and 2-fold effect (Bayes factor test).
What N would achieve 80% power to detect a 2-fold change?
sample_size <- power_analysis(fits, effect_size = 2, target_power = 0.8,
find = "sample_size", test = "bayes_t", n_sim = 100)
print(sample_size)
## peppwr_power analysis
## ---------------------
## Mode: per_peptide
##
## Recommended sample size: N=10 per group
## Target power: 80%
## Effect size: 2.00-fold
##
## Statistical test: bayes_t
## Decision threshold: BF > 3 (substantial evidence)
plot(sample_size)
Percentage of peptides achieving 80% power at each sample size.
At N=3, what’s the smallest effect we can reliably detect?
Understanding the two thresholds: In per-peptide mode, there are two distinct thresholds:
target_power (set to 0.8) - The power
level each individual peptide must achieveproportion_threshold (default 0.5) -
The fraction of peptides that must reach target_powerThe plot shows “% of peptides reaching 80% power” on the y-axis. The answer tells us: “At what effect size do 50% of peptides achieve 80% power?”
min_effect <- power_analysis(fits, n_per_group = 3, target_power = 0.8,
find = "effect_size", test = "bayes_t", n_sim = 100)
print(min_effect)
## peppwr_power analysis
## ---------------------
## Mode: per_peptide
##
## Minimum detectable effect: 5.00-fold
## Sample size: 3 per group
## Target power: 80%
##
## Statistical test: bayes_t
## Decision threshold: BF > 3 (substantial evidence)
plot(min_effect)
Proportion of peptides reaching 80% power at each effect size. The default threshold is 50% of peptides.
This tells us what effect sizes are realistically detectable with
current sample sizes. To require more peptides to be well-powered,
increase proportion_threshold.
How does accounting for missingness affect power estimates?
# Power without accounting for missingness (optimistic)
power_no_miss <- power_analysis(fits, effect_size = 2, n_per_group = 3,
find = "power", test = "bayes_t",
include_missingness = FALSE, n_sim = 100)
# Power accounting for missingness (realistic)
power_with_miss <- power_analysis(fits, effect_size = 2, n_per_group = 3,
find = "power", test = "bayes_t",
include_missingness = TRUE, n_sim = 100)
cat("Median power WITHOUT missingness:",
round(median(power_no_miss$simulations$peptide_power, na.rm = TRUE), 3), "\n")
## Median power WITHOUT missingness: 0.57
cat("Median power WITH missingness: ",
round(median(power_with_miss$simulations$peptide_power, na.rm = TRUE), 3), "\n")
## Median power WITH missingness: 0.57
Accounting for missingness typically reduces power estimates - this is the realistic cost of missing data on your experiment’s ability to detect effects.
With 285 peptides tested, multiple testing is a concern. If we use alpha = 0.05 for each test, we expect ~14 false positives by chance alone (285 x 0.05 = 14.25).
FDR (False Discovery Rate) control methods like Benjamini-Hochberg adjust p-values to control the expected proportion of false discoveries. This is more stringent than nominal testing and reduces power.
prop_nullThe prop_null parameter specifies the assumed proportion
of true null hypotheses - peptides with no real effect. This affects FDR
correction stringency:
prop_null |
Meaning | Impact |
|---|---|---|
| 0.9 | 90% of peptides have no effect | More stringent correction |
| 0.5 | 50% have real effects | Less stringent correction |
In targeted proteomics, you typically select peptides expected to
change, so prop_null might be lower than in discovery
experiments.
Note: FDR-aware mode requires frequentist tests (Wilcoxon or bootstrap-t) because Benjamini-Hochberg correction operates on p-values. Bayes factors cannot be meaningfully converted to p-values for this correction.
# Standard power with Wilcoxon (nominal alpha = 0.05)
power_nominal <- power_analysis(fits, effect_size = 2, n_per_group = 3,
find = "power", test = "wilcoxon",
apply_fdr = FALSE, n_sim = 100)
# FDR-aware power (BH correction)
# prop_null = 0.8 means we assume 80% of peptides have no true effect
power_fdr <- power_analysis(fits, effect_size = 2, n_per_group = 3,
find = "power", test = "wilcoxon",
apply_fdr = TRUE, prop_null = 0.8,
fdr_threshold = 0.05, n_sim = 100)
# Extract power values safely
nominal_power <- power_nominal$simulations$peptide_power
fdr_power <- power_fdr$simulations$peptide_power
cat("Nominal power (Wilcoxon, no FDR correction):\n")
## Nominal power (Wilcoxon, no FDR correction):
if (is.numeric(nominal_power) && length(nominal_power) > 0) {
cat(" Median power:", round(median(nominal_power, na.rm = TRUE), 3), "\n")
cat(" % peptides > 80% power:", round(mean(nominal_power > 0.8, na.rm = TRUE) * 100, 1), "%\n")
} else {
print(power_nominal)
}
## Median power: 0
## % peptides > 80% power: 0 %
cat("\nFDR-aware power (Wilcoxon, BH correction, 80% true nulls):\n")
##
## FDR-aware power (Wilcoxon, BH correction, 80% true nulls):
if (is.numeric(fdr_power) && length(fdr_power) > 0) {
cat(" Median power:", round(median(fdr_power, na.rm = TRUE), 3), "\n")
cat(" % peptides > 80% power:", round(mean(fdr_power > 0.8, na.rm = TRUE) * 100, 1), "%\n")
} else {
print(power_fdr)
}
## peppwr_power analysis
## ---------------------
## Mode: per_peptide
##
## Power: 0%
## Sample size: 3 per group
## Effect size: 2.00-fold
##
## Statistical test: wilcoxon
## Significance level: 0.05
##
## FDR-adjusted analysis (Benjamini-Hochberg)
## Proportion true nulls: 80%
## FDR threshold: 5%
FDR correction reduces power because:
prop_null is low), FDR correction is
less severeFor a targeted panel of 285 peptides, the FDR impact is more modest than for discovery proteomics with thousands of tests.
Missingness patterns: This PRM dataset has ~17% missing values. Some peptides may show evidence of MNAR (informative missingness), where low-abundance values are preferentially missing.
Distribution fitting: With only 6 observations per peptide (after averaging tech reps), distribution selection is unreliable. This is a small sample size artifact, not a reflection of true underlying distributions.
Test selection: Different statistical tests show varying power at small N. Bayes factor tests can provide more informative estimates than conservative non-parametric tests when samples are limited.
Missingness impact: Accounting for missing data reduces power estimates - this reflects the real cost of missingness.
FDR impact: With 285 peptides, FDR correction modestly reduces power compared to nominal testing.
For future experiments: Consider N=6 or more biological replicates for reliable detection of 2-fold changes.
For current data: Set realistic expectations about detectable effect sizes. Subtle changes may not be detectable with N=3.
Test selection: With small samples, Bayes factor tests may be more informative than traditional frequentist approaches.
MNAR peptides: Report peptides with high MNAR scores separately and interpret their results cautiously.
This PRM dataset (285 peptides) has advantages over discovery proteomics:
However:
prop_null parameter requires assumptions about true
effect ratessessionInfo()
## R version 4.5.1 (2025-06-13)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sequoia 15.7.3
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
##
## locale:
## [1] en_GB/UTF-8/en_GB/C/en_GB/en_GB
##
## time zone: Europe/London
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] tibble_3.3.1 ggplot2_4.0.2 dplyr_1.2.0 peppwR_0.1.0
##
## loaded via a namespace (and not attached):
## [1] sass_0.4.9 utf8_1.2.6 generics_0.1.4
## [4] tidyr_1.3.2 fGarch_4052.93 lattice_0.22-7
## [7] digest_0.6.37 magrittr_2.0.4 evaluate_1.0.5
## [10] grid_4.5.1 RColorBrewer_1.1-3 fastmap_1.2.0
## [13] jsonlite_2.0.0 Matrix_1.7-3 mgcv_1.9-3
## [16] purrr_1.2.1 scales_1.4.0 gbutils_0.5.1
## [19] jquerylib_0.1.4 Rdpack_2.6.5 cli_3.6.5
## [22] timeSeries_4052.112 rlang_1.1.7 rbibutils_2.4.1
## [25] splines_4.5.1 cowplot_1.2.0 intervals_0.15.5
## [28] withr_3.0.2 cachem_1.1.0 yaml_2.3.10
## [31] otel_0.2.0 cvar_0.6 tools_4.5.1
## [34] assertthat_0.2.1 vctrs_0.7.1 R6_2.6.1
## [37] lifecycle_1.0.5 univariateML_1.5.0 pkgconfig_2.0.3
## [40] pillar_1.11.1 bslib_0.8.0 gtable_0.3.6
## [43] glue_1.8.0 xfun_0.55 tidyselect_1.2.1
## [46] knitr_1.51 farver_2.1.2 spatial_7.3-18
## [49] nlme_3.1-168 htmltools_0.5.8.1 fBasics_4052.98
## [52] labeling_0.4.3 rmarkdown_2.30 timeDate_4052.112
## [55] compiler_4.5.1 S7_0.2.1