Quantifying causal effects with complex, multivariate outcomes remains a challenge in treatment evaluation. For hierarchical outcomes, regulatory agencies such as the U.S. Food and Drug Administration recommend methods based on the Win Ratio (Pocock et al. (2012)) and Generalized Pairwise Comparisons (Buyse (2010)). These approaches typically estimate (in this vignette using the WINS package by Cui and Huang (2025)) a population-level quantity — the probability that a randomly selected treated patient fares better than a randomly selected control patient. In Even and Josse (2025), the authors studied another quantity that can be very interesting in this individual context: the probability that a given individual would have a better outcome under treatment than under control. Unfortunately, this quantity is not identifiable, because for each individual we do not observe both the treated and control outcomes. Even and Josse (2025) thus proposed a proxy for this quantity that is identifiable, and three ways to estimate it. This package implements that methodology and the three complementary estimators: (i) a nearest-neighbor matching estimator (nn_win_ratio), (ii) a distributional regression estimator (drf_win_ratio), and (iii) a semiparametric efficient estimator (eif_win_ratio).
The latest release of the package can be installed through CRAN (soon):
The goal of this example is to illustrate, in a simple setting, how
to use the functions nn_win_ratio,
drf_win_ratio, and eif_win_ratio. The
estimators produced by these functions target the causal
(non-identifiable) quantity
\[ \tau_{\text{indiv}} := \mathbb{E}\Big[w\big(Y_i(1) \big| Y_i(0)\big)\Big], \]
where \(Y_i(0)\) and \(Y_i(1)\) denote the potential outcomes for individual \(i\) under control and treatment, respectively. The function \(w(\cdot| \cdot)\) compares these outcomes to determine whether a patient benefits more from treatment or from control. This measure is not identifiable. The authors of Buyse (2010) introduce the following population quantity estimated via nearest-neighbor matching: \[ \tau_{\text{pop}} := \mathbb{E}\Big[w\big(Y_i(1)\,\big|\, Y_j(0)\big)\Big], \] where \(i\) and \(j\) are two different and independent individuals. In Even and Josse (2025) the authors studied a proxy of the individual quantity,
\[ \tau^\star := \mathbb{E}\Big[ w\big(Y^{(X_i)}(1), Y_i(0)\big) \Big], \quad \text{ where } \tau^\star(x) := \mathbb{E}\Big[ w\big(Y^{(x)}(1), Y_i(0)\big) \;\big|\; X_i = x \Big], \] and where for \(x \in X\), \(\{Y^{(x)}(0), Y^{(x)}(1)\}\) denotes an independent copy of \(\{Y_i(0), Y_i(1)\} \mid X_i = x\).
For more details on how to use them, please refer to the function
documentation ?nn_win_ratio, ?drf_win_ratio
and eif_win_ratio.
# ---------------------------------------------------------
# Generate data with mixed covariates
# ---------------------------------------------------------
set.seed(123)
n <- 100
base <- data.frame(
X1 = rnorm(n, mean = 5, sd = 2), # numeric covariate 1
X2 = rnorm(n, mean = 10, sd = 3), # numeric covariate 2
X3 = factor(sample(c("A", "B", "C"), n, replace = TRUE)), # factor covariate
Y1 = rnorm(n, mean = 50, sd = 10), # numeric outcome 1
Y2 = rnorm(n, mean = 100, sd = 20), # numeric outcome 2
arm = sample(c(0, 1), n, replace = TRUE) # binary treatment arm
)
# ---------------------------------------------------------
# Compute win ratio with nearest neighbor pairing
# ---------------------------------------------------------
res_nn <- nn_win_ratio(
base = base,
treatment_name = "arm",
outcome_names = c("Y1", "Y2")
)
# ---------------------------------------------------------
# Compute win ratio with Distributional Random Forests
# ---------------------------------------------------------
res_drf <- drf_win_ratio(
base = base,
treatment_name = "arm",
outcome_names = c("Y1", "Y2")
)
# ---------------------------------------------------------
# Compute win ratio with Efficient Influence Functions
# ---------------------------------------------------------
res_eif <- eif_win_ratio(
base = base,
treatment_name = "arm",
outcome_names = c("Y1", "Y2")
)Drawing on Example 2 from Even and Josse
(2025), the following simulation illustrates the potential
discrepancy between the historical Win Ratio and the individual-level
estimator proposed in that work. The historical Win Ratio estimated here
using the WINS package (Cui and
Huang (2025)), which is based on complete pairing, differs from
the estimates obtained via the individual-level approach and the
causalWins package, which employs nearest-neighbor (NN)
matching.
library(WINS)
# In this example, there are two subpopulations: A and B. The potential outcomes are
y_0a <- 4 # potential outcome of individuals of subpopulation A without treatment
y_1b <- 3 # potential outcome of individuals of subpopulation B under treatment
y_0b <- 2 # potential outcome of individuals of subpopulation B without treatment
y_1a <- 1 # potential outcome of individuals of subpopulation A under treatment
alpha <- 1 / 3 # proportion of women
n_rounds <- 100
n <- 300*1:5 # number of individuals 300, 600, 900, 1200, 1500
rct_prob <- .5 # probability of assignment to treatment/control group
results_nn <- data.frame(row_id = 1:n_rounds)
results_comp <- data.frame(row_id = 1:n_rounds)
for (n_individuals in n) {
current_result_nn <- c()
current_result_comp <- c()
for (round in 1:n_rounds) {
set.seed(123 + n_individuals + round)
# ---------------------------------------------------
# Generate Data
# ---------------------------------------------------
X <- rbinom(n_individuals, 1, alpha) # Sample element in A with prob `alpha`
treatment <- rbinom(n_individuals, 1, rct_prob) # Assign to treatment with prob `rct_prob`
Y <- X * treatment * y_1a + X * (1 - treatment) * y_0a +
(1 - X) * treatment * y_1b + (1 - X) * (1 - treatment) * y_0b # Outcomes
base <- data.frame(X = X, Y = Y, arm = treatment)
# ---------------------------------------------------
# Compute Win Proportion for Nearest Neighbor Pairing
# ---------------------------------------------------
res_nn <- nn_win_ratio(
base = base,
treatment_name = "arm",
outcome_names = "Y"
)
res_nn <- res_nn$win_proportion$value
current_result_nn <- c(current_result_nn, res_nn)
# ---------------------------------------------------
# Compute Win Proportion for Complete Pairing
# ---------------------------------------------------
data <- base
colnames(data)[colnames(data) == "Y"] <- "Y_1" # Format for use with the WINS Package
data$id <- 1:n_individuals # Format for use with the WINS Package
res_comp <- WINS::win.stat(
data = data,
ep_type = "continuous",
arm.name = c(1, 0),
priority = c(1),
summary.print = FALSE
)
res_comp <- res_comp$Win_prop[[2]]
current_result_comp <- c(current_result_comp, res_comp)
}
results_nn[[as.character(n_individuals)]] <- current_result_nn
results_comp[[as.character(n_individuals)]] <- current_result_comp
}We now present boxplots of the simulated win proportions calculated
using the causalWins and WINS packages. The
results indicate that, under substantial heterogeneity, the two methods
are not equivalent and can result in different treatment
recommendations.
results_nn$row_id <- NULL
results_comp$row_id <- NULL
# # Interleave columns for side-by-side plotting
combined <- unlist(Map(list, results_nn, results_comp), recursive = FALSE)
# Set names for x-axis
names(combined) <- rep(names(results_nn), each = 2)
# Plot
boxplot(combined,
col = rep(c("skyblue", "orange"), ncol(results_nn)),
las = 2,
main = "Comparison of the win proportions computed \nwith Nearest Neighbors and Complete Pairings",
ylab = "Win proportion",
xlab = "Number of patients (n)"
)
legend("bottomleft",
legend = c("NN", "Complete"),
fill = c("skyblue", "orange")
)
abline(h = 0.5, lty = 2, col = "red")In contrast with the previous example, the following simulation demonstrates that in an RCT with a more homogeneous population, the complete pairing and nearest neighbor approaches yield broadly similar results.
library(MASS)
library(WINS)
n_rounds <- 100
n <- 300* 1:5 # number of individuals 300, 600, 900, 1200, 1500
mu <- c(0, 0) # Covariates mean
sigma <- matrix(c(1, 0.6, 0.6, 1), 2, 2) # Covariates covariance
tau_cont <- 2 # ATE of continuous outcome
tau_bin <- 0.8 # ATE on logit scale of binary outcome
rct_prob <- .5 # probability of assignment to treatment/control group
results_nn <- data.frame(row_id = 1:n_rounds)
results_comp <- data.frame(row_id = 1:n_rounds)
for (n_individuals in n) {
current_result_nn <- c()
current_result_comp <- c()
for (round in 1:n_rounds) {
set.seed(123 + n_individuals + round)
# ---------------------------------------------------
# Generate Covariates and Treatment
# ---------------------------------------------------
covariates <- as.data.frame(mvrnorm(n_individuals, mu = mu, Sigma = sigma)) # covariates
colnames(covariates) <- c("X1", "X2")
treatment <- rbinom(n_individuals, 1, rct_prob) # Assign to treatment with prob `rct_prob`
# ---------------------------------------------------
# Generate Binary Outcomes
# ---------------------------------------------------
# Potential Outcomes
lin_pred0 <- -1 + 0.3 * covariates$X1 - 0.5 * covariates$X2
lin_pred1 <- lin_pred0 + tau_bin
# Sample binary potential outcomes
prob0 <- 1 / (1 + exp(-lin_pred0))
prob1 <- 1 / (1 + exp(-lin_pred1))
Y0_bin <- rbinom(n_individuals, size = 1, prob = prob0)
Y1_bin <- rbinom(n_individuals, size = 1, prob = prob1)
# Observed binary outcome
Y_1 <- Y0_bin * (1 - treatment) + Y1_bin * treatment
# ---------------------------------------------------
# Generate Continuous Outcomes
# ---------------------------------------------------
# Potential Outcomes
Y0_cont <- 1 + 0.5 * covariates$X1 - 0.7 * covariates$X2 + rnorm(n_individuals, 0, 1)
Y1_cont <- Y0_cont + tau_cont
# Observed continuous outcome
Y_2 <- Y0_cont * (1 - treatment) + Y1_cont * treatment
# ---------------------------------------------------
# Compute Win Proportion for Nearest Neighbor Pairing
# ---------------------------------------------------
base <- data.frame(covariates, arm = treatment, Y_1, Y_2)
res_nn <- nn_win_ratio(
base = base,
treatment_name = "arm",
outcome_names = c("Y_1", "Y_2")
)
res_nn <- res_nn$win_proportion$value
current_result_nn <- c(current_result_nn, res_nn)
# ---------------------------------------------------
# Compute Win Proportion for Complete Pairing
# ---------------------------------------------------
data <- base
data$id <- 1:n_individuals # Format for use with the WINS Package
res_comp <- WINS::win.stat(
data = data,
ep_type = c("binary", "continuous"),
arm.name = c(1, 0),
priority = c(1, 2),
summary.print = FALSE
)
res_comp <- res_comp$Win_prop[[2]]
current_result_comp <- c(current_result_comp, res_comp)
}
results_nn[[as.character(n_individuals)]] <- current_result_nn
results_comp[[as.character(n_individuals)]] <- current_result_comp
}Unlike the previous plot, we now observe that both approaches yield similar treatment recommendations.
results_nn$row_id <- NULL
results_comp$row_id <- NULL
# # Interleave columns for side-by-side plotting
combined <- unlist(Map(list, results_nn, results_comp), recursive = FALSE)
# Set names for x-axis
names(combined) <- rep(names(results_nn), each = 2)
# Plot
boxplot(combined,
col = rep(c("skyblue", "orange"), ncol(results_nn)),
las = 2,
main = "Comparison of the win proportions computed \nwith Nearest Neighbors and Complete Pairings",
ylab = "Win proportion",
xlab = "Number of patients (n)"
)
legend("bottomleft",
legend = c("NN", "Complete"),
fill = c("skyblue", "orange")
)
abline(h = 0.5, lty = 2, col = "red")This simulation illustrates the application of the Efficient Influence Function (EIF) method. The data-generating process is similar to the previous example: covariates are produced in the same way. However, the treatment assignment is no longer randomized. Instead, it depends on the covariates, meaning the setting corresponds to an observational study.
library(doParallel)
library(foreach)
n_rounds <- 100
n <- 10000
mu <- c(0, 0)
sigma <- matrix(c(1, 0.6, 0.6, 1), 2, 2)
tau_cont <- 2
tau_bin <- 0.8
# Set up cluster
n_cores <- parallel::detectCores() - 1 # leave 1 core free
cl <- makeCluster(n_cores)
registerDoParallel(cl)
results_eif <- foreach(
round = 1:n_rounds,
.combine = c,
.packages = c("MASS", "causalWins")
) %dopar% {
set.seed(123 + n + round)
# -----------------------------------------------------------
# Generate Covariates and Treatment (observational)
# -----------------------------------------------------------
covariates <- mvrnorm(n, mu = mu, Sigma = sigma) # covariates
colnames(covariates) <- c("X1", "X2")
covariates <- as.data.frame(covariates)
p_treatment <- 1 / (1 + exp(-(0.5 * covariates$X1 - 0.3 * covariates$X2))) # non-rct
treatment <- rbinom(n, 1, p_treatment)
# -----------------------------------------------------------
# Generate Binary Outcomes
# -----------------------------------------------------------
# Potential Outcomes
lin_pred0 <- -1 + 0.3 * covariates$X1 - 0.5 * covariates$X2
lin_pred1 <- lin_pred0 + tau_bin
# Sample binary potential outcomes
prob0 <- 1 / (1 + exp(-lin_pred0))
prob1 <- 1 / (1 + exp(-lin_pred1))
Y0_bin <- rbinom(n, size = 1, prob = prob0)
Y1_bin <- rbinom(n, size = 1, prob = prob1)
# Observed binary outcome
Y_1 <- Y0_bin * (1 - treatment) + Y1_bin * treatment
# -----------------------------------------------------------
# Generate Continuous Outcomes
# -----------------------------------------------------------
# Potential Outcomes
Y0_cont <- 1 + 0.5 * covariates$X1 - 0.7 * covariates$X2 + rnorm(n, 0, 1)
Y1_cont <- Y0_cont + tau_cont
# Observed continuous outcome
Y_2 <- Y0_cont * (1 - treatment) + Y1_cont * treatment
# -----------------------------------------------------------
# Compute Win Proportion for drf and eif
# -----------------------------------------------------------
base <- data.frame(covariates, arm = treatment, Y_1, Y_2)
res_eif <- eif_win_ratio(
base = base,
treatment_name = "arm",
outcome_names = c("Y_1", "Y_2"),
propensity_model = "glm"
)
res_eif <- res_eif$win_proportion$value
res_eif
}
stopCluster(cl)In what follows, we construct an approximation of the causal measure \(\tau^{\star}\) defined in Even and Josse (2025) and which is revisited in Example 1 for completeness.
library(MASS)
mu <- c(0, 0) # Covariates mean
sigma <- matrix(c(1, 0.6, 0.6, 1), 2, 2)
tau_cont <- 2 # ATE of continuous outcome
tau_bin <- 0.8 # ATE on logit scale of binary outcome
# Default win function
win_function <- function(y, z) {
if (all(y == z)) {
return(0) # Tie
}
result <- (y > z)[which.max(y != z)] * 1
return(result)
}
mean_given <- function(x) {
n_samples <- 10000
# Binary potential outcomes
lin_pred0 <- -1 + 0.3 * x[1] - 0.5 * x[2]
lin_pred1 <- lin_pred0 + tau_bin
prob0 <- 1 / (1 + exp(-lin_pred0))
prob1 <- 1 / (1 + exp(-lin_pred1))
Y0_bin <- rbinom(n_samples, size = 1, prob = prob0)
Y1_bin <- rbinom(n_samples, size = 1, prob = prob1)
# Continuous Potential Outcomes
Y0_cont <- 1 + 0.5 * x[1] - 0.7 * x[2] + rnorm(n_samples, 0, 1)
Y1_cont <- 1 + 0.5 * x[1] - 0.7 * x[2] + rnorm(n_samples, 0, 1) + tau_cont
wins_given_x <- mapply(function(yb, yc, zb, zc) {
win_function(c(yb, yc), c(zb, zc))
}, Y1_bin, Y1_cont, Y0_bin, Y0_cont)
return(mean(wins_given_x))
}
set.seed(123)
covariates <- as.data.frame(mvrnorm(10000, mu = mu, Sigma = sigma))
tau_star <- mean(apply(covariates, 1, function(row) mean_given(row)))Let \(w\) and \(l\) denote the win and loss vectors, respectively, as computed by any of the three methods. The win ratio is defined as \(\bar{w} / \bar{l}\), where \[ \bar{w} = \frac{1}{n} \sum_{i=1}^{n} w_i \quad \text{and} \quad \bar{l} = \frac{1}{n} \sum_{i=1}^{n} l_i, \] with \(n\) denoting the number of individuals. Let \(s_w^2\), \(s_l^2\), and \(s^2_{wl}\) denote the sample variances of \(w\), \(l\) and the sample covariance of \((w,l)\), i.e., \[\begin{equation} \begin{aligned} s^2_w &= \frac{1}{n-1}\sum_i (w_i-\bar w)^2, \quad s^2_l = \frac{1}{n-1}\sum_i (l_i-\bar l)^2\quad \text{ and} \\ s^2_{wl} &= \frac{1}{n-1}\sum_i (w_i-\bar w)(l_i-\bar l) \end{aligned} \end{equation}\] To obtain the confidence interval for the win ratio we first compute \[\begin{equation} s_\text{log} = \sqrt{\frac{s^2_w}{n\bar w^2}+\frac{s^2_l}{n\bar l^2}-2\frac{s^2_{wl}}{n \bar w \bar l}} \end{equation}\] which can be seen to be an approximation of the variance of \(\log(\bar w/ \bar l)=\log(\bar w) -\log(\bar l)\) using the delta method. The confidence interval is then computed by \[\begin{equation} \Big[\exp\big(\log(\bar w/ \bar l)-1.96 s_{\text{log}})\big), \exp\big(\log(\bar w \bar l)+1.96 s_{\text{log}})\big)\Big]; \end{equation}\] which can be seen as the exponential of the confidence interval for the \(\log(\bar w/\bar l)\).