--- title: "Introduction to wstdiff" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to wstdiff} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(wstdiff) ``` ## Overview The `wstdiff` package implements the Welch-Satterthwaite approximation for differences of non-standardized t-distributed random variables. ## Background While the classical Welch-Satterthwaite approximation handles combinations of sample variances, this package extends the framework to t-distribution differences, which arise in: - Bayesian inference with t-distributed priors - Robust statistical modeling - Uncertainty quantification with heavy-tailed data ## Univariate Example Consider two independent t-distributed variables with different parameters: ```{r univariate} # X1 ~ t(mu=0, sigma=1, nu=10) # X2 ~ t(mu=0, sigma=1.5, nu=15) result <- ws_tdiff_univariate( mu1 = 0, sigma1 = 1, nu1 = 10, mu2 = 0, sigma2 = 1.5, nu2 = 15 ) print(result) # The difference Z = X1 - X2 is approximately: # Z ~ t(mu_diff, sigma_star^2, nu_star) ``` ## Using Distribution Functions Once we have the approximation, we can compute probabilities and quantiles: ```{r distributions} # Probability that difference is negative p_negative <- ptdiff(0, result) print(paste("P(X1 - X2 < 0) =", round(p_negative, 4))) # 95% confidence interval for the difference ci_95 <- qtdiff(c(0.025, 0.975), result) print(paste("95% CI:", round(ci_95[1], 3), "to", round(ci_95[2], 3))) # Generate random samples samples <- rtdiff(1000, result) hist(samples, breaks = 30, main = "Simulated Difference Distribution", xlab = "X1 - X2", probability = TRUE) # Overlay theoretical density x_seq <- seq(min(samples), max(samples), length.out = 100) lines(x_seq, dtdiff(x_seq, result), col = "red", lwd = 2) ``` ## Multivariate Example For multiple independent components: ```{r multivariate} result_multi <- ws_tdiff_multivariate_independent( mu1 = c(0, 1, 2), sigma1 = c(1, 1.5, 2), nu1 = c(10, 12, 15), mu2 = c(0, 0, 0), sigma2 = c(1.2, 1, 1.8), nu2 = c(15, 20, 25) ) print(result_multi) ``` ## Special Cases The package handles special cases efficiently: ```{r special} # When both distributions have identical parameters result_equal <- ws_tdiff_equal_params(mu = 0, sigma = 1, nu = 10) print(result_equal) # Verify: nu_star should be 2*(nu - 4) = 12 stopifnot(result_equal$nu_star == 12) ``` ## Approximation Quality The approximation quality depends on: 1. **Degrees of freedom**: Higher values give better approximations 2. **Parameter similarity**: More similar parameters improve accuracy 3. **Dimension**: In multivariate cases, using a single nu* may be less accurate ```{r quality} # Low degrees of freedom - use with caution result_low_df <- ws_tdiff_univariate(0, 1, 5, 0, 1, 6) print(paste("Warning: nu_star =", round(result_low_df$nu_star, 2))) # High degrees of freedom - excellent approximation result_high_df <- ws_tdiff_univariate(0, 1, 100, 0, 1.5, 150) print(paste("Good: nu_star =", round(result_high_df$nu_star, 2))) ``` ## References Yamaguchi, Y., Homma, G., Maruo, K., & Takeda, K. Welch-Satterthwaite Approximation for Difference of Non-Standardized t-Distributed Variables. (unpublished).