Getting Started with missoNet

Yixiao Zeng, Celia M. T. Greenwood

2025-09-02

1 Introduction

The missoNet package implements a powerful framework for multitask learning with missing responses, simultaneously estimating:

1.1 The Model

The conditional Gaussian model is: \[ \mathbf{Y} = \mathbf{1}\mu^T + \mathbf{X}\mathbf{B} + \mathbf{E}, \quad \mathbf{E} \sim \mathrm{MVN}(0, \Theta^{-1}) \] where:

For theoretical details, see Zeng et al. (2025).

1.2 Installation

# Install from CRAN (when available)
install.packages("missoNet")

# Or install development version from GitHub
devtools::install_github("yixiao-zeng/missoNet")
# Load the package
library(missoNet)

2 Quick Start

2.1 Generate Example Data

The package includes a flexible data generator for testing:

# Generate synthetic data
sim <- generateData(
  n = 200,                # Sample size
  p = 50,                 # Number of predictors
  q = 10,                 # Number of responses
  rho = 0.1,              # Missing rate (10%)
  missing.type = "MCAR"   # Missing completely at random
)

# Examine the data structure
str(sim, max.level = 1)
#> List of 7
#>  $ X           : num [1:200, 1:50] -0.424 0.84 -2.546 1.825 1.217 ...
#>  $ Y           : num [1:200, 1:10] -0.0884 -0.3687 2.7607 -2.1025 3.2892 ...
#>  $ Z           : num [1:200, 1:10] -0.0884 -0.3687 2.7607 -2.1025 3.2892 ...
#>  $ Beta        : num [1:50, 1:10] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ Theta       : num [1:10, 1:10] 1 0 0 0 0 0 0 0 0 0 ...
#>  $ rho         : num [1:10] 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
#>  $ missing.type: chr "MCAR"
#>  - attr(*, "class")= chr "missoNet.sim"
# Check dimensions
cat("Predictors (X):", dim(sim$X), "\n")
#> Predictors (X): 200 50
cat("Complete responses (Y):", dim(sim$Y), "\n")
#> Complete responses (Y): 200 10
cat("Observed responses (Z):", dim(sim$Z), "\n")
#> Observed responses (Z): 200 10
cat("Missing rate:", sprintf("%.1f%%", mean(is.na(sim$Z)) * 100), "\n")
#> Missing rate: 10.0%

2.2 Basic Model Fitting

# Fit missoNet with automatic parameter selection
fit <- missoNet(
  X = sim$X,
  Y = sim$Z,  # Use observed responses with missing values
  GoF = "BIC" # Goodness-of-fit criterion
)
#> 
#> =============================================================
#>                           missoNet
#> =============================================================
#> 
#> > Initializing model...
#> 
#> --- Model Configuration -------------------------------------
#>   Data dimensions:      n =   200, p =   50, q =   10
#>   Missing rate (avg):    10.0%
#>   Selection criterion:  BIC
#>   Lambda grid:          standard (dense)
#>   Lambda grid size:     50 x 50 = 2500 models
#> -------------------------------------------------------------
#> 
#> --- Optimization Progress -----------------------------------
#>   Stage 1: Initializing warm starts
#>   Stage 2: Grid search (sequential)
#> -------------------------------------------------------------
#> 
#>   |                                                          |                                                  |   0%  |                                                          |                                                  |   1%  |                                                          |=                                                 |   1%  |                                                          |=                                                 |   2%  |                                                          |=                                                 |   3%  |                                                          |==                                                |   3%  |                                                          |==                                                |   4%  |                                                          |==                                                |   5%  |                                                          |===                                               |   5%  |                                                          |===                                               |   6%  |                                                          |===                                               |   7%  |                                                          |====                                              |   7%  |                                                          |====                                              |   8%  |                                                          |====                                              |   9%  |                                                          |=====                                             |   9%  |                                                          |=====                                             |  10%  |                                                          |=====                                             |  11%  |                                                          |======                                            |  11%  |                                                          |======                                            |  12%  |                                                          |======                                            |  13%  |                                                          |=======                                           |  13%  |                                                          |=======                                           |  14%  |                                                          |=======                                           |  15%  |                                                          |========                                          |  15%  |                                                          |========                                          |  16%  |                                                          |========                                          |  17%  |                                                          |=========                                         |  17%  |                                                          |=========                                         |  18%  |                                                          |=========                                         |  19%  |                                                          |==========                                        |  19%  |                                                          |==========                                        |  20%  |                                                          |==========                                        |  21%  |                                                          |===========                                       |  21%  |                                                          |===========                                       |  22%  |                                                          |===========                                       |  23%  |                                                          |============                                      |  23%  |                                                          |============                                      |  24%  |                                                          |============                                      |  25%  |                                                          |=============                                     |  25%  |                                                          |=============                                     |  26%  |                                                          |=============                                     |  27%  |                                                          |==============                                    |  27%  |                                                          |==============                                    |  28%  |                                                          |==============                                    |  29%  |                                                          |===============                                   |  29%  |                                                          |===============                                   |  30%  |                                                          |===============                                   |  31%  |                                                          |================                                  |  31%  |                                                          |================                                  |  32%  |                                                          |================                                  |  33%  |                                                          |=================                                 |  33%  |                                                          |=================                                 |  34%  |                                                          |=================                                 |  35%  |                                                          |==================                                |  35%  |                                                          |==================                                |  36%  |                                                          |==================                                |  37%  |                                                          |===================                               |  37%  |                                                          |===================                               |  38%  |                                                          |===================                               |  39%  |                                                          |====================                              |  39%  |                                                          |====================                              |  40%  |                                                          |====================                              |  41%  |                                                          |=====================                             |  41%  |                                                          |=====================                             |  42%  |                                                          |=====================                             |  43%  |                                                          |======================                            |  43%  |                                                          |======================                            |  44%  |                                                          |======================                            |  45%  |                                                          |=======================                           |  45%  |                                                          |=======================                           |  46%  |                                                          |=======================                           |  47%  |                                                          |========================                          |  47%  |                                                          |========================                          |  48%  |                                                          |========================                          |  49%  |                                                          |=========================                         |  49%  |                                                          |=========================                         |  50%  |                                                          |=========================                         |  51%  |                                                          |==========================                        |  51%  |                                                          |==========================                        |  52%  |                                                          |==========================                        |  53%  |                                                          |===========================                       |  53%  |                                                          |===========================                       |  54%  |                                                          |===========================                       |  55%  |                                                          |============================                      |  55%  |                                                          |============================                      |  56%  |                                                          |============================                      |  57%  |                                                          |=============================                     |  57%  |                                                          |=============================                     |  58%  |                                                          |=============================                     |  59%  |                                                          |==============================                    |  59%  |                                                          |==============================                    |  60%  |                                                          |==============================                    |  61%  |                                                          |===============================                   |  61%  |                                                          |===============================                   |  62%  |                                                          |===============================                   |  63%  |                                                          |================================                  |  63%  |                                                          |================================                  |  64%  |                                                          |================================                  |  65%  |                                                          |=================================                 |  65%  |                                                          |=================================                 |  66%  |                                                          |=================================                 |  67%  |                                                          |==================================                |  67%  |                                                          |==================================                |  68%  |                                                          |==================================                |  69%  |                                                          |===================================               |  69%  |                                                          |===================================               |  70%  |                                                          |===================================               |  71%  |                                                          |====================================              |  71%  |                                                          |====================================              |  72%  |                                                          |====================================              |  73%  |                                                          |=====================================             |  73%  |                                                          |=====================================             |  74%  |                                                          |=====================================             |  75%  |                                                          |======================================            |  75%  |                                                          |======================================            |  76%  |                                                          |======================================            |  77%  |                                                          |=======================================           |  77%  |                                                          |=======================================           |  78%  |                                                          |=======================================           |  79%  |                                                          |========================================          |  79%  |                                                          |========================================          |  80%  |                                                          |========================================          |  81%  |                                                          |=========================================         |  81%  |                                                          |=========================================         |  82%  |                                                          |=========================================         |  83%  |                                                          |==========================================        |  83%  |                                                          |==========================================        |  84%  |                                                          |==========================================        |  85%  |                                                          |===========================================       |  85%  |                                                          |===========================================       |  86%  |                                                          |===========================================       |  87%  |                                                          |============================================      |  87%  |                                                          |============================================      |  88%  |                                                          |============================================      |  89%  |                                                          |=============================================     |  89%  |                                                          |=============================================     |  90%  |                                                          |=============================================     |  91%  |                                                          |==============================================    |  91%  |                                                          |==============================================    |  92%  |                                                          |==============================================    |  93%  |                                                          |===============================================   |  93%  |                                                          |===============================================   |  94%  |                                                          |===============================================   |  95%  |                                                          |================================================  |  95%  |                                                          |================================================  |  96%  |                                                          |================================================  |  97%  |                                                          |================================================= |  97%  |                                                          |================================================= |  98%  |                                                          |================================================= |  99%  |                                                          |==================================================|  99%  |                                                          |==================================================| 100%
#> 
#> -------------------------------------------------------------
#> 
#> > Refitting optimal model ...
#> 
#> 
#> --- Optimization Results ------------------------------------
#>   Optimal lambda.beta:   5.8376e-01
#>   Optimal lambda.theta:  1.2260e-01
#>   BIC value:              4305.2563
#>   Active predictors:     31 / 50 (62.0%)
#>   Network edges:         3 / 45 (6.7%)
#> -------------------------------------------------------------
#> 
#> =============================================================

# Extract optimal estimates
Beta.hat <- fit$est.min$Beta
Theta.hat <- fit$est.min$Theta
mu.hat <- fit$est.min$mu
# Model summary
cat("Selected lambda.beta:", fit$est.min$lambda.beta, "\n")
#> Selected lambda.beta: 0.5837619
cat("Selected lambda.theta:", fit$est.min$lambda.theta, "\n")
#> Selected lambda.theta: 0.1225953
cat("Active predictors:", sum(rowSums(abs(Beta.hat)) > 1e-8), "/", nrow(Beta.hat), "\n")
#> Active predictors: 31 / 50
cat("Network edges:", sum(abs(Theta.hat[upper.tri(Theta.hat)]) > 1e-8), 
    "/", ncol(Theta.hat) * (ncol(Theta.hat)-1) / 2, "\n")
#> Network edges: 3 / 45

2.3 Visualization

# Visualize the regularization path
plot(fit, type = "heatmap")

# Visualize the regularization path in a scatter plot
plot(fit, type = "scatter")

2.4 Making Predictions

# Split data for demonstration
train_idx <- 1:150
test_idx <- 151:200

# Refit on training data
fit_train <- missoNet(
  X = sim$X[train_idx, ],
  Y = sim$Z[train_idx, ],
  GoF = "BIC",
  verbose = 0  # Suppress output
)

# Predict on test data
Y_pred <- predict(fit_train, newx = sim$X[test_idx, ])

# Evaluate predictions (using complete data for comparison)
mse <- mean((Y_pred - sim$Y[test_idx, ])^2)
cat("Test set MSE:", round(mse, 4), "\n")
#> Test set MSE: 1.2326

3 Understanding Missing Data Mechanisms

missoNet handles three types of missing data:

# Generate data with different missing mechanisms
n <- 300; p <- 30; q <- 8; rho <- 0.15

sim_mcar <- generateData(n, p, q, rho, missing.type = "MCAR")
sim_mar <- generateData(n, p, q, rho, missing.type = "MAR")
sim_mnar <- generateData(n, p, q, rho, missing.type = "MNAR")

# Visualize missing patterns
par(mfrow = c(1, 3), mar = c(4, 4, 3, 1))

# MCAR pattern
image(1:q, 1:n, t(is.na(sim_mcar$Z)), 
      col = c("white", "darkred"),
      xlab = "Response", ylab = "Observation",
      main = "MCAR: Random Pattern")

# MAR pattern
image(1:q, 1:n, t(is.na(sim_mar$Z)), 
      col = c("white", "darkred"),
      xlab = "Response", ylab = "Observation",
      main = "MAR: Depends on X")

# MNAR pattern
image(1:q, 1:n, t(is.na(sim_mnar$Z)), 
      col = c("white", "darkred"),
      xlab = "Response", ylab = "Observation",
      main = "MNAR: Depends on Y")

4 Model Selection Strategies

4.1 Comparing Selection Criteria

# Fit with different criteria
criteria <- c("AIC", "BIC", "eBIC")
results <- list()

for (crit in criteria) {
  results[[crit]] <- missoNet(
    X = sim$X,
    Y = sim$Z,
    GoF = crit,
    verbose = 0
  )
}

# Compare selected models
comparison <- data.frame(
  Criterion = criteria,
  Lambda.Beta = sapply(results, function(x) x$est.min$lambda.beta),
  Lambda.Theta = sapply(results, function(x) x$est.min$lambda.theta),
  Active.Predictors = sapply(results, function(x) 
    sum(rowSums(abs(x$est.min$Beta)) > 1e-8)),
  Network.Edges = sapply(results, function(x) 
    sum(abs(x$est.min$Theta[upper.tri(x$est.min$Theta)]) > 1e-8)),
  GoF.Value = sapply(results, function(x) x$est.min$gof)
)

print(comparison, digits = 4)
#>      Criterion Lambda.Beta Lambda.Theta Active.Predictors Network.Edges GoF.Value
#> AIC        AIC      0.3770     0.007715                42            34      4021
#> BIC        BIC      0.5838     0.122595                31             3      4305
#> eBIC      eBIC      0.5838     0.122595                31             3      4483

4.2 Custom Lambda Grids

# Define custom regularization paths
lambda.beta <- 10^seq(0, -2, length.out = 15)
lambda.theta <- 10^seq(0, -2, length.out = 15)

# Fit with custom grid
fit_custom <- missoNet(
  X = sim$X,
  Y = sim$Z,
  lambda.beta = lambda.beta,
  lambda.theta = lambda.theta,
  verbose = 0
)
# Grid coverage summary
cat("  Beta range: [", 
    sprintf("%.4f", min(fit_custom$param_set$gof.grid.beta)), ", ",
    sprintf("%.4f", max(fit_custom$param_set$gof.grid.beta)), "]\n", sep = "")
#>   Beta range: [0.0100, 1.0000]
cat("  Theta range: [", 
    sprintf("%.4f", min(fit_custom$param_set$gof.grid.theta)), ", ",
    sprintf("%.4f", max(fit_custom$param_set$gof.grid.theta)), "]\n", sep = "")
#>   Theta range: [0.0100, 1.0000]
cat("  Total models evaluated:", length(fit_custom$param_set$gof), "\n")
#>   Total models evaluated: 225

5 Working with Real Data Patterns

5.1 Handling Variable Missing Rates

# Create data with variable missing rates across responses
n <- 300; p <- 30; q <- 8; rho <- 0.15
rho_vec <- seq(0.05, 0.30, length.out = q)

sim_var <- generateData(
  n = 300,
  p = 30,
  q = 8,
  rho = rho_vec,  # Different missing rate for each response
  missing.type = "MAR"
)

# Examine missing patterns
miss_summary <- data.frame(
  Response = paste0("Y", 1:q),
  Target = rho_vec,
  Actual = colMeans(is.na(sim_var$Z))
)

print(miss_summary, digits = 3)
#>   Response Target Actual
#> 1       Y1 0.0500 0.0367
#> 2       Y2 0.0857 0.0500
#> 3       Y3 0.1214 0.1167
#> 4       Y4 0.1571 0.1633
#> 5       Y5 0.1929 0.2000
#> 6       Y6 0.2286 0.2267
#> 7       Y7 0.2643 0.2367
#> 8       Y8 0.3000 0.3033

# Fit model accounting for variable missingness
fit_var <- missoNet(
  X = sim_var$X,
  Y = sim_var$Z,
  adaptive.search = TRUE, # Fast adaptive search
  verbose = 0
)

# Visualize
plot(fit_var)

5.2 Incorporating Prior Knowledge

# Use penalty factors to incorporate prior information
p <- ncol(sim$X)
q <- ncol(sim$Z)

# Example: We know predictors 1-10 are important
beta.pen.factor <- matrix(1, p, q)
beta.pen.factor[1:10, ] <- 0.1  # Lighter penalty for known important predictors

# Example: We expect certain response pairs to be connected
theta.pen.factor <- matrix(1, q, q)
theta.pen.factor[1, 2] <- theta.pen.factor[2, 1] <- 0.1
theta.pen.factor[3, 4] <- theta.pen.factor[4, 3] <- 0.1

# Fit with prior information
fit_prior <- missoNet(
  X = sim$X,
  Y = sim$Z,
  beta.pen.factor = beta.pen.factor,
  theta.pen.factor = theta.pen.factor
)

6 Tips for Optimal Performance

6.1 Standardization

# Standardization is recommended (default: TRUE)
# for numerical stability and comparable penalties
fit_std <- missoNet(X = sim$X, Y = sim$Z, 
                    standardize = TRUE, 
                    standardize.response = TRUE)

# Without standardization (for pre-scaled data)
fit_no_std <- missoNet(X = scale(sim$X), Y = scale(sim$Z), 
                       standardize = FALSE,
                       standardize.response = FALSE)

6.2 Convergence Settings

# Adjust convergence settings based on problem difficulty and time constraints
fit_tight <- missoNet(
  X = sim$X,
  Y = sim$Z,
  beta.tol = 1e-6,      # Tighter tolerance
  theta.tol = 1e-6,
  beta.max.iter = 10000,  # More iterations allowed
  theta.max.iter = 10000
)

# For quick exploration, use looser settings
fit_quick <- missoNet(
  X = sim$X,
  Y = sim$Z,
  beta.tol = 1e-3,      # Looser tolerance
  theta.tol = 1e-3,
  beta.max.iter = 1000,   # Fewer iterations
  theta.max.iter = 1000,
  adaptive.search = TRUE # Fast adaptive search
)

7 Summary

Key features of missoNet:
  ✓ Handles missing responses naturally through unbiased estimating equations
  ✓ Joint estimation of regression and network structures
  ✓ Flexible regularization with separate penalties for coefficients and network
  ✓ Multiple selection criteria (AIC/BIC/eBIC, cross-validation)
  ✓ Efficient algorithms with warm starts and adaptive search
  ✓ Comprehensive visualization tools
For advanced features including cross-validation, see the companion vignettes.