Introduction to Matrix Normal Package

Installation

You can install the released version of matrixNormal from CRAN[1] with install.packages("matrixNormal") and can load the package by:

> library(matrixNormal)

Attaching package: 'matrixNormal'
The following object is masked from 'package:base':

    I

Distribution functions

The matrix normal distribution is a generalization of the multivariate normal distribution to matrix-valued random values. The parameters consist of a n x p mean matrix M and two positive-definite covariance matrices, one for rows U and another for columns V. Suppose that any n x p matrix \(A \sim MatNorm_{n,p}(M, U, V)\). The mean of A is M, and the variance of vec(A) is the Kronecker product of V and U.

The matrix normal distribution is a conjugate prior of the coefficients used in multivariate regression. Suppose there are p predictors for k dependent variables. In univariate regression, only one dependent variable exists, so the conjugate prior for the k regression coefficients is the multivariate normal distribution. For multivariate regression, the extension of the conjugate prior of the coefficient matrix is the matrix normal distribution. This package also contains the PDF, CDF, and random number generation for the matrix normal distribution (*matnorm). See matrixNormal_Distribution file for more information.

For instance, the USArrests dataset in the dataset package examines statistics in arrests per 100,000 residents for assault, murder, and rape in each of 50 states since 1973. Suppose that a researcher wants to determine whether the outcomes of assault and murder rates are associated with urban population and rape. The researcher then decides to use a Bayesian multiple linear regression and makes the assumption that the states are independent. However, the covariance between assault and murder is nonzero and needs to be taken into account. In fact, it has a correlation of 0.411 as given below.

> library(datasets)
> data(USArrests)
> X <- cbind(USArrests$Assault, USArrests$Murder)
> Y <- cbind(USArrests$UrbanPop, USArrests$Rape)
> cor(Y)
          [,1]      [,2]
[1,] 1.0000000 0.4113412
[2,] 0.4113412 1.0000000

We can assume that the outcome Y follows a matrix normal distribution with mean matrix M, which is the matrix product of the coefficient matrix, \(\Psi\), times X. The covariance across states is assumed to be independent, and the covariance across the predictors be \(\Sigma\). Or succinctly, with \(k=2\) outcomes, \[ Y \sim MatNorm_{nx2}( M = X\Psi, U = I(n), V = \Sigma_2) \]

of coefficient matrix times X, and covariance across the predictors \(\Phi\). The overall covariance matrix of Y, \(\Sigma\), is a block diagonal matrix of \(\Phi\). For instance, suppose that Y has the following distribution, and if we know the parameters, we can calculate its density using dmatnorm().

> # Y is n = 50 x p = 2 that follows a matrix normal with mean matrix M, which is product
> # of
> M <- (100 * toeplitz(50:1))[, 1:2]
> dim(M)
[1] 50  2
> head(M)
     [,1] [,2]
[1,] 5000 4900
[2,] 4900 5000
[3,] 4800 4900
[4,] 4700 4800
[5,] 4600 4700
[6,] 4500 4600
> U <- I(50)  # Covariance across states: Assumed to be independent
> U[1:5, 1:5]
  1 2 3 4 5
1 1 0 0 0 0
2 0 1 0 0 0
3 0 0 1 0 0
4 0 0 0 1 0
5 0 0 0 0 1
> V <- cov(X)  # Covariance across predictors
> V
          [,1]      [,2]
[1,] 6945.1657 291.06237
[2,]  291.0624  18.97047
> # Find the density if Y has the density with these arguments.
> matrixNormal::dmatnorm(Y, M, U, V)
[1] -30446783

The coefficient matrix, \(Psi\), for urban population and rape has dimensions 2 x 3 for the k = 2 outcomes and the p = 3 predictors. A semi-conjugate prior can be constructed on \(\Psi\) to be a Matrix Normal distribution with mean \(\Psi_0\) as a matrix of ones, covariance across the predictors as \(X'X\), and with no covariance across states (due to independence). The conjugate prior for the covariance between the outcomes \(\Sigma\) is the inverse-Wishart distribution with mean covariance \(\Sigma_0\) and degrees of freedom \(\nu\). After defining the parameters, one random matrix is generated from this prior.

> # Generate a random matrix from this prior.  The prior mean of regression matrix
> J(2, 3)
  1 2 3
1 1 1 1
2 1 1 1
> # The prior variance between rape and population
> t(X) %*% X
        [,1]    [,2]
[1,] 1798262 80756.0
[2,]   80756  3962.2
> # The prior variance between regression parameters
> I(3)
  1 2 3
1 1 0 0
2 0 1 0
3 0 0 1
> # Random draw for prior would have these values
> A <- matrixNormal::rmatnorm(M = J(2, 3), U = t(X) %*% X, V = I(3))
> A
              1        2           3
[1,] 1400.73961 97.90007 -777.463079
[2,]   96.93588 25.53955    4.699484
> # Predicted Counts for y can be given as:
> ceiling(rowSums(X %*% A))
 [1] 171877 190942 213057 138143 200190 148125  79750 172391 243553 154382
[11]  33849  86872 180896  82409  40666  83699  79842 181532  60125 217791
[21] 108015 185439  52269 188833 129515  79372  74107 183289  41375 115609
[31] 206986 184591 244690  32555  87470 109738 115291  77246 125918 203040
[41]  62505 137260 146572  86949  34897 113585 105080  59141  38554 116975

However, the predicted counts do not have much meaning because the prior is uninformed from the data. We should use the posterior distribution to predict Y. Iranmanesh 2010 [2] shows the posterior distributions to be \[ \Psi | \Sigma, Y, X \sim MatNorm_{kxp}( \frac{\hat{\Psi} +\Psi_0}{2}, (2X^TX)^{-1}, \Sigma)\] and \[ \Sigma | Y, X \sim W^{-1}_p(S^*, 2\nu+n)\] where:
* \(\hat{\Psi}\) is the maximum likelihood estimate (MLE) for the coefficient \(\Psi\) matrix. * S is the MLE for the covariance matrix \(\Sigma\): \((Y-X\cdot\hat{\Psi})^T*(Y-X\cdot\hat{\Psi})\) * \(S^{*}\) is the data-adjusted matrix of inverse Wishart, \[ S^{*} = 2\Sigma_0 + S + (\hat{\Psi}-\Psi_0)'(2X^TX)^{-1}(\hat{\Psi}-\Psi_0) \] * c is the dimension of \(\Sigma\) * n is the sample size, the number of rows in Y and X.

Additional details can be found in Pocuca et al. 2019 [3]. At any rate, the matrixNormal package can be used in Bayesian Multivariate Linear Regression.

Matrix functions

In the matrixNormal package, useful but simple matrix operations have been coded:

Filename	Description
is.symmetric.matrix	Is a matrix square? symmetric? positive definite? or positive semi-definite? A tolerance is included here.
Special_matrices	Creates the identity matrix I and matrix of 1’s J.
tr	Calculates the usual trace of a matrix
vec	Stacks a matrix using matrix operator vec() and has option to keep names.
vech	Stacks elements of a numeric symmetric matrix A in lower triangular only (using half vectorization, vech()).

For example, these functions can be applied to the following matrices.

> # Make a 3 x 3 Identity matrix
> I(3)
  1 2 3
1 1 0 0
2 0 1 0
3 0 0 1
> # Make a 3 x 4 J matrix
> J(3, 4)
  1 2 3 4
1 1 1 1 1
2 1 1 1 1
3 1 1 1 1
> # Make a 3 x 3 J matrix
> J(3, 3)
  1 2 3
1 1 1 1
2 1 1 1
3 1 1 1
> # Calculate the trace of a J matrix
> tr(J(3, 3))  # Should be 3
[1] 3
> # Stack a matrix (used in distribution functions)
> A <- matrix(c(1:4), nrow = 2, dimnames = list(NULL, c("A", "B")))
> A
     A B
[1,] 1 3
[2,] 2 4
> vec(A)
1:A 2:A 1:B 2:B 
  1   2   3   4 
> # Test if matrix is symmetric (used in distribution function)
> is.symmetric.matrix(A)
[1] "A is not symmetric. Top of the matrix: "
     A B
[1,] 1 3
[2,] 2 4
[1] FALSE

Conclusion

Although other packages on CRAN have some matrixNormal functionality, this package provides a general approach in randomly sampling a matrix normal random variate. The MBSP::matrix_normal, matrixsampling::rmatrixnormal, and LaplacesDemon::rmatrixnorm functions also randomly sample from the matrix normal distribution [4–6]. The function in MBSP uses Cholesky decomposition of individual matrices U and V [4]. Similarly, the function in matrixsampling uses the Spectral decomposition of the individual matrices U and V [5]. Comparatively, the new function, rmatnorm(), in matrixNormal package is flexible in the decomposition of the covariance matrix, which is Kronecker product of U and V. While you can simulate many samples using the rmatrixnormal() function from matrixsampling package, the new rmatnorm() function only generates one variate, but can generate many samples by placing rmatnorm() function inside a for-loop. The LaplacesDemon package also has a density function, dmatrixnorm(), which calculates the log determinant of Cholesky decomposition of a positive definite matrix [6]. However, the random matrix A that follows a matrix normal distribution does not need to be positive definite [2]. There is no such restriction on the random matrix in matrixNormal package.

In conclusion, the matrixNormal package collects all forms of the Matrix Normal Distribution in one place: calculating the PDF and CDF of the Matrix Normal distribution and simulating a random variate from this distribution. The package allows the users to be flexible in finding the random variate. Its main application in using this package is the Bayesian multivariate regression.

Computational Details

This vignette is successfully processed using the following.

 -- Session info ---------------------------------------------------
 setting  value
 version  R version 4.5.2 (2025-10-31)
 os       macOS Tahoe 26.2
 system   aarch64, darwin20
 ui       X11
 language (EN)
 collate  C
 ctype    en_US.UTF-8
 tz       America/New_York
 date     2026-02-22
 pandoc   3.6.3 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/aarch64/ (via rmarkdown)
 quarto   1.8.25 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/quarto
--  Packages -------------------------------------------------------
 package        * version date (UTC) lib source
 LaplacesDemon    16.1.8  2026-02-17 [3] CRAN (R 4.5.2)
 matrixcalc       1.0-6   2022-09-14 [3] CRAN (R 4.5.0)
 matrixsampling   2.0.0   2019-08-24 [3] CRAN (R 4.5.0)
 MBSP             5.0     2025-07-19 [3] CRAN (R 4.5.0)

 [1] /private/var/folders/xt/b5cmsrmn26j8v9pvsz8_z8v00000gn/T/RtmpmkuaCp/Rinst129b4266ee61
 [2] /private/var/folders/xt/b5cmsrmn26j8v9pvsz8_z8v00000gn/T/RtmpvSFclD/temp_libpath15bf47cf447cf
 [3] /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/library

References

1.

R Core Team R: A Language and Environment for Statistical Computing 2018.

2.

Iranmanesh, A.; Arashi, M.; Tabatabaey, S.M.M. On Conditional Applications of Matrix Variate Normal Distribution. IJMSI 2010, 5, 33–43.

3.

Pocuca, N.; Gallaugher, M.P.B.; Clark, K.M.; McNicholas, P.D. Assessing and visualizing matrix variate normality. arXiv: Methodology 2019.

4.

Bai, R.; Ghosh, M. MBSP: Multivariate Bayesian Model with Shrinkage Priors; 2018;

5.

Laurent, S. Matrixsampling: Simulations of Matrix Variate Distributions; 2018;

6.

Statisticat; LLC. LaplacesDemon: Complete Environment for Bayesian Inference; Bayesian-Inference.com, 2018;