Package {morpheus}


Encoding: UTF-8
Title: Estimate Parameters of Mixtures of Logistic Regressions
Description: Mixture of logistic regressions parameters (H)estimation with (U)spectral methods. The main methods take d-dimensional inputs and a vector of binary outputs, and return parameters according to the GLMs mixture model (General Linear Model). For more details see chapter 3 in the PhD thesis of Mor-Absa Loum: https://theses.fr/s156435, available here https://theses.hal.science/tel-01877796/document.
Version: 1.0-5
Depends: R (≥ 3.5.0),
Imports: MASS, jointDiag, methods, pracma
Suggests: devtools, flexmix, parallel, testthat (≥ 3.0.0), roxygen2
License: MIT + file LICENSE
Date: 2026-04-22
RoxygenNote: 7.3.3
URL: https://github.com/yagu0/morpheus
Collate: 'utils.R' 'A_NAMESPACE.R' 'computeMu.R' 'multiRun.R' 'optimParams.R' 'plot.R' 'sampleIO.R'
Config/testthat/edition: 3
NeedsCompilation: yes
Packaged: 2026-05-11 00:19:11 UTC; ba
Author: Benjamin Auder [aut, cre], Mor-Absa Loum [aut]
Maintainer: Benjamin Auder <benjamin.auder@universite-paris-saclay.fr>
Repository: CRAN
Date/Publication: 2026-05-21 09:30:02 UTC

Estimate Parameters of Mixtures of Logistic Regressions

Description

Mixture of logistic regressions parameters (H)estimation with (U)spectral methods. The main methods take d-dimensional inputs and a vector of binary outputs, and return parameters according to the GLMs mixture model (General Linear Model). For more details see chapter 3 in the PhD thesis of Mor-Absa Loum: <https://theses.fr/s156435>, available here <https://theses.hal.science/tel-01877796/document>.

Details

The package devtools should be useful in development stage, since we rely on testthat for unit tests, and roxygen2 for documentation. knitr is used to generate the package vignette. jointDiag allows to solve a joint diagonalization problem, providing a more robust solution compared to a single diagonalization. Concerning the other suggested packages:

The two main functions are located in R/computeMu.R and R/optimParams.R:

See also multiRun(), which is a flexible method to run Monte-Carlo or bootstrap estimations using different models in various contexts.

Author(s)

Benjamin Auder [aut, cre], Mor-Absa Loum [aut]

Maintainer: Benjamin Auder <benjamin.auder@universite-paris-saclay.fr>


alignMatrices

Description

Align a set of parameters matrices, with potential permutations.

Usage

alignMatrices(Ms, ref, ls_mode = c("exact", "approx1", "approx2"))

Arguments

Ms

A list of matrices, all of same size DxK

ref

A reference matrix to align other matrices with

ls_mode

How to compute the labels assignment: "exact" for exact algorithm (default, but might be time-consuming, complexity is O(K^3) ), or "approx1", or "approx2" to apply a greedy matching algorithm (heuristic) which for each column in reference (resp. in current row) compare to all unassigned columns in current row (resp. in reference)

Value

The aligned list (of matrices), of same size as Ms

Examples

m1 <- matrix(c(1,1,0,0),ncol=2)
m2 <- matrix(c(0,0,1,1),ncol=2)
ref <- m1
Ms <- list(m1, m2, m1, m2)
a <- alignMatrices(Ms, ref, "exact")
# a[[i]] is expected to contain m1 for all i


computeMoments

Description

Compute cross-moments of order 1,2,3 from X,Y

Usage

computeMoments(X, Y)

Arguments

X

Matrix of input data (size nxd)

Y

Vector of binary outputs (size n)

Value

A list L where L[[i]] is the i-th cross-moment

Examples

X <- matrix(rnorm(100), ncol=2)
Y <- rbinom(100, 1, .5)
M <- computeMoments(X, Y)


Compute mu

Description

Estimate the normalized columns mu of the beta matrix parameter in a mixture of logistic regressions models, with a spectral method described in the package vignette.

Usage

computeMu(X, Y, optargs = list())

Arguments

X

Matrix of input data (size nxd)

Y

Vector of binary outputs (size n)

optargs

List of optional argument:

  • 'jd_method', joint diagonalization method from the package jointDiag: 'uwedge' (default) or 'jedi'.

  • 'jd_nvects', number of random vectors for joint-diagonalization (or 0 for p=d, canonical basis by default)

  • 'M', moments of order 1,2,3: will be computed if not provided.

  • 'K', number of populations (estimated with rank of M2 if not given)

Value

The estimated normalized parameters as columns of a matrix mu of size dxK

See Also

multiRun to estimate statistics based on mu, and generateSampleIO for I/O random generation.

Examples

io <- generateSampleIO(10000, 1/2, matrix(c(1,0,0,1),ncol=2), c(0,0), "probit")
mu <- computeMu(io$X, io$Y, list(K=2)) #or just X and Y for estimated K


Generate sample inputs-outputs

Description

Generate input matrix X of size nxd and binary output of size n, where Y is subdivided into K groups of proportions p. Inside one group, the probability law P(Y=1) is described by the corresponding column parameter in the matrix beta + intercept b.

Usage

generateSampleIO(n, p, beta, b, link)

Arguments

n

Number of individuals

p

Vector of K(-1) populations relative proportions (sum (<)= 1)

beta

Vectors of model parameters for each population, of size dxK

b

Vector of intercept values (use rep(0,K) for no intercept)

link

Link type; "logit" or "probit"

Value

A list with

Examples

# K = 3 so we give first two components of p: 0.3 and 0.3 (p[3] = 0.4)
io <- generateSampleIO(1000, c(.3,.3),
  matrix(c(1,3,-1,1,2,1),ncol=3), c(.5,-1,0), "logit")
io$index[1] #number of the group of X[1,] and Y[1] (in 1...K)


multiRun

Description

Estimate N times some parameters, outputs of some list of functions. This method is thus very generic, allowing typically bootstrap or Monte-Carlo estimations of matrices mu or beta. Passing a list of functions opens the possibility to compare them on a fair basis (exact same inputs). It's even possible to compare methods on some deterministic design of experiments.

Usage

multiRun(
  fargs,
  estimParams,
  packages = c("morpheus"),
  prepareArgs = function(x, i) x,
  N = 10,
  ncores = 3,
  agg = lapply,
  verbose = FALSE
)

Arguments

fargs

List of arguments for the estimation functions

estimParams

List of nf function(s) to apply on fargs

packages

Vector of packages to load on each node (default: morpheus)

prepareArgs

Prepare arguments for the functions inside estimParams

N

Number of runs

ncores

Number of cores for parallel runs (<=1: sequential)

agg

Aggregation method (default: lapply)

verbose

TRUE to indicate runs + methods numbers

Value

A list of nf aggregates of N results (matrices).

Examples


beta <- matrix(c(1,-2,3,1),ncol=2)

# This example requires a patch for the flexmix package written by Bettina GrĂ¼n:
patch_path <- system.file("extdata", "FLXMRglm.R", package = "morpheus")
source(patch_path)

# Bootstrap + computeMu, morpheus VS flexmix
io <- generateSampleIO(n=1000, p=1/2, beta=beta, b=c(0,0), "logit")
mu <- normalize(beta)
res <- multiRun(list(X=io$X,Y=io$Y,K=2), list(
  # morpheus
  function(fargs) {
    ind <- fargs$ind
    computeMu(fargs$X[ind,], fargs$Y[ind], list(K=fargs$K))
  },
  # flexmix
  function(fargs) {
    ind <- fargs$ind
    K <- fargs$K
    dat <- as.data.frame( cbind(fargs$Y[ind],fargs$X[ind,]) )
    out <- refit( flexmix( cbind(V1, 1 - V1) ~ 0+., data=dat, k=K,
      model=FLXMRglm(family="binomial") ) )
    normalize( matrix(out@coef[1:(ncol(fargs$X)*K)], ncol=K) )
  } ),
  packages = c("morpheus", "flexmix"),
  prepareArgs = function(fargs,index) {
    if (index == 1)
      fargs$ind <- 1:nrow(fargs$X)
    else
      fargs$ind <- sample(1:nrow(fargs$X),replace=TRUE)
    fargs
  }, N=10, ncores=1) #ncores=3
for (i in 1:2)
  res[[i]] <- alignMatrices(res[[i]], ref=mu, ls_mode="exact")

# Monte-Carlo + optimParams from X,Y, morpheus VS flexmix
res <- multiRun(list(n=1000,p=1/2,beta=beta,b=c(0,0),link="logit"), list(
  # morpheus
  function(fargs) {
    K <- fargs$K
    mu <- computeMu(fargs$X, fargs$Y, list(K=fargs$K))
    o <- optimParams(fargs$X, fargs$Y, fargs$K, fargs$link, fargs$M)
    o$run(list(beta=mu))$beta
  },
  # flexmix
  function(fargs) {
    K <- fargs$K
    dat <- as.data.frame( cbind(fargs$Y,fargs$X) )
    out <- refit( flexmix( cbind(V1, 1 - V1) ~ ., data=dat, k=K,
      model=FLXMRglm(family="binomial") ) )
    sapply( seq_len(K), function(i)
      as.double( out@components[[1]][[i]][2:(1+ncol(fargs$X)),1] ) )
  } ),
  packages = c("morpheus", "flexmix"),
  prepareArgs = function(fargs,index) {
    library(morpheus)
    io <- generateSampleIO(fargs$n, fargs$p, fargs$beta, fargs$b, fargs$link)
    fargs$X <- io$X
    fargs$Y <- io$Y
    fargs$K <- ncol(fargs$beta)
    fargs$link <- fargs$link
    fargs$M <- computeMoments(io$X,io$Y)
    fargs
  }, N=10, ncores=1) #ncores=3
for (i in 1:2)
  res[[i]] <- alignMatrices(res[[i]], ref=beta, ls_mode="exact")


normalize

Description

Normalize a vector or a matrix (by columns), using euclidian norm

Usage

normalize(x)

Arguments

x

Vector or matrix to be normalized

Value

The normalized matrix (1 column if x is a vector)

Examples

x <- matrix(c(1,2,-1,3), ncol=2)
normalize(x) #column 1 is 1/sqrt(5) (1 2),
             #and column 2 is 1/sqrt(10) (-1, 3)


optimParams

Description

Wrapper function for OptimParams class

Usage

optimParams(X, Y, K, link = c("logit", "probit"), M = NULL, nc = 0)

Arguments

X

Data matrix of covariables

Y

Output as a binary vector

K

Number of populations.

link

The link type, 'logit' or 'probit'.

M

the empirical cross-moments between X and Y (optional)

nc

Number of cores (default: 0 to use all)

Value

An object 'op' of class OptimParams, initialized so that op$run(theta0) outputs the list of optimized parameters

theta0 is a list containing the initial parameters. Only beta is required (p would be set to (1/K,...,1/K) and b to (0,...0)).

See Also

multiRun to estimate statistics based on beta, and generateSampleIO for I/O random generation.

Examples

# Optimize parameters from estimated mu
io <- generateSampleIO(100,
  1/2, matrix(c(1,-2,3,1),ncol=2), c(0,0), "logit")
mu <- computeMu(io$X, io$Y, list(K=2))
o <- optimParams(io$X, io$Y, 2, "logit")

theta0 <- list(p=1/2, beta=mu, b=c(0,0))
par0 <- o$run(theta0)
# Compare with another starting point
theta1 <- list(p=1/2, beta=2*mu, b=c(0,0))
par1 <- o$run(theta1)
# Look at the function values at par0 and par1:
o$f( o$linArgs(par0) )
o$f( o$linArgs(par1) )


plotBox

Description

Draw compared boxplots of a single parameter (scalar)

Usage

plotBox(mr, x, y, ...)

Arguments

mr

Output of multiRun(), list of lists of functions results

x

Row index of the element inside the aggregated parameter

y

Column index of the element inside the aggregated parameter

...

Additional graphical parameters (xlab, ylab, ...)

Value

No return value, called for side effects.

Examples

# mr[[i]] is a list of estimated parameters matrices (here random matrices).
# Should be mr <- multiRun(...) --> see bootstrap example in ?multiRun.
simmr_path <- system.file("extdata", "simulateMr.R", package = "morpheus")
source(simmr_path)
source(simmr_path)
mr <- simulateMr(c(2,2), 10)$mr
plotBox(mr, 2, 1) #second row, first column


plotCoefs

Description

Draw a graph of (averaged) coefficients estimations with their standard, deviations ordered by mean values. Note that the drawing does not correspond to a function; it is just a convenient way to visualize the estimated parameters.

Usage

plotCoefs(mr, params, ...)

Arguments

mr

List of parameters matrices

params

True value of the parameters matrix

...

Additional graphical parameters

Value

No return value, called for side effects.

Examples

# mr[[i]] is a list of estimated parameters matrices (here random matrices).
# Should be mr <- multiRun(...) --> see bootstrap example in ?multiRun.
simmr_path <- system.file("extdata", "simulateMr.R", package = "morpheus")
source(simmr_path)
mr_theta <- simulateMr(c(3,2), 10)
mr <- mr_theta$mr ; theta <- mr_theta$theta
plotCoefs(mr[[1]], theta)


plotHist

Description

Plot compared histograms of a single parameter (scalar)

Usage

plotHist(mr, x, y, ...)

Arguments

mr

Output of multiRun(), list of lists of functions results

x

Row index of the element inside the aggregated parameter

y

Column index of the element inside the aggregated parameter

...

Additional graphical parameters (xlab, ylab, ...)

Value

No return value, called for side effects.

Examples

# mr[[i]] is a list of estimated parameters matrices (here random matrices).
# Should be mr <- multiRun(...) --> see bootstrap example in ?multiRun.
simmr_path <- system.file("extdata", "simulateMr.R", package = "morpheus")
source(simmr_path)
mr <- simulateMr(c(2,2), 10)$mr
plotHist(mr, 2, 1) #second row, first column


pvalue

Description

Compute the p-values of the tests beta[i,j] == 0 vs != 0

Usage

pvalue(mr)

Arguments

mr

A list of matrices as output by multiRun()

Value

The matrix of p-values (same size as mr[[1]])

Examples

# Next line should be a real call to multiRun()
mr <- list( list(matrix(c(1,2,3,4),ncol=2),matrix(c(2,2,1,1),ncol=2)) )
p <- pvalue(mr[[1]])