A Quick Demo of trustOptim

Michael Braun

2026-01-12

This is a quick demo of how to use the trustOptim package. For this example, the objective function is the Rosenbrock function. \[ f(x_{1:N},y_{1:N})=\sum_{i=1}^N \left[100\left(x^2_i-y_i\right)^2+\left(x_i-1\right)^2\right] \]

The parameter vector contains \(2N\) variables ordered as \(x_1, y_1, x_2, y_2, ... x_n, y_n\). The optimum of the function is a vector of ones, and the value at the minimum is zero.

The following functions return the objective, gradient, and Hessian (in sparse format) of this function.

require(trustOptim)
require(Matrix)
f <- function(V) {

    N <- length(V)/2
    x <- V[seq(1,2*N-1,by=2)]
    y <- V[seq(2,2*N,by=2)]
    return(sum(100*(x^2-y)^2+(x-1)^2))
}

df <- function(V) {
    N <- length(V)/2
    x <- V[seq(1,2*N-1,by=2)]
    y <- V[seq(2,2*N,by=2)]

    t <- x^2-y
    dxi <- 400*t*x+2*(x-1)
    dyi <- -200*t
    return(as.vector(rbind(dxi,dyi)))
 }

hess <- function(V) {

    N <- length(V)/2
    x <- V[seq(1,2*N-1,by=2)]
    y <- V[seq(2,2*N,by=2)]
    d0 <- rep(200,N*2)
    d0[seq(1,(2*N-1),by=2)] <- 1200*x^2-400*y+2
    d1 <- rep(0,2*N-1)
    d1[seq(1,(2*N-1),by=2)] <- -400*x

    H <- bandSparse(2*N,
                    k=c(-1,0,1),
                    diagonals=list(d1,d0,d1),
                    symmetric=FALSE,
                    repr='C')
    return(drop0(H))
}

For this demo, we start at a random vector.

set.seed(1234)
N <- 3
start <- as.vector(rnorm(2*N, -1, 3))

Next, we call trust.optim, with all default arguments.

opt <- trust.optim(start, fn=f, gr=df, hs=hess, method="Sparse")
# Beginning optimization
# 
# iter            f           nrm_gr                     status
#   1   10015.031437   10987.613876     Continuing - TR expand
#   2   10015.031437   10987.613876   Continuing - TR contract
#   3     219.817975    1588.461373     Continuing - TR expand
#   4     219.817975    1588.461373   Continuing - TR contract
#   5     219.817975    1588.461373   Continuing - TR contract
#   6     219.817975    1588.461373   Continuing - TR contract
#   7      82.628847     703.991138                 Continuing
#   8      17.092097     196.558177     Continuing - TR expand
#   9      17.092097     196.558177   Continuing - TR contract
#  10      17.092097     196.558177   Continuing - TR contract
#  11      17.092097     196.558177   Continuing - TR contract
#  12      15.903949      87.156911                 Continuing
#  13      10.985480      39.443763     Continuing - TR expand
#  14      10.985480      39.443763   Continuing - TR contract
#  15       9.991011      96.253959                 Continuing
#  16       7.829903      26.847358                 Continuing
#  17       7.829903      26.847358   Continuing - TR contract
#  18       6.689435      33.066635     Continuing - TR expand
#  19       6.689435      33.066635   Continuing - TR contract
#  20       6.221075      58.822744                 Continuing
#  21       4.451638      16.401176                 Continuing
#  22       4.451638      16.401176   Continuing - TR contract
#  23       3.834185      30.511945     Continuing - TR expand
#  24       2.924576       8.993872                 Continuing
#  25       2.924576       8.993872   Continuing - TR contract
# 
# iter            f           nrm_gr                     status
#  26       2.924576       8.993872   Continuing - TR contract
#  27       2.532653      20.239445                 Continuing
#  28       1.786237       5.608209                 Continuing
#  29       1.786237       5.608209   Continuing - TR contract
#  30       1.380482       7.023595     Continuing - TR expand
#  31       1.019554       5.780869                 Continuing
#  32       0.725310       4.438916                 Continuing
#  33       0.502544       4.865620                 Continuing
#  34       0.324202       3.268843                 Continuing
#  35       0.207341       5.565848                 Continuing
#  36       0.111685       1.780737                 Continuing
#  37       0.072993       7.082883                 Continuing
#  38       0.022568       0.361306                 Continuing
#  39       0.022568       0.361306   Continuing - TR contract
#  40       0.022568       0.361306   Continuing - TR contract
#  41       0.009448       2.928907     Continuing - TR expand
#  42       0.001544       0.247652                 Continuing
#  43       0.000153       0.498492                 Continuing
#  44       0.000001       0.005509                 Continuing
#  45       0.000000       0.000359                 Continuing
#  46       0.000000       0.000000                 Continuing
# 
# Iteration has terminated
#  46       0.000000       0.000000                    Success

In the above output, f is the objective function, and nrm_gr is the norm of the gradient. The status messages illustrate how the underlying trust region algorithm is progressing, and are useful mainly for debugging purposes. Note that the objective value is non-increasing at each iteration, but the norm of the gradient is not. The algorithm will continue until either the norm of the gradient is less than the control parameter prec, the trust region radius is less than stop.trust.radius, or the iteration count exceeds maxit. See the package manual for details of the control parameters. We use the default control parameters for this demo (hence, there is no control list here.

The result contains the objective value, the minimum, the gradient at the minimum (should be numerically zero for all elements), and the Hessian at the minimum.

opt
# $fval
# [1] 2.233e-19
# 
# $solution
# [1] 1 1 1 1 1 1
# 
# $gradient
# [1]  2.331e-09 -1.631e-09  0.000e+00  0.000e+00  0.000e+00  0.000e+00
# 
# $hessian
# 6 x 6 sparse Matrix of class "dsCMatrix"
#                                   
# [1,]  802 -400    .    .    .    .
# [2,] -400  200    .    .    .    .
# [3,]    .    .  802 -400    .    .
# [4,]    .    . -400  200    .    .
# [5,]    .    .    .    .  802 -400
# [6,]    .    .    .    . -400  200
# 
# $iterations
# [1] 46
# 
# $status
# [1] "Success"
# 
# $trust.radius
# [1] 0.5006
# 
# $nnz
# [1] 9
# 
# $method
# [1] "Sparse"

Note that opt$fval, and all elements of opt$gradient are zero, within machine precision. The solution is correct, and the Hessian is returned as a compressed sparse Matrix object (refer to the Matrix package for details).

One way to potentially speed up convergence (but not necessarily compute time) is to apply a preconditioner to the algorithm. Other than the identity matrix (the default), the package current supports only a modified Cholesky preconditioner. This is implemented with a control parameter preconditioner=1. To save space, we report the optimizer status only ever 10 iterations.

opt1 <- trust.optim(start, fn=f, gr=df, hs=hess, method="Sparse",
      control=list(preconditioner=1, report.freq=10))
# Beginning optimization
# 
# iter            f           nrm_gr                     status
#  10      13.648174       7.496606   Continuing - TR contract
#  20       7.151988      37.094473     Continuing - TR expand
#  30       3.408749      18.596902     Continuing - TR expand
#  40       0.836703      12.994474                 Continuing
#  50       0.128263       6.112045                 Continuing
#  60       0.000011       0.142876                 Continuing
# 
# Iteration has terminated
#  63       0.000000       0.000000                    Success

Here, we see that adding the preconditioner actually increases the number of iterations. Sometimes preconditioners help a lot, and sometimes not at all.

##Quasi-Newton Methods

The trust.optim function also supports quasi-Newton approximations to the Hessian. The two options are BFGS and SR1 updates. See Nocedal and Wright (2006) for details. You do not need to provide the Hessian for these methods, and they are often preferred when the Hessian is dense. However, they may take longer to converge, which is why we need to change the maxit control parameter. To save space, we report the status of the optimizer only every 10 iterations.

opt.bfgs <- trust.optim(start, fn=f, gr=df, method="BFGS", control=list(maxit=5000, report.freq=10))
# Beginning optimization
# 
# iter            f           nrm_gr                     status
#   10      88.806530     354.018026                 Continuing
#   20       1.823163       6.328353   Continuing - TR contract
#   30       1.389553       4.246382                 Continuing
#   40       0.802565       3.855513     Continuing - TR expand
#   50       0.571448       2.816390     Continuing - TR expand
#   60       0.208713      10.847588   Continuing - TR contract
#   70       0.007267       1.328120   Continuing - TR contract
#   80       0.005292       0.670026     Continuing - TR expand
#   90       0.000001       0.004486                 Continuing
#  100       0.000000       0.000000                 Continuing
# 
# Iteration has terminated
#  101       0.000000       0.000000                    Success
opt.bfgs
# $fval
# [1] 8.829e-23
# 
# $solution
# [1] 1 1 1 1 1 1
# 
# $gradient
# [1] -1.149e-10  6.506e-11  8.306e-12 -6.639e-12 -7.881e-11  3.622e-11
# 
# $iterations
# [1] 101
# 
# $status
# [1] "Success"
# 
# $trust.radius
# [1] 0.3383
# 
# $method
# [1] "BFGS"
# 
# $hessian.update.method
# [1] 2

And we can do the same thing with SR1 updates.

opt.sr1 <- trust.optim(start, fn=f, gr=df, method="SR1", control=list(maxit=5000, report.freq=10))
# Beginning optimization
# 
# iter            f           nrm_gr                     status
#   10     175.256780     287.816322   Continuing - TR contract
#   20       2.931556       2.996846                 Continuing
#   30       2.131656       6.411590                 Continuing
#   40       1.127632       3.784477     Continuing - TR expand
#   50       0.315880       6.964198                 Continuing
#   60       0.208635       1.302632     Continuing - TR expand
#   70       0.132457       5.187552   Continuing - TR contract
#   80       0.108929       2.243054   Continuing - TR contract
#   90       0.103658       0.417444     Continuing - TR expand
#  100       0.039978       2.964965   Continuing - TR contract
#  110       0.007481       2.932641   Continuing - TR contract
#  120       0.006169       1.885494   Continuing - TR contract
#  130       0.000005       0.033300   Continuing - TR contract
#  140       0.000000       0.002521                 Continuing
# 
# Iteration has terminated
#  144       0.000000       0.000000                    Success
opt.sr1
# $fval
# [1] 1.657e-19
# 
# $solution
# [1] 1 1 1 1 1 1
# 
# $gradient
# [1] -6.302e-10  3.902e-10 -4.991e-11  1.402e-10 -1.377e-08  6.698e-09
# 
# $iterations
# [1] 144
# 
# $status
# [1] "Success"
# 
# $trust.radius
# [1] 0.005085
# 
# $method
# [1] "SR1"
# 
# $hessian.update.method
# [1] 1

Note that the quasi_Newton updates do not return a Hessian. We do not think that the final approximations from BFGS or SR1 updates are particularly reliable. If you need the Hessian, you can use the sparseHessianFD package.

References

Nocedal, Jorge, and Stephen J Wright. 2006. Numerical Optimization. Second edition. Springer-Verlag.