Introduction to optimflex

Rigorous Convergence Control

A defining feature of optimflex is its rigorous convergence control. Instead of relying on a single, hard-coded stopping rule, optimflex allows users to select and combine up to eight distinct convergence criteria.

When multiple criteria are enabled (by setting their respective use_ flags to TRUE in the control list), the package applies a strict “AND” rule: all chosen conditions must be satisfied simultaneously before the algorithm declares success. This multi-faceted approach ensures that the solution is stable from various numerical perspectives.

Convergence Criteria and Formulas

Each criterion is managed via a logical flag (use_*) and a corresponding tolerance parameter (tol_*) within the control list.

1. Function Value Changes

These monitor the stability of the objective function \(f\).

Absolute Change (use_abs_f): \[|f_{k+1} - f_k| < \epsilon_{abs\_f}\]
Relative Change (use_rel_f): \[\frac{|f_{k+1} - f_k|}{\max(1, |f_k|)} < \epsilon_{rel\_f}\]

2. Parameter Space Changes

These ensure that the parameter vector \(x\) has stabilized.

Absolute Change (use_abs_x): \[\|x_{k+1} - x_k\|_\infty < \epsilon_{abs\_x}\]
Relative Change (use_rel_x): \[\frac{\|x_{k+1} - x_k\|_\infty}{\max(1, \|x_k\|_\infty)} < \epsilon_{rel\_x}\]

3. Gradient Norm

The standard measure of stationarity (first-order optimality).

Gradient Infinity Norm (use_grad): \[\|g_{k+1}\|_\infty < \epsilon_{grad}\]

4. Hessian Verification

Positive Definiteness (use_posdef): \[\lambda_{min}(H) > 0\] This verifies that the Hessian at the final point is positive definite. This is crucial for confirming that the algorithm has reached a true local minimum rather than a saddle point.

5. Model-based Predicted Decrease

These check if the quadratic model of the objective function suggests significant further improvement is possible.

Predicted Decrease (use_pred_f): \[\Delta m_k < \epsilon_{pred\_f}\]
Average Predicted Decrease (use_pred_f_avg): \[\frac{\Delta m_k}{n} < \epsilon_{pred\_f\_avg}\]

Basic Usage

The following example demonstrates how to minimize a simple quadratic function using the BFGS algorithm with customized convergence criteria.

# Define a simple objective function
quad_func <- function(x) {
  (x[1] - 5)^2 + (x[2] + 3)^2
}

# Run optimization
res <- bfgs(
  start = c(0, 0),
  objective = quad_func,
  control = list(
    use_grad = TRUE,
    tol_grad = 1e-6,
    use_rel_x = TRUE
  )
)

# Inspect results
res$par
#> [1]  5 -3

res$converged
#> [1] FALSE

Algorithm Comparison: The Rosenbrock Function

optimflex shines when dealing with difficult landscapes like the Rosenbrock “banana” function. You can easily compare how different algorithms (e.g., Quasi-Newton vs. Trust-Region) navigate the narrow valley.

rosenbrock <- function(x) {
  100 * (x[2] - x[1]^2)^2 + (1 - x[1])^2
}

start_val <- c(-1.2, 1.0)

# Compare DFP and Double Dogleg
res_dfp <- dfp(start_val, rosenbrock)
res_dd  <- double_dogleg(start_val, rosenbrock, control = list(initial_delta = 2.0))

cat("DFP Iterations:", res_dfp$iter, "\n")
#> DFP Iterations: 39

cat("Double Dogleg Iterations:", res_dd$iter, "\n")
#> Double Dogleg Iterations: 70