API Reference

Module

Marguerite.Marguerite — Module

Marguerite

A minimal, differentiable Frank-Wolfe solver for constrained convex optimization.

The main entry point is solve. For bilevel problems, use bilevel_solve. For full Jacobians $\partial x^*/\partial\theta$, use solution_jacobian.

source

Solver

Marguerite.solve — Function

solve(f, lmo, x0; grad=nothing, kwargs...) -> (x, Result)

Solve

\[\min_{x \in C} f(x)\]

via the Frank-Wolfe algorithm.

lmo is a linear minimization oracle — any callable (v, g) -> v or <: AbstractOracle.

Keyword Arguments

grad: in-place gradient grad(g, x). If nothing (default), computed automatically via DifferentiationInterface using backend.
backend: AD backend (default: DEFAULT_BACKEND)
max_iters::Int = 10000: maximum iterations
tol::Real = 1e-4: convergence tolerance ($\mathrm{gap} \le \mathrm{tol} \cdot (1 + |f(x)|)$)
step_rule = MonotonicStepSize(): step size rule (callable t -> γ)
monotonic::Bool = true: reject non-improving updates
verbose::Bool = false: print progress
cache::Union{Cache, Nothing} = nothing: pre-allocated buffers

source

solve(f, lmo, x0, θ; grad=nothing, kwargs...) -> (x, Result)

Solve

\[\min_{x \in C(\theta)} f(x, \theta)\]

with parameters $\theta$.

If lmo is a ParametricOracle, the constraint set $C(\theta)$ is materialized at $\theta$ via materialize. Otherwise, $C$ is fixed.

A ChainRulesCore.rrule enables $\partial x^* / \partial \theta$ via implicit differentiation.

Keyword Arguments

grad: in-place gradient grad(g, x, θ). If nothing (default), auto-computed.
backend: AD backend for first-order gradients
hvp_backend: AD backend for Hessian-vector products
diff_cg_maxiter::Int=50: max CG iterations for the Hessian solve
diff_cg_tol::Real=1e-6: CG convergence tolerance
diff_lambda::Real=1e-4: Tikhonov regularization
assume_interior::Bool=false: for differentiated calls with custom oracles lacking active_set, error by default; when true, use the interior active set approximation instead

source

Bilevel

Marguerite.bilevel_solve — Function

bilevel_solve(outer_loss, inner_loss, lmo, x0, θ; grad=nothing, cross_deriv=nothing, kwargs...) -> (x_star, θ_grad, cg_result)

Solve the inner problem and compute the gradient of $L(x^*(\theta))$ w.r.t. $\theta$.

If lmo is a ParametricOracle, computes gradients through both the objective and constraint set via KKT adjoint differentiation.

Returns (x_star, θ_grad, cg_result) where x_star is the inner solution, θ_grad is $\nabla_\theta L(x^*(\theta))$, and cg_result::CGResult contains CG solver diagnostics.

outer_loss(x) -> Real takes only the inner solution. If the user's outer loss depends on $\theta$ directly, close over it and add the direct gradient manually.

Keyword Arguments

grad: in-place gradient grad(g, x, θ). If nothing (default), auto-computed.
cross_deriv: manual cross derivative cross_deriv(u, θ) -> dθ. Must return $-(\partial^2 f / \partial\theta\partial x)^T u$ (note the negative sign). Bypasses AD-based cross derivative computation. Faster when the cross derivative has closed-form structure (e.g. linear in $u$).
backend: AD backend for first-order gradients (default: DEFAULT_BACKEND)
hvp_backend: AD backend for Hessian-vector products (default: SECOND_ORDER_BACKEND)
diff_cg_maxiter::Int=50: max CG iterations for the Hessian solve
diff_cg_tol::Real=1e-6: CG convergence tolerance
diff_lambda::Real=1e-4: Tikhonov regularization for the Hessian
assume_interior::Bool=false: for custom oracles lacking active_set, error by default; when true, use the interior active set approximation
tol::Real=1e-4: inner solve convergence tolerance (also used for active set identification)

All other kwargs are forwarded to solve.

source

Marguerite.bilevel_gradient — Function

bilevel_gradient(outer_loss, inner_loss, lmo, x0, θ; kwargs...) -> θ_grad

Convenience wrapper: returns only ∇_θ L(x*(θ)). See bilevel_solve for full documentation.

source

Jacobian

Marguerite.solution_jacobian — Function

solution_jacobian(f, lmo, x0, θ; kwargs...) -> (J, result)

Compute the full Jacobian $\partial x^*/\partial\theta \in \mathbb{R}^{n \times m}$ via direct reduced-Hessian factorization.

Forms the reduced Hessian $P^\top \nabla^2 f\, P$ explicitly ($n_{\text{free}}$ HVPs), Cholesky-factors it once, then solves all $m$ right-hand sides in one shot. Much faster than $m$ separate pullback calls for full Jacobians.

Supports ParametricOracle inputs: the constraint sensitivity is included via bound shift coupling and normal space equality displacement.

See solution_jacobian! for the in-place version.

Returns (J, result) where result is a SolveResult.

source

Marguerite.solution_jacobian! — Function

solution_jacobian!(J, f, lmo, x0, θ; kwargs...) -> (J, result)

In-place version of solution_jacobian. Writes the Jacobian $\partial x^*/\partial\theta$ into the pre-allocated matrix J.

Supports ParametricOracle inputs: the constraint sensitivity (how the feasible set changes with $\theta$) is included via bound shift coupling and normal space equality displacement, requiring up to $n_{\text{bound}}$ additional HVPs (for oracle types with parametric bounds).

source

Types

Marguerite.Result — Type

Result{T<:Real}

Immutable record of a Frank-Wolfe solve.

Fields

objective::T – final objective value $f(x^*)$
gap::T – final Frank-Wolfe duality gap
iterations::Int – iterations taken
converged::Bool – whether $\mathrm{gap} \le \mathrm{tol} \cdot (1 + |f(x)|)$
discards::Int – rejected non-improving updates (monotonic mode)

source

Marguerite.CGResult — Type

CGResult{T<:Real}

Diagnostics from the linear solve in implicit differentiation.

For the iterative bilevel_solve path, all three fields reflect the inner CG. For the cached rrule(solve) direct-factorization path, iterations is 0 and residual_norm carries the relative factorization residual ($\lambda \|u\| / \|b\|$); converged is true if that residual is below the Tikhonov retry threshold.

Fields

iterations::Int – CG iterations (0 for direct-solve path)
residual_norm::T – residual norm
converged::Bool – residual below the acceptance threshold

source

Marguerite.SolveResult — Type

SolveResult{T, V<:AbstractVector{T}}

Wrapper for solve output. Supports tuple unpacking and pretty-printing.

x, result = solve(...) still works via Base.iterate. Provides cleaner REPL display than a raw tuple.

Fields

x::AbstractVector{T} – optimal solution $x^*$
result::Result{T} – convergence diagnostics

source

Marguerite.BilevelResult — Type

BilevelResult{T, S}

Wrapper for bilevel_solve output. Supports tuple unpacking and pretty-printing.

x, θ_grad, cg_result = bilevel_solve(...) still works via Base.iterate.

Fields

x::Vector{T} – inner problem solution $x^*(\theta)$
theta_grad::S – gradient $\nabla_\theta L(x^*(\theta))$
cg_result::CGResult{T} – CG solver diagnostics

source

Marguerite.Cache — Type

Cache{T<:Real}

Pre-allocated working buffers for the Frank-Wolfe inner loop. Includes sparse vertex buffers used internally by fused LMO+gap computation.

Construct via Cache{T}(n) or let solve allocate one automatically.

source

Step Size Rules

Marguerite.MonotonicStepSize — Type

MonotonicStepSize()

The standard Frank-Wolfe step size

\[\gamma_t = \frac{2}{t+2}\]

yielding $O(1/t)$ convergence. Carderera, Besançon & Pokutta (2024) establish this rate for generalized self-concordant objectives.

source

Marguerite.AdaptiveStepSize — Type

AdaptiveStepSize(L0::Real=1.0; eta=2.0)

Backtracking line-search step size with Lipschitz estimation.

Starting from estimate $L$, multiplies $L$ by $\eta$ until sufficient decrease holds, then sets

\[\gamma = \mathrm{clamp}\!\left(\frac{\langle \nabla f,\, x - v \rangle}{L \,\|d\|^2},\; 0,\; 1\right)\]

source

AD Backends

Marguerite.DEFAULT_BACKEND — Constant

DEFAULT_BACKEND

Default AD backend for first-order gradients (DI.AutoForwardDiff()). Override by passing backend= to auto-gradient or parameterized solve variants, bilevel_solve, etc.

source

Marguerite.SECOND_ORDER_BACKEND — Constant

SECOND_ORDER_BACKEND

Default AD backend for Hessian-vector products in implicit differentiation (DI.SecondOrder(DI.AutoForwardDiff(), DI.AutoForwardDiff())). Override by passing hvp_backend= to parameterized solve variants, bilevel_solve, etc.

source

Oracle Types

See Oracles for full documentation and constraint set diagrams.

Parametric Oracle Types

See Parametric Oracles for full documentation.

Internals

These are not part of the public API and may change without notice.

Solver

Marguerite._solve_core — Function

_solve_core(f, ∇f!, lmo, x0; kwargs...) -> SolveResult

Core Frank-Wolfe loop. Requires lmo <: AbstractOracle for dispatch on _lmo_and_gap! specializations.

Callers should use solve instead; this is an internal function.

source

Marguerite._to_oracle — Function

_to_oracle(lmo) -> AbstractOracle
_to_oracle(lmo, θ) -> AbstractOracle

Normalize lmo to an AbstractOracle. If lmo is a ParametricOracle and θ is provided, materializes the constraint set at θ. Calling with a ParametricOracle and no θ is an error: such oracles cannot be reduced without their parameters.

source

Marguerite._partial_sort_negative! — Function

_partial_sort_negative!(perm, g, k) -> count

Find up to k indices with the most negative values in g, stored sorted in perm[1:count]. Zero-allocation: only uses the pre-allocated perm buffer. $O(n \cdot k)$ — fine since k is typically small.

source

Marguerite._lmo_and_gap! — Function

_lmo_and_gap!(lmo, c::Cache, x, n) -> (fw_gap, nnz)

Fused linear minimization oracle + Frank-Wolfe gap computation.

Returns (fw_gap, nnz) where nnz encodes the vertex representation:

nnz = -1: dense vertex stored in c.vertex
nnz = 0: origin vertex (all zeros)
nnz > 0: sparse vertex in c.vertex_nzind[1:nnz], c.vertex_nzval[1:nnz]

Specializations exist for Simplex, Knapsack, and MaskedKnapsack to avoid materializing the full dense vertex vector. The generic fallback calls lmo(c.vertex, c.gradient) and returns nnz = -1.

Indices in c.vertex_nzind[1:nnz] must be distinct.

Dispatches on KernelAbstractions.get_backend(x) to select CPU or GPU code paths.

source

Marguerite._ensure_vertex! — Function

_ensure_vertex!(c::Cache, nnz, step_rule)

Materialize the dense vertex buffer c.vertex from the sparse representation (c.vertex_nzind[1:nnz], c.vertex_nzval[1:nnz]). When nnz = 0, fills the vertex buffer with zeros (origin vertex). No-op when nnz = -1 (already dense) or for MonotonicStepSize.

source

Marguerite._trial_update! — Function

_trial_update!(c::Cache, x, γ, nnz, n)

Compute the trial point x_trial = (1-γ)*x + γ*v using the sparse vertex representation when available. When nnz ≥ 0, avoids touching the dense vertex buffer by scaling x by (1-γ) and adding sparse corrections. When nnz = -1, uses the equivalent form x + γ*(v - x).

source

Differentiation

Pullback State

Marguerite._PullbackState — Type

_PullbackState{T, FT, HB, P, AS, TM}

Pre-computed state for the rrule pullback closure. Built once in the rrule body and reused across all pullback calls (e.g. when computing a full Jacobian via $n$ pullback calls).

Caches the active set, HVP preparation, tangent space geometry (_TangentMap), pre-allocated work buffers, and the Cholesky (or LU) factorization of the reduced Hessian for direct backsubstitution.

Mutable so that _kkt_adjoint_solve_cached can grow the Tikhonov regularization on the fly when the residual exceeds the recovery threshold; the unregularized H_red is kept so we can refactor with a new lambda without rebuilding via d HVPs. The factor type can flip (LU on initial indefinite fallback, Cholesky on retry once regularization is strong enough), so hessian_factor is stored as Any rather than typed.

source

Marguerite._build_pullback_state — Function

_build_pullback_state(f, hvp_backend, x_star, θ, oracle, tol; ...)

Build _PullbackState once in the rrule body. Performs active set identification, tangent map construction, reduced Hessian factorization, and buffer allocation — all invariant across pullback calls.

source

Marguerite._kkt_adjoint_solve_cached — Function

_kkt_adjoint_solve_cached(state::_PullbackState, dx) -> (u, μ_bound, μ_eq, cg_result)

Cached pullback solve with a single residual-driven Tikhonov retry. If the factorization residual $\lambda \|u\| / \|b\|$ exceeds _TIKHONOV_RESIDUAL_THRESHOLD after the initial solve, scale state.lambda by _TIKHONOV_GROWTH (capped at _TIKHONOV_MAX), refactor the reduced Hessian, and resolve once. The retry's outcome persists across pullback calls on the same state, so subsequent solves benefit from the tightened regularization without an extra retry.

Records state.last_residual_rel and state.last_cg_iters for diagnostics.

source

Marguerite._build_cross_matrix! — Function

_build_cross_matrix!(C_red, state::_PullbackState, f, grad, x_star, θ, backend, hvp_backend)

Build the d × m cross derivative matrix $P^\top (\partial^2 f / \partial x \partial \theta)$ in the tangent space.

Manual gradient path: differentiates θ → ∇ₓf(x*, θ) via DI.jacobian, then projects each column into the tangent space. Auto gradient path: m joint HVPs with v = [0; eⱼ], project each result.

source

Marguerite._has_active_set — Function

_has_active_set(oracle)

Trait: returns true if the oracle type has a specialized active_set method. Built-in oracles are enumerated. Custom oracle types should also define Marguerite._has_active_set(::MyOracle) = true to enable differentiation.

The fallback checks whether a typed (non-generic) active_set method exists for the oracle type via which signature inspection.

source

CG and Hessian Solves

Marguerite._cg_solve — Function

_cg_solve(hvp_fn, rhs; maxiter=50, tol=1e-6, λ=1e-4)

Conjugate gradient solver for

\[(H + \lambda I)\, u = \text{rhs}\]

where $H$ is accessed only via Hessian-vector products hvp_fn(d) -> Hd.

Tikhonov regularization $\lambda$ ensures well-conditioned systems near singular Hessians (e.g. on boundary of feasible set).

source

Marguerite._hessian_cg_solve — Function

_hessian_cg_solve(f, hvp_backend, x_star, θ, dx; cg_maxiter=50, cg_tol=1e-6, cg_λ=1e-4)

Solve

\[(\nabla^2_{xx} f + \lambda I)\, u = dx\]

via CG with HVPs.

Shared Hessian-solve step used by _kkt_adjoint_solve (fast path when active set is empty) and the KKT implicit pullback functions.

source

Marguerite._kkt_adjoint_solve — Function

_kkt_adjoint_solve(f, hvp_backend, x_star, θ, dx, as::ActiveConstraints;
                    cg_maxiter=50, cg_tol=1e-6, cg_λ=1e-4)

Solve the KKT adjoint system at $x^*$ with active constraints.

The KKT system is:

\[\begin{bmatrix} \nabla^2 f & G^T \\ G & 0 \end{bmatrix} \begin{bmatrix} u \\ \mu \end{bmatrix} = \begin{bmatrix} dx \\ 0 \end{bmatrix}\]

Solved via reduced Hessian CG on the null space of $G$ (active face):

Set $u[\text{bound}] = 0$, work only in free variable subspace
Project $dx_{\text{free}}$ to null(eq_normals)
CG solve: $(P H_{\text{free}} P + \lambda I) w = P dx_{\text{free}}$
Recover $\mu$ from KKT residual

source

Marguerite._factor_reduced_hessian — Function

_factor_reduced_hessian(H_red, diff_lambda) -> factorization

Regularize H_red with diff_lambda, then Cholesky-factor. Falls back to LU if the Hessian is indefinite. Errors with an actionable message if both fail.

source

Null Space Projection

Marguerite._orthogonalize! — Function

_orthogonalize!(a_vecs, a_norm_sqs)

Modified Gram-Schmidt orthogonalization of constraint normals in-place. Updates both a_vecs and a_norm_sqs so that sequential projection in _null_project! is correct for non-orthogonal normals.

source

Marguerite._null_project! — Function

_null_project!(out, w, a_frees, a_norm_sqs)

Project w (in free variable space) onto the null space of pre-computed equality constraint normals. Writes result into out (may alias w).

Assumes a_frees have been orthogonalized via _orthogonalize! so that sequential subtraction is exact.

\[P(w) = w - \sum_j \frac{a_j^\top w}{\|a_j\|^2}\, a_j\]

source

Marguerite._recover_μ_eq — Function

_recover_μ_eq(a_vecs, residual)

Recover equality-constraint multipliers by solving the small normal-equations system $(A A^\top) \mu = A \, \text{residual}$ where the rows of $A$ are the (original, non-orthogonalized) constraint normals.

For a single constraint this reduces to the familiar inner-product formula.

source

Marguerite._recover_face_multipliers — Function

_recover_face_multipliers(residual, as::ActiveConstraints)

Recover multipliers for the active face from a residual of the form $G^\top \lambda = \text{residual}$, where G contains the active face normals.

This is used both for adjoint multipliers (with residual = dx - Hu) and for primal multipliers (with residual = -∇f(x*, θ)).

source

Marguerite._primal_face_multipliers — Function

_primal_face_multipliers(f, grad, x_star, θ, as, backend)

Recover the primal active-face multipliers from stationarity

\[\nabla_x f(x^*; \theta) + G(\theta)^T \lambda = 0.\]

When the active normals depend on $\theta$, these multipliers contribute to the implicit gradient through the term $-\lambda^T (\partial_\theta G) u$.

source

Marguerite._correct_bound_multipliers! — Function

_correct_bound_multipliers!(μ_bound, μ_eq, as::ActiveConstraints)

Correct raw bound multipliers by subtracting equality-constraint overlap.

The KKT residual at a bound index iₖ contains both the bound multiplier and projections of equality multipliers onto that index. This function removes the equality contributions so that μ_bound reflects only the bound constraint force:

\[\mu_{\text{bound},k} \mathrel{-}= \sum_j \mu_{\text{eq},j} \, a_j[i_k]\]

source

Marguerite._objective_gradient — Function

_objective_gradient(f, grad, x_star, θ, backend)

Evaluate $\nabla_x f(x^*, \theta)$ using either the user's manual gradient or the configured AD backend.

source

Cross-Derivatives

Marguerite._make_∇ₓf_of_θ — Function

_make_∇ₓf_of_θ(∇f!, x_star)

Build the map $\theta \mapsto \nabla_x f(x^*, \theta)$ from a mutating gradient ∇f!(g, x, θ) and a fixed solution x_star.

The returned closure allocates a type promoted buffer so that forward mode AD through $\theta$ propagates correctly. It is consumed by _cross_derivative_manual to compute the cross derivative $(\partial \nabla_x f / \partial \theta)^\top u$.

source

Marguerite._cross_derivative_manual — Function

_cross_derivative_manual(∇ₓf_of_θ, u, θ, backend)

Compute $d\theta = -(\partial(\nabla_x f)/\partial\theta)^T u$ via AD through the scalar $\theta \mapsto \langle \nabla_x f(\theta), u \rangle$.

source

Marguerite._cross_derivative_hvp — Function

_cross_derivative_hvp(f, x_star, θ, u, hvp_backend)

Compute $d\theta = -\nabla^2_{\theta x} f \cdot u$ via a joint HVP on $g(z) = f(z_{1:n}, z_{n+1:\text{end}})$ with $z = [x; \theta]$.

Note

Uses @view slicing which requires the AD backend to support SubArray differentiation. ForwardDiff and Mooncake support this; other backends (e.g. Enzyme, Zygote) may not. If using an unsupported backend, provide a manual grad function to bypass this joint-HVP path.

source

Constraint Pullback

Marguerite._constraint_scalar — Function

_constraint_scalar(plmo::ParametricOracle, θ, x_star, u, μ_bound, μ_eq, λ_bound, λ_eq, as)

Compute the scalar function $\Phi(\theta)$ whose gradient gives the constraint sensitivity contribution to $d\theta$.

For simple RHS-parametric constraints, $\Phi(\theta) = \mu^T h(\theta)$ where $h(\theta)$ are the active constraint RHS values. For constraints with $\theta$-dependent normals, the scalar also includes the primal-multiplier term $-\lambda^T G(\theta) u$ required by the full linear-face KKT pullback.

source

Marguerite._constraint_pullback — Function

_constraint_pullback(plmo::ParametricOracle, θ, x_star, u, μ_bound, μ_eq, λ_bound, λ_eq, as, backend)

Compute $d\theta_{\text{constraint}}$ via AD through the constraint scalar function.

source

Tangent Maps

Marguerite._TangentMap — Type

_TangentMap{T}

Oracle-specific tangent space geometry for implicit differentiation.

At a constrained optimum, only certain directions are "free" — the rest are pinned by active constraints. A tangent map encodes how to move between the full variable space and the reduced space of free directions (the "tangent space" of the constraint surface).

All polyhedral oracles (Simplex, Box, Knapsack, etc.) share one tangent map (_PolyhedralTangentMap) because they all have the same structure: some variables hit bounds, linear equality constraints restrict the rest. The Spectraplex needs its own (_SpectralTangentMap) because its constraint surface (the PSD cone) has curved geometry requiring eigendecomposition-based operations.

Interface methods (dispatched on concrete subtype):

_project_tangent!: full space → tangent space
_expand_tangent!: tangent space → full space
_tangent_correction!: post-HVP correction for reduced Hessian
_reduced_dim: dimension of the tangent space

source

Marguerite._PolyhedralTangentMap — Type

_PolyhedralTangentMap{T} <: _TangentMap{T}

Tangent map for polyhedral constraint sets (Simplex, Box, Knapsack, etc.).

At the solution, variables split into two groups:

Bound: variables on a constraint boundary that cannot move.
Free: interior variables that form the reduced search space.

Equality constraints (e.g. $\sum x_i = r$ for the Simplex) further restrict the free variables to a subspace. The orthogonalized constraint normals a_frees and their squared norms a_norm_sqs enable null space projection: removing components that would violate equality constraints.

Fields

free::Vector{Int} – indices of free (non-bound) variables
bound::Vector{Int} – indices of variables pinned to constraint boundaries
a_frees::Vector{Vector{T}} – orthogonalized equality constraint normals restricted to free variables
a_frees_orig::Vector{Vector{T}} – original (pre-orthogonalization) constraint normals, retained for multiplier recovery
a_norm_sqs::Vector{T} – squared norms of orthogonalized normals

source

Marguerite._SpectralTangentMap — Type

_SpectralTangentMap{T} <: _TangentMap{T}

Tangent map for the Spectraplex constraint ($X \succeq 0$, $\operatorname{tr}(X) = r$).

The solution matrix $X^*$ has an eigendecomposition with rank nonzero eigenvalues. The eigenvectors split into:

U (n × rank): eigenvectors with nonzero eigenvalues — the "active face" of the PSD cone, where $X^*$ has support.
V_perp (n × nullity): eigenvectors with zero eigenvalues — the boundary of the PSD cone, where $X^*$ touches the cone edge.

The tangent space at $X^*$ consists of symmetric perturbations that preserve trace and PSD structure. Its dimension is rank*(rank+1)/2 - 1 (trace zero face block) plus rank * nullity (active×null cross block).

G_uu and G_vv are the objective's Hessian projected onto the active and null eigenspaces. They appear in a curvature correction term: unlike polyhedral constraints, moving along the active×null cross block incurs additional curvature B·G_vv - G_uu·B from the cone boundary itself.

Remaining fields (tmp_face, tmp_null, face, mixed, mixed_curv, full, cross) are pre-allocated work buffers for compress/expand/curvature operations.

source

Marguerite._project_tangent! — Function

_project_tangent!(out, v, tm::_TangentMap)

Project full space vector v into the tangent space, writing to out (length _reduced_dim(tm)).

Polyhedral: extracts free variable components. Spectral: compresses via _spectraplex_compress!.

source

Marguerite._expand_tangent! — Function

_expand_tangent!(out, z, tm::_TangentMap)

Expand tangent space vector z back to the full variable space, writing to out.

Polyhedral: scatters into free variable positions, zeros elsewhere. Spectral: reconstructs via _spectraplex_expand!.

source

Marguerite._tangent_correction! — Function

_tangent_correction!(out, z, tm::_TangentMap)

Post-HVP correction when building the reduced Hessian matrix.

Polyhedral: null space projection — removes components along equality constraint normals that are not truly free. Spectral: adds the mixed curvature term from the PSD cone geometry. At a rank deficient optimum, perturbing in the active×null cross block sees additional curvature $B G_{vv} - G_{uu} B$ from the cone boundary.

source

Marguerite._reduced_dim — Function

_reduced_dim(tm::_TangentMap) -> Int

Working dimension for the reduced Hessian. For polyhedral, this is the number of free variables (equality constraints are handled via null space projection, not dimensional reduction).

source

Marguerite._build_tangent_map — Function

_build_tangent_map(T, oracle, as, f, grad, x_star, θ, backend) -> _TangentMap

Construct the oracle-specific tangent map from the active constraint set. Polyhedral oracles get _PolyhedralTangentMap, Spectraplex gets _SpectralTangentMap.

source

Spectraplex

Oracle

Marguerite.SpectraplexEqNormals — Type

SpectraplexEqNormals{T}

Lightweight representation of equality constraints for the spectraplex active set. Stores the active eigenvectors U (rank columns) and null space eigenvectors V_perp (nullity columns). The constraint count encodes antisymmetry, trace, mixed, and null-null constraints without materializing them as dense vectors — the differentiation pipeline dispatches on ActiveConstraints{AT, <:SpectraplexEqNormals} and works with U/V_perp directly via tangent space compress/expand operations.

source

Marguerite._spectraplex_min_eigen — Function

_spectraplex_min_eigen(g, n[, buf]) -> (λ_min, v_min)

Symmetrize the gradient (reshaped as n×n) and return the minimum eigenvalue and eigenvector. Shared by the oracle callable and _lmo_and_gap!.

The buf argument is an n×n pre-allocated matrix used for symmetrization and destroyed by eigen!. When omitted, a fresh buffer is allocated.

source

Marguerite._spectraplex_write_rank1! — Function

_spectraplex_write_rank1!(v, v_min, r, n)

Write the rank-1 vertex $r \, v_{\min} v_{\min}^\top$ column major into v.

source

Marguerite._spectraplex_sym_count — Function

_spectraplex_sym_count(n) -> Int

Number of antisymmetry constraints for an $n \times n$ matrix: $n(n-1)/2$.

source

Tangent Space Coordinates

Marguerite._spectraplex_trace_zero_dim — Function

_spectraplex_trace_zero_dim(rank) -> Int

Number of free parameters in a rank × rank symmetric, trace zero matrix. Equal to rank*(rank+1)/2 - 1 (upper triangle minus the trace constraint).

source

Marguerite._spectraplex_tangent_dim — Function

_spectraplex_tangent_dim(rank, nullity) -> Int

Total dimension of the spectraplex tangent space: face block (trace zero symmetric, rank*(rank+1)/2 - 1) plus mixed block (rank * nullity).

source

Marguerite._spectraplex_pack_trace_zero! — Function

_spectraplex_pack_trace_zero!(out, M) -> out

Encode a k × k symmetric trace zero matrix M as a flat vector.

Layout: first k-1 entries are diagonal differences M[i,i] - M[k,k], then upper-triangle off-diagonal sums M[i,j] + M[j,i]. The last diagonal is implicit via the trace zero constraint.

Inverse: _spectraplex_unpack_trace_zero!.

source

Marguerite._spectraplex_unpack_trace_zero! — Function

_spectraplex_unpack_trace_zero!(S, z) -> S

Recover a k × k symmetric trace zero matrix S from its packed vector z.

Inverts _spectraplex_pack_trace_zero!: recovers diagonals from the stored differences plus the trace zero constraint (S[k,k] = -∑ z[1:k-1] / k), then fills the symmetric off-diagonals from the stored sums (z[p] / 2).

source

Marguerite._spectraplex_pack_mixed! — Function

_spectraplex_pack_mixed!(out, B, offset) -> out

Pack the rank × nullity mixed block B into out starting at offset, in column major order. Inverse: _spectraplex_unpack_mixed!.

source

Marguerite._spectraplex_unpack_mixed! — Function

_spectraplex_unpack_mixed!(B, z, offset) -> B

Unpack the rank × nullity mixed block from z starting at offset into matrix B, in column major order. Inverse: _spectraplex_pack_mixed!.

source

Marguerite._spectraplex_compress! — Function

_spectraplex_compress!(out, x, U, V_perp, tmp_face, tmp_null, face, mixed, full) -> out

Compress a full n² vector x to tangent space coordinates out.

Symmetrizes x as an n × n matrix, then extracts:

Face block: $U^\top \operatorname{sym}(X) \, U$ → packed trace zero
Mixed block: $U^\top \operatorname{sym}(X) \, V_\perp$ → packed column major

Inverse: _spectraplex_expand!.

source

Marguerite._spectraplex_expand! — Function

_spectraplex_expand!(out, z, U, V_perp, face, mixed, tmp_face, tmp_null, full, cross) -> out

Expand tangent space coordinates z to a full n² vector out.

Reconstructs the symmetric matrix perturbation: $\Delta X = U \, S \, U^\top + U \, B \, V_\perp^\top + V_\perp \, B^\top \, U^\top$ where S is the trace zero face block and B is the mixed block, both unpacked from z. Then vectorizes column major into out.

Inverse: _spectraplex_compress!.

source

Marguerite._spectraplex_add_mixed_curvature! — Function

_spectraplex_add_mixed_curvature!(out, z, G_uu, G_vv, mixed, mixed_curv) -> out

Add the PSD cone curvature correction to a reduced Hessian-vector product.

At a rank deficient optimum, perturbing in the active×null cross block B (extracted from z) sees curvature from the cone boundary: $\Delta_{\text{out}} \mathrel{+}= B \, G_{vv} - G_{uu} \, B$ where G_uu and G_vv are the objective Hessian restricted to the active and null eigenspaces. This term is zero when the solution is full-rank.

source

Index

Marguerite.Marguerite
Marguerite.DEFAULT_BACKEND
Marguerite.SECOND_ORDER_BACKEND
Marguerite.AbstractOracle
Marguerite.ActiveConstraints
Marguerite.AdaptiveStepSize
Marguerite.BatchBilevelResult
Marguerite.BatchCache
Marguerite.BatchResult
Marguerite.BatchSolveConfig
Marguerite.BatchSolveResult
Marguerite.BatchedExpression
Marguerite.BilevelResult
Marguerite.Box
Marguerite.CGResult
Marguerite.Cache
Marguerite.FunctionOracle
Marguerite.Knapsack
Marguerite.MaskedKnapsack
Marguerite.MonotonicStepSize
Marguerite.ParametricBox
Marguerite.ParametricOracle
Marguerite.ParametricSimplex
Marguerite.ParametricWeightedSimplex
Marguerite.Result
Marguerite.ScalarBox
Marguerite.Simplex
Marguerite.SolveResult
Marguerite.Spectraplex
Marguerite.SpectraplexEqNormals
Marguerite.WeightedSimplex
Marguerite._PolyhedralTangentMap
Marguerite._PullbackState
Marguerite._SpectralTangentMap
Marguerite._TangentMap
ChainRulesCore.rrule
Marguerite.ParametricProbSimplex
Marguerite.ProbSimplex
Marguerite.ProbabilitySimplex
Marguerite._build_cross_matrix!
Marguerite._build_pullback_state
Marguerite._build_tangent_map
Marguerite._cg_solve
Marguerite._constraint_pullback
Marguerite._constraint_scalar
Marguerite._correct_bound_multipliers!
Marguerite._cross_derivative_hvp
Marguerite._cross_derivative_manual
Marguerite._ensure_vertex!
Marguerite._expand_tangent!
Marguerite._factor_reduced_hessian
Marguerite._has_active_set
Marguerite._hessian_cg_solve
Marguerite._kkt_adjoint_solve
Marguerite._kkt_adjoint_solve_cached
Marguerite._lmo_and_gap!
Marguerite._make_∇ₓf_of_θ
Marguerite._null_project!
Marguerite._objective_gradient
Marguerite._orthogonalize!
Marguerite._partial_sort_negative!
Marguerite._primal_face_multipliers
Marguerite._project_tangent!
Marguerite._recover_face_multipliers
Marguerite._recover_μ_eq
Marguerite._reduced_dim
Marguerite._solve_core
Marguerite._spectraplex_add_mixed_curvature!
Marguerite._spectraplex_compress!
Marguerite._spectraplex_expand!
Marguerite._spectraplex_min_eigen
Marguerite._spectraplex_pack_mixed!
Marguerite._spectraplex_pack_trace_zero!
Marguerite._spectraplex_sym_count
Marguerite._spectraplex_tangent_dim
Marguerite._spectraplex_trace_zero_dim
Marguerite._spectraplex_unpack_mixed!
Marguerite._spectraplex_unpack_trace_zero!
Marguerite._spectraplex_write_rank1!
Marguerite._tangent_correction!
Marguerite._to_oracle
Marguerite._trial_update!
Marguerite.active_set
Marguerite.batch_bilevel_gradient
Marguerite.batch_bilevel_solve
Marguerite.batch_solution_jacobian
Marguerite.batch_solve
Marguerite.bilevel_gradient
Marguerite.bilevel_solve
Marguerite.materialize
Marguerite.solution_jacobian
Marguerite.solution_jacobian!
Marguerite.solve