Releases · Entrolution/echidna

14 Mar 22:47

gvonness-apolitical

v0.5.0

32f8a75

v0.5.0 Latest

Latest

Added

GPU cast safety audit: SAFETY comments on all as u32 casts in GPU paths (mod.rs, cuda_backend.rs, wgpu_backend.rs, stde_gpu.rs). Added debug_assert! guards on user-provided direction/batch counts in stde_gpu.rs.
#[must_use] annotations: 19 pure functions now carry #[must_use] (support module helpers, GPU codegen, solver wrappers, Laurent::zero/one).
#![warn(missing_docs)]: enabled crate-wide. All public items — 35 OpCode variants, ~190 elemental methods across Dual, DualVec, Taylor, TaylorDyn, Laurent, struct fields, and trait methods — now have doc comments.

Changed

Test decomposition: split tests/stde.rs (1630 lines, 76 tests) into 5 focused files: stde_core, stde_stats, stde_pipeline, stde_higher_order, stde_dense. All 76 tests preserved.
Removed ROADMAP.md — all phases (0–5) complete.

Assets 2

26 Feb 01:52

gvonness-apolitical

v0.4.0

764925c

v0.4.0

Changed

Internal Architecture

BytecodeTape decomposition: split 2,689-line monolithic bytecode_tape.rs into a directory module with 10 focused submodules (forward.rs, reverse.rs, tangent.rs, jacobian.rs, sparse.rs, optimize.rs, taylor.rs, parallel.rs, serde_support.rs, thread_local.rs). Zero public API changes; benchmarks confirm no performance impact.
Deduplicated reverse sweep in gradient_with_buf() and sparse_jacobian_par() — both now call shared reverse_sweep_core() instead of inlining the loop. gradient_with_buf gains the zero-adjoint skip optimization it was previously missing.
Bumped nalgebra dependency from 0.33 to 0.34

Fixed

Corrected opcode variant count in documentation (44 variants, not 38/43)
Fixed CONTRIBUTING.md MSRV reference (1.93, not 1.80)

Assets 2

25 Feb 14:25

github-actions

v0.3.0

98020ac

v0.3.0

Added

Differential Operator Evaluation (`diffop` feature)

diffop::mixed_partial(tape, x, orders) — compute any mixed partial derivative via jet coefficient extraction
diffop::hessian(tape, x) — full Hessian via jet extraction (cross-validated against tape.hessian())
MultiIndex — specify which mixed partial to compute (e.g., [2, 0, 1] = ∂³u/∂x₀²∂x₂)
JetPlan::plan(n, indices) — precompute slot assignments and extraction prefactors; reuse across evaluation points
diffop::eval_dyn(plan, tape, x) — evaluate a plan at a new point using TaylorDyn
Pushforward grouping: multi-indices with different active variable sets get separate forward passes to avoid slot contamination
Prime window sliding for collision-free slot assignment up to high derivative orders

Assets 2

25 Feb 01:02

gvonness-apolitical

v0.2.0

9e19af9

v0.2.0

Added

Bytecode Tape (Graph-Mode AD)

BytecodeTape SoA graph-mode AD with opcode dispatch and tape optimization (CSE, DCE, constant folding)
BReverse<F> tape-recording reverse-mode variable
record() / record_multi() to build tapes from closures
Hessian computation via forward-over-reverse (hessian, hvp)
DualVec<F, N> batched forward-mode with N tangent lanes for vectorized Hessians (hessian_vec)

Sparse Derivatives

Sparsity pattern detection via bitset propagation
Graph coloring: greedy distance-2 for Jacobians, star bicoloring for Hessians
sparse_jacobian, sparse_hessian, sparse_hessian_vec
CSR storage (CsrPattern, JacobianSparsityPattern, SparsityPattern)

Taylor Mode AD

Taylor<F, K> const-generic Taylor coefficients with Cauchy product propagation
TaylorDyn<F> arena-based dynamic Taylor (runtime degree)
taylor_grad / taylor_grad_with_buf — reverse-over-Taylor for gradient + HVP + higher-order adjoints
ode_taylor_step / ode_taylor_step_with_buf — ODE Taylor series integration via coefficient bootstrapping

Stochastic Taylor Derivative Estimators (STDE)

laplacian — Hutchinson trace estimator for Laplacian approximation
hessian_diagonal — exact Hessian diagonal via coordinate basis
directional_derivatives — batched second-order directional derivatives
laplacian_with_stats — Welford's online variance tracking
laplacian_with_control — diagonal control variate variance reduction
Estimator trait generalizing per-direction sample computation (Laplacian, GradientSquaredNorm)
estimate / estimate_weighted generic pipeline
Hutchinson divergence estimator for vector fields via Dual<F> forward mode
Hutch++ (Meyer et al. 2021) O(1/S²) trace estimator via sketch + residual decomposition
Importance-weighted estimation (West's 1979 algorithm)

Cross-Country Elimination

jacobian_cross_country — Markowitz vertex elimination on linearized computational graph

Custom Operations

eval_dual / partials_dual default methods on CustomOp<F> for correct second-order derivatives (HVP, Hessian) through custom ops

Nonsmooth AD

forward_nonsmooth — branch tracking and kink detection for abs/min/max/signum/floor/ceil/round/trunc
clarke_jacobian — Clarke generalized Jacobian via limiting Jacobian enumeration
has_nontrivial_subdifferential() — two-tier classification: all 8 nonsmooth ops tracked for proximity detection; only abs/min/max enumerated in Clarke Jacobian
KinkEntry, NonsmoothInfo, ClarkeError types

Laurent Series

Laurent<F, K> — singularity analysis with pole tracking, flows through BytecodeTape::forward_tangent

Checkpointing

grad_checkpointed — binomial Revolve checkpointing
grad_checkpointed_online — periodic thinning for unknown step count
grad_checkpointed_disk — disk-backed for large state vectors
grad_checkpointed_with_hints — user-controlled checkpoint placement

GPU Acceleration

wgpu backend: batched forward, gradient, sparse Jacobian, HVP, sparse Hessian (f32, Metal/Vulkan/DX12)
CUDA backend: same operations with f32 + f64 support (NVRTC runtime compilation)
GpuBackend trait unifying wgpu and CUDA backends behind a common interface

Composable Mode Nesting

Type-level AD composition: Dual<BReverse<f64>>, Taylor<BReverse<f64>, K>, DualVec<BReverse<f64>, N>
composed_hvp convenience function for forward-over-reverse HVP
BReverse<Dual<f64>> reverse-wrapping-forward composition via BtapeThreadLocal impls for Dual<f32> and Dual<f64>

Serialization

serde support for BytecodeTape, Laurent<F, K>, KinkEntry, NonsmoothInfo, ClarkeError
JSON and bincode roundtrip support

Linear Algebra Integrations

faer_support: HVP, sparse Hessian, dense/sparse solvers (LU, Cholesky)
nalgebra_support: gradient, Hessian, Jacobian with nalgebra types
ndarray_support: HVP, sparse Hessian, sparse Jacobian with ndarray types

Optimization Solvers (`echidna-optim`)

L-BFGS solver with two-loop recursion
Newton solver with Cholesky factorization
Trust-region solver with Steihaug-Toint CG
Armijo line search
Implicit differentiation: implicit_tangent, implicit_adjoint, implicit_jacobian, implicit_hvp, implicit_hessian
Piggyback differentiation: tangent, adjoint, and interleaved forward-adjoint modes
Sparse implicit differentiation via faer sparse LU (sparse-implicit feature)

Benchmarking

Criterion benchmarks for Taylor mode, STDE, cross-country, sparse derivatives, nonsmooth
Comparison benchmarks against num-dual and ad-trait (forward + reverse gradient)
Correctness cross-check tests verifying ad-trait gradient agreement with echidna
CI regression detection via criterion-compare-action

Changed

Tape optimization: algebraic simplification at recording time (identity, absorbing, powi patterns)
Tape optimization: targeted multi-output DCE (dead_code_elimination_for_outputs)
Thread-local Adept tape pooling — grad()/vjp() reuse cleared tapes via thread-local pool instead of per-call allocation
Signed::signum() for BReverse<F> now records OpCode::Signum to tape (was returning a constant)
MSRV raised from 1.80 to 1.93
WelfordAccumulator struct extracted, deduplicating Welford's algorithm across 4 STDE functions
cuda_err helper extracted, replacing 72 inline .map_err closures in CUDA backend
create_tape_bind_group method extracted, replacing 4 duplicated bind group blocks in wgpu backend

Assets 2

Releases: Entrolution/echidna

v0.5.0

Added

Changed

Uh oh!

v0.4.0

Changed

Internal Architecture

Fixed

Uh oh!

v0.3.0

Added

Differential Operator Evaluation (diffop feature)

Uh oh!

v0.2.0

Added

Bytecode Tape (Graph-Mode AD)

Sparse Derivatives

Taylor Mode AD

Stochastic Taylor Derivative Estimators (STDE)

Cross-Country Elimination

Custom Operations

Nonsmooth AD

Laurent Series

Checkpointing

GPU Acceleration

Composable Mode Nesting

Serialization

Linear Algebra Integrations

Optimization Solvers (echidna-optim)

Benchmarking

Changed

Uh oh!

Differential Operator Evaluation (`diffop` feature)

Optimization Solvers (`echidna-optim`)