Skip to content

Optimization#17

Open
Balinus wants to merge 3 commits intomasterfrom
optimization
Open

Optimization#17
Balinus wants to merge 3 commits intomasterfrom
optimization

Conversation

@Balinus
Copy link
Copy Markdown
Collaborator

@Balinus Balinus commented Mar 25, 2026

Optimization 1st pass include the following modifications:

Changes made

  1. src/distributions/Power.jl — Direct closed-form math
    Replaced delegation through Beta(κ, 1) wrapper with direct formulas: cdf = x^κ, logpdf = log(κ) + (κ-1)log(x), quantile = p^(1/κ), rand = u^(1/κ). Also added a specialized logcdf = κ·log(x).

  2. src/distributions/TruncatedNormal.jl — Cached underlying distribution
    The Truncated(Normal(...)) wrapper was being recreated on every pdf/cdf/quantile/rand call. Now precomputed once at construction and stored in a _dist field.

  3. src/distributions/TruncatedBeta.jl — Same caching strategy
    The LocationScale(Truncated(Beta(...))) triple-nested wrapper is now built once at construction.

  4. src/distributions/ExtendedGeneralizedPareto.jl — @inline hints + logcdf
    Added @inline on hot methods and a specialized logcdf that avoids log(cdf(...)).

  5. src/parameterestimation.jl — Allocation-free hot loop

Replaced sum(logpdf.(pd, y⁺)) (allocates temp array) with an explicit @inbounds loop
Avoids constructing EGP in the uncensored path — uses V and G directly
Replaced count(y .< threshold) (allocates BitArray) with count(v -> v < threshold, y)
Used expm1(ν) for better numerics

Category Benchmark Speedup
quantile EGP_Power_vector 94% faster
rand EGP_Power_1000 95% faster
quantile EGP_Power_scalar 86% faster
rand EGP_Power_single 89% faster
cdf EGP_Power_scalar 78% faster
cdf EGP_TNormal_scalar 74% faster
fit_mle EGP_TNormal_n500 74% faster
cdf EGP_Power_vector 72% faster
rand EGP_TNormal_single 71% faster
pdf TBeta variants 60-63% faster
fit_mle Power variants 57-59% faster
workflow pcp_fit_and_quantile 56% faster (2.1s → 0.9s)
logpdf all variants 29-58% faster

Construction of TNormal/TBeta is slower (they now precompute the cached distribution), but this is amortized away by the dramatic speedups on all subsequent operations.

Optimization 2nd pass - MVector include the following changes:

Changes made

  1. src/parameterestimation.jl — MVector + fused GP computation

MVector{3} for Optim.jl: The optimizer initial point is now MVector(ν₀, ϕ₀, ξ₀) instead of [ν₀, ϕ₀, ξ₀]. The fixed size lets the compiler unroll simplex operations and avoids dynamic-size array overhead. (SVector isn't compatible with Optim v1.13.3's use of fill!.)
_gp_cdf_logpdf fused helper: Both cdf(GP, x) and logpdf(GP, x) internally compute z = 1 + ξ·x/σ and log(z). The new helper computes both in a single pass, eliminating the redundant computation per data point per optimizer iteration.
log(σ) = ϕ identity: Since σ = exp(ϕ), we skip computing log(σ) entirely.
Allocations dropped from 1,420 → 588 (58% fewer) and 34.2 KiB → 19.2 KiB (44% less memory).

  1. src/distributions/TruncatedNormal.jl & TruncatedBeta.jl — added logcdf delegates

  2. src/ExtendedExtremes.jl — added StaticArrays dependency

Results vs previous round (additional speedups)

Benchmark Additional improvement
fit_mle EGP_TNormal_n500 -47% faster
fit_mle EGP_Power_censored_n500 -44% faster
fit_mle EGP_Power_n2000 -43% faster
fit_mle EGP_Power_n500 -42% faster
workflow pcp_fit_and_quantile -41% faster
logpdf EGP_TNormal_vector -20% faster

Cumulative results vs original code

Benchmark Total speedup
fit_mle EGP_TNormal_n500 86% faster (77ms → 11ms)
fit_mle EGP_Power_n500 76% faster (66ms → 16ms)
fit_mle EGP_Power_censored 77% faster (50ms → 12ms)
workflow pcp_fit_and_quantile 74% faster (2.1s → 0.55s)
rand EGP_Power_1000 95% faster
quantile EGP_Power_vector 94% faster

@Balinus Balinus requested a review from jojal5 March 25, 2026 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants