diff --git a/docs/source/algorithms.md b/docs/source/algorithms.md
index bd8837b9a..b55b37ca2 100644
--- a/docs/source/algorithms.md
+++ b/docs/source/algorithms.md
@@ -3977,573 +3977,374 @@ and hence imprecise.\
`AXP (AX-platfofm)` - Very slow and not recommended.
```{eval-rst}
-.. dropdown:: nevergrad_pso
+.. dropdown:: nevergrad_pso
+
+ **How to use this algorithm:**
.. code-block::
- "nevergrad_pso"
-
- Minimize a scalar function using the Particle Swarm Optimization algorithm.
-
- The Particle Swarm Optimization algorithm was originally proposed by :cite:`Kennedy1995`.The
- implementation in Nevergrad is based on :cite:`Zambrano2013`.
-
- PSO solves an optimization problem by evolving a swarm of particles (candidate solutions) across the
- search space. Each particle adjusts its position based on its own experience (cognitive component)
- and the experiences of its neighbors or the swarm (social component), using velocity updates. The
- algorithm iteratively guides the swarm toward promising regions of the search space.
-
- - **transform** (str): The transform used to map from PSO optimization space to real space. Options:
- - "arctan" (default)
- - "identity"
- - "gaussian"
- - **population\_size** (int): The number of particles in the swarm.
- - **n\_cores** (int): The number of CPU cores to use for parallel computation.
- - **seed** (int, optional): Random seed for reproducibility.
- - **stopping\_maxfun** (int, optional): Maximum number of function evaluations.
- - **inertia** (float):
- Inertia weight ω. Controls the influence of a particle's previous velocity. Must be less than 1 to
- avoid divergence. Default is 0.7213475204444817.
- - **cognitive** (float):
- Cognitive coefficient :math:`\phi_p`. Controls the influence of a particle’s own best known
- position. Typical values: 1.0 to 3.0. Default is 1.1931471805599454.
- - **social** (float):
- Social coefficient. Denoted by :math:`\phi_g`. Controls the influence of the swarm’s best known
- position. Typical values: 1.0 to 3.0. Default is 1.1931471805599454.
- - **quasi\_opp\_init** (bool): Whether to use quasi-opposition initialization. Default is False.
- - **speed\_quasi\_opp\_init** (bool):
- Whether to apply quasi-opposition initialization to speed. Default is False.
- - **special\_speed\_quasi\_opp\_init** (bool):
- Whether to use special quasi-opposition initialization for speed. Default is False.
- - **sigma**:
- Standard deviation for sampling initial population from N(0, σ²) in case bounds are not provided.
+ import optimagic as om
+ om.minimize(
+ ...,
+ algorithm=om.algos.nevergrad_pso(stopping_maxfun=1_000, ...)
+ )
+
+ or
+
+ .. code-block::
+
+ om.minimize(
+ ...,
+ algorithm="nevergrad_pso",
+ algo_options={"stopping_maxfun": 1_000, ...}
+ )
+
+ **Description and available options:**
+
+ .. autoclass:: optimagic.optimizers.nevergrad_optimizers.NevergradPSO
+
```
```{eval-rst}
-.. dropdown:: nevergrad_cmaes
+.. dropdown:: nevergrad_cmaes
+
+ **How to use this algorithm:**
.. code-block::
- "nevergrad_cmaes"
+ import optimagic as om
+ om.minimize(
+ ...,
+ algorithm=om.algos.nevergrad_cmaes(stopping_maxfun=1_000, ...)
+ )
+
+ or
+
+ .. code-block::
+
+ om.minimize(
+ ...,
+ algorithm="nevergrad_cmaes",
+ algo_options={"stopping_maxfun": 1_000, ...}
+ )
+
+ **Description and available options:**
+
+ .. autoclass:: optimagic.optimizers.nevergrad_optimizers.NevergradCMAES
- Minimize a scalar function using the Covariance Matrix Adaptation Evolution Strategy (CMA-ES)
- algorithm.
-
- The CMA-ES (Covariance Matrix Adaptation Evolution Strategy) is a state-of-the-art evolutionary
- algorithm designed for difficult non-linear, non-convex, black-box optimization problems in
- continuous domains. It is typically applied to unconstrained or bounded optimization problems with
- dimensionality between 3 and 100. CMA-ES adapts a multivariate normal distribution to approximate
- the shape of the objective function. It estimates a positive-definite covariance matrix, akin to the
- inverse Hessian in convex-quadratic problems, but without requiring derivatives or their
- approximation. Original paper can be accessed at `cma `_. This
- implementation is a python wrapper over the original code `pycma `_.
-
- - **scale**: Scale of the search.
- - **elitist**:
- Whether to switch to elitist mode (also known as (μ,λ)-CMA-ES). In elitist mode, the best point in
- the population is always retained.
- - **population\_size**: Population size.
- - **diagonal**: Use the diagonal version of CMA, which is more efficient for high-dimensional problems.
- - **high\_speed**: Use a metamodel for recommendation to speed up optimization.
- - **fast\_cmaes**:
- Use the fast CMA-ES implementation. Cannot be used with diagonal=True. Produces equivalent results
- and is preferable for high dimensions or when objective function evaluations are fast.
- - **random\_init**: If True, initialize the optimizer with random parameters.
- - **n\_cores**: Number of cores to use for parallel function evaluation.
- - **step\_size\_adaptive**:
- Whether to adapt the step size. Can be a boolean or a string specifying the adaptation strategy.
- - **CSA\_dampfac**: Damping factor for step size adaptation.
- - **CMA\_dampsvec\_fade**: Damping rate for step size adaptation.
- - **CSA\_squared**: Whether to use squared step sizes in updates.
- - **CMA\_on**: Learning rate for the covariance matrix update.
- - **CMA\_rankone**: Multiplier for the rank-one update learning rate of the covariance matrix.
- - **CMA\_rankmu**: Multiplier for the rank-mu update learning rate of the covariance matrix.
- - **CMA\_cmean**: Learning rate for the mean update.
- - **CMA\_diagonal\_decoding**: Learning rate for the diagonal update.
- - **num\_parents**: Number of parents (μ) for recombination.
- - **CMA\_active**: Whether to use negative updates for the covariance matrix.
- - **CMA\_mirrormethod**: Strategy for mirror sampling. Possible values are:
- - **0**: Unconditional mirroring
- - **1**: Selective mirroring
- - **2**: Selective mirroring with delay (default)
- - **CMA\_const\_trace**: How to normalize the trace of the covariance matrix. Valid values are:
- - False: No normalization
- - True: Normalize to 1
- - "arithm": Arithmetic mean normalization
- - "geom": Geometric mean normalization
- - "aeig": Arithmetic mean of eigenvalues
- - "geig": Geometric mean of eigenvalues
- - **CMA\_diagonal**:
- Number of iterations to use diagonal covariance matrix before switching to full matrix. If False,
- always use full matrix.
- - **stopping\_maxfun**: Maximum number of function evaluations before termination.
- - **stopping\_maxiter**: Maximum number of iterations before termination.
- - **stopping\_timeout**: Maximum time in seconds before termination.
- - **stopping\_cov\_mat\_cond**: Maximum condition number of the covariance matrix before termination.
- - **convergence\_ftol\_abs**: Absolute tolerance on function value changes for convergence.
- - **convergence\_ftol\_rel**: Relative tolerance on function value changes for convergence.
- - **convergence\_xtol\_abs**: Absolute tolerance on parameter changes for convergence.
- - **convergence\_iter\_noimprove**: Number of iterations without improvement before termination.
- - **invariant\_path**: Whether evolution path (pc) should be invariant to transformations.
- - **eval\_final\_mean**: Whether to evaluate the final mean solution.
- - **seed**: Seed used by the internal random number generator for reproducibility.
- - **sigma**:
- Standard deviation for sampling initial population from N(0, σ²) in case bounds are not provided.
```
```{eval-rst}
.. dropdown:: nevergrad_oneplusone
+ **How to use this algorithm:**
+
.. code-block::
- "nevergrad_oneplusone"
-
- Minimize a scalar function using the One Plus One Evolutionary algorithm from Nevergrad.
-
- THe One Plus One evolutionary algorithm iterates to find a set of parameters that minimizes the loss
- function. It does this by perturbing, or mutating, the parameters from the last iteration (the
- parent). If the new (child) parameters yield a better result, then the child becomes the new parent
- whose parameters are perturbed, perhaps more aggressively. If the parent yields a better result, it
- remains the parent and the next perturbation is less aggressive. Originally proposed by
- :cite:`Rechenberg1973`. The implementation in Nevergrad is based on the one-fifth adaptation rule,
- going back to :cite:`Schumer1968.
-
- - **noise\_handling**: Method for handling the noise, can be
- - "random": A random point is reevaluated regularly using the one-fifth adaptation rule.
- - "optimistic": The best optimistic point is reevaluated regularly, embracing optimism in the face of uncertainty.
- - A float coefficient can be provided to tune the regularity of these reevaluations (default is 0.05). Eg: with 0.05, each evaluation has a 5% chance (i.e., 1 in 20) of being repeated (i.e., the same candidate solution is reevaluated to better estimate its performance). (Default: `None`).
- - **n\_cores**: Number of cores to use.
-
- stopping.maxfun: Maximum number of function evaluations.
- - **mutation**: Type of mutation to apply. Available options are (Default: `"gaussian"`).
- - "gaussian": Standard mutation by adding a Gaussian random variable (with progressive widening) to the best pessimistic point.
- - "cauchy": Same as Gaussian but using a Cauchy distribution.
- - "discrete": Mutates a randomly drawn variable (mutation occurs with probability 1/d in d dimensions, hence ~1 variable per mutation).
- - "discreteBSO": Follows brainstorm optimization by gradually decreasing mutation rate from 1 to 1/d.
- - "fastga": Fast Genetic Algorithm mutations from the current best.
- - "doublefastga": Double-FastGA mutations from the current best :cite:`doerr2017`.
- - "rls": Randomized Local Search — mutates one and only one variable.
- - "portfolio": Random number of mutated bits, known as uniform mixing :cite:`dang2016`.
- - "lengler": Mutation rate is a function of dimension and iteration index.
- - "lengler{2|3|half|fourth}": Variants of the Lengler mutation rate adaptation.
- - **sparse**: Whether to apply random mutations that set variables to zero. Default is `False`.
- - **smoother**: Whether to suggest smooth mutations. Default is `False`.
- - **annealing**:
- Annealing schedule to apply to mutation amplitude or temperature-based control. Options are:
- - "none": No annealing is applied.
- - "Exp0.9": Exponential decay with rate 0.9.
- - "Exp0.99": Exponential decay with rate 0.99.
- - "Exp0.9Auto": Exponential decay with rate 0.9, auto-scaled based on problem horizon.
- - "Lin100.0": Linear decay from 1 to 0 over 100 iterations.
- - "Lin1.0": Linear decay from 1 to 0 over 1 iteration.
- - "LinAuto": Linearly decaying annealing automatically scaled to the problem horizon. Default is `"none"`.
- - **super\_radii**:
- Whether to apply extended radii beyond standard bounds for candidate generation, enabling broader
- exploration. Default is `False`.
- - **roulette\_size**:
- Size of the roulette wheel used for selection in the evolutionary process. Affects the sampling
- diversity from past candidates. (Default: `64`)
- - **antismooth**:
- Degree of anti-smoothing applied to prevent premature convergence in smooth landscapes. This alters
- the landscape by penalizing overly smooth improvements. (Default: `4`)
- - **crossover**: Whether to include a genetic crossover step every other iteration. Default is `False`.
- - **crossover\_type**:
- Method used for genetic crossover between individuals in the population. Available options (Default: `"none"`):
- - "none": No crossover is applied.
- - "rand": Randomized selection of crossover point.
- - "max": Crossover at the point with maximum fitness gain.
- - "min": Crossover at the point with minimum fitness gain.
- - "onepoint": One-point crossover, splitting the genome at a single random point.
- - "twopoint": Two-point crossover, splitting the genome at two points and exchanging the middle section.
- - **tabu\_length**:
- Length of the tabu list used to prevent revisiting recently evaluated candidates in local search
- strategies. Helps in escaping local minima. (Default: `1000`)
- - **rotation**:
- Whether to apply rotational transformations to the search space, promoting invariance to axis-
- aligned structures and enhancing search performance in rotated coordinate systems. (Default:
- `False`)
- - **seed**: Seed for the random number generator for reproducibility.
- - **sigma**:
- Standard deviation for sampling initial population from N(0, σ²) in case bounds are not provided.
+ import optimagic as om
+ om.minimize(
+ ...,
+ algorithm=om.algos.nevergrad_oneplusone(stopping_maxfun=1_000, ...)
+ )
+
+ or
+
+ .. code-block::
+
+ om.minimize(
+ ...,
+ algorithm="nevergrad_oneplusone",
+ algo_options={"stopping_maxfun": 1_000, ...}
+ )
+
+ **Description and available options:**
+
+ .. autoclass:: optimagic.optimizers.nevergrad_optimizers.NevergradOnePlusOne
```
```{eval-rst}
.. dropdown:: nevergrad_de
+ **How to use this algorithm:**
+
.. code-block::
- "nevergrad_de"
-
- Minimize a scalar function using the Differential Evolution optimizer from Nevergrad.
-
- Differential Evolution is typically used for continuous optimization. It uses differences between
- points in the population for performing mutations in fruitful directions; it is therefore a kind of
- covariance adaptation without any explicit covariance, making it very fast in high dimensions.
-
- - **initialization**:
- Algorithm/distribution used for initialization. Can be one of: "parametrization" (uses
- parametrization's sample method), "LHS" (Latin Hypercube Sampling), "QR" (Quasi-Random), "QO"
- (Quasi-Orthogonal), or "SO" (Sobol sequence).
- - **scale**: Scale of random component of updates. Can be a float or a string.
- - **recommendation**: Criterion for selecting the best point to recommend.
- - **Options**: "pessimistic", "optimistic", "mean", or "noisy".
- - **crossover**: Crossover rate or strategy. Can be:
- - float: Fixed crossover rate
- - "dimension": 1/dimension
- - "random": Random uniform rate per iteration
- - "onepoint": One-point crossover
- - "twopoints": Two-points crossover
- - "rotated_twopoints": Rotated two-points crossover
- - "parametrization": Use parametrization's recombine method
- - **F1**: Differential weight #1 (scaling factor).
- - **F2**: Differential weight #2 (scaling factor).
- - **popsize**: Population size. Can be an integer or one of:
- - "standard": max(num_workers, 30)
- - "dimension": max(num_workers, 30, dimension + 1)
- - "large": max(num_workers, 30, 7 * dimension)
- - **high\_speed**: If True, uses a metamodel for recommendations to speed up optimization.
- - **stopping\_maxfun**: Maximum number of function evaluations before termination.
- - **n\_cores**: Number of cores to use for parallel function evaluation.
- - **seed**: Seed for the random number generator for reproducibility.
- - **sigma**:
- Standard deviation for sampling initial population from N(0, σ²) in case bounds are not provided.
+ import optimagic as om
+ om.minimize(
+ ...,
+ algorithm=om.algos.nevergrad_de(population_size="large", ...)
+ )
+
+ or
+
+ .. code-block::
+
+ om.minimize(
+ ...,
+ algorithm="nevergrad_de",
+ algo_options={"population_size": "large", ...}
+ )
+
+ **Description and available options:**
+
+ .. autoclass:: optimagic.optimizers.nevergrad_optimizers.NevergradDifferentialEvolution
```
```{eval-rst}
-.. dropdown:: nevergrad_bo
+.. dropdown:: nevergrad_bo
+
+ .. note::
+
+ Using this optimizer requires the `bayes-optim` package to be installed as well.
+ This can be done with `pip install bayes-optim`.
+
+ **How to use this algorithm:**
+
+ .. code-block::
+
+ import optimagic as om
+ om.minimize(
+ ...,
+ algorithm=om.algos.nevergrad_bo(stopping_maxfun=1_000, ...)
+ )
+
+ or
.. code-block::
- "nevergrad_bo"
-
- Minimize a scalar function using the Bayes Optim algorithm. BO and PCA-BO algorithms from the
- `bayes_optim `_ package PCA-BO (Principal
- Component Analysis for Bayesian Optimization) is a dimensionality reduction technique for black-box
- optimization. It applies PCA to the input space before performing Bayesian optimization, improving
- efficiency in high dimensions by focusing on directions of greatest variance. This helps concentrate
- search in informative subspaces and reduce sample complexity. :cite:`bayesoptimimpl`.
-
- - **init\_budget**: Number of initialization algorithm steps.
- - **pca**: Whether to use the PCA transformation, defining PCA-BO rather than standard BO.
- - **n\_components**:
- Number of principal axes in feature space representing directions of maximum variance in the data.
- Represents the percentage of explained variance (e.g., 0.95 means 95% variance retained).
- - **prop\_doe\_factor**:
- Percentage of the initial budget used for DoE, potentially overriding `init_budget`. For
- - **stopping\_maxfun**: Maximum number of function evaluations before termination.
- - **n\_cores**: Number of cores to use for parallel function evaluation.
- - **seed**: Seed for the random number generator for reproducibility.
- - **sigma**:
- Standard deviation for sampling initial population from N(0, σ²) in case bounds are not provided.
+ om.minimize(
+ ...,
+ algorithm="nevergrad_bo",
+ algo_options={"stopping_maxfun": 1_000, ...}
+ )
+
+ **Description and available options:**
+
+ .. autoclass:: optimagic.optimizers.nevergrad_optimizers.NevergradBayesOptim
```
```{eval-rst}
.. dropdown:: nevergrad_emna
+ **How to use this algorithm:**
+
.. code-block::
- "nevergrad_emna"
-
- Minimize a scalar function using the Estimation of Multivariate Normal Algorithm.
-
- Estimation of Multivariate Normal Algorithm (EMNA), a distribution-based evolutionary algorithm that
- models the search space using a multivariate Gaussian. EMNA learns the full covariance matrix of the
- Gaussian sampling distribution, resulting in a cubic time complexity w.r.t. each sampling. It is
- highly recommended to first attempt other more advanced optimization methods for LBO. See
- :cite:`emnaimpl`. This algorithm is quite efficient in a parallel setting, i.e. when the population
- size is large.
-
- - **isotropic**:
- If True, uses an isotropic (identity covariance) Gaussian. If False, uses a separable (diagonal
- covariance) Gaussian for greater flexibility in anisotropic landscapes.
- - **noise\_handling**:
- If True, returns the best individual found. If False (recommended for noisy problems), returns the
- average of the final population to reduce noise.
- - **population\_size\_adaptation**:
- If True, the population size is adjusted automatically based on the optimization landscape and noise
- level.
- - **initial\_popsize**: Initial population size. Default: 4 x dimension..
- - **stopping\_maxfun**: Maximum number of function evaluations before termination.
- - **n\_cores**: Number of cores to use for parallel function evaluation.
- - **seed**: Seed for the random number generator for reproducibility.
- - **sigma**:
- Standard deviation for sampling initial population from N(0, σ²) in case bounds are not provided.
+ import optimagic as om
+ om.minimize(
+ ...,
+ algorithm=om.algos.nevergrad_emna(noise_handling=False, ...)
+ )
+
+ or
+
+ .. code-block::
+
+ om.minimize(
+ ...,
+ algorithm="nevergrad_emna",
+ algo_options={"noise_handling": False, ...}
+ )
+
+ **Description and available options:**
+
+ .. autoclass:: optimagic.optimizers.nevergrad_optimizers.NevergradEMNA
```
```{eval-rst}
.. dropdown:: nevergrad_cga
+ **How to use this algorithm:**
+
.. code-block::
- "nevergrad_cga"
-
- Minimize a scalar function using the Compact Genetic Algorithm.
-
- The Compact Genetic Algorithm (cGA) is a memory-efficient genetic algorithm that represents the
- population as a probability vector over gene values. It simulates the order-one behavior of a simple
- GA with uniform crossover, updating probabilities instead of maintaining an explicit population. cGA
- processes each gene independently and is well-suited for large or constrained environments. For
- details see :cite:`cgaimpl`.
-
- - **stopping\_maxfun**: Maximum number of function evaluations before termination.
- - **n\_cores**: Number of cores to use for parallel function evaluation.
- - **seed**: Seed for the random number generator for reproducibility.
- - **sigma**:
- Standard deviation for sampling initial population from N(0, σ²) in case bounds are not provided.
+ import optimagic as om
+ om.minimize(
+ ...,
+ algorithm=om.algos.nevergrad_cga(stopping_maxfun=10_000)
+ )
+
+ or
+
+ .. code-block::
+
+ om.minimize(
+ ...,
+ algorithm="nevergrad_cga",
+ algo_options={"stopping_maxfun": 10_000}
+ )
+
+ **Description and available options:**
+
+ .. autoclass:: optimagic.optimizers.nevergrad_optimizers.NevergradCGA
```
```{eval-rst}
.. dropdown:: nevergrad_eda
+ **How to use this algorithm:**
+
.. code-block::
- "nevergrad_eda"
-
- Minimize a scalar function using the Estimation of distribution algorithm.
-
- Estimation of Distribution Algorithms (EDAs) optimize by building and sampling a probabilistic model
- of promising solutions. Instead of using traditional variation operators like crossover or mutation,
- EDAs update a distribution based on selected individuals and sample new candidates from it. This
- allows efficient exploration of complex or noisy search spaces. In short, EDAs typically do not
- directly evolve populations of search points but build probabilistic models of promising solutions
- by repeatedly sampling and selecting points from the underlying search space. Refer :cite:`edaimpl`.
-
- - **stopping\_maxfun**: Maximum number of function evaluations before termination.
- - **n\_cores**: Number of cores to use for parallel function evaluation.
- - **seed**: Seed for the random number generator for reproducibility.
- - **sigma**:
- Standard deviation for sampling initial population from N(0, σ²) in case bounds are not provided.
+ import optimagic as om
+ om.minimize(
+ ...,
+ algorithm=om.algos.nevergrad_eda(stopping_maxfun=10_000)
+ )
+
+ or
+
+ .. code-block::
+
+ om.minimize(
+ ...,
+ algorithm="nevergrad_eda",
+ algo_options={"stopping_maxfun": 10_000}
+ )
+
+ **Description and available options:**
+
+ .. autoclass:: optimagic.optimizers.nevergrad_optimizers.NevergradEDA
```
```{eval-rst}
.. dropdown:: nevergrad_tbpsa
+ **How to use this algorithm:**
+
+ .. code-block::
+
+ import optimagic as om
+ om.minimize(
+ ...,
+ algorithm=om.algos.nevergrad_tbpsa(noise_handling=False, ...)
+ )
+
+ or
+
.. code-block::
- "nevergrad_tbpsa"
-
- Minimize a scalar function using the Test-based population size adaptation algorithm.
-
- TBPSA adapts population size based on fitness trend detection using linear regression. If no
- significant improvement is found (via hypothesis testing), the population size is increased to
- improve robustness in noisy settings. This method performs the best in many noisy optimization
- problems, even in large dimensions. For more details, refer :cite:`tbpsaimpl`
-
- - **noise\_handling**:
- If True, returns the best individual seen so far. If False (recommended for noisy problems), returns
- the average of the final population to reduce the effect of noise.
- - **initial\_popsize**: Initial population size. If not specified, defaults to 4 x dimension.
- - **stopping\_maxfun**: Maximum number of function evaluations before termination.
- - **n\_cores**: Number of cores to use for parallel function evaluation.
- - **seed**: Seed for the random number generator for reproducibility.
- - **sigma**:
- Standard deviation for sampling initial population from N(0, σ²) in case bounds are not provided.
+ om.minimize(
+ ...,
+ algorithm="nevergrad_tbpsa",
+ algo_options={"noise_handling": False, ...}
+ )
+
+ **Description and available options:**
+
+ .. autoclass:: optimagic.optimizers.nevergrad_optimizers.NevergradTBPSA
```
```{eval-rst}
.. dropdown:: nevergrad_randomsearch
+ **How to use this algorithm:**
+
.. code-block::
- "nevergrad_randomsearch"
-
- Minimize a scalar function using the Random Search algorithm.
-
- This is a one-shot optimization method, provides random suggestions.
-
- - **middle\_point**:
- Enforces that the first suggested point (ask) is the zero vector. i.e we add (0,0,...,0) as a first
- point.
- - **opposition\_mode**: Symmetrizes exploration with respect to the center.
- - "opposite": enables full symmetry by always evaluating mirrored points.
- - "quasi": applies randomized symmetry (less strict, more exploratory).
- - None: disables any symmetric mirroring in the sampling process.
- - **sampler**:
- - "parametrization": uses the default sample() method of the parametrization, which samples uniformly within bounds or from a Gaussian.
- - "gaussian": samples from a standard Gaussian distribution.
- - "cauchy": uses a Cauchy distribution instead of Gaussian.
- - **scale**: Scalar used to multiply suggested point values, or a string mode:
- - "random": uses a randomized pattern for the scale.
- - "auto": sigma = (1 + log(budget)) / (4 * log(dimension)); adjusts scale based on problem size.
- - "autotune": sigma = sqrt(log(budget) / dimension); alternative auto-scaling based on budget and dimensionality.
- - **recommendation\_rule**: Specifies how the final recommendation is chosen.
- - "average_of_best": returns the average of top-performing candidates.
- - "pessimistic": selects the pessimistic best (default);
- - "average_of_exp_best": uses an exponential moving average of the best points.
- - **stopping\_maxfun**: Maximum number of function evaluations before termination.
- - **n\_cores**: Number of cores to use for parallel function evaluation.
- - **seed**: Seed for the random number generator for reproducibility.
- - **sigma**:
- Standard deviation for sampling initial population from N(0, σ²) in case bounds are not provided.
+ import optimagic as om
+ om.minimize(
+ ...,
+ algorithm=om.algos.nevergrad_randomsearch(opposition_mode="quasi", ...)
+ )
+
+ or
+
+ .. code-block::
+
+ om.minimize(
+ ...,
+ algorithm="nevergrad_randomsearch",
+ algo_options={"opposition_mode": "quasi", ...}
+ )
+
+ **Description and available options:**
+
+ .. autoclass:: optimagic.optimizers.nevergrad_optimizers.NevergradRandomSearch
```
```{eval-rst}
.. dropdown:: nevergrad_samplingsearch
+ **How to use this algorithm:**
+
.. code-block::
- "nevergrad_samplingsearch"
-
- Minimize a scalar function using SamplingSearch.
-
- This is a one-shot optimization method, but better than random search by ensuring more uniformity.
-
- - **sampler**: Choice of the low-discrepancy sampler used for initial points.
- - "Halton": deterministic, well-spaced sequences
- - "Hammersley": similar to Halton but more uniform in low dimension
- - "LHS": Latin Hypercube Sampling; ensures coverage along each axis
- - **scrambled**:
- If True, Adds scrambling to the search; much better in high dimension and rarely worse than the
- original search.
- - **middle\_point**:
- If True, the first suggested point is the zero vector. Useful for initializing at the center of the
- search space.
- - **cauchy**:
- If True, uses the inverse Cauchy distribution instead of Gaussian when projecting samples to real-
- valued space (especially when no box bounds exist).
- - **scale**: A float multiplier or "random".
- - float: directly scales all generated points
- - "random": uses a randomized scaling pattern for increased diversity
- - **rescaled**: If True or a specific mode, rescales the sampling pattern.
- - Ensures coverage of boundaries and may apply adaptive scaling
- - Useful when original scale is too narrow or biased
- - **recommendation\_rule**: How the final recommendation is chosen.
- - "average_of_best": mean of the best-performing points
- - "pessimistic": selects the point with best worst-case value (default)
- - **stopping\_maxfun**: Maximum number of function evaluations before termination.
- - **n\_cores**: Number of cores to use for parallel function evaluation.
- - **seed**: Seed for the random number generator for reproducibility.
- - **sigma**:
- Standard deviation for sampling initial population from N(0, σ²) in case bounds are not provided. Notes
- -----
- - Halton is a low quality sampling method when the dimension is high; it is usually better to use Halton with scrambling.
- - When the budget is known in advance, it is also better to replace Halton by Hammersley.
+ import optimagic as om
+ om.minimize(
+ ...,
+ algorithm=om.algos.nevergrad_samplingsearch(sampler="Hammersley", scrambled=True)
+ )
+
+ or
+
+ .. code-block::
+
+ om.minimize(
+ ...,
+ algorithm="nevergrad_samplingsearch",
+ algo_options={"sampler": "Hammersley", "scrambled": True}
+ )
+
+ **Description and available options:**
+
+ .. autoclass:: optimagic.optimizers.nevergrad_optimizers.NevergradSamplingSearch
```
```{eval-rst}
.. dropdown:: nevergrad_NGOpt
+ **How to use this algorithm:**
+
+ .. code-block::
+
+ import optimagic as om
+ om.minimize(
+ ...,
+ algorithm=om.algos.nevergrad_NGOpt(optimizer="NGOptRW", ...)
+ )
+
+ or
+
.. code-block::
- "nevergrad_NGOpt"
-
- Minimize a scalar function using a Meta Optimizer from Nevergrad. Each meta optimizer combines
- multiples optimizers to solve a problem.
-
- - **optimizer**: One of
- - NGOpt
- - NGOpt4
- - NGOpt8
- - NGOpt10
- - NGOpt12
- - NGOpt13
- - NGOpt14
- - NGOpt15
- - NGOpt16
- - NGOpt21
- - NGOpt36
- - NGOpt38
- - NGOpt39
- - NGOptRW
- - NGOptF
- - NGOptF2
- - NGOptF3
- - NGOptF5
- - NgIoh2
- - NgIoh3
- - NgIoh4
- - NgIoh5
- - NgIoh6
- - NgIoh7
- - NgIoh8
- - NgIoh9
- - NgIoh10
- - NgIoh11
- - NgIoh12
- - NgIoh13
- - NgIoh14
- - NgIoh15
- - NgIoh16
- - NgIoh17
- - NgIoh18
- - NgIoh19
- - NgIoh20
- - NgIoh21
- - NgIoh12b
- - NgIoh13b
- - NgIoh14b
- - NgIoh15b
- - NgIohRW2
- - NgIohTuned
- - NgDS
- - NgDS2
- - NGDSRW
- - NGO
- - CSEC
- - CSEC10
- - CSEC11
- - Wiz
- - **stopping\_maxfun**: Maximum number of function evaluations before termination.
- - **n\_cores**: Number of cores to use for parallel function evaluation.
- - **seed**: Seed for the random number generator for reproducibility.
- - **sigma**:
- Standard deviation for sampling initial population from N(0, σ²) in case bounds are not provided.
+ om.minimize(
+ ...,
+ algorithm="nevergrad_NGOpt",
+ algo_options={"optimizer": "NGOptRW", ...}
+ )
+
+ **Description and available options:**
+
+ .. autoclass:: optimagic.optimizers.nevergrad_optimizers.NevergradNGOpt
```
```{eval-rst}
.. dropdown:: nevergrad_meta
+ **How to use this algorithm:**
+
.. code-block::
- "nevergrad_meta"
-
- Minimize a scalar function using a Meta Optimizer from Nevergrad. Utilizes a combination of local
- and global optimizers to find the best solution. Local optimizers like BFGS are wrappers over scipy
- implementations. Each meta optimizer combines multiples optimizers to solve a problem.
-
- - **optimizer**: One of
- - MultiBFGSPlus
- - LogMultiBFGSPlus
- - SqrtMultiBFGSPlus
- - MultiCobylaPlus
- - MultiSQPPlus
- - BFGSCMAPlus
- - LogBFGSCMAPlus
- - SqrtBFGSCMAPlus
- - SQPCMAPlus
- - LogSQPCMAPlus
- - SqrtSQPCMAPlus
- - MultiBFGS
- - LogMultiBFGS
- - SqrtMultiBFGS
- - MultiCobyla
- - ForceMultiCobyla
- - MultiSQP
- - BFGSCMA
- - LogBFGSCMA
- - SqrtBFGSCMA
- - SQPCMA
- - LogSQPCMA
- - SqrtSQPCMA
- - FSQPCMA
- - F2SQPCMA
- - F3SQPCMA
- - MultiDiscrete
- - CMandAS2
- - CMandAS3
- - MetaCMA
- - CMA
- - PCEDA
- - MPCEDA
- - MEDA
- - NoisyBandit
- - Shiwa
- - Carola3
- - **stopping\_maxfun**: Maximum number of function evaluations before termination.
- - **n\_cores**: Number of cores to use for parallel function evaluation.
- - **seed**: Seed for the random number generator for reproducibility.
- - **sigma**:
- Standard deviation for sampling initial population from N(0, σ²) in case bounds are not provided.
+ import optimagic as om
+ om.minimize(
+ ...,
+ algorithm=om.algos.nevergrad_meta(optimizer="BFGSCMAPlus", ...)
+ )
+
+ or
+
+ .. code-block::
+
+ om.minimize(
+ ...,
+ algorithm="nevergrad_meta",
+ algo_options={"optimizer": "BFGSCMAPlus", ...}
+ )
+
+ **Description and available options:**
+
+ .. autoclass:: optimagic.optimizers.nevergrad_optimizers.NevergradMeta
```
## Bayesian Optimization
@@ -4552,6 +4353,8 @@ We wrap the
[BayesianOptimization](https://github.com/bayesian-optimization/BayesianOptimization)
package. To use it, you need to have
[bayesian-optimization](https://pypi.org/project/bayesian-optimization/) installed.
+Note: This optimizer requires `bayesian_optimization > 2.0.0` to be installed which is
+incompatible with `nevergrad > 1.0.3`.
```{eval-rst}
.. dropdown:: bayes_opt
@@ -4627,80 +4430,6 @@ package. To use it, you need to have
- **n_restarts** (int): Number of times to restart the optimizer. Default is 1.
```
-```{eval-rst}
-.. dropdown:: nevergrad_oneplusone
-
- .. code-block::
-
- "nevergrad_oneplusone"
-
- Minimize a scalar function using the One Plus One Evolutionary algorithm from Nevergrad.
-
- THe One Plus One evolutionary algorithm iterates to find a set of parameters that minimizes the loss
- function. It does this by perturbing, or mutating, the parameters from the last iteration (the
- parent). If the new (child) parameters yield a better result, then the child becomes the new parent
- whose parameters are perturbed, perhaps more aggressively. If the parent yields a better result, it
- remains the parent and the next perturbation is less aggressive. Originally proposed by
- :cite:`Rechenberg1973`. The implementation in Nevergrad is based on the one-fifth adaptation rule,
- going back to :cite:`Schumer1968.
-
- - **noise\_handling**: Method for handling the noise, can be
- - "random": A random point is reevaluated regularly using the one-fifth adaptation rule.
- - "optimistic": The best optimistic point is reevaluated regularly, embracing optimism in the face of uncertainty.
- - A float coefficient can be provided to tune the regularity of these reevaluations (default is 0.05). Eg: with 0.05, each evaluation has a 5% chance (i.e., 1 in 20) of being repeated (i.e., the same candidate solution is reevaluated to better estimate its performance). (Default: `None`).
- - **n\_cores**: Number of cores to use.
-
- - **stopping.maxfun**: Maximum number of function evaluations.
- - **mutation**: Type of mutation to apply. Available options are (Default: `"gaussian"`).
- - "gaussian": Standard mutation by adding a Gaussian random variable (with progressive widening) to the best pessimistic point.
- - "cauchy": Same as Gaussian but using a Cauchy distribution.
- - "discrete": Mutates a randomly drawn variable (mutation occurs with probability 1/d in d dimensions, hence ~1 variable per mutation).
- - "discreteBSO": Follows brainstorm optimization by gradually decreasing mutation rate from 1 to 1/d.
- - "fastga": Fast Genetic Algorithm mutations from the current best.
- - "doublefastga": Double-FastGA mutations from the current best :cite:`doerr2017`.
- - "rls": Randomized Local Search — mutates one and only one variable.
- - "portfolio": Random number of mutated bits, known as uniform mixing :cite:`dang2016`.
- - "lengler": Mutation rate is a function of dimension and iteration index.
- - "lengler{2|3|half|fourth}": Variants of the Lengler mutation rate adaptation.
- - **sparse**: Whether to apply random mutations that set variables to zero. Default is `False`.
- - **smoother**: Whether to suggest smooth mutations. Default is `False`.
- - **annealing**:
- Annealing schedule to apply to mutation amplitude or temperature-based control. Options are:
- - "none": No annealing is applied.
- - "Exp0.9": Exponential decay with rate 0.9.
- - "Exp0.99": Exponential decay with rate 0.99.
- - "Exp0.9Auto": Exponential decay with rate 0.9, auto-scaled based on problem horizon.
- - "Lin100.0": Linear decay from 1 to 0 over 100 iterations.
- - "Lin1.0": Linear decay from 1 to 0 over 1 iteration.
- - "LinAuto": Linearly decaying annealing automatically scaled to the problem horizon. Default is `"none"`.
- - **super\_radii**:
- Whether to apply extended radii beyond standard bounds for candidate generation, enabling broader
- exploration. Default is `False`.
- - **roulette\_size**:
- Size of the roulette wheel used for selection in the evolutionary process. Affects the sampling
- diversity from past candidates. (Default: `64`)
- - **antismooth**:
- Degree of anti-smoothing applied to prevent premature convergence in smooth landscapes. This alters
- the landscape by penalizing overly smooth improvements. (Default: `4`)
- - **crossover**: Whether to include a genetic crossover step every other iteration. Default is `False`.
- - **crossover\_type**:
- Method used for genetic crossover between individuals in the population. Available options (Default: `"none"`):
- - "none": No crossover is applied.
- - "rand": Randomized selection of crossover point.
- - "max": Crossover at the point with maximum fitness gain.
- - "min": Crossover at the point with minimum fitness gain.
- - "onepoint": One-point crossover, splitting the genome at a single random point.
- - "twopoint": Two-point crossover, splitting the genome at two points and exchanging the middle section.
- - **tabu\_length**:
- Length of the tabu list used to prevent revisiting recently evaluated candidates in local search
- strategies. Helps in escaping local minima. (Default: `1000`)
- - **rotation**:
- Whether to apply rotational transformations to the search space, promoting invariance to axis-
- aligned structures and enhancing search performance in rotated coordinate systems. (Default:
- `False`)
- - **seed**: Seed for the random number generator for reproducibility.
-```
-
## References
```{eval-rst}
diff --git a/docs/source/how_to/how_to_start_parameters.md b/docs/source/how_to/how_to_start_parameters.md
index fc5a031e9..0c13ba6bf 100644
--- a/docs/source/how_to/how_to_start_parameters.md
+++ b/docs/source/how_to/how_to_start_parameters.md
@@ -14,125 +14,120 @@ advantages and drawbacks of each of them.
Again, we use the simple `sphere` function you know from other tutorials as an example.
```{eval-rst}
-.. tabbed:: Array
- A frequent choice of ``params`` is a one-dimensional numpy array. This is
- because one-dimensional numpy arrays are all that is supported by most optimizer
- libraries.
+.. tab-set::
+ .. tab-item:: Array
- In our opinion, it is rarely a good choice to represent parameters as flat numpy arrays
- and then access individual parameters or sclices by positions. The only exception
- are simple optimization problems with very-fast-to-evaluate criterion functions where
- any overhead must be avoided.
+ A frequent choice of ``params`` is a one-dimensional numpy array. This is
+ because one-dimensional numpy arrays are all that is supported by most optimizer
+ libraries.
- If you still want to use one-dimensional numpy arrays, here is how:
+ In our opinion, it is rarely a good choice to represent parameters as flat numpy arrays
+ and then access individual parameters or sclices by positions. The only exception
+ are simple optimization problems with very-fast-to-evaluate criterion functions where
+ any overhead must be avoided.
- .. code-block:: python
+ If you still want to use one-dimensional numpy arrays, here is how:
- import optimagic as om
+ .. code-block:: python
+ import optimagic as om
- def sphere(params):
- return params @ params
+ def sphere(params):
+ return params @ params
- om.minimize(
- fun=sphere,
- params=np.arange(3),
- algorithm="scipy_lbfgsb",
- )
-```
+ om.minimize(
+ fun=sphere,
+ params=np.arange(3),
+ algorithm="scipy_lbfgsb",
+ )
-```{eval-rst}
-.. tabbed:: DataFrame
+ .. tab-item:: DataFrame
- Originally, pandas DataFrames were the mandatory format for ``params`` in optimagic.
- They are still highly recommended and have a few special features. For example,
- they allow to bundle information on start parameters and bounds together into one
- data structure.
+ Originally, pandas DataFrames were the mandatory format for ``params`` in optimagic.
+ They are still highly recommended and have a few special features. For example,
+ they allow to bundle information on start parameters and bounds together into one
+ data structure.
- Let's look at an example where we do that:
+ Let's look at an example where we do that:
- .. code-block:: python
+ .. code-block:: python
- def sphere(params):
- return (params["value"] ** 2).sum()
+ def sphere(params):
+ return (params["value"] ** 2).sum()
- params = pd.DataFrame(
- data={"value": [1, 2, 3], "lower_bound": [-np.inf, 1.5, 0]},
- index=["a", "b", "c"],
- )
+ params = pd.DataFrame(
+ data={"value": [1, 2, 3], "lower_bound": [-np.inf, 1.5, 0]},
+ index=["a", "b", "c"],
+ )
- om.minimize(
- fun=sphere,
- params=params,
- algorithm="scipy_lbfgsb",
- )
+ om.minimize(
+ fun=sphere,
+ params=params,
+ algorithm="scipy_lbfgsb",
+ )
- DataFrames have many advantages:
+ DataFrames have many advantages:
- - It is easy to select single parameters or groups of parameters or work with
- the entire parameter vector. Especially, if you use a well designed MultiIndex.
- - It is very easy to produce publication quality LaTeX tables from them.
- - If you have nested models, you can easily update the parameter vector of a larger
- model with the values from a smaller one (e.g. to get good start parameters).
- - You can bundle information on bounds and values in one place.
- - It is easy to compare two params vectors for equality.
+ - It is easy to select single parameters or groups of parameters or work with
+ the entire parameter vector. Especially, if you use a well designed MultiIndex.
+ - It is very easy to produce publication quality LaTeX tables from them.
+ - If you have nested models, you can easily update the parameter vector of a larger
+ model with the values from a smaller one (e.g. to get good start parameters).
+ - You can bundle information on bounds and values in one place.
+ - It is easy to compare two params vectors for equality.
- If you are sure you won't have bounds on your parameter, you can also use a
- pandas.Series instead of a pandas.DataFrame.
+ If you are sure you won't have bounds on your parameter, you can also use a
+ pandas.Series instead of a pandas.DataFrame.
- A drawback of DataFrames is that they are not JAX compatible. Another one is that
- they are a bit slower than numpy arrays.
+ A drawback of DataFrames is that they are not JAX compatible. Another one is that
+ they are a bit slower than numpy arrays.
-```
+ .. tab-item:: Dict
-```{eval-rst}
-.. tabbed:: Dict
+ ``params`` can also be a (nested) dictionary containing all of the above and more.
- ``params`` can also be a (nested) dictionary containing all of the above and more.
+ .. code-block:: python
- .. code-block:: python
+ def sphere(params):
+ return params["a"] ** 2 + params["b"] ** 2 + (params["c"] ** 2).sum()
- def sphere(params):
- return params["a"] ** 2 + params["b"] ** 2 + (params["c"] ** 2).sum()
+ res = om.minimize(
+ fun=sphere,
+ params={"a": 0, "b": 1, "c": pd.Series([2, 3, 4])},
+ algorithm="scipy_neldermead",
+ )
- res = om.minimize(
- fun=sphere,
- params={"a": 0, "b": 1, "c": pd.Series([2, 3, 4])},
- algorithm="scipy_neldermead",
- )
+ Dictionarys of arrays are ideal if you want to do vectorized computations with
+ groups of parameters. They are also a good choice if you calculate derivatives
+ with JAX.
- Dictionarys of arrays are ideal if you want to do vectorized computations with
- groups of parameters. They are also a good choice if you calculate derivatives
- with JAX.
+ While optimagic won't stop you, don't go too far! Having parameters in very deeply
+ nested dictionaries makes it hard to visualize results and/or even to compare two
+ estimation results.
- While optimagic won't stop you, don't go too far! Having parameters in very deeply
- nested dictionaries makes it hard to visualize results and/or even to compare two
- estimation results.
-```
+ .. tab-item:: Scalar
-```{eval-rst}
-.. tabbed:: Scalar
+ If you have a one-dimensional optimization problem, the natural way to represent
+ your params is a float:
- If you have a one-dimensional optimization problem, the natural way to represent
- your params is a float:
+ .. code-block:: python
- .. code-block:: python
+ def sphere(params):
+ return params**2
- def sphere(params):
- return params**2
+ om.minimize(
+ fun=sphere,
+ params=3,
+ algorithm="scipy_lbfgsb",
+ )
- om.minimize(
- fun=sphere,
- params=3,
- algorithm="scipy_lbfgsb",
- )
```
diff --git a/docs/source/refs.bib b/docs/source/refs.bib
index f8005d2e9..298b813ff 100644
--- a/docs/source/refs.bib
+++ b/docs/source/refs.bib
@@ -964,8 +964,8 @@ @inproceedings{tbpsaimpl
year = {2016},
month = {09},
pages = {},
-title = {Evolution under Strong Noise: A Self-Adaptive Evolution Strategy Can Reach the Lower Performance Bound - the pcCMSA-ES},
-volume = {9921},
+title = {Evolution under Strong Noise: A Self-Adaptive Evolution Strategy Can Reach the Lower Performance Bound - the pcCMSA-ES},
+booktitle = {Parallel Problem Solving from Nature -- PPSN XIII},volume = {9921},
isbn = {9783319458229},
doi = {10.1007/978-3-319-45823-6_3}
}
@@ -1037,6 +1037,7 @@ @book{emnaimpl
pages = {},
title = {Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation},
isbn = {9781461356042},
+publisher = {Springer},
journal = {Genetic algorithms and evolutionary computation ; 2},
doi = {10.1007/978-1-4615-1539-5}
}
diff --git a/pyproject.toml b/pyproject.toml
index c74752252..58730bd0f 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -380,6 +380,7 @@ module = [
"pdbp",
"iminuit",
"nevergrad",
+ "nevergrad.optimization.base.ConfiguredOptimizer",
"yaml",
]
ignore_missing_imports = true
diff --git a/src/optimagic/config.py b/src/optimagic/config.py
index ce6cd4d60..3171a4195 100644
--- a/src/optimagic/config.py
+++ b/src/optimagic/config.py
@@ -38,8 +38,11 @@ def _is_installed(module_name: str) -> bool:
IS_NUMBA_INSTALLED = _is_installed("numba")
IS_IMINUIT_INSTALLED = _is_installed("iminuit")
IS_NEVERGRAD_INSTALLED = _is_installed("nevergrad")
-IS_BAYESOPT_INSTALLED = _is_installed("bayes_opt")
-
+IS_BAYESOPTIM_INSTALLED = _is_installed("bayes-optim")
+IS_BAYESOPT_INSTALLED_AND_VERSION_NEWER_THAN_2 = (
+ _is_installed("bayes_opt")
+ and importlib.metadata.version("bayesian_optimization") > "2.0.0"
+)
# ======================================================================================
# Check if pandas version is newer or equal to version 2.1.0
diff --git a/src/optimagic/optimizers/bayesian_optimizer.py b/src/optimagic/optimizers/bayesian_optimizer.py
index 3de716a7f..93337f586 100644
--- a/src/optimagic/optimizers/bayesian_optimizer.py
+++ b/src/optimagic/optimizers/bayesian_optimizer.py
@@ -10,7 +10,7 @@
from scipy.optimize import NonlinearConstraint
from optimagic import mark
-from optimagic.config import IS_BAYESOPT_INSTALLED
+from optimagic.config import IS_BAYESOPT_INSTALLED_AND_VERSION_NEWER_THAN_2
from optimagic.exceptions import NotInstalledError
from optimagic.optimization.algo_options import N_RESTARTS
from optimagic.optimization.algorithm import Algorithm, InternalOptimizeResult
@@ -35,7 +35,7 @@
@mark.minimizer(
name="bayes_opt",
solver_type=AggregationLevel.SCALAR,
- is_available=IS_BAYESOPT_INSTALLED,
+ is_available=IS_BAYESOPT_INSTALLED_AND_VERSION_NEWER_THAN_2,
is_global=True,
needs_jac=False,
needs_hess=False,
@@ -72,7 +72,7 @@ class BayesOpt(Algorithm):
def _solve_internal_problem(
self, problem: InternalOptimizationProblem, x0: NDArray[np.float64]
) -> InternalOptimizeResult:
- if not IS_BAYESOPT_INSTALLED:
+ if not IS_BAYESOPT_INSTALLED_AND_VERSION_NEWER_THAN_2:
raise NotInstalledError(
"To use the 'bayes_opt' optimizer you need to install bayes_opt. "
"Use 'pip install bayesian-optimization'. "
diff --git a/src/optimagic/optimizers/nevergrad_optimizers.py b/src/optimagic/optimizers/nevergrad_optimizers.py
index 16166b0a9..258b5e39a 100644
--- a/src/optimagic/optimizers/nevergrad_optimizers.py
+++ b/src/optimagic/optimizers/nevergrad_optimizers.py
@@ -1,5 +1,7 @@
"""Implement optimizers from the nevergrad package."""
+from __future__ import annotations
+
import math
from dataclasses import dataclass
from typing import TYPE_CHECKING, Any, Literal
@@ -8,7 +10,7 @@
from numpy.typing import NDArray
from optimagic import mark
-from optimagic.config import IS_NEVERGRAD_INSTALLED
+from optimagic.config import IS_BAYESOPTIM_INSTALLED, IS_NEVERGRAD_INSTALLED
from optimagic.exceptions import NotInstalledError
from optimagic.optimization.algo_options import (
CONVERGENCE_FTOL_ABS,
@@ -30,7 +32,7 @@
)
if TYPE_CHECKING:
- import nevergrad as ng
+ from nevergrad.optimization.base import ConfiguredOptimizer
NEVERGRAD_NOT_INSTALLED_ERROR = (
@@ -58,18 +60,84 @@
)
@dataclass(frozen=True)
class NevergradPSO(Algorithm):
+ """Minimize a scalar function using the Particle Swarm Optimization algorithm.
+
+ The Particle Swarm Optimization algorithm was originally proposed by
+ :cite:`Kennedy1995`.The implementation in Nevergrad is based on
+ :cite:`Zambrano2013`.
+
+ PSO solves an optimization problem by evolving a swarm of particles
+ (candidate solutions) across the search space. Each particle adjusts its position
+ based on its own experience (cognitive component) and the experiences
+ of its neighbors or the swarm (social component), using velocity updates. The
+ algorithm iteratively guides the swarm toward promising regions of the search
+ space.
+
+ """
+
transform: Literal["arctan", "gaussian", "identity"] = "arctan"
+ """The transform used to map from PSO optimization space to real space."""
+
population_size: int | None = None
+ """The number of particles in the swarm."""
+
n_cores: int = 1
+ """The number of CPU cores to use for parallel computation."""
+
seed: int | None = None
+ """Random seed for reproducibility."""
+
stopping_maxfun: PositiveInt = STOPPING_MAXFUN_GLOBAL
+ """Maximum number of function evaluations."""
+
inertia: float = 0.5 / math.log(2.0)
+ r"""Inertia weight ω.
+
+ Controls the influence of a particle's previous velocity. Must be less than 1 to
+ avoid divergence.
+
+ """
+
cognitive: float = 0.5 + math.log(2.0)
+ r"""Cognitive coefficient :math:`\phi_p`.
+
+ Controls the influence of a particle's own best known position. Typical values: 1.0
+ to 3.0.
+
+ """
+
social: float = 0.5 + math.log(2.0)
+ r"""Social coefficient.
+
+ Denoted by :math:`\phi_g`. Controls the influence of the swarm's best known
+ position. Typical values: 1.0 to 3.0.
+
+ """
+
quasi_opp_init: bool = False
+ """Whether to use quasi-opposition initialization.
+
+ Default is False.
+
+ """
+
speed_quasi_opp_init: bool = False
+ """Whether to apply quasi-opposition initialization to speed.
+
+ Default is False.
+
+ """
+
special_speed_quasi_opp_init: bool = False
+ """Whether to use special quasi-opposition initialization for speed.
+
+ Default is False.
+
+ """
+
sigma: float | None = None
+ """Standard deviation for sampling initial population from N(0, σ²) in case bounds
+ are not provided."""
def _solve_internal_problem(
self, problem: InternalOptimizationProblem, x0: NDArray[np.float64]
@@ -121,40 +189,154 @@ def _solve_internal_problem(
)
@dataclass(frozen=True)
class NevergradCMAES(Algorithm):
+ """Minimize a scalar function using the Covariance Matrix Adaptation Evolution
+ Strategy (CMA-ES) algorithm.
+
+ The CMA-ES is a state-of-the-art evolutionary algorithm for difficult non-linear,
+ non-convex, black-box optimization problems in continuous domains. It is typically
+ applied to unconstrained or bounded problems with dimensionality between 3 and 100.
+ CMA-ES adapts a multivariate normal distribution to approximate the objective
+ function's shape by estimating a positive-definite covariance matrix, akin to the
+ inverse Hessian in convex-quadratic problems, but without requiring derivatives.
+
+ Original paper can be accessed at :cma:`
+ https://cma-es.github.io/`.
+ This implementation is a python wrapper over the original code :pycma:`
+ https://cma-es.github.io/`.
+
+ """
+
scale: NonNegativeFloat = 1.0
+ """Scale of the search."""
+
elitist: bool = False
+ """Whether to switch to elitist mode (also known as (μ,λ)-CMA-ES).
+
+ In elitist mode, the best point in the population is always retained.
+
+ """
+
population_size: int | None = None
+ """Population size."""
+
diagonal: bool = False
+ """Use the diagonal version of CMA, which is more efficient for high-dimensional
+ problems."""
+
high_speed: bool = False
+ """Use a metamodel for recommendation to speed up optimization."""
+
fast_cmaes: bool = False
+ """Use the fast CMA-ES implementation.
+
+ Cannot be used with diagonal=True. Produces equivalent results and is preferable for
+ high dimensions or when objective function evaluations are fast.
+
+ """
+
random_init: bool = False
+ """If True, initialize the optimizer with random parameters."""
+
n_cores: PositiveInt = 1
+ """Number of cores to use for parallel function evaluation."""
+
step_size_adaptive: bool | str = True
+ """Whether to adapt the step size.
+
+ Can be a boolean or a string specifying the adaptation strategy.
+
+ """
+
CSA_dampfac: PositiveFloat = 1.0
+ """Damping factor for step size adaptation."""
+
CMA_dampsvec_fade: PositiveFloat = 0.1
+ """Damping rate for step size adaptation."""
+
CSA_squared: bool = False
+ """Whether to use squared step sizes in updates."""
+
CMA_on: float = 1.0
+ """Learning rate for the covariance matrix update."""
+
CMA_rankone: float = 1.0
+ """Multiplier for the rank-one update learning rate of the covariance matrix."""
+
CMA_rankmu: float = 1.0
+ """Multiplier for the rank-mu update learning rate of the covariance matrix."""
+
CMA_cmean: float = 1.0
+ """Learning rate for the mean update."""
+
CMA_diagonal_decoding: float = 0.0
+ """Learning rate for the diagonal update."""
+
num_parents: int | None = None
+ """Number of parents (μ) for recombination."""
+
CMA_active: bool = True
+ """Whether to use negative updates for the covariance matrix."""
+
CMA_mirrormethod: Literal[0, 1, 2] = 2
+ """Strategy for mirror sampling.
+
+ 0: Unconditional, 1: Selective, 2: Selective
+ with delay.
+
+ """
+
CMA_const_trace: bool | Literal["arithm", "geom", "aeig", "geig"] = False
+ """How to normalize the trace of the covariance matrix.
+
+ False: No normalization,
+ True: Normalize to 1. Other options: 'arithm', 'geom', 'aeig', 'geig'.
+
+ """
+
CMA_diagonal: int | bool = False
+ """Number of iterations to use diagonal covariance matrix before switching to full
+ matrix.
+
+ If False, always use full matrix.
+
+ """
+
stopping_maxfun: PositiveInt = STOPPING_MAXFUN_GLOBAL
+ """Maximum number of function evaluations before termination."""
+
stopping_maxiter: PositiveInt = STOPPING_MAXITER
+ """Maximum number of iterations before termination."""
+
stopping_maxtime: PositiveFloat = float("inf")
+ """Maximum time in seconds before termination."""
+
stopping_cov_mat_cond: NonNegativeFloat = 1e14
+ """Maximum condition number of the covariance matrix before termination."""
+
convergence_ftol_abs: NonNegativeFloat = CONVERGENCE_FTOL_ABS
+ """Absolute tolerance on function value changes for convergence."""
+
convergence_ftol_rel: NonNegativeFloat = CONVERGENCE_FTOL_REL
+ """Relative tolerance on function value changes for convergence."""
+
convergence_xtol_abs: NonNegativeFloat = CONVERGENCE_XTOL_ABS
+ """Absolute tolerance on parameter changes for convergence."""
+
convergence_iter_noimprove: PositiveInt | None = None
+ """Number of iterations without improvement before termination."""
+
invariant_path: bool = False
+ """Whether evolution path (pc) should be invariant to transformations."""
+
eval_final_mean: bool = True
+ """Whether to evaluate the final mean solution."""
+
seed: int | None = None
+ """Seed used by the internal random number generator for reproducibility."""
+
sigma: float | None = None
+ """Standard deviation for sampling initial population from N(0, σ²)in case bounds
+ are not provided."""
def _solve_internal_problem(
self, problem: InternalOptimizationProblem, x0: NDArray[np.float64]
@@ -231,11 +413,34 @@ def _solve_internal_problem(
)
@dataclass(frozen=True)
class NevergradOnePlusOne(Algorithm):
+ """Minimize a scalar function using the One-Plus-One Evolutionary algorithm.
+
+ The One-Plus-One evolutionary algorithm iterates to find a set of parameters
+ that minimizes the loss function. It does this by perturbing, or mutating,
+ the parameters from the last iteration (the parent). If the new (child)
+ parameters yield a better result, the child becomes the new parent whose
+ parameters are perturbed, perhaps more aggressively. If the parent yields a
+ better result, it remains the parent and the next perturbation is less
+ aggressive.
+
+ Originally proposed by :cite:`Rechenberg1973`. The implementation in
+ Nevergrad is based on the one-fifth adaptation rule from :cite:`Schumer1968`.
+
+ """
+
noise_handling: (
Literal["random", "optimistic"]
| tuple[Literal["random", "optimistic"], float]
| None
) = None
+ """Method for handling noise.
+
+ 'random' reevaluates a random point, while 'optimistic' reevaluates the best
+ optimistic point. A float coefficient can be provided to tune the regularity of
+ these reevaluations.
+
+ """
+
mutation: Literal[
"gaussian",
"cauchy",
@@ -261,27 +466,75 @@ class NevergradOnePlusOne(Algorithm):
"biglognormal",
"hugelognormal",
] = "gaussian"
+ """Type of mutation to apply.
+
+ 'gaussian' is the default. Other options include 'cauchy', 'discrete', 'fastga',
+ 'rls', and 'portfolio'.
+
+ """
+
annealing: (
Literal[
"none", "Exp0.9", "Exp0.99", "Exp0.9Auto", "Lin100.0", "Lin1.0", "LinAuto"
]
| None
) = None
+ """Annealing schedule for mutation amplitude.
+
+ Can be 'none', exponential (e.g., 'Exp0.9'), or linear (e.g., 'Lin100.0').
+
+ """
+
sparse: bool = False
+ """Whether to apply random mutations that set variables to zero."""
+
super_radii: bool = False
+ """Whether to apply extended radii beyond standard bounds for candidate generation,
+ enabling broader exploration."""
+
smoother: bool = False
+ """Whether to suggest smooth mutations."""
+
roulette_size: PositiveInt = 64
+ """Size of the roulette wheel used for selection, affecting sampling diversity from
+ past candidates."""
+
antismooth: NonNegativeInt = 4
+ """Degree of anti-smoothing to prevent premature convergence by penalizing overly
+ smooth improvements."""
+
crossover: bool = False
+ """Whether to include a genetic crossover step every other iteration."""
+
crossover_type: (
Literal["none", "rand", "max", "min", "onepoint", "twopoint"] | None
) = None
+ """Method for genetic crossover.
+
+ Options include 'rand', 'onepoint', and 'twopoint'.
+
+ """
+
tabu_length: NonNegativeInt = 1000
+ """Length of the tabu list to prevent revisiting recent candidates and help escape
+ local minima."""
+
rotation: bool = False
+ """Whether to apply rotational transformations to the search space to enhance search
+ performance."""
+
seed: int | None = None
+ """Seed for the random number generator for reproducibility."""
+
stopping_maxfun: PositiveInt = STOPPING_MAXFUN_GLOBAL
+ """Maximum number of function evaluations."""
+
n_cores: PositiveInt = 1
+ """Number of cores to use for parallel computation."""
+
sigma: float | None = None
+ """Standard deviation for sampling initial population from N(0, σ²)if bounds are not
+ provided."""
def _solve_internal_problem(
self, problem: InternalOptimizationProblem, x0: NDArray[np.float64]
@@ -336,13 +589,32 @@ def _solve_internal_problem(
)
@dataclass(frozen=True)
class NevergradDifferentialEvolution(Algorithm):
+ """Minimize a scalar function using the Differential Evolution optimizer.
+
+ Differential Evolution is typically used for continuous optimization. It uses
+ differences between points in the population for performing mutations in fruitful
+ directions. It is a kind of covariance adaptation without any explicit covariance,
+ making it very fast in high dimensions.
+
+ """
+
initialization: Literal["parametrization", "LHS", "QR", "QO", "SO"] = (
"parametrization"
)
+ """Algorithm for initialization.
+
+ 'LHS' is Latin Hypercube Sampling, 'QR' is Quasi-Random.
+
+ """
+
scale: float | str = 1.0
+ """Scale of random component of updates."""
+
recommendation: Literal["pessimistic", "optimistic", "mean", "noisy"] = (
"pessimistic"
)
+ """Criterion for selecting the best point to recommend."""
+
crossover: (
float
| Literal[
@@ -354,14 +626,41 @@ class NevergradDifferentialEvolution(Algorithm):
"parametrization",
]
) = 0.5
+ """Crossover rate or strategy.
+
+ Can be a float, 'dimension' (1/dim), 'random', 'onepoint', or 'twopoints'.
+
+ """
+
F1: PositiveFloat = 0.8
+ """Differential weight #1 (scaling factor)."""
+
F2: PositiveFloat = 0.8
+ """Differential weight #2 (scaling factor)."""
+
population_size: int | Literal["standard", "dimension", "large"] = "standard"
+ """Population size.
+
+ Can be an integer or a string like 'standard', 'dimension', or 'large' to set it
+ automatically.
+
+ """
+
high_speed: bool = False
+ """If True, uses a metamodel for recommendations to speed up optimization."""
+
stopping_maxfun: PositiveInt = STOPPING_MAXFUN_GLOBAL
+ """Maximum number of function evaluations before termination."""
+
n_cores: PositiveInt = 1
+ """Number of cores to use for parallel function evaluation."""
+
seed: int | None = None
+ """Seed for the random number generator for reproducibility."""
+
sigma: float | None = None
+ """Standard deviation for sampling initial population from N(0, σ²)if bounds are not
+ provided."""
def _solve_internal_problem(
self, problem: InternalOptimizationProblem, x0: NDArray[np.float64]
@@ -371,7 +670,10 @@ def _solve_internal_problem(
import nevergrad as ng
+ # The nevergrad implementation has `popsize` but we use `population_size`
+ # for consistency.
configured_optimizer = ng.optimizers.DifferentialEvolution(
+ initialization=self.initialization,
scale=self.scale,
recommendation=self.recommendation,
crossover=self.crossover,
@@ -397,7 +699,7 @@ def _solve_internal_problem(
@mark.minimizer(
name="nevergrad_bo",
solver_type=AggregationLevel.SCALAR,
- is_available=IS_NEVERGRAD_INSTALLED,
+ is_available=IS_NEVERGRAD_INSTALLED and IS_BAYESOPTIM_INSTALLED,
is_global=True,
needs_jac=False,
needs_hess=False,
@@ -411,14 +713,43 @@ def _solve_internal_problem(
)
@dataclass(frozen=True)
class NevergradBayesOptim(Algorithm):
+ """Minimize a scalar function using the Bayesian Optimization (BO) algorithm.
+
+ This wrapper uses the BO and PCA-BO algorithms from the `bayes_optim` package
+ :cite:`bayesoptimimpl`. PCA-BO (Principal Component Analysis for Bayesian
+ Optimization) is a dimensionality reduction technique for black-box
+ optimization. It applies PCA to the input space before performing Bayesian
+ optimization, improving efficiency in high dimensions by focusing on
+ directions of greatest variance.
+
+ """
+
init_budget: int | None = None
+ """Number of initialization algorithm steps."""
+
pca: bool = False
+ """Whether to use the PCA transformation, defining PCA-BO rather than standard
+ BO."""
+
n_components: NonNegativeFloat = 0.95
+ """Number of principal axes, representing the percentage of explained variance
+ (e.g., 0.95 means 95% variance retained)."""
+
prop_doe_factor: NonNegativeFloat | None = 1
+ """Percentage of the initial budget used for Design of Experiments (DoE)."""
+
stopping_maxfun: PositiveInt = STOPPING_MAXFUN_GLOBAL
+ """Maximum number of function evaluations before termination."""
+
n_cores: PositiveInt = 1
+ """Number of cores to use for parallel function evaluation."""
+
seed: int | None = None
+ """Seed for the random number generator for reproducibility."""
+
sigma: int | None = None
+ """Standard deviation for sampling initial population from N(0, σ²)in case bounds
+ are not provided."""
def _solve_internal_problem(
self, problem: InternalOptimizationProblem, x0: NDArray[np.float64]
@@ -465,14 +796,54 @@ def _solve_internal_problem(
)
@dataclass(frozen=True)
class NevergradEMNA(Algorithm):
+ """Minimize a scalar function using the Estimation of Multivariate Normal Algorithm.
+
+ EMNA is a distribution-based evolutionary algorithm that models the search
+ space using a multivariate Gaussian. It learns the full covariance matrix,
+ resulting in a cubic time complexity with respect to each sampling. It is
+ efficient in parallel settings but other methods should be considered first.
+ See :cite:`emnaimpl`.
+
+ """
+
isotropic: bool = True
+ """If True, uses an isotropic (identity covariance) Gaussian.
+
+ If False, uses a separable (diagonal covariance) Gaussian.
+
+ """
+
noise_handling: bool = True
+ """If True, returns the best individual found.
+
+ If False (recommended for noisy problems), returns the average of the final
+ population.
+
+ """
+
population_size_adaptation: bool = False
+ """If True, the population size is adjusted automatically based on the optimization
+ landscape and noise level."""
+
initial_popsize: int | None = None
+ """Initial population size.
+
+ Defaults to 4 times the problem dimension.
+
+ """
+
stopping_maxfun: PositiveInt = STOPPING_MAXFUN_GLOBAL
+ """Maximum number of function evaluations before termination."""
+
n_cores: PositiveInt = 1
+ """Number of cores to use for parallel function evaluation."""
+
seed: int | None = None
+ """Seed for the random number generator for reproducibility."""
+
sigma: float | None = None
+ """Standard deviation for sampling initial population from N(0, σ²)in case bounds
+ are not provided."""
def _solve_internal_problem(
self, problem: InternalOptimizationProblem, x0: NDArray[np.float64]
@@ -482,6 +853,8 @@ def _solve_internal_problem(
import nevergrad as ng
+ # The nevergrad implementation has `naive` but we use `noise_handling`
+ # for clarity. naive=True -> returns best point; naive=False -> returns mean.
configured_optimizer = ng.optimizers.EMNA(
isotropic=self.isotropic,
naive=self.noise_handling,
@@ -519,10 +892,27 @@ def _solve_internal_problem(
)
@dataclass(frozen=True)
class NevergradCGA(Algorithm):
+ """Minimize a scalar function using the Compact Genetic Algorithm.
+
+ The Compact Genetic Algorithm (cGA) is a memory-efficient genetic algorithm
+ that represents the population as a probability vector over gene values. It
+ simulates the behavior of a simple GA with uniform crossover by updating
+ probabilities instead of maintaining an explicit population. See :cite:`cgaimpl`.
+
+ """
+
stopping_maxfun: PositiveInt = STOPPING_MAXFUN_GLOBAL
+ """Maximum number of function evaluations before termination."""
+
n_cores: PositiveInt = 1
+ """Number of cores to use for parallel function evaluation."""
+
seed: int | None = None
+ """Seed for the random number generator for reproducibility."""
+
sigma: float | None = None
+ """Standard deviation for sampling initial population from N(0, σ²)in case bounds
+ are not provided."""
def _solve_internal_problem(
self, problem: InternalOptimizationProblem, x0: NDArray[np.float64]
@@ -564,10 +954,28 @@ def _solve_internal_problem(
)
@dataclass(frozen=True)
class NevergradEDA(Algorithm):
+ """Minimize a scalar function using the Estimation of Distribution Algorithm.
+
+ Estimation of Distribution Algorithms (EDAs) optimize by building and sampling
+ a probabilistic model of promising solutions. Instead of using traditional
+ variation operators like crossover or mutation, EDAs update a distribution
+ based on selected individuals and sample new candidates from it.
+ Refer to :cite:`edaimpl`.
+
+ """
+
stopping_maxfun: PositiveInt = STOPPING_MAXFUN_GLOBAL
+ """Maximum number of function evaluations before termination."""
+
n_cores: PositiveInt = 1
+ """Number of cores to use for parallel function evaluation."""
+
seed: int | None = None
+ """Seed for the random number generator for reproducibility."""
+
sigma: float | None = None
+ """Standard deviation for sampling initial population from N(0, σ²)in case bounds
+ are not provided."""
def _solve_internal_problem(
self, problem: InternalOptimizationProblem, x0: NDArray[np.float64]
@@ -609,12 +1017,43 @@ def _solve_internal_problem(
)
@dataclass(frozen=True)
class NevergradTBPSA(Algorithm):
+ """Minimize a scalar function using the Test-based Population Size Adaptation
+ algorithm.
+
+ TBPSA adapts population size based on fitness trend detection using linear
+ regression. If no significant improvement is found (via hypothesis testing),
+ the population size is increased to improve robustness, making it effective
+ for noisy optimization problems. For more details, refer to :cite:`tbpsaimpl`.
+
+ """
+
noise_handling: bool = True
+ """If True, returns the best individual.
+
+ If False (recommended for noisy problems), returns the average of the final
+ population to reduce noise.
+
+ """
+
initial_popsize: int | None = None
+ """Initial population size.
+
+ If not specified, defaults to 4 times the problem dimension.
+
+ """
+
stopping_maxfun: PositiveInt = STOPPING_MAXFUN_GLOBAL
+ """Maximum number of function evaluations before termination."""
+
n_cores: PositiveInt = 1
+ """Number of cores to use for parallel function evaluation."""
+
seed: int | None = None
+ """Seed for the random number generator for reproducibility."""
+
sigma: float | None = None
+ """Standard deviation for sampling initial population from N(0, σ²)in case bounds
+ are not provided."""
def _solve_internal_problem(
self, problem: InternalOptimizationProblem, x0: NDArray[np.float64]
@@ -624,6 +1063,8 @@ def _solve_internal_problem(
import nevergrad as ng
+ # The nevergrad implementation has `naive` but we use `noise_handling`
+ # for clarity. naive=True -> returns best point; naive=False -> returns mean.
configured_optimizer = ng.optimizers.ParametrizedTBPSA(
naive=self.noise_handling,
initial_popsize=self.initial_popsize,
@@ -659,16 +1100,52 @@ def _solve_internal_problem(
)
@dataclass(frozen=True)
class NevergradRandomSearch(Algorithm):
+ """Minimize a scalar function using the Random Search algorithm.
+
+ This is a one-shot optimization method that provides random suggestions and serves
+ as a simple baseline for other optimizers.
+
+ """
+
middle_point: bool = False
+ """Enforces that the first suggested point is the zero vector."""
+
opposition_mode: Literal["opposite", "quasi"] | None = None
+ """Symmetrizes exploration with respect to the center.
+
+ 'opposite' enables full symmetry, while 'quasi' applies randomized symmetry.
+
+ """
+
sampler: Literal["parametrization", "gaussian", "cauchy"] = "parametrization"
+ """The probability distribution for sampling points.
+
+ 'gaussian' and 'cauchy' are available alternatives.
+
+ """
+
scale: PositiveFloat | Literal["random", "auto", "autotune"] = "auto"
+ """Scalar used to multiply suggested point values.
+
+ Can be a float or a string for auto-scaling ('random', 'auto', 'autotune').
+
+ """
+
recommendation_rule: Literal[
"average_of_best", "pessimistic", "average_of_exp_best"
] = "pessimistic"
+ """Specifies how the final recommendation is chosen, e.g., 'pessimistic' (default)
+ or 'average_of_best'."""
+
stopping_maxfun: PositiveInt = STOPPING_MAXFUN_GLOBAL
+ """Maximum number of function evaluations before termination."""
+
n_cores: PositiveInt = 1
+ """Number of cores to use for parallel function evaluation."""
+
sigma: float | None = None
+ """Standard deviation for sampling initial population from N(0, σ²)in case bounds
+ are not provided."""
def _solve_internal_problem(
self, problem: InternalOptimizationProblem, x0: NDArray[np.float64]
@@ -717,17 +1194,60 @@ def _solve_internal_problem(
)
@dataclass(frozen=True)
class NevergradSamplingSearch(Algorithm):
+ """Minimize a scalar function using SamplingSearch.
+
+ This is a one-shot optimization method that is better than random search because it
+ uses low-discrepancy sequences to ensure more uniform coverage of the search space.
+ It is recommended to use "Hammersley" as the sampler if the budget is known, and to
+ set `scrambled=True` in high dimensions.
+
+ """
+
sampler: Literal["Halton", "LHS", "Hammersley"] = "Halton"
+ """Choice of the low-discrepancy sampler used for generating points.
+
+ 'LHS' is Latin Hypercube Sampling.
+
+ """
+
scrambled: bool = False
+ """If True, adds scrambling to the search sequence, which is highly recommended for
+ high-dimensional problems."""
+
middle_point: bool = False
+ """If True, the first suggested point is the zero vector, useful for initializing at
+ the center of the search space."""
+
cauchy: bool = False
+ """If True, uses the inverse Cauchy distribution instead of Gaussian when projecting
+ samples to a real-valued space."""
+
scale: bool | NonNegativeFloat = 1.0
+ """A float multiplier to scale all generated points."""
+
rescaled: bool = False
+ """If True, rescales the sampling pattern to ensure better coverage of the
+ boundaries."""
+
recommendation_rule: Literal["average_of_best", "pessimistic"] = "pessimistic"
+ """How the final recommendation is chosen.
+
+ 'pessimistic' is the default.
+
+ """
+
stopping_maxfun: PositiveInt = STOPPING_MAXFUN_GLOBAL
+ """Maximum number of function evaluations before termination."""
+
n_cores: PositiveInt = 1
+ """Number of cores to use for parallel function evaluation."""
+
seed: int | None = None
+ """Seed for the random number generator for reproducibility."""
+
sigma: float | None = None
+ """Standard deviation for sampling initial population from N(0, σ²)in case bounds
+ are not provided."""
def _solve_internal_problem(
self, problem: InternalOptimizationProblem, x0: NDArray[np.float64]
@@ -753,7 +1273,7 @@ def _solve_internal_problem(
configured_optimizer=configured_optimizer,
stopping_maxfun=self.stopping_maxfun,
n_cores=self.n_cores,
- seed=None,
+ seed=self.seed,
sigma=self.sigma,
nonlinear_constraints=problem.nonlinear_constraints,
)
@@ -777,6 +1297,14 @@ def _solve_internal_problem(
)
@dataclass(frozen=True)
class NevergradNGOpt(Algorithm):
+ """Minimize a scalar function using a Meta Optimizer from Nevergrad.
+
+ These are meta-optimizers that intelligently combine multiple different
+ optimization algorithms to solve a problem. The specific portfolio of
+ optimizers can be selected via the `optimizer` parameter.
+
+ """
+
optimizer: Literal[
"NGOpt",
"NGOpt4",
@@ -831,10 +1359,24 @@ class NevergradNGOpt(Algorithm):
"CSEC11",
"Wiz",
] = "NGOpt"
+ """The specific Nevergrad meta-optimizer to use.
+
+ Each option is a portfolio of different algorithms.
+
+ """
+
stopping_maxfun: PositiveInt = STOPPING_MAXFUN_GLOBAL
+ """Maximum number of function evaluations before termination."""
+
n_cores: PositiveInt = 1
+ """Number of cores to use for parallel function evaluation."""
+
seed: int | None = None
+ """Seed for the random number generator for reproducibility."""
+
sigma: float | None = None
+ """Standard deviation for sampling initial population from N(0, σ²)in case bounds
+ are not provided."""
def _solve_internal_problem(
self, problem: InternalOptimizationProblem, x0: NDArray[np.float64]
@@ -877,6 +1419,14 @@ def _solve_internal_problem(
)
@dataclass(frozen=True)
class NevergradMeta(Algorithm):
+ """Minimize a scalar function using a Meta Optimizer from Nevergrad.
+
+ This algorithm utilizes a combination of local and global optimizers to find
+ the best solution. The specific portfolio of optimizers can be selected via
+ the `optimizer` parameter.
+
+ """
+
optimizer: Literal[
"MultiBFGSPlus",
"LogMultiBFGSPlus",
@@ -916,10 +1466,24 @@ class NevergradMeta(Algorithm):
"Shiwa",
"Carola3",
] = "Shiwa"
+ """The specific Nevergrad meta-optimizer to use.
+
+ Each option is a portfolio of different local and global algorithms.
+
+ """
+
stopping_maxfun: PositiveInt = STOPPING_MAXFUN_GLOBAL
+ """Maximum number of function evaluations before termination."""
+
n_cores: PositiveInt = 1
+ """Number of cores to use for parallel function evaluation."""
+
seed: int | None = None
+ """Seed for the random number generator for reproducibility."""
+
sigma: float | None = None
+ """Standard deviation for sampling initial population from N(0, σ²) in case bounds
+ are not provided."""
def _solve_internal_problem(
self, problem: InternalOptimizationProblem, x0: NDArray[np.float64]
@@ -949,7 +1513,7 @@ def _nevergrad_internal(
problem: InternalOptimizationProblem,
x0: NDArray[np.float64],
n_cores: int,
- configured_optimizer: "ng.optimization.base.ConfiguredOptimizer",
+ configured_optimizer: ConfiguredOptimizer,
stopping_maxfun: int,
seed: int | None,
sigma: float | None,