Safer bayesian optimization by roussel-ryan · Pull Request #341 · xopt-org/Xopt

roussel-ryan · 2025-07-03T19:51:34Z

This pull request introduces support for nonlinear inequality constraints in numerical optimization of the acquisition function. The changes include enhancements to the LBFGSOptimizer and GridOptimizer classes, updates to acquisition functions, and new testing capabilities. These additions enable using the probability of feasibility as a nonlinear constraint when optimizing the acquisition function.

Currently, generating initial points and optimizing the acquisition function is relatively slow for LBFGSOptimizer so it is not recommended to use this functionality for high-D parameter spaces.

Enhancements to Numerical Optimization:

Support for nonlinear inequality constraints in LBFGSOptimizer: Added logic to handle nonlinear constraints, including a random initial condition generator for sampling feasible points. If candidate generation fails, the optimizer falls back to random valid samples. (xopt/numerical_optimizer.py) [1] [2]
Support for nonlinear inequality constraints in GridOptimizer: Integrated constraint handling by filtering grid points based on feasibility. Raises an error if no feasible points are found. (xopt/numerical_optimizer.py)

Updates to Bayesian Generator:

Feasibility tolerance parameter: Added a new feasibility_tolerance field to BayesianGenerator to enable constrained acquisition function optimization based on predicted probability of feasibility. (xopt/generators/bayesian/bayesian_generator.py)
Log feasibility acquisition function: Introduced _get_log_feasibility method to calculate feasibility constraints for optimization. (xopt/generators/bayesian/bayesian_generator.py)
Enhanced candidate proposal: Modified propose_candidates to include nonlinear inequality constraints when feasibility_tolerance is set. (xopt/generators/bayesian/bayesian_generator.py)

Testing Improvements:

Tests for nonlinear constraints: Added unit tests for LBFGSOptimizer and GridOptimizer to validate handling of nonlinear constraints, including edge cases where no feasible points exist. (xopt/tests/test_numerical_optimizer.py) [1] [2]
Random initial condition generator tests: Verified functionality of the random initial condition generator with nonlinear constraints. (xopt/tests/test_numerical_optimizer.py)

nikitakuklev

To my understanding, what this PR does is

compute constraints-only acquisition function multiplier aka 'probability of feasibility'
instead of multiplying base acq like regular constrained BO, it then uses 1) to generate a model of an inequality constraint on the inputs

Question 1:
Are you sure that same effect cannot be accomplished with a sharper edge function for constraints and multiplying base acquisition the usual way?

Question 2:
Doesn't this effectively double penalize the acquisition function if constraints are applied both to base acquisition and as inequalities? Can the base acquisition function be used directly in this case?

Question 3:
For better performance, is there a neat way to cache the constraint call results to avoid double evaluating them? Seems that would require some surgery on the acquisition functions to store result of apply_constraints, so probably not worth it.

Overall though, no objections. Once this PR is merged, I will be adding variance limits/constraints. We found them very useful to ensure sampling near known locations for machines like Booster where absolutely no bad shots can be permitted during global BE. So far I treated them as just additional fatmoid clipped constraints with special processing, and will explore your inequality approach.

nikitakuklev · 2025-07-08T21:21:55Z

xopt/numerical_optimizer.py


+        # Apply nonlinear constraints -- remove f_value point where constraints(x) does not satisfy the constraints
+        if nonlinear_inequality_constraints is not None:
+            mask = torch.ones(f_values.shape, dtype=torch.bool, device=f_values.device)


always force to cpu to be same as mesh_pts (or send all tensors to specific device)?

nikitakuklev · 2025-07-08T21:28:42Z

xopt/numerical_optimizer.py

+        logger.debug("getting random initial conditions")
+        start = time.time()
+        lower, upper = bounds[0], bounds[1]
+        rand = torch.rand(


In BoTorch gen_batch_initial_conditions, they use sobol random sampling - should we copy that approach?

nikitakuklev · 2025-07-08T21:30:39Z

xopt/tests/test_numerical_optimizer.py

                assert candidates.shape == torch.Size([ncandidate, ndim])

+        # test nonlinear constraints
+        def constraint1(X):


these are linear

nikitakuklev · 2025-07-08T21:39:36Z

xopt/numerical_optimizer.py

+        function: Callable,
+        bounds: Tensor,
+        n_candidates: int = 1,
+        nonlinear_inequality_constraints: (list[tuple[Callable, bool]] | None) = None,


add to docstring

nikitakuklev · 2025-07-08T21:40:26Z

xopt/numerical_optimizer.py

            A tensor specifying the bounds for the optimization. It must have the shape [2, ndim].
        n_candidates : int, optional
            The number of candidates to generate (default is 1).
+        nonlinear_inequality_constraints : Optional[list[Callable]]


wrong docstring

nikitakuklev · 2025-07-08T21:54:23Z

xopt/numerical_optimizer.py

+
+            warnings.warn(
+                "Nonlinear inequality constraints are provided for LBFGS numerical optimization, "
+                "using a random initial condition generator which may take a long time to sample enough points.",


Supplying nonlinear_inequality_constraints also switches algo to SLSQP (see here). That might explain why convergence slows down.

nikitakuklev · 2025-07-08T23:20:59Z

xopt/generators/bayesian/bayesian_generator.py

+
+        sampler = self._get_sampler(model)
+
+        log_feasibility = qLogProbabilityOfFeasibility(


can analytic version be used for better perf?

probably, will need to check if it can be used for a given model

roussel-ryan · 2025-07-09T14:37:24Z

Thanks for the comments @nikitakuklev . In its current implementation, it actually both weights the acquisition function with the feasibility multiplier and restricts the numerical optimization of the acquisition function with a nonlinear feasibility constraint. This might be redundant and adds computational complexity now that I think of it.
To address your questions

My concern is maintaining differentiability + preventing vanishing gradients. If you have an alternative function to try I'm all for it.
You are correct that it is a double penalty which could be removed to improve computational efficiency
Yes we will have to see if this can be used in some way

To clarify you add in a constraint to acquisition function optimization that measures the model uncertainty? So, in this case the acquisition function will only be optimized in regions where the model has a high degree of confidence?

I will take a look at making changes to improve the computational efficiency based on your suggestions in the next few weeks. If you want it faster we can ID a subset of the changes you proposes an merge these features over 2 PRs

nikitakuklev · 2025-07-09T21:50:13Z

Good point on the gradients - making cutoff sharper = changing eta for sigmoid, no new functions come to mind. One could try min-clipping the feasibility to something like 1e-6 - that is yet another transformation however. Logspace does help a bit here.

Thinking more, the smooth constraint acquisition function penalty helps ensure the search stays away from the edge, whereas the hard inequality limit will result in all samples being right along the edge (assuming true optimum is in that direction). Here is a 'highly advanced drawing' - black is objective, green the constraint, black dashes is where constraint is violated, red the constraint acq multiplier, blue the sample with constrained acq, purple the sample with inequality only.

For noise constraints, yes, constraint on variance of the MC samples - this formulates tasks like "optimize in areas where efficiency uncertainty< 0.1%". Setting higher min_noise_prior helps to control model fit. Second version is in normalized space, using the output_transform.stdvs to convert. Third version uses the trained kernel noise distribution to compute threshold, making it a relative 'better-known regions' bias. I need to look more into which noise prior is best, since BoTorch changed defaults recently. (Edit: noise constraint can be applied on objective and/or constraint models)

Coming back to the PR, since the features are optional and interesting, let's merge as is after the minor comments are fixed. After noise constraints PR, we'll do some 'violation count' benchmarking to determine the recommended defaults for safety-critical problems.

roussel-ryan added 7 commits July 2, 2025 16:38

add inequality constraints to grid optimizer

7cd11e6

add nonlinear constraints to LBFGS

452d516

bugfixes / performance improvements

d760513

add feasibility tolerance to BO generators

066a348

remove code duplication

b798ff8

Merge branch 'main' into safe_bo

2ce375d

linting

e7c5c62

roussel-ryan requested a review from nikitakuklev July 3, 2025 19:52

replace print statements with debug level logging

6895c3c

roussel-ryan changed the title ~~Safe bayesian optimization~~ Safer bayesian optimization Jul 3, 2025

nikitakuklev reviewed Jul 8, 2025

View reviewed changes

roussel-ryan mentioned this pull request Aug 26, 2025

BO uncertainty constraints and other tweaks for safety-critical scenarios #354

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Safer bayesian optimization#341

Safer bayesian optimization#341
roussel-ryan wants to merge 8 commits intomainfrom
safe_bo

roussel-ryan commented Jul 3, 2025 •

edited

Loading

Uh oh!

nikitakuklev left a comment •

edited

Loading

Uh oh!

nikitakuklev Jul 8, 2025 •

edited

Loading

Uh oh!

nikitakuklev Jul 8, 2025

Uh oh!

nikitakuklev Jul 8, 2025

Uh oh!

nikitakuklev Jul 8, 2025

Uh oh!

nikitakuklev Jul 8, 2025

Uh oh!

nikitakuklev Jul 8, 2025

Uh oh!

nikitakuklev Jul 8, 2025

Uh oh!

roussel-ryan Jul 9, 2025

Uh oh!

roussel-ryan commented Jul 9, 2025

Uh oh!

nikitakuklev commented Jul 9, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		sampler = self._get_sampler(model)

		log_feasibility = qLogProbabilityOfFeasibility(

Conversation

roussel-ryan commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Enhancements to Numerical Optimization:

Updates to Bayesian Generator:

Testing Improvements:

Uh oh!

nikitakuklev left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikitakuklev Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikitakuklev Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

nikitakuklev Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

nikitakuklev Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

nikitakuklev Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

nikitakuklev Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

nikitakuklev Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

roussel-ryan Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

roussel-ryan commented Jul 9, 2025

Uh oh!

nikitakuklev commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

roussel-ryan commented Jul 3, 2025 •

edited

Loading

nikitakuklev left a comment •

edited

Loading

nikitakuklev Jul 8, 2025 •

edited

Loading

nikitakuklev commented Jul 9, 2025 •

edited

Loading