Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 71 additions & 2 deletions bayes_opt/bayesian_optimization.py
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,75 @@ def res(self) -> list[dict[str, Any]]:
"""
return self._space.res()

def predict(
self, params: dict[str, Any] | list[dict[str, Any]], return_std=False, return_cov=False, fit_gp=True
) -> tuple[float | NDArray[Float], float | NDArray[Float]]:
"""Predict the target function value at given parameters.

Parameters
---------
params: dict or list
The parameters where the prediction is made.

return_std: bool, optional(default=True)
If True, the standard deviation of the prediction is returned.

return_cov: bool, optional(default=False)
If True, the covariance of the prediction is returned.

fit_gp: bool, optional(default=True)
If True, the internal Gaussian Process model is fitted before
making the prediction.

Returns
-------
mean: float or np.ndarray
The predicted mean of the target function at the given parameters.

std_or_cov: float or np.ndarray
The predicted standard deviation or covariance of the target function
at the given parameters.
"""
if isinstance(params, list):
# convert list of dicts to 2D array
params_array = np.array([self._space.params_to_array(p) for p in params])
single_param = False
elif isinstance(params, dict):
params_array = self._space.params_to_array(params).reshape(1, -1)
single_param = True

if fit_gp:
if len(self._space) == 0:
msg = (
"The Gaussian Process model cannot be fitted with zero observations. To use predict(), "
"without fitting the GP, set fit_gp=False. The predictions will then be made using the "
"GP prior."
)
raise RuntimeError(msg)
self.acquisition_function._fit_gp(self._gp, self._space)

res = self._gp.predict(params_array, return_std=return_std, return_cov=return_cov)

if return_std or return_cov:
mean, std_or_cov = res
else:
mean = res

if not single_param and mean.ndim == 0:
mean = np.atleast_1d(mean)
# ruff complains when nesting conditionals, so this three-way split is necessary
if not single_param and (return_std or return_cov) and std_or_cov.ndim == 0:
std_or_cov = np.atleast_1d(std_or_cov)

if single_param and mean.ndim > 0:
mean = mean[0]
if single_param and (return_std or return_cov) and std_or_cov.ndim > 0:
std_or_cov = std_or_cov[0]

if return_std or return_cov:
return mean, std_or_cov
return mean
Comment on lines +178 to +245
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

predict(): tighten input validation, clarify typing/docs, and guard std vs cov

A few points here:

  • Invalid params types lead to UnboundLocalError.
    If params is neither dict nor list, params_array / single_param are never set and you’ll get an unhelpful runtime error. Consider explicitly validating and raising a TypeError:

    -        if isinstance(params, list):
    +        if isinstance(params, list):
               # convert list of dicts to 2D array
               params_array = np.array([self._space.params_to_array(p) for p in params])
               single_param = False
           elif isinstance(params, dict):
               params_array = self._space.params_to_array(params).reshape(1, -1)
               single_param = True
  •    else:
    
  •        msg = "params must be a dict or a list of dicts."
    
  •        raise TypeError(msg)
    
    
    
  • Return type annotation doesn’t match behavior.
    The function sometimes returns just mean (float / NDArray) and sometimes a (mean, std_or_cov) tuple. The annotation -> tuple[...] is misleading for callers and type checkers. Consider something like:

    def predict(...) -> float | NDArray[Float] | tuple[float | NDArray[Float], float | NDArray[Float]]:
        ...

    or using typing.overload for the main common combinations.

  • Docstring default for return_std is wrong.
    The signature uses return_std=False, but the docstring says optional(default=True). That will confuse users. Please fix the docstring to optional(default=False).

  • return_std and return_cov should be mutually exclusive.
    GaussianProcessRegressor.predict only allows one of return_std or return_cov to be True at a time; otherwise it raises.
    It would be clearer to check upfront and raise a user-facing ValueError:

  •    if fit_gp:
    
  •    if return_std and return_cov:
    
  •        msg = "At most one of return_std and return_cov can be True."
    
  •        raise ValueError(msg)
    
  •    if fit_gp:
           if len(self._space) == 0:
               ...
    
    
    
    
  • Shape handling vs. list-of-one semantics.
    For params given as a single dict you correctly return scalars (or scalar-like) when possible via single_param=True. For params as a list with a single element (e.g. [{"p1": 0, "p2": 0}]), single_param is False so mean/std_or_cov stay 1D arrays. Several new tests call predict([point]) and then do scalar-style checks like np.isclose(means, 0, ...) or stds < 0.02, which will raise ValueError if means/stds are arrays. You probably want either:

    • API choice A (simpler): Always treat list inputs as “multi-point”, even of length 1, and adjust the tests to index or call .item(), or
    • API choice B (more user-friendly): If params is a list of length 1, treat it like a single param (single_param=True) so predict([point]) behaves like predict(point) in terms of scalar vs array output.

    Whichever you decide, aligning implementation, tests, and docstring explicitly around “single dict vs list, including length-1 lists” will avoid confusing shape bugs.

  • Minor: handling of single_param with return_cov.
    When return_cov=True and params is a single dict, std_or_cov starts as a (1, 1) covariance matrix and becomes (1,) after std_or_cov = std_or_cov[0]. If you intend a scalar variance for a single point, consider also unwrapping the remaining dimension to a float for the single-parameter case, or documenting that you return a 1‑element array for covariance.


I need to verify the concerns in the review comment against the actual codebase. Let me gather information:


---

🏁 Script executed:

```shell
#!/bin/bash
# Find the predict() method definition in full to see if validation exists
rg -n -A 80 "def predict\(" bayes_opt/bayesian_optimization.py | head -120

Length of output: 3434


🏁 Script executed:

#!/bin/bash
# Check the full type annotation for the predict method
rg -n "def predict" bayes_opt/bayesian_optimization.py -A 5 --type python

Length of output: 30


🏁 Script executed:

#!/bin/bash
# Find test files that call predict() to see edge cases and assertions
fd -e "test_*.py" -o -e "*_test.py" | xargs rg -l "\.predict\(" 2>/dev/null | head -5

Length of output: 279


🏁 Script executed:

#!/bin/bash
# Search for any validation of return_std and return_cov together
rg -n "return_std.*return_cov|return_cov.*return_std" bayes_opt/bayesian_optimization.py

Length of output: 462


🏁 Script executed:

#!/bin/bash
# Look for the else clause handling invalid params types
rg -n -B 5 -A 5 "else:" bayes_opt/bayesian_optimization.py | grep -A 10 "params"

Length of output: 779


I need to check the test file and verify the GP behavior:


🏁 Script executed:

#!/bin/bash
# Find predict() calls in test file with single list elements or list inputs
rg -n "\.predict\(" tests/test_bayesian_optimization.py -B 2 -A 3 | head -80

Length of output: 3252


🏁 Script executed:

#!/bin/bash
# Look for assertions after predict() calls to see what output shape is expected
rg -n "\.predict\(\[" tests/test_bayesian_optimization.py -A 5 | head -60

Length of output: 1468


🏁 Script executed:

#!/bin/bash
# Check what happens when calling sklearn GP predict with both return_std and return_cov
grep -r "return_std.*return_cov\|return_cov.*return_std" . --include="*.py" | head -10

Length of output: 660


🏁 Script executed:

#!/bin/bash
# Search for any documentation or error handling about mutual exclusivity in GP wrapper
rg -n "return_std\|return_cov" bayes_opt/ --type py -B 2 -A 2 | grep -E "(ValueError|TypeError|raise|either|mutual|exclusive)" -C 3

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Check if GaussianProcessRegressor docs mention std/cov constraints
rg -n "class.*GaussianProcessRegressor" bayes_opt/ --type py -A 30

Length of output: 0


I need to verify the mutual exclusivity behavior with sklearn's GaussianProcessRegressor:


🌐 Web query:

scikit-learn GaussianProcessRegressor predict method return_std return_cov mutual exclusive

💡 Result:

Yes — they are mutually exclusive. The docs state the predict method may return the mean and optionally either the standard deviation (return_std=True) or the joint covariance (return_cov=True), and "at most one of the two can be requested." [1][2]

Sources:

  • scikit-learn GaussianProcessRegressor.predict (stable docs). [1]
  • scikit-learn GaussianProcessRegressor.predict (v1.3 docs). [2]

predict(): fix input validation, return type annotation, docstring, and std/cov mutual exclusivity; resolve shape handling ambiguity

Several issues require fixes:

  • Invalid params types lead to UnboundLocalError.
    If params is neither dict nor list, params_array and single_param are never set. Add an else clause:

    else:
        msg = "params must be a dict or a list of dicts."
        raise TypeError(msg)
  • Return type annotation is incorrect.
    Line 180 declares -> tuple[...] but line 245 returns just mean (non-tuple) when return_std and return_cov are both False. Fix to:

    def predict(...) -> float | NDArray[Float] | tuple[float | NDArray[Float], float | NDArray[Float]]:
  • Docstring default for return_std contradicts signature.
    Line 188 says optional(default=True) but line 179 has return_std=False. Fix docstring to optional(default=False).

  • return_std and return_cov are mutually exclusive but not validated.
    sklearn's GaussianProcessRegressor allows at most one of return_std or return_cov to be requested. Add upfront validation:

    if return_std and return_cov:
        msg = "At most one of return_std and return_cov can be True."
        raise ValueError(msg)
  • Shape handling for single-element lists is ambiguous.
    Tests at lines 628–642 call predict([single_point]) with scalar assertions like np.isclose(means, 0) and stds < 0.02, which fail if the output is an array. Yet lines 664–667 treat the same call as returning an array with len(means) == 1 and means[0] indexing. Clarify whether predict([point]) should:

    • Always return arrays (adjust tests at 628–642 to index or .item()), or
    • Return scalars when the list has one element (adjust implementation to match).
      Align implementation, docstring, and tests once decided.
🤖 Prompt for AI Agents
In bayes_opt/bayesian_optimization.py around lines 178–245, the predict() method
lacks validation for params types (causing UnboundLocalError), has an incorrect
return type annotation, a wrong docstring default for return_std, doesn't
enforce mutual exclusivity of return_std and return_cov, and handles
single-element list shapes ambiguously; fix by (1) adding an else branch that
raises TypeError if params is neither dict nor list[dict], (2) updating the
function signature return annotation to allow either a scalar/NDArray or a tuple
of scalar/NDArray, (3) correcting the docstring to state return_std default is
False, (4) validating up-front that not both return_std and return_cov are True
and raising ValueError if they are, and (5) making shape semantics
deterministic: treat dict input as returning scalars and any list input (even
length 1) as returning arrays (adjust the existing shape-conversion code
accordingly so single-element lists remain 1D arrays and single dict inputs are
reduced to scalars), then update tests/docstrings to match this behavior.


def register(
self, params: ParamsType, target: float, constraint_value: float | NDArray[Float] | None = None
) -> None:
Expand Down Expand Up @@ -303,8 +372,8 @@ def maximize(self, init_points: int = 5, n_iter: int = 25) -> None:
probe based on the acquisition function. This means that the GP may
not be fitted on all points registered to the target space when the
method completes. If you intend to use the GP model after the
optimization routine, make sure to fit it manually, e.g. by calling
``optimizer._gp.fit(optimizer.space.params, optimizer.space.target)``.
optimization routine, make sure to call predict() with fit_gp=True.

"""
# Log optimization start
self.logger.log_optimization_start(self._space.keys)
Expand Down
Loading
Loading