-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Add .predict function to optimizer
#593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #593 +/- ##
==========================================
- Coverage 97.89% 97.77% -0.12%
==========================================
Files 10 10
Lines 1185 1212 +27
==========================================
+ Hits 1160 1185 +25
- Misses 25 27 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
WalkthroughA new Changes
Sequence DiagramsequenceDiagram
participant User
participant BO as BayesianOptimization
participant GP as GaussianProcess
User->>BO: predict(params, fit_gp=True)
alt fit_gp is True
BO->>BO: Check if observations exist
alt No observations
BO-->>User: RuntimeError
else Observations exist
BO->>GP: Fit GP via _fit_gp()
GP-->>BO: GP fitted
end
end
BO->>BO: Normalize params to 2D array
BO->>GP: Predict mean (and std/cov if requested)
GP-->>BO: Predictions returned
BO->>BO: Denormalize output shape
alt return_std or return_cov
BO-->>User: (mean, uncertainty)
else Neither requested
BO-->>User: mean only
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Tip 📝 Customizable high-level summaries are now available in beta!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.
Example instruction:
Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
tests/test_bayesian_optimization.py (2)
19-64: Custom triangle parameter and area helper: clarify intent and edge conditions
Triangle sampling vs.
perimeterandbounds.
FixedPerimeterTriangleParameter.random_sampledraws Dirichlet samples scaled byself.perimeterand only enforcesself.bounds. For the current tests (perimeter=1.0, bounds[0, 1]for each side) this is fine, but for inconsistent combinations (e.g.perimeter > 3 * upper_bound) thewhileloop could run indefinitely because no samples satisfy the bounds. If this helper is ever reused with different parameters, consider either:
- Asserting a compatible relationship between
perimeterandboundsat construction time, or- Adding a max-tries safeguard with a clear error message.
Heron’s formula vs. current
area_of_triangleimplementation.
Standard Heron’s formula uses the semiperimeters = (a + b + c) / 2, whereas heresis the full perimeter and the comment explicitly says# perimeter. This function is only used as a test objective, so correctness up to a constant factor may be acceptable, but the namearea_of_triangleis misleading as‑is. You might want to either:
Implement the usual formula:
s = np.sum(sides, axis=-1) # perimeter
return np.sqrt(s * (s - a) * (s - b) * (s - c))
- s = 0.5 * np.sum(sides, axis=-1) # semiperimeter
- return np.sqrt(s * (s - a) * (s - b) * (s - c))
```
- Or rename/comment this as a synthetic triangle-like objective to avoid confusion.
537-590: Custom-parameter save/load test: avoid debug prints and private GP calls
The
pbounds(Lines 546–548) look like leftover debugging and add noise to test output. They can safely be removed without losing coverage.Forcing a GP fit via
optimizer._gp.fit(...)(Lines 557–558) reaches into a private attribute. With the new publicpredict(..., fit_gp=True)hook, you could instead trigger a GP fit in a more idiomatic way (e.g., calloptimizer.predict(...)once) or rely on the existing logic that fits the GP insidemaximize(). If the extra fit is really needed for this scenario, a brief comment explaining why would help future maintainers.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
bayes_opt/bayesian_optimization.py(2 hunks)tests/test_bayesian_optimization.py(2 hunks)
🔇 Additional comments (1)
bayes_opt/bayesian_optimization.py (1)
369-376: maximize() warning update looks goodThe updated warning that points users to
predict(..., fit_gp=True)aftermaximize()is accurate given the new API and clarifies GP fitting expectations without changing behavior.
| def predict( | ||
| self, params: dict[str, Any] | list[dict[str, Any]], return_std=False, return_cov=False, fit_gp=True | ||
| ) -> tuple[float | NDArray[Float], float | NDArray[Float]]: | ||
| """Predict the target function value at given parameters. | ||
| Parameters | ||
| --------- | ||
| params: dict or list | ||
| The parameters where the prediction is made. | ||
| return_std: bool, optional(default=True) | ||
| If True, the standard deviation of the prediction is returned. | ||
| return_cov: bool, optional(default=False) | ||
| If True, the covariance of the prediction is returned. | ||
| fit_gp: bool, optional(default=True) | ||
| If True, the internal Gaussian Process model is fitted before | ||
| making the prediction. | ||
| Returns | ||
| ------- | ||
| mean: float or np.ndarray | ||
| The predicted mean of the target function at the given parameters. | ||
| std_or_cov: float or np.ndarray | ||
| The predicted standard deviation or covariance of the target function | ||
| at the given parameters. | ||
| """ | ||
| if isinstance(params, list): | ||
| # convert list of dicts to 2D array | ||
| params_array = np.array([self._space.params_to_array(p) for p in params]) | ||
| single_param = False | ||
| elif isinstance(params, dict): | ||
| params_array = self._space.params_to_array(params).reshape(1, -1) | ||
| single_param = True | ||
|
|
||
| if fit_gp: | ||
| if len(self._space) == 0: | ||
| msg = ( | ||
| "The Gaussian Process model cannot be fitted with zero observations. To use predict(), " | ||
| "without fitting the GP, set fit_gp=False. The predictions will then be made using the " | ||
| "GP prior." | ||
| ) | ||
| raise RuntimeError(msg) | ||
| self.acquisition_function._fit_gp(self._gp, self._space) | ||
|
|
||
| res = self._gp.predict(params_array, return_std=return_std, return_cov=return_cov) | ||
|
|
||
| if return_std or return_cov: | ||
| mean, std_or_cov = res | ||
| else: | ||
| mean = res | ||
|
|
||
| if not single_param and mean.ndim == 0: | ||
| mean = np.atleast_1d(mean) | ||
| # ruff complains when nesting conditionals, so this three-way split is necessary | ||
| if not single_param and (return_std or return_cov) and std_or_cov.ndim == 0: | ||
| std_or_cov = np.atleast_1d(std_or_cov) | ||
|
|
||
| if single_param and mean.ndim > 0: | ||
| mean = mean[0] | ||
| if single_param and (return_std or return_cov) and std_or_cov.ndim > 0: | ||
| std_or_cov = std_or_cov[0] | ||
|
|
||
| if return_std or return_cov: | ||
| return mean, std_or_cov | ||
| return mean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
predict(): tighten input validation, clarify typing/docs, and guard std vs cov
A few points here:
-
Invalid
paramstypes lead toUnboundLocalError.
Ifparamsis neitherdictnorlist,params_array/single_paramare never set and you’ll get an unhelpful runtime error. Consider explicitly validating and raising aTypeError:- if isinstance(params, list): + if isinstance(params, list): # convert list of dicts to 2D array params_array = np.array([self._space.params_to_array(p) for p in params]) single_param = False elif isinstance(params, dict): params_array = self._space.params_to_array(params).reshape(1, -1) single_param = True
-
else: -
msg = "params must be a dict or a list of dicts." -
raise TypeError(msg)
-
Return type annotation doesn’t match behavior.
The function sometimes returns justmean(float /NDArray) and sometimes a(mean, std_or_cov)tuple. The annotation-> tuple[...]is misleading for callers and type checkers. Consider something like:def predict(...) -> float | NDArray[Float] | tuple[float | NDArray[Float], float | NDArray[Float]]: ...
or using
typing.overloadfor the main common combinations. -
Docstring default for
return_stdis wrong.
The signature usesreturn_std=False, but the docstring saysoptional(default=True). That will confuse users. Please fix the docstring tooptional(default=False). -
return_stdandreturn_covshould be mutually exclusive.
GaussianProcessRegressor.predictonly allows one ofreturn_stdorreturn_covto beTrueat a time; otherwise it raises.
It would be clearer to check upfront and raise a user-facingValueError: -
if fit_gp:
-
if return_std and return_cov: -
msg = "At most one of return_std and return_cov can be True." -
raise ValueError(msg) -
if fit_gp: if len(self._space) == 0: ...
-
Shape handling vs. list-of-one semantics.
Forparamsgiven as a single dict you correctly return scalars (or scalar-like) when possible viasingle_param=True. Forparamsas a list with a single element (e.g.[{"p1": 0, "p2": 0}]),single_paramisFalsesomean/std_or_covstay 1D arrays. Several new tests callpredict([point])and then do scalar-style checks likenp.isclose(means, 0, ...)orstds < 0.02, which will raiseValueErrorifmeans/stdsare arrays. You probably want either:- API choice A (simpler): Always treat list inputs as “multi-point”, even of length 1, and adjust the tests to index or call
.item(), or - API choice B (more user-friendly): If
paramsis a list of length 1, treat it like a single param (single_param=True) sopredict([point])behaves likepredict(point)in terms of scalar vs array output.
Whichever you decide, aligning implementation, tests, and docstring explicitly around “single dict vs list, including length-1 lists” will avoid confusing shape bugs.
- API choice A (simpler): Always treat list inputs as “multi-point”, even of length 1, and adjust the tests to index or call
-
Minor: handling of
single_paramwithreturn_cov.
Whenreturn_cov=Trueandparamsis a single dict,std_or_covstarts as a(1, 1)covariance matrix and becomes(1,)afterstd_or_cov = std_or_cov[0]. If you intend a scalar variance for a single point, consider also unwrapping the remaining dimension to a float for the single-parameter case, or documenting that you return a 1‑element array for covariance.
I need to verify the concerns in the review comment against the actual codebase. Let me gather information:
---
🏁 Script executed:
```shell
#!/bin/bash
# Find the predict() method definition in full to see if validation exists
rg -n -A 80 "def predict\(" bayes_opt/bayesian_optimization.py | head -120
Length of output: 3434
🏁 Script executed:
#!/bin/bash
# Check the full type annotation for the predict method
rg -n "def predict" bayes_opt/bayesian_optimization.py -A 5 --type pythonLength of output: 30
🏁 Script executed:
#!/bin/bash
# Find test files that call predict() to see edge cases and assertions
fd -e "test_*.py" -o -e "*_test.py" | xargs rg -l "\.predict\(" 2>/dev/null | head -5Length of output: 279
🏁 Script executed:
#!/bin/bash
# Search for any validation of return_std and return_cov together
rg -n "return_std.*return_cov|return_cov.*return_std" bayes_opt/bayesian_optimization.pyLength of output: 462
🏁 Script executed:
#!/bin/bash
# Look for the else clause handling invalid params types
rg -n -B 5 -A 5 "else:" bayes_opt/bayesian_optimization.py | grep -A 10 "params"Length of output: 779
I need to check the test file and verify the GP behavior:
🏁 Script executed:
#!/bin/bash
# Find predict() calls in test file with single list elements or list inputs
rg -n "\.predict\(" tests/test_bayesian_optimization.py -B 2 -A 3 | head -80Length of output: 3252
🏁 Script executed:
#!/bin/bash
# Look for assertions after predict() calls to see what output shape is expected
rg -n "\.predict\(\[" tests/test_bayesian_optimization.py -A 5 | head -60Length of output: 1468
🏁 Script executed:
#!/bin/bash
# Check what happens when calling sklearn GP predict with both return_std and return_cov
grep -r "return_std.*return_cov\|return_cov.*return_std" . --include="*.py" | head -10Length of output: 660
🏁 Script executed:
#!/bin/bash
# Search for any documentation or error handling about mutual exclusivity in GP wrapper
rg -n "return_std\|return_cov" bayes_opt/ --type py -B 2 -A 2 | grep -E "(ValueError|TypeError|raise|either|mutual|exclusive)" -C 3Length of output: 0
🏁 Script executed:
#!/bin/bash
# Check if GaussianProcessRegressor docs mention std/cov constraints
rg -n "class.*GaussianProcessRegressor" bayes_opt/ --type py -A 30Length of output: 0
I need to verify the mutual exclusivity behavior with sklearn's GaussianProcessRegressor:
🌐 Web query:
scikit-learn GaussianProcessRegressor predict method return_std return_cov mutual exclusive
💡 Result:
Yes — they are mutually exclusive. The docs state the predict method may return the mean and optionally either the standard deviation (return_std=True) or the joint covariance (return_cov=True), and "at most one of the two can be requested." [1][2]
Sources:
- scikit-learn GaussianProcessRegressor.predict (stable docs). [1]
- scikit-learn GaussianProcessRegressor.predict (v1.3 docs). [2]
predict(): fix input validation, return type annotation, docstring, and std/cov mutual exclusivity; resolve shape handling ambiguity
Several issues require fixes:
-
Invalid
paramstypes lead toUnboundLocalError.
Ifparamsis neitherdictnorlist,params_arrayandsingle_paramare never set. Add anelseclause:else: msg = "params must be a dict or a list of dicts." raise TypeError(msg)
-
Return type annotation is incorrect.
Line 180 declares-> tuple[...]but line 245 returns justmean(non-tuple) whenreturn_stdandreturn_covare bothFalse. Fix to:def predict(...) -> float | NDArray[Float] | tuple[float | NDArray[Float], float | NDArray[Float]]:
-
Docstring default for
return_stdcontradicts signature.
Line 188 saysoptional(default=True)but line 179 hasreturn_std=False. Fix docstring tooptional(default=False). -
return_stdandreturn_covare mutually exclusive but not validated.
sklearn's GaussianProcessRegressor allows at most one of return_std or return_cov to be requested. Add upfront validation:if return_std and return_cov: msg = "At most one of return_std and return_cov can be True." raise ValueError(msg)
-
Shape handling for single-element lists is ambiguous.
Tests at lines 628–642 callpredict([single_point])with scalar assertions likenp.isclose(means, 0)andstds < 0.02, which fail if the output is an array. Yet lines 664–667 treat the same call as returning an array withlen(means) == 1andmeans[0]indexing. Clarify whetherpredict([point])should:- Always return arrays (adjust tests at 628–642 to index or
.item()), or - Return scalars when the list has one element (adjust implementation to match).
Align implementation, docstring, and tests once decided.
- Always return arrays (adjust tests at 628–642 to index or
🤖 Prompt for AI Agents
In bayes_opt/bayesian_optimization.py around lines 178–245, the predict() method
lacks validation for params types (causing UnboundLocalError), has an incorrect
return type annotation, a wrong docstring default for return_std, doesn't
enforce mutual exclusivity of return_std and return_cov, and handles
single-element list shapes ambiguously; fix by (1) adding an else branch that
raises TypeError if params is neither dict nor list[dict], (2) updating the
function signature return annotation to allow either a scalar/NDArray or a tuple
of scalar/NDArray, (3) correcting the docstring to state return_std default is
False, (4) validating up-front that not both return_std and return_cov are True
and raising ValueError if they are, and (5) making shape semantics
deterministic: treat dict input as returning scalars and any list input (even
length 1) as returning arrays (adjust the existing shape-conversion code
accordingly so single-element lists remain 1D arrays and single dict inputs are
reduced to scalars), then update tests/docstrings to match this behavior.
| def test_predict(): | ||
| """Test the predict method of the optimizer.""" | ||
| optimizer = BayesianOptimization(f=target_func, pbounds=PBOUNDS, random_state=1, verbose=0) | ||
|
|
||
| # Register some points | ||
| optimizer.register(params={"p1": 1, "p2": 2}, target=3) | ||
| optimizer.register(params={"p1": 4, "p2": 5}, target=9) | ||
| optimizer.register(params={"p1": 7, "p2": 8}, target=15) | ||
|
|
||
| # Points to predict | ||
| test_points = [{"p1": 2, "p2": 3}, {"p1": 5, "p2": 6}, {"p1": 8, "p2": 9}] | ||
|
|
||
| # Perform predictions | ||
| means = optimizer.predict(test_points) | ||
|
|
||
| # Check that means have correct length | ||
| assert len(means) == len(test_points) | ||
|
|
||
| # Also test with return_std=True to get std | ||
| means, stds = optimizer.predict(test_points, return_std=True) | ||
| assert len(means) == len(test_points) | ||
| assert len(stds) == len(test_points) | ||
|
|
||
|
|
||
| def test_predict_example(): | ||
| """Test the predict method with known outputs.""" | ||
| optimizer = BayesianOptimization(f=target_func, pbounds=PBOUNDS, random_state=1, verbose=0) | ||
|
|
||
| # Register some points | ||
| optimizer.register(params={"p1": 0, "p2": 0}, target=0) | ||
| optimizer.register(params={"p1": 10, "p2": 10}, target=20) | ||
|
|
||
| # Point to predict | ||
| test_point = {"p1": 0, "p2": 0} | ||
|
|
||
| # Perform prediction | ||
| means = optimizer.predict([test_point]) | ||
| assert np.isclose(means, 0, atol=1e-3) | ||
|
|
||
| # Test with return_std=True | ||
| means, stds = optimizer.predict([test_point], return_std=True) | ||
| assert np.isclose(means, 0, atol=1e-3) | ||
| assert stds < 0.02 # std should be small but not tiny due to GP uncertainty | ||
|
|
||
| test_point = {"p1": 10, "p2": 10} | ||
| means = optimizer.predict([test_point]) | ||
| assert np.isclose(means, 20, atol=1e-3) | ||
|
|
||
| means, stds = optimizer.predict([test_point], return_std=True) | ||
| assert np.isclose(means, 20, atol=1e-3) | ||
| assert stds < 0.02 | ||
|
|
||
|
|
||
| def test_predict_no_fit(): | ||
| """Test the predict method when GP has not been fitted.""" | ||
| optimizer = BayesianOptimization(f=target_func, pbounds=PBOUNDS, random_state=1, verbose=0) | ||
|
|
||
| # Perform prediction with fit_gp=True should raise error when no data | ||
| with pytest.raises(RuntimeError): | ||
| optimizer.predict({"p1": 5, "p2": 5}, fit_gp=True) | ||
|
|
||
| # Predict without fitting GP using single dict - get scalar mean by default | ||
| mean = optimizer.predict({"p1": 5, "p2": 5}, fit_gp=False) | ||
| # Since GP is not fitted, mean should be close to 0 | ||
| assert np.isclose(mean, 0, atol=1e-4) | ||
|
|
||
| # Get std when not fitting GP | ||
| mean, std = optimizer.predict({"p1": 5, "p2": 5}, fit_gp=False, return_std=True) | ||
| # Since GP is not fitted, std should be large | ||
| assert std > 1e-2 | ||
|
|
||
| # Test with list - returns array | ||
| means = optimizer.predict([{"p1": 5, "p2": 5}], fit_gp=False) | ||
| # With a list, even single point returns array | ||
| assert len(means) == 1 | ||
| assert np.isclose(means[0], 0, atol=1e-4) | ||
|
|
||
|
|
||
| def test_predict_return_cov(): | ||
| """Test the predict method with return_cov=True.""" | ||
| optimizer = BayesianOptimization(f=target_func, pbounds=PBOUNDS, random_state=1, verbose=0) | ||
|
|
||
| optimizer.register(params={"p1": 1, "p2": 2}, target=3) | ||
| optimizer.register(params={"p1": 4, "p2": 5}, target=9) | ||
|
|
||
| test_points = [{"p1": 2, "p2": 3}, {"p1": 5, "p2": 6}] | ||
|
|
||
| means, cov = optimizer.predict(test_points, return_cov=True) | ||
|
|
||
| assert len(means) == len(test_points) | ||
| assert cov.shape == (len(test_points), len(test_points)) | ||
|
|
||
|
|
||
| def test_predict_integer_params(): | ||
| """Test the predict method with integer parameters.""" | ||
| int_pbounds = {"p1": (0, 10, int), "p2": (0, 10, int)} | ||
| optimizer = BayesianOptimization(f=target_func, pbounds=int_pbounds, random_state=1, verbose=0) | ||
| optimizer.register(params={"p1": 1, "p2": 2}, target=3) | ||
| optimizer.register(params={"p1": 4, "p2": 5}, target=9) | ||
| test_points = [{"p1": 2, "p2": 3}, {"p1": 5, "p2": 6}] | ||
| means = optimizer.predict(test_points) | ||
| assert len(means) == len(test_points) | ||
|
|
||
| # Test with return_std | ||
| means, stds = optimizer.predict(test_points, return_std=True) | ||
| assert len(means) == len(test_points) | ||
| assert len(stds) == len(test_points) | ||
|
|
||
| float_points = [{"p1": 2.7, "p2": 3.3}, {"p1": 5.9, "p2": 6.1}] | ||
| means_float = optimizer.predict(float_points) | ||
| assert len(means_float) == len(float_points) | ||
|
|
||
| means_float, stds_float = optimizer.predict(float_points, return_std=True) | ||
| assert len(means_float) == len(float_points) | ||
| assert len(stds_float) == len(float_points) | ||
| # Check that rounding float inputs gives similar predictions as integer inputs | ||
|
|
||
| for i in range(len(test_points)): | ||
| rounded_point = {k: round(v) for k, v in float_points[i].items()} | ||
| mean_rounded = optimizer.predict([rounded_point]) | ||
| assert np.isclose(means_float[i], mean_rounded, atol=1e-1) | ||
|
|
||
| # Also check with std | ||
| for i in range(len(test_points)): | ||
| rounded_point = {k: round(v) for k, v in float_points[i].items()} | ||
| mean_rounded, std_rounded = optimizer.predict([rounded_point], return_std=True) | ||
| assert np.isclose(means_float[i], mean_rounded, atol=1e-1) | ||
| assert np.isclose(stds_float[i], std_rounded, atol=1e-1) | ||
|
|
||
|
|
||
| def test_predict_categorical_params(): | ||
| """Test the predict method with categorical parameters.""" | ||
|
|
||
| def cat_target_func(param1: str, param2: str) -> float: | ||
| value_map = {"low": 1.0, "medium": 2.0, "high": 3.0} | ||
| return value_map[param1] + value_map[param2] | ||
|
|
||
| cat_pbounds = {"param1": ["low", "medium", "high"], "param2": ["low", "medium", "high"]} | ||
|
|
||
| optimizer = BayesianOptimization(f=cat_target_func, pbounds=cat_pbounds, random_state=1, verbose=0) | ||
|
|
||
| optimizer.register(params={"param1": "low", "param2": "low"}, target=2.0) | ||
| optimizer.register(params={"param1": "high", "param2": "high"}, target=6.0) | ||
|
|
||
| test_points = [{"param1": "medium", "param2": "medium"}, {"param1": "low", "param2": "high"}] | ||
|
|
||
| means = optimizer.predict(test_points) | ||
|
|
||
| assert len(means) == len(test_points) | ||
| assert np.isclose(means[0], 4.0, atol=1.0) | ||
| assert np.isclose(means[1], 4.0, atol=1.0) | ||
|
|
||
| # Test with return_std | ||
| means, stds = optimizer.predict(test_points, return_std=True) | ||
| assert len(means) == len(test_points) | ||
| assert len(stds) == len(test_points) | ||
| assert stds[0] > 0.0 | ||
| assert stds[1] > 0.0 | ||
|
|
||
|
|
||
| def test_predict_no_points_registered(): | ||
| """Test the predict method when no points have been registered.""" | ||
| optimizer = BayesianOptimization(f=target_func, pbounds=PBOUNDS, random_state=1, verbose=0) | ||
|
|
||
| test_points = [{"p1": 2, "p2": 3}, {"p1": 5, "p2": 6}] | ||
|
|
||
| means = optimizer.predict(test_points, fit_gp=False) | ||
|
|
||
| assert len(means) == len(test_points) | ||
| for mean in means: | ||
| assert np.isclose(mean, 0, atol=1e-4) | ||
|
|
||
| # Test with return_std to get uncertainty | ||
| means, stds = optimizer.predict(test_points, fit_gp=False, return_std=True) | ||
| assert len(means) == len(test_points) | ||
| assert len(stds) == len(test_points) | ||
| for std in stds: | ||
| assert std > 1e-2 | ||
|
|
||
|
|
||
| def test_predict_custom_parameter(): | ||
| """Test the predict method with a custom parameter type.""" | ||
|
|
||
| param = FixedPerimeterTriangleParameter( | ||
| name="sides", bounds=np.array([[0.0, 1.0], [0.0, 1.0], [0.0, 1.0]]), perimeter=1.0 | ||
| ) | ||
| pbounds = {"sides": param} | ||
|
|
||
| optimizer = BayesianOptimization(f=area_of_triangle, pbounds=pbounds, random_state=1, verbose=0) | ||
|
|
||
| optimizer.register( | ||
| params={"sides": np.array([0.3, 0.4, 0.3])}, target=area_of_triangle(np.array([0.3, 0.4, 0.3])) | ||
| ) | ||
| optimizer.register( | ||
| params={"sides": np.array([0.2, 0.5, 0.3])}, target=area_of_triangle(np.array([0.2, 0.5, 0.3])) | ||
| ) | ||
|
|
||
| test_points = [{"sides": np.array([0.25, 0.5, 0.25])}, {"sides": np.array([0.4, 0.4, 0.2])}] | ||
|
|
||
| means = optimizer.predict(test_points) | ||
|
|
||
| assert len(means) == len(test_points) | ||
| for i, point in enumerate(test_points): | ||
| expected_area = area_of_triangle(point["sides"]) | ||
| assert np.isclose(means[i], expected_area, atol=0.1) | ||
|
|
||
| # Test with return_std | ||
| means, stds = optimizer.predict(test_points, return_std=True) | ||
| assert len(means) == len(test_points) | ||
| assert len(stds) == len(test_points) | ||
| for i in range(len(test_points)): | ||
| assert stds[i] > 0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Prediction tests largely cover the new API, but shape assumptions need to match implementation
Overall these tests do a good job exercising:
- Prior vs posterior behavior (
fit_gp=True/False, zero vs nonzero observations). return_stdandreturn_covpaths.- Integer, categorical, and custom parameter handling.
- Behavior when no points are registered.
A few focused issues/suggestions:
-
Scalar vs array ambiguity when calling
predict([point]).
Several tests treatpredict([point])as returning scalars:test_predict_example:means = optimizer.predict([test_point])followed byassert np.isclose(means, 0, ...)and latermeans, stds = optimizer.predict([test_point], return_std=True)withassert stds < 0.02.test_predict_integer_params: inside loops,mean_rounded = optimizer.predict([rounded_point])andmean_rounded, std_rounded = optimizer.predict([rounded_point], return_std=True)both compared vianp.iscloseto scalar values.
With the current implementation of
predict, a list input (even of length 1) yields a 1D array formean/std, so these assertions will hit the “ambiguous truth value of array” error or silently assume array-scalar behavior that doesn’t match the docstring. You should either:- Update tests to treat
predict([point])as returning arrays (means[0],stds[0], or.item()), or - Adjust
predictso that a list of length 1 is treated as a single point (see earlier comment onsingle_param) and document that behavior.
-
Mutual exclusivity of
return_stdandreturn_cov.
There are no tests that exercise the invalid case where both flags areTrue. Oncepredictenforces this (see previous file comment), consider adding a small negative test, e.g.:with pytest.raises(ValueError): optimizer.predict(test_points, return_std=True, return_cov=True)
to lock in the contract.
-
Integer, categorical, and custom parameter tests look solid but are a bit brittle.
-
test_predict_exampleand several others assert that posterior means at training points are very close to the observed targets and that std is< 0.02. This is reasonable with the currentGaussianProcessRegressordefaults (Matern kernel,alpha=1e-6,normalize_y=True), but could become brittle if GP hyperparameters or scikit-learn defaults change. Loosening tolerances slightly or asserting monotonic/relative behavior instead of exact thresholds would make the suite more robust over time. -
test_predict_custom_parameteruses absolute closenessatol=0.1between the GP mean andarea_of_trianglewith only two training points. That’s probably okay, but again somewhat tied to current kernel behavior; a brief comment about this being a smoke test (not strict accuracy validation) would clarify intent.
-
-
No test for
fit_gp=Falsewith existing observations.
You test:fit_gp=Truewith data, andfit_gp=Falsewith no data (prior-only).
It may be worth adding a small test where you register a few points, call
predict(..., fit_gp=True)once to fit, and then callpredict(..., fit_gp=False)again to ensure that using the “already-fitted GP without re-fitting” path behaves as expected. This would exercise the “don’t touch GP state” branch explicitly.
Overall, once the shape expectations for predict([point]) are aligned with the implementation, this test suite will provide good coverage for the new .predict API and its interaction with non-float parameters.
I'll verify the review comment claims by examining the actual implementation of the predict method.
[/run_scripts]
🏁 Script executed:
#!/bin/bash
# Find the predict method implementation
fd -t f -e py | xargs rg -l "def predict" | head -20Length of output: 84
🏁 Script executed:
#!/bin/bash
# Search for predict method with context to understand implementation
rg -n -A 50 "def predict\(" --type py | head -150Length of output: 10940
🏁 Script executed:
#!/bin/bash
# Look for the BayesianOptimization class and predict method
ast-grep --pattern $'class BayesianOptimization {
$$$
predict($_, $_) {
$$$
}
$$$
}'Length of output: 0
I need to see the complete predict method implementation to verify the return value handling.
[/run_scripts]
🏁 Script executed:
#!/bin/bash
# Get complete predict method implementation
rg -n -A 100 "def predict\(" bayes_opt/bayesian_optimization.py | head -200Length of output: 4585
🏁 Script executed:
#!/bin/bash
# Check if there's validation for return_std and return_cov together
rg -n "return_std.*return_cov\|return_cov.*return_std" bayes_opt/bayesian_optimization.pyLength of output: 0
Now let me examine the actual test code to confirm the specific assertions mentioned in the review:
[/run_scripts]
🏁 Script executed:
#!/bin/bash
# Check test_predict_example assertions - lines around 615-640
sed -n '615,640p' tests/test_bayesian_optimization.pyLength of output: 921
🏁 Script executed:
#!/bin/bash
# Check test_predict_integer_params - lines around 690-720
sed -n '690,720p' tests/test_bayesian_optimization.pyLength of output: 1406
🏁 Script executed:
#!/bin/bash
# Check if tests actually pass or have issues with array comparisons
rg -n "predict\(\[" tests/test_bayesian_optimization.py | head -20Length of output: 461
🏁 Script executed:
#!/bin/bash
# Look for fit_gp=False patterns in tests to see if that case is covered
rg -n "fit_gp=False" tests/test_bayesian_optimization.pyLength of output: 370
Fix array/scalar mismatches in predict() assertions—list inputs return arrays, not scalars.
Several test assertions incorrectly treat predict([point]) output as scalars when the implementation returns arrays for list inputs:
test_predict_example(lines 628, 632, 637, 640):means = optimizer.predict([test_point])followed byassert np.isclose(means, 0, ...)andassert stds < 0.02will fail or produce misleading results sincepredict()returns 1D arrays for list inputs.test_predict_integer_params(lines 711, 717): Inside the loop,mean_rounded = optimizer.predict([rounded_point])and subsequent assertions comparing against scalars have the same issue.
Update these to either extract the scalar (e.g., means[0] or .item()) or wrap the point in a dict instead: optimizer.predict(rounded_point) returns a scalar directly.
Additionally, add a test for the case where return_std=True and return_cov=True are both passed to verify mutual exclusivity is enforced once the implementation validates this. Include a test where points are registered, predict(..., fit_gp=True) is called to fit the model, and then predict(..., fit_gp=False) is called to exercise the "use already-fitted GP" path without re-fitting.
🤖 Prompt for AI Agents
In tests/test_bayesian_optimization.py around lines 592 to 803, several
assertions treat the output of optimizer.predict([point]) as scalars but the
implementation returns 1D arrays for list inputs; change those assertions to
index the first element (e.g., means[0] or stds[0] or .item()) or call predict
with a single dict (predict(point)) when a scalar is expected (update
test_predict_example lines ~628/632/637/640 and test_predict_integer_params
loops around ~711/717 accordingly). Also add a small test verifying that passing
both return_std=True and return_cov=True raises or is rejected (mutual
exclusivity), and add a test that registers points then calls predict(...,
fit_gp=True) followed by predict(..., fit_gp=False) to exercise the "use
already-fitted GP" path. Ensure all assertions compare scalars to scalars and
array outputs are asserted on length/shape before indexing.
This introduces a user-facing .predict() method to the optimizer to simplify making predictions with the GP. I think that with the support for optimization over non-float parameters, calling
.predict()on the GP now requires a bit too much parameter transformation and mangling. Rather than forcing users to handle this complexity themselves, this change provides an interface that handles all necessary conversions internally.Summary by CodeRabbit
New Features
predict()method to generate predictions for specified parameters with optional standard deviation or covariance outputs.Documentation
maximize()documentation to reference the new prediction capability.