Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces a sketch of an amortized Bayesian Optimal Experimental Design (BOED) generator that uses a pre-trained neural network for experimental design decisions. The implementation extends the xopt Generator class and loads a TorchScript model to generate candidate experimental points based on observed data. The PR also adds .DS_Store to .gitignore.
- Adds new
AmortizedBOEDGeneratorclass that integrates pre-trained neural networks for BOED - Includes example/test code in
__main__block demonstrating usage with a plateau function - Updates
.gitignoreto exclude macOS system files
Reviewed changes
Copilot reviewed 1 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| bax_algorithms/amortized_boed.py | New generator class implementing amortized BOED with neural network inference and example usage code |
| .gitignore | Adds .DS_Store to ignore macOS system files |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| self.model = torch.jit.load(self.model_path, map_location=self.device) | ||
| self.model.eval() | ||
|
|
||
| def generate(self, n_candidates: int = 1) -> list[dict]: |
There was a problem hiding this comment.
The generate method lacks a docstring. This is a critical public method that should be documented with its parameters, return values, and behavior, especially since the return format is non-trivial (as indicated by the comment on line 38).
| def generate(self, n_candidates: int = 1) -> list[dict]: | |
| def generate(self, n_candidates: int = 1) -> list[dict]: | |
| """ | |
| Generate candidate experimental designs using the pre-trained neural network model. | |
| Parameters | |
| ---------- | |
| n_candidates : int, optional | |
| Number of candidate designs to generate (default is 1). | |
| Returns | |
| ------- | |
| list of dict | |
| A list of dictionaries, each containing variable names as keys and the generated candidate values. | |
| Example: [{'x': value1}, {'x': value2}, ...] | |
| Notes | |
| ----- | |
| - Currently, only a single candidate is produced per call to the model. To generate multiple candidates, | |
| the method calls the model multiple times or may require modification of the model. | |
| - The output dictionary structure is based on the VOCS variable names. | |
| """ |
bax_algorithms/amortized_boed.py
Outdated
|
|
||
| def model_post_init(self, __context): | ||
| """Called after Pydantic initialization to load the model.""" | ||
| super().__init__() |
There was a problem hiding this comment.
The super().__init__() call in model_post_init is incorrect. Pydantic's model_post_init is a hook that runs after the Pydantic model initialization is complete, so calling super().__init__() here would re-initialize the parent class after Pydantic has already set up the instance. This line should be removed. The correct pattern is to only perform post-initialization logic here without calling super().__init__().
| super().__init__() |
bax_algorithms/amortized_boed.py
Outdated
| # Not sure what the correct dictionary structure is here? | ||
| # return [{'records': c} for c in candidates] | ||
| return [ | ||
| {self.vocs.variable_names[0]: candidates[i].item()} for i in range(n_candidates) |
There was a problem hiding this comment.
The loop iterates over range(n_candidates) but always accesses candidates[i].item(), which assumes candidates has at least n_candidates elements. However, based on the TODO comment on line 36, the model only produces a single candidate. This will cause an IndexError when n_candidates > 1. The logic needs to either check the size of candidates or only iterate over the actual number of candidates returned by the model.
| {self.vocs.variable_names[0]: candidates[i].item()} for i in range(n_candidates) | |
| {self.vocs.variable_names[0]: candidates[i].item()} for i in range(len(candidates)) |
bax_algorithms/amortized_boed.py
Outdated
| # Not sure what the correct dictionary structure is here? | ||
| # return [{'records': c} for c in candidates] | ||
| return [ | ||
| {self.vocs.variable_names[0]: candidates[i].item()} for i in range(n_candidates) |
There was a problem hiding this comment.
The return value assumes all variables can be accessed via self.vocs.variable_names[0], which only works for single-variable problems. For multi-variable optimization problems, this will only populate one variable and ignore the rest. The code should map all variables from the candidates, not just the first one.
| {self.vocs.variable_names[0]: candidates[i].item()} for i in range(n_candidates) | |
| {var: candidates[i][j].item() for j, var in enumerate(self.vocs.variable_names)} for i in range(n_candidates) |
bax_algorithms/amortized_boed.py
Outdated
|
|
||
| if __name__ == "__main__": | ||
| import matplotlib.pyplot as plt | ||
| import torch |
There was a problem hiding this comment.
The torch import on line 47 is redundant since torch is already imported at the top of the file (line 1). This duplicate import should be removed.
| import torch |
| init_point = X.vocs.grid_inputs(1) | ||
| # Replace with a fixed initial point for reproducibility | ||
| init_point["x"][0] = 100 | ||
| X.evaluate_data(init_point) |
There was a problem hiding this comment.
you can also call X.grid_evaluate() to do the same thing if you want
There was a problem hiding this comment.
That is what I did originally, this is just to match how the model was trained. For the time being I've fixed the first measurment to be at T=100, thats something that is easy to change so that the model can handle random first measurements.
No description provided.