Skip to content

Sketch of generator.#4

Open
sambklein wants to merge 5 commits intoxopt-org:mainfrom
sambklein:boed
Open

Sketch of generator.#4
sambklein wants to merge 5 commits intoxopt-org:mainfrom
sambklein:boed

Conversation

@sambklein
Copy link

No description provided.

Copilot AI review requested due to automatic review settings December 4, 2025 21:49
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a sketch of an amortized Bayesian Optimal Experimental Design (BOED) generator that uses a pre-trained neural network for experimental design decisions. The implementation extends the xopt Generator class and loads a TorchScript model to generate candidate experimental points based on observed data. The PR also adds .DS_Store to .gitignore.

  • Adds new AmortizedBOEDGenerator class that integrates pre-trained neural networks for BOED
  • Includes example/test code in __main__ block demonstrating usage with a plateau function
  • Updates .gitignore to exclude macOS system files

Reviewed changes

Copilot reviewed 1 out of 2 changed files in this pull request and generated 6 comments.

File Description
bax_algorithms/amortized_boed.py New generator class implementing amortized BOED with neural network inference and example usage code
.gitignore Adds .DS_Store to ignore macOS system files

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

self.model = torch.jit.load(self.model_path, map_location=self.device)
self.model.eval()

def generate(self, n_candidates: int = 1) -> list[dict]:
Copy link

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generate method lacks a docstring. This is a critical public method that should be documented with its parameters, return values, and behavior, especially since the return format is non-trivial (as indicated by the comment on line 38).

Suggested change
def generate(self, n_candidates: int = 1) -> list[dict]:
def generate(self, n_candidates: int = 1) -> list[dict]:
"""
Generate candidate experimental designs using the pre-trained neural network model.
Parameters
----------
n_candidates : int, optional
Number of candidate designs to generate (default is 1).
Returns
-------
list of dict
A list of dictionaries, each containing variable names as keys and the generated candidate values.
Example: [{'x': value1}, {'x': value2}, ...]
Notes
-----
- Currently, only a single candidate is produced per call to the model. To generate multiple candidates,
the method calls the model multiple times or may require modification of the model.
- The output dictionary structure is based on the VOCS variable names.
"""

Copilot uses AI. Check for mistakes.

def model_post_init(self, __context):
"""Called after Pydantic initialization to load the model."""
super().__init__()
Copy link

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The super().__init__() call in model_post_init is incorrect. Pydantic's model_post_init is a hook that runs after the Pydantic model initialization is complete, so calling super().__init__() here would re-initialize the parent class after Pydantic has already set up the instance. This line should be removed. The correct pattern is to only perform post-initialization logic here without calling super().__init__().

Suggested change
super().__init__()

Copilot uses AI. Check for mistakes.
# Not sure what the correct dictionary structure is here?
# return [{'records': c} for c in candidates]
return [
{self.vocs.variable_names[0]: candidates[i].item()} for i in range(n_candidates)
Copy link

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop iterates over range(n_candidates) but always accesses candidates[i].item(), which assumes candidates has at least n_candidates elements. However, based on the TODO comment on line 36, the model only produces a single candidate. This will cause an IndexError when n_candidates > 1. The logic needs to either check the size of candidates or only iterate over the actual number of candidates returned by the model.

Suggested change
{self.vocs.variable_names[0]: candidates[i].item()} for i in range(n_candidates)
{self.vocs.variable_names[0]: candidates[i].item()} for i in range(len(candidates))

Copilot uses AI. Check for mistakes.
# Not sure what the correct dictionary structure is here?
# return [{'records': c} for c in candidates]
return [
{self.vocs.variable_names[0]: candidates[i].item()} for i in range(n_candidates)
Copy link

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value assumes all variables can be accessed via self.vocs.variable_names[0], which only works for single-variable problems. For multi-variable optimization problems, this will only populate one variable and ignore the rest. The code should map all variables from the candidates, not just the first one.

Suggested change
{self.vocs.variable_names[0]: candidates[i].item()} for i in range(n_candidates)
{var: candidates[i][j].item() for j, var in enumerate(self.vocs.variable_names)} for i in range(n_candidates)

Copilot uses AI. Check for mistakes.

if __name__ == "__main__":
import matplotlib.pyplot as plt
import torch
Copy link

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The torch import on line 47 is redundant since torch is already imported at the top of the file (line 1). This duplicate import should be removed.

Suggested change
import torch

Copilot uses AI. Check for mistakes.
init_point = X.vocs.grid_inputs(1)
# Replace with a fixed initial point for reproducibility
init_point["x"][0] = 100
X.evaluate_data(init_point)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also call X.grid_evaluate() to do the same thing if you want

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is what I did originally, this is just to match how the model was trained. For the time being I've fixed the first measurment to be at T=100, thats something that is easy to change so that the model can handle random first measurements.

Copy link
Contributor

@roussel-ryan roussel-ryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants