Skip to content

Add external python predictor#1898

Open
danielwolff1 wants to merge 14 commits into4C-multiphysics:mainfrom
danielwolff1:external-python-predictor
Open

Add external python predictor#1898
danielwolff1 wants to merge 14 commits into4C-multiphysics:mainfrom
danielwolff1:external-python-predictor

Conversation

@danielwolff1
Copy link
Copy Markdown
Contributor

Description and Context

With this PR, I'm proposing a new predictor class which allows to defer the actually implementation of the predictor to python. We need this feature as we want to study the behavior of Neural Network-based structural predictors.

Implementation details

I implemented this feature similarly to what has been done in the context of a python surrogate model for rough surface contact laws as merged in #1735. The general idea of the implementation are:

  • Only static cases are supported so far, i.e., load stepping is fine, but no dynamics.
  • 4C execution might happen in parallel, yet the computations in python only happen in a single interpreter, i.e., in a single process.
  • Process with id 0 communicates with python.
  • The logic for communicating with python via pybind11 functionality is completely encapsulated in an Implementation class so that it does not interfere with the actual predictor.
  • As for my personal application case I need to evaluate the neural network on the computational mesh, I need a lightweight representation of the discretization in python. Since this is a special requirement of my own research, the feature is implemented in a way that the communication of the mesh happens in an optional setup step, which will simply be skipped, if the python file does not provide a setup() method. That way the user can easily control whether to have this once-per-simulation overhead of communicating the mesh or whether they only need the "regular" information that also the C++ predictors see.

Testing

I have implemented two integration tests where I test the two major different control flows, i.e., once the python script has a setup method and the mesh needs to be communicated and once not. In both cases, the python script implements the simple logic of a ConstDis predictor and the corresponding values in the RESULTS DESCRIPTION section have been generated by running the same input scripts however with PREDICT = ConstDis.

I'm still planning to look into unit tests for this feature but the general implementation already works and won't change anymore, so I already wanted to put it out for review.

I appreciate any comments and insights for potential improvements 😊

@danielwolff1 danielwolff1 self-assigned this Mar 27, 2026
@danielwolff1 danielwolff1 added the status: under review Issues undergoing review during a pull request label Mar 27, 2026
Copy link
Copy Markdown
Member

@mayrmt mayrmt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielwolff1 I left some comments and raised some questions, that I could not answer for myself right away. For me, it looks like this is going into the right direction.

Comment on lines +236 to +237
// Initialize interpreter once per process.
static pybind11::scoped_interpreter py_guard;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned that only rank 0 should interact with Python. So far, this is not checked here. Just an observation. I am not sure yet, if an actual check is required.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising this point! Took me a while to investigate it and find a solution.

In the first draft of the implementation, communication was indeed restricted to rank 0, i.e., only rank 0 would call the python interpreter, however, pybind11 data types were still used on other processes. Yet, in order to allow this, python needed to be initialized on all processes (which in itself was fine, as it didn't result in weird/undefined behavior during runtime, but as you rightfully pointed out, the implementation was not strictly following the documentation, as indeed a python interpreter was initialized on each process, thus creating an overhead as ultimately only rank 0 meaningfully used it later on).

With commit 4ba3d3c I revised the implementation so that it now strictly follows the self-imposed "only rank 0 communicates with python"-rule. That is, initialization of the python interpreter now only happens on rank 0. All processes participate in communication, but only rank 0 interacts with python and only rank 0 actually touches pybind11 data types, meaning all conversion between 4C-internal and python-compatible data types happen on rank 0.

Overall, the implementation should be much cleaner now and some of the bigger functions (like build_mesh_snapshot()) have now been split up in two, namely gathering data from other processes in a first step and converting the data from C++ to python-compatible data types in a second step.

Comment on lines +5 to +18
STRUCTURAL DYNAMIC:
INT_STRATEGY: "Standard"
DYNAMICTYPE: "Statics"
RESTARTEVERY: 2
TIMESTEP: 0.5
NUMSTEP: 2
MAXTIME: 1
PREDICT: "PythonSurrogate"
PYTHON_PREDICTOR_FILE: "./structure_new_python_surrogate_predictor_constdis.py"
TOLDISP: 1e-12
TOLRES: 1e-12
LINEAR_SOLVER: 1
NLNSOL: "fullnewton"
MAXITER: 20
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This performs a full Newton solve, so the final displacement values in the result tests will be independent of the outcome of the predictor. Hence, this test might not test the outcome of the predictor, just that the code path is not broken.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought that's what the integration tests are for and I wanted to provide a unit test for testing the actual functionality of the predictor. However, this is not as straightforward as I initially thought: In my opinion it would make more sense to test solid::predict::PythonWrapper::Implementation because this class provides the actual functionality, rather than solid::predict::PythonWrapper itself, which is only a wrapper. However, the Implementation class is currently private within PythonWrapper, which makes it very inconvenient to test.

So I guess I will adapt this test like we discussed and modify the newton iteration such that I can actually "see" the influence of my predictor, unless you have a better idea? 🤔

Comment on lines +5 to +18
STRUCTURAL DYNAMIC:
INT_STRATEGY: "Standard"
DYNAMICTYPE: "Statics"
RESTARTEVERY: 2
TIMESTEP: 0.5
NUMSTEP: 2
MAXTIME: 1
PREDICT: "PythonSurrogate"
PYTHON_PREDICTOR_FILE: "./structure_new_python_surrogate_predictor_constdis_with_setup.py"
TOLDISP: 1e-12
TOLRES: 1e-12
LINEAR_SOLVER: 1
NLNSOL: "fullnewton"
MAXITER: 20
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This performs a full Newton solve, so the final displacement values in the result tests will be independent of the outcome of the predictor. Hence, this test might not test the outcome of the predictor, just that the code path is not broken.

danielwolff1 and others added 2 commits March 29, 2026 12:05
Co-authored-by: Matthias Mayr <mayrmt@users.noreply.github.com>
Signed-off-by: Daniel Wolff <39346676+danielwolff1@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

@danielwolff1 danielwolff1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fast review, I appreciate the input! I still have to look into some of the points you mentioned and will do so next week probably.

Comment on lines +288 to +302
if (is_rank_zero(global_state_ptr->get_dis_n()->get_comm()))
{
// make it safe to touch python objects for the rest of this function
pybind11::gil_scoped_acquire gil;
// invoke the python predictor step in the python script
pybind11::object result = compute_(state);
// update 4C's internal state based on the result from the python script
apply_prediction(result, gathered, global_state_ptr);
}
else
{
// all other processes enter apply_prediction and wait for python to be finished.
// Once python is done, the results will be communicated back to each processor.
apply_prediction(pybind11::none(), gathered, global_state_ptr);
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. I've thought about it the last couple of days and my explanation is:
I think it would also work if I rewrote it like this:

// create an empty result object (necessary for processes with non-zero ID)
pybind11::object result = pybind11::none();
// Only process with rank 0 communicates with python
if (is_rank_zero(global_state_ptr->get_dis_n()->get_comm()))
{
    // make it safe to touch python objects for the rest of this function
    pybind11::gil_scoped_acquire gil;
    // invoke the python predictor step in the python script
    result = compute_(state);
    // update 4C's internal state based on the result from the python script
    apply_prediction(result, gathered, global_state_ptr);
}
// explicitly synchronise all processes, i.e., wait for process 0 to be done with python communication
Core::Communication::barrier(global_state_ptr->get_discret()->get_comm());
// Once processor 0 is done with python, the results will be communicated back to each processor.
apply_prediction(result, gathered, global_state_ptr);

Right now, the communication happening in apply_prediction, i.e., the scattering of the results that process 0 received from python involves an implicit barrier, i.e., the current implementation let's all processes already entering the receiving state, but of course they will only receive the data, once processor 0 starts scattering them, i.e., after python communication is finished. So the communication in apply_prediction involves an implicit barrier.

Does that make sense or do you disagree?

If you prefer, I can adapt the implementation to make the synchronisation more explicit.

const int node_id = node->id();

// copy the node's coordinates into a simple STL container
std::vector<double> coords(3, 0.0);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but then I would also need to adapt the type of the container local_node_coords which is currently declared as std::map<int, std::vector<double>> because Core::LinAlg::gather() is already overloaded for a container of type std::map<int, std::vector<double>> (see here) which simplifies the communication. But I can also check if gather would work for a container of type std::map<int, std::array<double, 3>>, at least there seems to be a more general templated version available here.

Comment on lines +5 to +18
STRUCTURAL DYNAMIC:
INT_STRATEGY: "Standard"
DYNAMICTYPE: "Statics"
RESTARTEVERY: 2
TIMESTEP: 0.5
NUMSTEP: 2
MAXTIME: 1
PREDICT: "PythonSurrogate"
PYTHON_PREDICTOR_FILE: "./structure_new_python_surrogate_predictor_constdis.py"
TOLDISP: 1e-12
TOLRES: 1e-12
LINEAR_SOLVER: 1
NLNSOL: "fullnewton"
MAXITER: 20
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought that's what the integration tests are for and I wanted to provide a unit test for testing the actual functionality of the predictor. However, this is not as straightforward as I initially thought: In my opinion it would make more sense to test solid::predict::PythonWrapper::Implementation because this class provides the actual functionality, rather than solid::predict::PythonWrapper itself, which is only a wrapper. However, the Implementation class is currently private within PythonWrapper, which makes it very inconvenient to test.

So I guess I will adapt this test like we discussed and modify the newton iteration such that I can actually "see" the influence of my predictor, unless you have a better idea? 🤔

Comment on lines +480 to +491
pybind11::array_t<double> Solid::Predict::PythonWrapper::Implementation::copy_vector_to_numpy(
const Core::LinAlg::Vector<double>& source) const
{
// create a new NumPy array
pybind11::array_t<double> target(source.local_length());
auto target_view = target.mutable_unchecked<1>();
const double* source_view = source.get_values();
// copy data from vector into the NumPy array
for (int i = 0; i < source.local_length(); ++i) target_view(i) = source_view[i];

return target;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid to copy the data? If I'm not mistaken, we could just return a view to the data

Suggested change
pybind11::array_t<double> Solid::Predict::PythonWrapper::Implementation::copy_vector_to_numpy(
const Core::LinAlg::Vector<double>& source) const
{
// create a new NumPy array
pybind11::array_t<double> target(source.local_length());
auto target_view = target.mutable_unchecked<1>();
const double* source_view = source.get_values();
// copy data from vector into the NumPy array
for (int i = 0; i < source.local_length(); ++i) target_view(i) = source_view[i];
return target;
}
pybind11::array_t<const double> Solid::Predict::PythonWrapper::Implementation::copy_vector_to_numpy(
const Core::LinAlg::Vector<double>& source) const
{
return py::array_t<const double>(
{source.local_length()},
{sizeof(double)},
source.get_values()
)
}

This might even be general enough so that we can move this into a helper function close to our Core::LinAlg::Vector.

Comment on lines +493 to +494
void Solid::Predict::PythonWrapper::Implementation::copy_numpy_to_vector(
const pybind11::array_t<double>& source, Core::LinAlg::Vector<double>& target) const
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also do it the other way around, i.e., directly writing into the data of our Core::LinAlg::Vector (same as above, just non-const)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status: under review Issues undergoing review during a pull request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants