Add external python predictor by danielwolff1 · Pull Request #1898 · 4C-multiphysics/4C

danielwolff1 · 2026-03-27T12:40:04Z

Description and Context

With this PR, I'm proposing a new predictor class which allows to defer the actually implementation of the predictor to python. We need this feature as we want to study the behavior of Neural Network-based structural predictors.

Implementation details

I implemented this feature similarly to what has been done in the context of a python surrogate model for rough surface contact laws as merged in #1735. The general idea of the implementation are:

Only static cases are supported so far, i.e., load stepping is fine, but no dynamics.
4C execution might happen in parallel, yet the computations in python only happen in a single interpreter, i.e., in a single process.
Process with id 0 communicates with python.
The logic for communicating with python via pybind11 functionality is completely encapsulated in an Implementation class so that it does not interfere with the actual predictor.
As for my personal application case I need to evaluate the neural network on the computational mesh, I need a lightweight representation of the discretization in python. Since this is a special requirement of my own research, the feature is implemented in a way that the communication of the mesh happens in an optional setup step, which will simply be skipped, if the python file does not provide a setup() method. That way the user can easily control whether to have this once-per-simulation overhead of communicating the mesh or whether they only need the "regular" information that also the C++ predictors see.

Testing

I have implemented two integration tests where I test the two major different control flows, i.e., once the python script has a setup method and the mesh needs to be communicated and once not. In both cases, the python script implements the simple logic of a ConstDis predictor and the corresponding values in the RESULTS DESCRIPTION section have been generated by running the same input scripts however with PREDICT = ConstDis.

I'm still planning to look into unit tests for this feature but the general implementation already works and won't change anymore, so I already wanted to put it out for review.

I appreciate any comments and insights for potential improvements 😊

mayrmt

@danielwolff1 I left some comments and raised some questions, that I could not answer for myself right away. For me, it looks like this is going into the right direction.

src/inpar/4C_inpar_structure.cpp

src/structure_new/src/predict/4C_structure_new_predict_python_surrogate.cpp

mayrmt · 2026-03-27T19:34:30Z

src/structure_new/src/predict/4C_structure_new_predict_python_wrapper.cpp

+  // Initialize interpreter once per process.
+  static pybind11::scoped_interpreter py_guard;


You mentioned that only rank 0 should interact with Python. So far, this is not checked here. Just an observation. I am not sure yet, if an actual check is required.

Thanks for raising this point! Took me a while to investigate it and find a solution.

In the first draft of the implementation, communication was indeed restricted to rank 0, i.e., only rank 0 would call the python interpreter, however, pybind11 data types were still used on other processes. Yet, in order to allow this, python needed to be initialized on all processes (which in itself was fine, as it didn't result in weird/undefined behavior during runtime, but as you rightfully pointed out, the implementation was not strictly following the documentation, as indeed a python interpreter was initialized on each process, thus creating an overhead as ultimately only rank 0 meaningfully used it later on).

With commit 4ba3d3c I revised the implementation so that it now strictly follows the self-imposed "only rank 0 communicates with python"-rule. That is, initialization of the python interpreter now only happens on rank 0. All processes participate in communication, but only rank 0 interacts with python and only rank 0 actually touches pybind11 data types, meaning all conversion between 4C-internal and python-compatible data types happen on rank 0.

Overall, the implementation should be much cleaner now and some of the bigger functions (like build_mesh_snapshot()) have now been split up in two, namely gathering data from other processes in a first step and converting the data from C++ to python-compatible data types in a second step.

src/structure_new/src/predict/4C_structure_new_predict_python_surrogate.cpp

mayrmt · 2026-03-27T19:56:33Z

tests/input_files/structure_new_python_wrapper_predictor_constdis.4C.yaml

+STRUCTURAL DYNAMIC:
+  INT_STRATEGY: "Standard"
+  DYNAMICTYPE: "Statics"
+  RESTARTEVERY: 2
+  TIMESTEP: 0.5
+  NUMSTEP: 2
+  MAXTIME: 1
+  PREDICT: "PythonSurrogate"
+  PYTHON_PREDICTOR_FILE: "./structure_new_python_surrogate_predictor_constdis.py"
+  TOLDISP: 1e-12
+  TOLRES: 1e-12
+  LINEAR_SOLVER: 1
+  NLNSOL: "fullnewton"
+  MAXITER: 20


This performs a full Newton solve, so the final displacement values in the result tests will be independent of the outcome of the predictor. Hence, this test might not test the outcome of the predictor, just that the code path is not broken.

Yeah I thought that's what the integration tests are for and I wanted to provide a unit test for testing the actual functionality of the predictor. However, this is not as straightforward as I initially thought: In my opinion it would make more sense to test solid::predict::PythonWrapper::Implementation because this class provides the actual functionality, rather than solid::predict::PythonWrapper itself, which is only a wrapper. However, the Implementation class is currently private within PythonWrapper, which makes it very inconvenient to test.

So I guess I will adapt this test like we discussed and modify the newton iteration such that I can actually "see" the influence of my predictor, unless you have a better idea? 🤔

mayrmt · 2026-03-27T19:57:20Z

tests/input_files/structure_new_python_wrapper_predictor_constdis_with_setup.4C.yaml

+STRUCTURAL DYNAMIC:
+  INT_STRATEGY: "Standard"
+  DYNAMICTYPE: "Statics"
+  RESTARTEVERY: 2
+  TIMESTEP: 0.5
+  NUMSTEP: 2
+  MAXTIME: 1
+  PREDICT: "PythonSurrogate"
+  PYTHON_PREDICTOR_FILE: "./structure_new_python_surrogate_predictor_constdis_with_setup.py"
+  TOLDISP: 1e-12
+  TOLRES: 1e-12
+  LINEAR_SOLVER: 1
+  NLNSOL: "fullnewton"
+  MAXITER: 20


This performs a full Newton solve, so the final displacement values in the result tests will be independent of the outcome of the predictor. Hence, this test might not test the outcome of the predictor, just that the code path is not broken.

Co-authored-by: Matthias Mayr <mayrmt@users.noreply.github.com> Signed-off-by: Daniel Wolff <39346676+danielwolff1@users.noreply.github.com>

danielwolff1

Thanks for the fast review, I appreciate the input! I still have to look into some of the points you mentioned and will do so next week probably.

src/inpar/4C_inpar_structure.cpp

danielwolff1 · 2026-03-29T11:02:56Z

src/structure_new/src/predict/4C_structure_new_predict_python_wrapper.cpp

+  if (is_rank_zero(global_state_ptr->get_dis_n()->get_comm()))
+  {
+    // make it safe to touch python objects for the rest of this function
+    pybind11::gil_scoped_acquire gil;
+    // invoke the python predictor step in the python script
+    pybind11::object result = compute_(state);
+    // update 4C's internal state based on the result from the python script
+    apply_prediction(result, gathered, global_state_ptr);
+  }
+  else
+  {
+    // all other processes enter apply_prediction and wait for python to be finished.
+    // Once python is done, the results will be communicated back to each processor.
+    apply_prediction(pybind11::none(), gathered, global_state_ptr);
+  }


This is a good point. I've thought about it the last couple of days and my explanation is:
I think it would also work if I rewrote it like this:

// create an empty result object (necessary for processes with non-zero ID) pybind11::object result = pybind11::none(); // Only process with rank 0 communicates with python if (is_rank_zero(global_state_ptr->get_dis_n()->get_comm())) { // make it safe to touch python objects for the rest of this function pybind11::gil_scoped_acquire gil; // invoke the python predictor step in the python script result = compute_(state); // update 4C's internal state based on the result from the python script apply_prediction(result, gathered, global_state_ptr); } // explicitly synchronise all processes, i.e., wait for process 0 to be done with python communication Core::Communication::barrier(global_state_ptr->get_discret()->get_comm()); // Once processor 0 is done with python, the results will be communicated back to each processor. apply_prediction(result, gathered, global_state_ptr);

Right now, the communication happening in apply_prediction, i.e., the scattering of the results that process 0 received from python involves an implicit barrier, i.e., the current implementation let's all processes already entering the receiving state, but of course they will only receive the data, once processor 0 starts scattering them, i.e., after python communication is finished. So the communication in apply_prediction involves an implicit barrier.

Does that make sense or do you disagree?

If you prefer, I can adapt the implementation to make the synchronisation more explicit.

danielwolff1 · 2026-03-29T11:10:40Z

src/structure_new/src/predict/4C_structure_new_predict_python_wrapper.cpp

+    const int node_id = node->id();
+
+    // copy the node's coordinates into a simple STL container
+    std::vector<double> coords(3, 0.0);


True, but then I would also need to adapt the type of the container local_node_coords which is currently declared as std::map<int, std::vector<double>> because Core::LinAlg::gather() is already overloaded for a container of type std::map<int, std::vector<double>> (see here) which simplifies the communication. But I can also check if gather would work for a container of type std::map<int, std::array<double, 3>>, at least there seems to be a more general templated version available here.

src/structure_new/src/predict/4C_structure_new_predict_python_surrogate.cpp

danielwolff1 · 2026-03-29T11:22:20Z

tests/input_files/structure_new_python_wrapper_predictor_constdis.4C.yaml

+STRUCTURAL DYNAMIC:
+  INT_STRATEGY: "Standard"
+  DYNAMICTYPE: "Statics"
+  RESTARTEVERY: 2
+  TIMESTEP: 0.5
+  NUMSTEP: 2
+  MAXTIME: 1
+  PREDICT: "PythonSurrogate"
+  PYTHON_PREDICTOR_FILE: "./structure_new_python_surrogate_predictor_constdis.py"
+  TOLDISP: 1e-12
+  TOLRES: 1e-12
+  LINEAR_SOLVER: 1
+  NLNSOL: "fullnewton"
+  MAXITER: 20


Yeah I thought that's what the integration tests are for and I wanted to provide a unit test for testing the actual functionality of the predictor. However, this is not as straightforward as I initially thought: In my opinion it would make more sense to test solid::predict::PythonWrapper::Implementation because this class provides the actual functionality, rather than solid::predict::PythonWrapper itself, which is only a wrapper. However, the Implementation class is currently private within PythonWrapper, which makes it very inconvenient to test.

So I guess I will adapt this test like we discussed and modify the newton iteration such that I can actually "see" the influence of my predictor, unless you have a better idea? 🤔

amgebauer · 2026-03-30T08:51:05Z

src/structure_new/src/predict/4C_structure_new_predict_python_wrapper.cpp

+pybind11::array_t<double> Solid::Predict::PythonWrapper::Implementation::copy_vector_to_numpy(
+    const Core::LinAlg::Vector<double>& source) const
+{
+  // create a new NumPy array
+  pybind11::array_t<double> target(source.local_length());
+  auto target_view = target.mutable_unchecked<1>();
+  const double* source_view = source.get_values();
+  // copy data from vector into the NumPy array
+  for (int i = 0; i < source.local_length(); ++i) target_view(i) = source_view[i];
+
+  return target;
+}


Can we avoid to copy the data? If I'm not mistaken, we could just return a view to the data

Suggested change

pybind11::array_t<double> Solid::Predict::PythonWrapper::Implementation::copy_vector_to_numpy(

const Core::LinAlg::Vector<double>& source) const

{

// create a new NumPy array

pybind11::array_t<double> target(source.local_length());

auto target_view = target.mutable_unchecked<1>();

const double* source_view = source.get_values();

// copy data from vector into the NumPy array

for (int i = 0; i < source.local_length(); ++i) target_view(i) = source_view[i];

return target;

}

pybind11::array_t<const double> Solid::Predict::PythonWrapper::Implementation::copy_vector_to_numpy(

const Core::LinAlg::Vector<double>& source) const

{

return py::array_t<const double>(

{source.local_length()},

{sizeof(double)},

source.get_values()

)

}

This might even be general enough so that we can move this into a helper function close to our Core::LinAlg::Vector.

amgebauer · 2026-03-30T08:52:47Z

src/structure_new/src/predict/4C_structure_new_predict_python_wrapper.cpp

+void Solid::Predict::PythonWrapper::Implementation::copy_numpy_to_vector(
+    const pybind11::array_t<double>& source, Core::LinAlg::Vector<double>& target) const


You could also do it the other way around, i.e., directly writing into the data of our Core::LinAlg::Vector (same as above, just non-const)

danielwolff1 added 10 commits March 24, 2026 13:18

First (working) draft of python surrogate predictor class

34c45a2

First working version that actually exchanges data

ffde690

Make PYTHON_PREDICTOR_FILE runtime configurable

0c28adb

First working distributed memory version

b299f14

Some renaming and started to add comments

d2117ef

Finalize comments

d3f3b2d

PR-ready code, tests still missing

337e2fb

Fix fetching of input parameters

bcff8bb

Add integration tests

59cb316

Fix integration tests

a68cb73

danielwolff1 self-assigned this Mar 27, 2026

danielwolff1 added the status: under review Issues undergoing review during a pull request label Mar 27, 2026

Fix formatting via pre-commit hooks

5904154

mayrmt reviewed Mar 27, 2026

View reviewed changes

danielwolff1 and others added 2 commits March 29, 2026 12:05

Apply suggestions from code review

3e5dd4e

Co-authored-by: Matthias Mayr <mayrmt@users.noreply.github.com> Signed-off-by: Daniel Wolff <39346676+danielwolff1@users.noreply.github.com>

Modifications based on review comments

be3675e

danielwolff1 commented Mar 29, 2026

View reviewed changes

amgebauer reviewed Mar 30, 2026

View reviewed changes

Refactor to restrict pybind11 usage to proc 0

4ba3d3c

		// Initialize interpreter once per process.
		static pybind11::scoped_interpreter py_guard;

		void Solid::Predict::PythonWrapper::Implementation::copy_numpy_to_vector(
		const pybind11::array_t<double>& source, Core::LinAlg::Vector<double>& target) const

Conversation

danielwolff1 commented Mar 27, 2026

Description and Context

Implementation details

Testing

Uh oh!

mayrmt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danielwolff1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants