Skip to content

Split stepper config into train and inference parts#753

Closed
mcgibbon wants to merge 1 commit intomainfrom
refactor/split_stepper_config
Closed

Split stepper config into train and inference parts#753
mcgibbon wants to merge 1 commit intomainfrom
refactor/split_stepper_config

Conversation

@mcgibbon
Copy link
Contributor

Short description of why the PR is needed and how it satisfies those requirements, in sentence form.

Changes:

  • symbol (e.g. fme.core.my_function) or script and concise description of changes or added feature

  • Can group multiple related symbols on a single bullet

  • Tests added

  • If dependencies changed, "deps only" image rebuilt and "latest_deps_only_image.txt" file updated

Resolves # (delete if none)

Copy link
Member

@jpdunc23 jpdunc23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, the motivation for this change is to improve the separation between two aspects of Stepper configuration:

  • C1. Configuration needed for Stepper forward stepping.
  • C2. Configuration needed for Stepper.train_on_batch

Note that C1 is a subset of C2.

These changes make it clearer which StepperConfig attributes are in C1.

However, I don't think they accomplish the goal of "improving separation" in a way that can be practically leveraged elsewhere (e.g., in fme.coupled). For that reason, I'm not sure these changes justify the effort that will be required to update our training configs.

Aside from that, I have a couple of specific concerns:

  • I'd prefer to avoid additional config nesting.
  • The term "inference" could get (further) overloaded since we heavily use "inference" when referring to a specific type of metric (e.g., "inference/time_mean/rmse") which comes from an "inference" job.

@mcgibbon
Copy link
Contributor Author

However, I don't think they accomplish the goal of "improving separation" in a way that can be practically leveraged elsewhere (e.g., in fme.coupled).

I believe it can be leveraged in fme.coupled. For example, here's a proof of concept which pretends to split ace's stepper into train and inference steppers, and then refactors the coupled stepper to do the same: #754

This greatly simplifies the coupled training stepper, keeping only the attributes related to training and putting all of the "shuffle data back and forth between two sub-objects" logic into the inference stepper.

@mcgibbon
Copy link
Contributor Author

mcgibbon commented Jan 23, 2026

I'd prefer to avoid additional config nesting.

Is there a motivation for this preference? We usually prefer config nesting over flattening, to better separate concerns in the config and have it match the object hierarchy. If the configs are getting too deeply nested in the coupled mode, we could expose a flatter user config in the coupled code and internally convert it to the ace nested versions.

Or maybe we can set it up so that the InferenceStepperConfig lives at the same level as TrainStepperConfig, and have TrainStepperConfig.build take in an InferenceStepper as argument.

@mcgibbon
Copy link
Contributor Author

mcgibbon commented Jan 23, 2026

As I understand it, the motivation for this change is to improve the separation between two aspects of Stepper configuration

That's more a side-effect of the design of the planned change than the motivation. The configuration is not the primary change, though it's what I started with because of backwards compatibility concerns.

The motivation is that we have a lot of settings, methods and data currently attached to the Stepper which we don't need once we're done training, but the inference code is still exposed to them. This makes it harder for example to keep backwards compatibility for inference, but it also makes it harder to understand the Stepper code because a lot is going on. The idea is that the Stepper currently has two large and separate concerns, so they should be split into two objects.

The configuration change is just a side-effect of the fact that we want our configuration hierarchy to reflect our objects. While it's not great having a bunch of train and inference stuff at the same level of config, that's not the main issue. This new config hierarchy reflects the objects we'd like to refactor the current Stepper into because of these other concerns.

@mcgibbon mcgibbon closed this Jan 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants