[feat] add `keep_terminal` flag to support keeping terminal observation by adzcai · Pull Request #6 · EdanToledo/Stoa

adzcai · 2025-10-10T16:10:38Z

See discussion in EdanToledo/Stoix#181. The issue is basically how to handle episode boundaries. Suppose base_env.step enters a state with timestep.done() == True (either terminated or truncated). So the question is what an auto-reset wrapper should return:

Return the state and observation from base_env.reset and keep the other timestep properties from base_env.step. For proper bootstrapping in algorithms, though, this requires doubling the number of critic evaluations.
Just return everything from base_env.step and return base_env.reset, with a dummy reward and discount, on the next call to wrapped_env.step. This might require masking out any losses computed based on the policy's actions in the final state.

Both of these are valid choices and we should enable the user to decide which they prefer. For example, in settings where evaluating the critic involves some form of search, option two would incur half the number of critic invocations as option one.

This PR also fixes #5 by re-implementing the optimistic auto reset wrapper.

EdanToledo · 2025-10-13T17:52:06Z

I have seen this, i'm just unable to review this week. Will try handle asap

EdanToledo · 2025-10-26T10:52:34Z

So, i've been thinking about this more and im very PRO making this a choice. I will try and review the PR this week.

feat: add keep_terminal flag to support keeping terminal state

b3556d5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] add `keep_terminal` flag to support keeping terminal observation#6

[feat] add `keep_terminal` flag to support keeping terminal observation#6
adzcai wants to merge 1 commit intoEdanToledo:mainfrom
adzcai:feat/keep-terminal-flag

adzcai commented Oct 10, 2025 •

edited

Loading

Uh oh!

EdanToledo commented Oct 13, 2025

Uh oh!

EdanToledo commented Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

adzcai commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EdanToledo commented Oct 13, 2025

Uh oh!

EdanToledo commented Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

adzcai commented Oct 10, 2025 •

edited

Loading