Skip to content

simplify supervision propagation; unhandled events always cause failure #773

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

mariusae
Copy link
Member

@mariusae mariusae commented Aug 6, 2025

Summary:
Currently (local) supervision can propagate events without also killing an intermediate actor.

This is 1) wrong; and 2) complicated.

Instead, we treat an unhandled supervision event as an actor failure, and then reduce the propagation paths to one: that of an actor failing.

In order to retain accurate attribution, we add a "caused_by" field to the actor supervision events.

Differential Revision: D79702385

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 6, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79702385

Summary:
Adds Proc::instance() which returns an actor instance and its corresponding handler. This allows the user to create a regular actor without any message handlers. The returned `Instance` provides all the normal capabilities, including sending and receiving messages, being able to spawn and manage child actors, etc.

This is the foundation for a kind of "script mode" actor.

Differential Revision: D79685752
mariusae added a commit to mariusae/monarch-1 that referenced this pull request Aug 6, 2025
…re (meta-pytorch#773)

Summary:

Currently (local) supervision can propagate events without also killing an intermediate actor.

This is 1) wrong; and 2) complicated.

Instead, we treat an unhandled supervision event as an actor failure, and then reduce the propagation paths to one: that of an actor failing.

In order to retain accurate attribution, we add a "caused_by" field to the actor supervision events.

Differential Revision: D79702385
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79702385

mariusae added a commit to mariusae/monarch-1 that referenced this pull request Aug 6, 2025
…re (meta-pytorch#773)

Summary:

Currently (local) supervision can propagate events without also killing an intermediate actor.

This is 1) wrong; and 2) complicated.

Instead, we treat an unhandled supervision event as an actor failure, and then reduce the propagation paths to one: that of an actor failing.

In order to retain accurate attribution, we add a "caused_by" field to the actor supervision events.

Reviewed By: shayne-fletcher

Differential Revision: D79702385
mariusae added a commit to mariusae/monarch-1 that referenced this pull request Aug 6, 2025
…re (meta-pytorch#773)

Summary:

Currently (local) supervision can propagate events without also killing an intermediate actor.

This is 1) wrong; and 2) complicated.

Instead, we treat an unhandled supervision event as an actor failure, and then reduce the propagation paths to one: that of an actor failing.

In order to retain accurate attribution, we add a "caused_by" field to the actor supervision events.

Reviewed By: shayne-fletcher

Differential Revision: D79702385
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79702385

mariusae added a commit to mariusae/monarch-1 that referenced this pull request Aug 6, 2025
…re (meta-pytorch#773)

Summary:
Pull Request resolved: meta-pytorch#773

Currently (local) supervision can propagate events without also killing an intermediate actor.

This is 1) wrong; and 2) complicated.

Instead, we treat an unhandled supervision event as an actor failure, and then reduce the propagation paths to one: that of an actor failing.

In order to retain accurate attribution, we add a "caused_by" field to the actor supervision events.

Reviewed By: shayne-fletcher

Differential Revision: D79702385
…re (meta-pytorch#773)

Summary:
Pull Request resolved: meta-pytorch#773

Currently (local) supervision can propagate events without also killing an intermediate actor.

This is 1) wrong; and 2) complicated.

Instead, we treat an unhandled supervision event as an actor failure, and then reduce the propagation paths to one: that of an actor failing.

In order to retain accurate attribution, we add a "caused_by" field to the actor supervision events.

Reviewed By: shayne-fletcher

Differential Revision: D79702385
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79702385

dulinriley pushed a commit to dulinriley/monarch that referenced this pull request Aug 6, 2025
Summary:
Pull Request resolved: meta-pytorch#644

1.10.1 (March 5th, 2025)
Fixed
Fix memory leak when using to_vec with Bytes::from_owner (meta-pytorch#773)
1.10.0 (February 3rd, 2025)
Added
Add feature to support platforms without atomic CAS (meta-pytorch#467)
try_get_* methods for Buf trait (meta-pytorch#753)
Implement Buf::chunks_vectored for Take (meta-pytorch#617)
Implement Buf::chunks_vectored for VecDeque<u8> (meta-pytorch#708)
Fixed
Remove incorrect guarantee for chunks_vectored (meta-pytorch#754)
Ensure that tests pass under panic=abort (meta-pytorch#749)

Reviewed By: cjlongoria

Differential Revision: D78948561

fbshipit-source-id: bb755e25088f9d77f0d5a9ae3027cadc51ea586a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants