Skip to content

Raise missing std::invalid_argument#837

Open
pentschev wants to merge 5 commits intorapidsai:mainfrom
pentschev:irrecoverable-fatal
Open

Raise missing std::invalid_argument#837
pentschev wants to merge 5 commits intorapidsai:mainfrom
pentschev:irrecoverable-fatal

Conversation

@pentschev
Copy link
Member

@pentschev pentschev commented Feb 2, 2026

Certain cases are missing checks on argument inputs, raise proper std::invalid_argument.

@pentschev pentschev requested a review from a team as a code owner February 2, 2026 12:49
@pentschev pentschev self-assigned this Feb 2, 2026
@pentschev pentschev added bug Something isn't working non-breaking Introduces a non-breaking change labels Feb 2, 2026
@madsbk
Copy link
Member

madsbk commented Feb 2, 2026

Hmm, I am not sure this is a good idea. I see your point that you cannot recover the input buffer immediately once the std::unique_ptr has been moved, but you can still recover at a higher level, for example by retrying the workflow.

Also, making this fatal via std::terminate prevents a more user-friendly error report in Python, since it bypasses our usual exception translation and error handling.

@pentschev
Copy link
Member Author

We don't currently have any error recovery AFAIK, which means if any of those fail we don't have a practical way at least today for any of that, thus a regular exception is just currently treated as a potentially recoverable issue which is at best misleading. However, at least in the present state none of those exceptions are supposed to happen, if they do a fatal error seems like the right choice today. With that said, I'm not opposed to us handling those issues at some point in the future in a way we can safely recover, but that's not the case today, so a fatal error is IMO the only correct choice short-term.

@madsbk
Copy link
Member

madsbk commented Feb 2, 2026

Just because we currently cannot recover does not mean we should resort to std::terminate. In my opinion, std::terminate should be reserved for cases where it is truly required, for example violations of noexcept such as throwing from a destructor.

In all other cases, a regular exception still accurately describes what happened. It keeps the door open for future recovery mechanisms and allows us to surface a meaningful, user-friendly error in Python.

@pentschev
Copy link
Member Author

I don't see how this closes the door for a future recovery mechanism, we can always remove the fatal case and resolve it by simply doing what we need to actually recover. I don't disagree a regular exception allows you to describe what happened, however, that also requires some internal knowledge of implementation details that are not currently "publicly known". For example, you get an exception which is expected to be an indication of something that went wrong but should have handled gracefully, and then you look for how you could appropriately handle that gracefully, today that isn't the case for the situations being increased to fatal in this PR, if that condition was hit it means something went wrong that cannot be fixed from the user's perspective and is also not a bug in RapidsMPF, it is a known irrecoverable limitation. IOW, this fatal error now means something went wrong in a way that should never have happened and cannot be excepted to be handled in any way, so we fail loudly that there's nothing we could have done. IMO, none of this is really different than the noexcept throwing in a destructor, that is known it should have also never happened.

@wence-
Copy link
Contributor

wence- commented Feb 3, 2026

I, like Mads, have some concerns with this approach. Ideally we never std::terminate (it's not an error condition that python is happy with, for one).

If we're worried about ctors throwing and leaving the unique_ptrs they consumed dead, then we could accept std::unique_ptr<foo>&& and only move after we've checked the invariants.

However, I really would rather we avoid std::terminate. So although we've consumed things in an irrecoverable way, I would rather not tear down the calling process.

@pentschev
Copy link
Member Author

I still think the whole point has been missed here, the intent is to fail as loud as possible to a case that we, the developers, have not handled correctly, and there's no way we could today recover from, so it's a bad case just as dereferencing a nullptr or writing out-of-bounds represent, and it should cause an effect analogous to a segfault. There's no point in simply raising an exception back to the user with an error that essentially says something like "msg must be ready" when in reality it means "this is a RapidsMPF bug, you should have never come here and there's nothing you could have done to prevent it, oh well tough luck...".

However, since there's so much pushback I won't insist, I have made changes to just raise an exception which is still IMO misleading. I have also left some other checks/doc changes that I believe were missing and should be non-controversial.

@pentschev pentschev changed the title Raise fatal on error with std::unique_ptr passed as argument Raise missing std::invalid_argument Feb 4, 2026
@madsbk
Copy link
Member

madsbk commented Feb 4, 2026

@pentschev, one middle ground could be a dedicated exception type for internal logic errors (bugs in RapidsMPF). That would let us distinguish between user-facing failures and internal invariants being violated, without resorting to std::terminate. Maybe internal_error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants