Skip to content

ossl: error queue assert to only terminate in debug builds#274

Merged
pgellert merged 1 commit intoredpanda-data:v26.2.x-prefrom
pgellert:ossl/relax-err-queue-assert
May 1, 2026
Merged

ossl: error queue assert to only terminate in debug builds#274
pgellert merged 1 commit intoredpanda-data:v26.2.x-prefrom
pgellert:ossl/relax-err-queue-assert

Conversation

@pgellert
Copy link
Copy Markdown

@pgellert pgellert commented Apr 29, 2026

Openssl's api contract is too loose and the impact to wide (e.g. low-priority https traffic could crash the whole process) that it makes sense to only terminate here in debug builds.

CORE-15655

@pgellert pgellert requested a review from dotnwat April 29, 2026 15:52
@pgellert pgellert self-assigned this Apr 29, 2026
Copilot AI review requested due to automatic review settings April 29, 2026 15:52
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR changes how the OpenSSL error-queue “stale error” invariant is enforced in the Seastar TLS (OpenSSL-based) implementation, aiming to avoid process termination in production when OpenSSL leaves unexpected errors on the per-thread error queue.

Changes:

  • Replace an always-on SEASTAR_ASSERT with standard assert() when detecting stale OpenSSL errors on the error queue.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/net/ossl.cc
Comment on lines 1858 to 1862
char buf[256];
ERR_error_string_n(err, buf, sizeof(buf));
tls_log.warn("{} stale error on queue before {}: {}", *this, operation, buf);
SEASTAR_ASSERT(0 && "stale errors on OpenSSL error queue");
assert(0 && "stale errors on OpenSSL error queue");
}
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert() is compiled out when NDEBUG is set, so in release builds this function will just log and then continue while leaving the OpenSSL per-thread error queue dirty. That defeats the purpose of verify_clean_error_queue() and can still poison subsequent SSL_get_error() calls. Consider draining/clearing the queue (e.g., via the existing clear_stale_ssl_errors(operation) / get_all_ossl_errors() path) when a stale error is detected, while keeping an abort/assert behavior only for debug builds if desired.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do that, but I explicitly wanted to avoid clean errors of "unknown origin", which is why I think the current behaviour is the ideal behaviour.

Comment thread src/net/ossl.cc
@pgellert pgellert changed the title ossl: error queue assert to only terminate in release builds ossl: error queue assert to only terminate in debug builds Apr 29, 2026
Openssl's api contract is too loose and the impact to wide (e.g.
low-priority https traffic could crash the whole process) that it makes
sense to only terminate here in debug builds.
@pgellert pgellert force-pushed the ossl/relax-err-queue-assert branch from 6eac50d to 0c39c1a Compare April 29, 2026 17:48
@pgellert
Copy link
Copy Markdown
Author

force-push: fix commit message typo

@pgellert pgellert merged commit a0b4f2a into redpanda-data:v26.2.x-pre May 1, 2026
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants