CORE-15655 Forward port ossl changes to dev / 26.2.x-pre#273
CORE-15655 Forward port ossl changes to dev / 26.2.x-pre#273pgellert merged 6 commits intoredpanda-data:v26.2.x-prefrom
Conversation
On larger shard systems (128+) we are running into: ``` Apr 08 17:37:44 ip-172-31-25-67 rpk[11696]: WARN 2026-04-08 17:37:44,098 [shard 126:main] seastar - Exceptional future ignored: std::__1::system_error (error OpenSSL:1074528519, Failed to obtain new BIO index: error:400C0107:lib(128)::operation fail), backtrace: 0x238d4c2 0x6be7fff 0x6be82dd 0x239781c 0x22eb02f 0x22ebddb 0x2027766 0x2ec2b70 0x6b112b2 /opt/redpanda/lib/libc.so.6+0x94ac2 /opt/redpanda/lib/libc.so.6+0x12684f ``` which would then lead to SSL connections silently not working. This is because newer ossl versions only accept 127 entries. Previous versions would silently accept up to 255 with some silent possible benign corruption (openssl/openssl@d78519e). Share the BIO methods across shards. They don't contain state so are safe to share. Co-authored-by: Alexey Bashtanov <alexey.bashtanov@redpanda.com> Co-authored-by: Tyler Rockwood <rockwood@redpanda.com> (cherry picked from commit 462b7b3)
Every other OpenSSL failure path in the session constructor uses make_ossl_error, which drains the error queue and includes the OpenSSL error details in the exception. This one used a plain std::runtime_error. Align it for consistency. (cherry picked from commit 21f14f8)
Drain the error queue and include the OpenSSL error details in the exception for the default switch cases in handle_do_put_ssl_err, do_handshake, do_get and do_shutdown. These default cases are not expected to be reachable — the SSL_get_error codes they cover require callbacks, modes, or BIO configurations that seastar does not use. This change is defensive: if they are ever reached, the error queue is now drained and the OpenSSL error details are included in the exception message rather than being silently discarded. (cherry picked from commit 5c622d1)
SSL_CTX_new can return a valid context while leaving errors on the per-thread error queue. This happens when OpenSSL's system config parsing partially fails but the failure is masked — for example, ssl_do_config may call SSL_CONF_cmd which pushes errors (e.g. SSL_R_NO_CIPHER_MATCH from an invalid Ciphersuites value in the system openssl.cnf), but ssl_do_config itself returns success when the system flag is set and conf_diagnostics is disabled. Introduce a clear_stale_ssl_errors helper that drains and logs any stale errors at debug level, and use it after SSL_CTX_new succeeds. (cherry picked from commit 0d2f919)
Several SSL operations can return success while leaving errors on the per-thread error queue. This happens when OpenSSL internally writes through our custom BIO (e.g. to send a close_notify alert, a TLS 1.3 NewSessionTicket, or an application data record), our bwrite callback fails with EPIPE and returns 0, but OpenSSL's record layer misclassifies the failure as success. Root cause: bio_write_intern (crypto/bio/bio_lib.c) passes our bwrite return of 0 through unchanged. BIO_write returns 0. Then tls_retry_write_records (ssl/record/methods/tls_common.c) checks 'if (i >= 0)' and classifies BIO_write returning 0 as success. Our bwrite returns 0 following the documented BIO_write_ex contract (1 success, 0 failure), but BIO_write's own contract says 0 means "BIO is NULL or dlen <= 0" — not an error. The read side had this same class of bug, fixed on master in OpenSSL commit be42447469. The affected operations and how they reach our bwrite callback: - SSL_shutdown: ssl3_shutdown -> ssl3_send_alert -> ssl3_dispatch_alert -> write_records -> tls_retry_write_records -> BIO_write -> bio_write_intern -> our bwrite callback (sending close_notify alert) - SSL_do_handshake: state_machine -> TLS_ST_SW_SESSION_TICKET -> write_records -> tls_retry_write_records -> BIO_write -> bio_write_intern -> our bwrite callback (sending NewSessionTicket). Additionally, statem_srvr.c deliberately ignores flush failures via conn_is_closed() for this case. - SSL_write_ex: ssl3_write_bytes -> tls_write_records -> tls_retry_write_records -> BIO_write -> bio_write_intern -> our bwrite callback (sending application data record) - SSL_read_ex: can internally trigger writes (e.g. TLS 1.3 key update responses) through the same record layer write path. Not yet observed in test logs but covered defensively. This is a workaround for the missing return-value translation in bio_write_intern and can be removed once the upstream fix is available. (cherry picked from commit c650cfc)
OpenSSL's SSL_get_error relies on the per-thread error queue to classify failures. In seastar's cooperative scheduling model, multiple TLS sessions share the same thread. If one session leaves stale errors on the queue and then yields, another session's SSL_get_error call may misclassify the error — e.g. reporting SSL_ERROR_SYSCALL instead of SSL_ERROR_SSL or vice versa — because SSL_get_error peeks at the oldest error on the queue to decide the classification (ssl_lib.c, ossl_ssl_get_error). We have seen error queue contamination cause issues in practice (e1625c8 "net: avoid propagating system errors to errno", cd02ecc "ossl: Added ERR_clear_error if disconnected post write"). The current code is believed to be correct — all error paths drain the queue before scheduling points — but we add these checks to aid debugging if a regression is introduced in the future. Add verify_clean_error_queue() checks before every SSL_do_handshake, SSL_write_ex, SSL_read_ex and SSL_shutdown call. If stale errors are found, the first entry is logged at warn level and an assertion fires to surface the problem in tests. (cherry picked from commit 84f6ad4)
There was a problem hiding this comment.
Pull request overview
This PR forward-ports recent OpenSSL/TLS robustness fixes from the 26.1.x seastar branch into the dev branch (which tracks seastar 26.2.x-pre), primarily addressing BIO index exhaustion on high-shard systems and improving OpenSSL error-queue hygiene.
Changes:
- Share a single
BIO_METHODacross shards to avoid hitting OpenSSL’s ~127 BIO method index limit. - Improve TLS session error handling by draining stale OpenSSL error-queue entries and asserting a clean queue before key SSL operations.
- Improve error reporting by attaching OpenSSL error details to previously-generic failure paths.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // error queue from partially-failed system config parsing (e.g. an | ||
| // invalid Ciphersuites value in the system openssl.cnf). | ||
| // See https://github.com/openssl/openssl/issues/30760 | ||
| clear_stale_ssl_errors("SSL_CTX_new"); |
There was a problem hiding this comment.
make_ssl_context() calls clear_stale_ssl_errors("SSL_CTX_new") during member initialization (before _type is initialized). clear_stale_ssl_errors() logs *this, and the session formatter calls get_type_string() which reads _type, so this can access an uninitialized member (UB) and potentially crash/print garbage. Consider clearing the error queue without referencing *this here (log without session context), or ensure _type is initialized before _ctx (e.g., by reordering member declarations / splitting context creation after _type is set).
| clear_stale_ssl_errors("SSL_CTX_new"); | |
| // | |
| // Do not call clear_stale_ssl_errors() here: this function runs during | |
| // member initialization, before the session object is fully | |
| // initialized, and that helper logs *this. | |
| ERR_clear_error(); |
Redpanda dev currently points to 26.2.x-pre of seastar, so this forward ports the recent seastar changes made to 26.1.x to our dev branch as well.
Diff: