Update seastar for v25.3.x by BenPope · Pull Request #228 · redpanda-data/seastar

BenPope · 2025-08-06T13:05:20Z

Notes for reviewer:

Many commits reordered compared to previous branches to colocate similar changes and fixups
All the "old" OpenSSL work was ripped out, and replaced with tls: Introduce OpenSSL scylladb/seastar#2569
- Moved some stuff after these commits to make future rebases easier, and keep the changes more in sync

Seastar http server implementation supports multiple listeners. It may be required for the handler logic to know which listener the connection is coming from. Added listener_idx field to `httpd::request` to allow handler recognize listener. Signed-off-by: Michal Maslanka <michal@vectorized.io>

Since an exception carries some text for the response body text, the raising site might like to specify the content type if it's e.g. json. Signed-off-by: John Spray <jcs@vectorized.io>

This enables throwing a base_exception from a json request handler with a json payload inside it. Signed-off-by: John Spray <jcs@vectorized.io>

Signed-off-by: John Spray <jcs@vectorized.io>

Prior to this patch seastar only exposes one global metrics::impl::impl object which holds all metric related data for one application. This patch changes the implementation details such that multiple metrics::impl::impl objects can exist for any given application. Said objects are stored into a map on each shard and created dinamically whenever requested. A metrics::impl::impl is identified by an integer handle that acts as the key for the storage map. Implementation note: in order to avoid issues caused by the ordering of static thread_local objects I had to declare the storage in reactor.cc. (cherry picked from commit 585a8af)

This patch extends the metrics internal apis to use a specific metrics::impl::impl object identified by its integer handle. (cherry picked from commit 6ee4af7)

Add a public method to metric_groups_impl that exposes the handle of the internal implementation it is using. This is required in order for the metric_groups class to be able to reset itself to the configured implementation handle.

This patch extends the metrics user facing apis to use a specific metrics::impl::impl object identified by its integer handle. Note that the constructor of 'metric_groups' is marked explicit in this patch and updates two call sites where the constructor was used implicitly.

This patch removes two subsequent calls to `get_local_impl` and reuses the returned handle in that scope.

This patch extends the user facing prometheus apis allowing the user to specify the internal metrics implementation to be used through a handle. Additionally, 'add_prometheus_routes' now takes an argument that specifies the route on which to advertise the metrics. This enables different metrics "namespaces" to be served by different endpoints in isolation. (cherry picked from commit 6189522)

This patch extends the scollectd apis with the ability to select the internal metrics implementation to be used by providing a handle. (cherry picked from commit d4331d1)

This patch adds a 'get_skip_when_empy' getter to the 'registered_metric' class. It is used by follow-up patches in order to replicate metrics.

This patch adds private methods to the 'metrics::impl' class that deal with the creation of replicated metrics. They will be used to build the public api in future commits.

This patch adds private helpers to 'metrics::impl' that deal with the removal of replicated metric families from their destintation implementation. These methods will be used in subsequent commits to manage the lifetime of replicated metrics.

This patch adds a public method to the 'metrics::impl' class: 'set_metric_families_to_replicate'. When this method is called the families that match any of the specifications will be replicated on the specified destinations.

This patch extends the metric registration and unregistration processes to make them aware of metric replication. In the case of metric registration, if the new metric belongs to a family that matches one of the replication specs, then a replicated metric is created accordingly. For unregistration of a metric, the replicated metric is unregistered too if one exists.

This patch exposes a method in the public interface of the metrics module ('replicate_metric_families'), which enables metric replication internally for the requested metric families.

Extends the metrics api to allow changing the aggregation labels of a metrics family. Otherwise one had to un-register every single metric instance in a metric family and then re-register with the changed aggregation labels. For metric families with thousands of instances (e.g.: histograms with lots of different labels) this is quite expensive. With this change we avoid the full reconstruction of the metrics family and all its metrics. Only the work associated with marking the metrics `dirty()` is needed then.

Copilot

Pull Request Overview

Updates the Seastar library to v25.3.x with significant cryptographic infrastructure changes and comprehensive test infrastructure improvements.

Key changes:

Replaces OpenSSL implementation with a new TLS provider abstraction supporting both OpenSSL and GnuTLS
Adds CPU profiler functionality with comprehensive testing and signal handling
Introduces metric family replication capabilities and Prometheus integration testing

Reviewed Changes

Copilot reviewed 71 out of 71 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tests/unit/tls_test.cc	Updates TLS tests to support dual crypto provider architecture with OpenSSL and GnuTLS branches
tests/unit/stall_detector_test_utilities.hh	Extracts reusable stall detector testing utilities to shared header
tests/unit/stall_detector_test.cc	Refactors to use shared utilities and removes code duplication
tests/unit/prometheus_test.cc	Adds comprehensive testing for Prometheus metrics HTTP endpoint
tests/unit/metric_family_replication_test.cc	Introduces testing for metrics replication across handles
tests/unit/cpu_profiler_test.cc	Comprehensive CPU profiler testing with signal handling and timing validation
tests/unit/cpu_profiler_alloc_test.cc	Tests CPU profiler signal handler memory allocation safety
tests/unit/CMakeLists.txt	Updates build configuration for new tests and certificate generation
tests/perf/linux_perf_event.cc	Removes assertion on perf event read result
src/websocket/common.cc	Adds dual crypto provider support for WebSocket SHA1 and base64 operations
src/util/backtrace.cc	Adds guarded backtrace function with signal safety
src/seastar.cc	Updates includes for new TLS implementation structure
src/rpc/rpc.cc	Fixes buffer construction to use explicit constructor
src/net/tls.cc	Major refactor splitting TLS implementation with provider abstraction
src/net/tls-impl.hh	New header defining TLS implementation interfaces and common structures
src/net/tls-impl.cc	Implementation of shared TLS functionality and credential management
src/net/posix-stack.cc	Fixes namespace qualification for format function calls

Comments suppressed due to low confidence (1)

tests/unit/stall_detector_test_utilities.hh:77

[nitpick] The function name 'spin_user_hires' is ambiguous. Consider renaming to 'spin_user_high_resolution' to clarify that it uses high resolution clock for user-space spinning.

[[maybe_unused]] void spin_user_hires(std::chrono::duration<double> how_much) {

We simply ignore premature termination error as it is GnuTLS specific. Signed-off-by: Michał Maślanka <michal@redpanda.com>

sigev_notify_thread_id is defined as a macro by musl libc, FreeBSD, and the Linux kernel. The macro is for accessing the corresponding field in sigevent. see https://sourceware.org/bugzilla/show_bug.cgi?id=27417 (cherry picked from commit bbe1af3) tdowns note: this is a re-pick of the above as the function moved so the original was not applied

Created tls-impl.cc and tls-impl.h which contains common structures and definitions that are not dependent on the underlying TLS mechanism. These changes set the stage for implementing other TLS providers. Signed-off-by: Michael Boquard <michael@redpanda.com>

This commit adds support for using OpenSSL, instead of GnuTLS, as the TLS provider within Seastar. To support this change, the configure script has been updated to allow users to select which cryptographic provider should be used by supply `--crypto-provider` and specificying either `OpenSSL` or `GnuTLS`. The OpenSSL implementation mirrors the GnuTLS implementation. Instead of using callbacks, a custom BIO was created to handle moving data on/off of the OpenSSL SSL session into the Seastar TLS session data sinks. When compiled for OpenSSL, the `certificate_credentials::set_priority_string` method is compiled out and replaced with the following: * `set_cipher_string` * `set_ciphersuites` * `enable_server_precedence` * `set_minimum_tls_version` * `set_maximum_tls_version` These methods are specific to OpenSSL. The github actions have been updated to run the full suite of tests against both cryptographic providers. `src/net/tcp.hh` and `src/websocket/server.cc` have been updated to use OpenSSL instead of GnuTLS, depending upon the build configuration. Signed-off-by: Michael Boquard <michael@redpanda.com>

Added pretty-print capabilities to seastar::tls::session for OpenSSL and added a number of log statements that may be helpful if debugging the implementation. Signed-off-by: Michael Boquard <michael@redpanda.com>

Prior to this commit, in certain situations, the put() and get() methods would 'drop into' handshake() if they determined that a re-negotiation was occurring. This was unecessary as OpenSSL provides helpful error codes like SSL_WANT_READ and SSL_WANT_WRITE which indicates if OpenSSL needed to send/receive non-user data to handle re-negotiations. This may have also led to a rare situation where the put() method would assert because the _output_pending future had not resolved. This change properly handles the SSL_WANT_READ and SSL_WANT_WRITE conditions and ensures that the _output_pending future resolves. Signed-off-by: Michael Boquard <michael@redpanda.com>

More recent versions of OpenSSL requrire CA certificates to have CA:true Signed-off-by: Michael Boquard <michael@redpanda.com>

Now handling situations where the get() call doesn't throw but does return an empty buffer indicating EOF. Signed-off-by: Michael Boquard <michael@redpanda.com>

TLS renegotiation was removed in TLSv1.3 due to issues around security. This change makes that the default behavior, however a method is exposed to re-enable it if desired. This was only added to OpenSSL as GnuTLS does not provide a means to fully disable renegotiation. Signed-off-by: Michael Boquard <michael@redpanda.com>

Added a dn_format flag to switch between legacy and RFC2253 format. Legacy format matches that originally returned by the GnuTLS api `gnutls_x509_crt_get_dn` which returns the DN in the format of `C=US,ST=London,L=London,O=Redpanda Data,OU=Core,CN=id`. RFC2253 format, effectively reverses the order: `CN=id,OU=Core,O=Redpanda Data,L=London,ST=London,C=US`. Signed-off-by: Michael Boquard <michael@redpanda.com>

We need to build with OpenSSL for the TLS tests to work on our fork.

Seastar has a C++20 modules variation of the build in GHA, but our changes have left it quite broken as we don't use modules. Disable this build for now so we can get the build green and protect the seastar branch.

This commit introduces an alternative reload callback prototype which includes the credentials that were reloaded. In service of that, also adds `reloadable_credentials_base::as_certificate_credentials` to produce a const-ref to `certificate_credentials`, suitable for exposure through the new callback. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>

This commit introduces `struct cert_info` which encapsulates useful information about TLS certificates, currently: - The certificate's serial number (as assigned by the signing CA) - The certificate's expiration time Additionally, this commit adds public API on `tls::certificate_credentials` to access this information for loaded certificates AND the loaded trust list. The motivation for this change is to give applications the ability to monitor and report on certificate and CA lifetimes on demand, the idea being that an application may cache serial numbers and expiries when credentials are reloaded, serving up a summary of those details until the next reload time. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>

Signed-off-by: Ben Pope <ben@redpanda.com>

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>

GnuTLS provides serial numbers not as a legible string of hex digits but as a byte array corresponding to an integer value up to 20 octets wide. The cert_info interface should reflect this. To that end, this commit updates the type of `cert_info::serial` to an non-null-terminated basic_sstring of uint8_t. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>

Passed up to application code as a tls::blob (basic_string_view<char>). Also provide the ability to fetch this blob directly via credentials_builder::get_trust_file_blob() Updates unit tests to accommodate the new reload callback signature. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>

char_traits<unsigned> is deprecated in LLVM 18, slated for removal in LLVM 19. Therefore remove this non-standard usage from cert_info. Extracting the x509 serial number is a relatively infrequent operation and the result is always <= 20B, so the cost of using a vector here should be marginal. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>

We disable interception of _Unwind_RaiseException when ASAN is enabled since it can produce a large number of stack-related false positives when it is enabled and exceptions are thrown. Fixes scylladb#2892.

Redpanda builds seastar with depcrecations as errors. However, some of the newly deprecated functions in Seastar are used heavily in RP. Hence this commit temporarily comments those annotations out until their usage is removed in RP.

Pure non-functional refactor to split the long line into three. (cherry picked from commit a0155ba)

Switch the io-queue cost function over to use max(IOPS cost, TP cost) away from sum(IOPS cost, TP cost). In its form today the io-queue fails to reach the advertised throughput/IOPS numbers from io-properties (see [seastar#2771](scylladb#2771) for details). The reason for this is easy to understand when looking at a disk like the ones from the m7gd AWS series. They (approximately) achieve full throughput as well as full IOPS at 4k IOP size. This means that for cost modelling for a single IO operation it would be enough to just look at one dimension, either throughput or IOPS but not both. The current cost function which is always additive in IOP and throughput cost however costs operation way too high. This leads to the aforementioned problem where the io-queue underperforms. Taking 4k IOPS on m7gd as an example we only get around 60% of the disk performance. From various testing the max cost function seems to model disks better than the sum cost function and avoids the problem mentioned above. Note this doesn't change the "outer" cost model of how the costs of individual operations are summed together. That also isn't modelled entirely right for more modern disks but is pessimistically fine and not the focus of this patch. The problem can be easily shown using the following io-tester config: ``` - name: highprio shards: [0] type: randwrite shard_info: parallelism: 32 reqsize: 4kB|16kB|128kB shares: 1000 think_time: 0 - name: lowprio shards: [0] type: randwrite shard_info: parallelism: 32 reqsize: 4kB|16kB|128kB shares: 100 think_time: 0 ``` On m7gd.4xl (all numbers in MB/s): - NOIP: Not using io-properties - IP: Using io-properties - MCF: Using io-properties with the max cost function | | 4k | | | 16k | | | 128k | | | | | high prio | low prio | total | high prio | low prio | total | high prio | low prio | total | | NOIP | 303 | 287 | 591 | 299 | 292 | 591 | 295 | 295 | 590 | | IP | 262 | 28 | 290 | 432 | 44 | 476 | 528 | 53 | 581 | | MCF | 476 | 56 | 532 | 543 | 55 | 598 | 544 | 55 | 599 | We see that at low IOP size using no io-properties we do get the full throughput but no scheduling (bad), using io-properties we do get scheduling but low throughput and with the max cost function we get both. More data and extensive discussion can be found here: - scylladb#2801 (comment) - scylladb#2840 (comment) - https://docs.google.com/spreadsheets/d/1u_9tRxePkq-OYFiba8xz4JY54P2VJ-JQBvhuBv04tmo These tests also include latency tests in priority scenarios and different kind of disks like EBS. The linked PR also includes a discussion about a different approach using a weighted cost function. Note that in general latency is often a function of throughput as the io-queue arbitrarily keeping things in the queue induces latency. This is especially true for real world usecases like RP where write amplification is generally fairly high but at the same time can be absorbed at the cost of latency. (cherry picked from commit 9bf2094)

BenPope · 2025-08-10T09:40:53Z

Changes in force-push:

Added two most recent commits to v25.3.x-pre
Removed 3 commits that were already upstream
Deflake cpu_profiler test again (move from 5 allowed dropped_samples to 10)
Actually fixup the fixups

mmaslankaprv and others added 19 commits August 6, 2025 13:55

http: enable specifying a content type on exceptions

e667a1b

Since an exception carries some text for the response body text, the raising site might like to specify the content type if it's e.g. json. Signed-off-by: John Spray <jcs@vectorized.io>

http: don't jsonize exception if has content type

ef5cda8

This enables throwing a base_exception from a json request handler with a json payload inside it. Signed-off-by: John Spray <jcs@vectorized.io>

http: use base_exception content type in non-json errors

842e408

Signed-off-by: John Spray <jcs@vectorized.io>

metrics: expose metric impl handle to internal api

3ac5220

This patch extends the metrics internal apis to use a specific metrics::impl::impl object identified by its integer handle. (cherry picked from commit 6ee4af7)

metrics: expose handle in metric_groups_impl

5db62b1

Add a public method to metric_groups_impl that exposes the handle of the internal implementation it is using. This is required in order for the metric_groups class to be able to reset itself to the configured implementation handle.

metrics: Use handle for impl object

c808b5b

This patch removes two subsequent calls to `get_local_impl` and reuses the returned handle in that scope.

scollectd: select internal metrics implementation

d793184

This patch extends the scollectd apis with the ability to select the internal metrics implementation to be used by providing a handle. (cherry picked from commit d4331d1)

metrics: Expose 'skip_when_empty' for metrics

74842fa

This patch adds a 'get_skip_when_empy' getter to the 'registered_metric' class. It is used by follow-up patches in order to replicate metrics.

metrics: add helpers for creation of replicas

6bf71b3

This patch adds private methods to the 'metrics::impl' class that deal with the creation of replicated metrics. They will be used to build the public api in future commits.

metrics: add metric replication internal interface

2b21b5e

This patch adds a public method to the 'metrics::impl' class: 'set_metric_families_to_replicate'. When this method is called the families that match any of the specifications will be replicated on the specified destinations.

metrics: public family replication interface

a66876a

This patch exposes a method in the public interface of the metrics module ('replicate_metric_families'), which enables metric replication internally for the requested metric families.

tests: add metrics replication unit tests

3373d1e

BenPope requested a review from Copilot August 6, 2025 13:05

This comment was marked as outdated.

Sign in to view

BenPope mentioned this pull request Aug 6, 2025

v25.3.x update BenPope/seastar#1

Closed

BenPope requested review from a team, Copilot, michael-redpanda and pgellert and removed request for a team August 6, 2025 13:07

Copilot AI reviewed Aug 6, 2025

View reviewed changes

Comment thread tests/unit/cpu_profiler_test.cc

Comment thread tests/unit/cpu_profiler_test.cc

Comment thread tests/unit/tls_test.cc

Comment thread src/net/tls-impl.cc

BenPope mentioned this pull request Aug 6, 2025

Update seastar v25.3.x redpanda-data/redpanda#27115

Merged

7 tasks

BenPope force-pushed the v25.3.x-update branch from 8a6b7a3 to 58ce918 Compare August 6, 2025 13:17

mmaslankaprv and others added 24 commits August 10, 2025 10:37

http/client: do not depend on gnutls for error handling

2bdd229

We simply ignore premature termination error as it is GnuTLS specific. Signed-off-by: Michał Maślanka <michal@redpanda.com>

tls: Create tls-impl

6ad5d14

Created tls-impl.cc and tls-impl.h which contains common structures and definitions that are not dependent on the underlying TLS mechanism. These changes set the stage for implementing other TLS providers. Signed-off-by: Michael Boquard <michael@redpanda.com>

tls: Added tls_logger and log statements

f69d6e0

Added pretty-print capabilities to seastar::tls::session for OpenSSL and added a number of log statements that may be helpful if debugging the implementation. Signed-off-by: Michael Boquard <michael@redpanda.com>

test/tls: Using CA extension for generating CA

26c1e82

More recent versions of OpenSSL requrire CA certificates to have CA:true Signed-off-by: Michael Boquard <michael@redpanda.com>

tls/test: Updated test_reload_certificates

6f69012

Now handling situations where the get() call doesn't throw but does return an empty buffer indicating EOF. Signed-off-by: Michael Boquard <michael@redpanda.com>

ossl: use raw pointer instead of iterator

3aaa5fd

GHA-alpine: build with OpenSSL

d5ef099

We need to build with OpenSSL for the TLS tests to work on our fork.

GHA: disable modules build

3dd1fae

Seastar has a C++20 modules variation of the build in GHA, but our changes have left it quite broken as we don't use modules. Disable this build for now so we can get the build green and protect the seastar branch.

net/tls(ossl): Introduce cert_info and accessors

d621e9e

Signed-off-by: Ben Pope <ben@redpanda.com>

tls_test: Add tests for new reload callback and cert_info accessors

8391f9a

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>

disable exception interception when ASAN enabled

93ae4f7

We disable interception of _Unwind_RaiseException when ASAN is enabled since it can produce a large number of stack-related false positives when it is enabled and exceptions are thrown. Fixes scylladb#2892.

io-queue: Refactor cost function

f6062fb

Pure non-functional refactor to split the long line into three. (cherry picked from commit a0155ba)

BenPope dismissed travisdowns’s stale review via 68c3a6f August 10, 2025 09:37

BenPope force-pushed the v25.3.x-update branch from 58ce918 to 68c3a6f Compare August 10, 2025 09:37

BenPope requested a review from pgellert August 10, 2025 09:47

pgellert approved these changes Aug 11, 2025

View reviewed changes

piyushredpanda merged commit 1fea24e into redpanda-data:v25.3.x Aug 11, 2025
31 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update seastar for v25.3.x#228

Update seastar for v25.3.x#228
piyushredpanda merged 86 commits intoredpanda-data:v25.3.xfrom
BenPope:v25.3.x-update

BenPope commented Aug 6, 2025 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BenPope commented Aug 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

16 participants

Conversation

BenPope commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BenPope commented Aug 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

16 participants

BenPope commented Aug 6, 2025 •

edited

Loading

BenPope commented Aug 10, 2025 •

edited

Loading