Skip to content

v25.3.x update#1

Closed
BenPope wants to merge 95 commits intov25.3.xfrom
v25.3.x-update
Closed

v25.3.x update#1
BenPope wants to merge 95 commits intov25.3.xfrom
v25.3.x-update

Conversation

@BenPope
Copy link
Copy Markdown
Owner

@BenPope BenPope commented Aug 4, 2025

Notes for reviewer:

  • Many commits reordered compared to previous branches to colocate similar changes and fixups
  • All the "old" OpenSSL work was ripped out, and replaced with tls: Introduce OpenSSL scylladb/seastar#2569
    • Moved some stuff after these commits to make future rebases easier, and keep the changes more in sync

@BenPope BenPope force-pushed the v25.3.x-update branch 4 times, most recently from 302d575 to b8fa9be Compare August 6, 2025 10:09
@BenPope BenPope changed the title V25.3.x update v25.3.x update Aug 6, 2025
@BenPope BenPope removed the request for review from michael-redpanda August 6, 2025 12:52
mmaslankaprv and others added 22 commits August 6, 2025 13:55
Seastar http server implementation supports multiple listeners. It may
be required for the handler logic to know which listener the connection
is coming from. Added listener_idx field to `httpd::request` to allow
handler recognize listener.

Signed-off-by: Michal Maslanka <michal@vectorized.io>
Since an exception carries some text for the response
body text, the raising site might like to specify
the content type if it's e.g. json.

Signed-off-by: John Spray <jcs@vectorized.io>
This enables throwing a base_exception from
a json request handler with a json payload
inside it.

Signed-off-by: John Spray <jcs@vectorized.io>
Signed-off-by: John Spray <jcs@vectorized.io>
Prior to this patch seastar only exposes one global metrics::impl::impl
object which holds all metric related data for one application.

This patch changes the implementation details such that multiple
metrics::impl::impl objects can exist for any given application.
Said objects are stored into a map on each shard and created
dinamically whenever requested. A metrics::impl::impl is identified
by an integer handle that acts as the key for the storage map.

Implementation note: in order to avoid issues caused by the ordering
of static thread_local objects I had to declare the storage in
reactor.cc.

(cherry picked from commit 585a8af)
This patch extends the metrics internal apis to use a specific
metrics::impl::impl object identified by its integer handle.

(cherry picked from commit 6ee4af7)
Add a public method to metric_groups_impl that exposes the handle
of the internal implementation it is using. This is required in order
for the metric_groups class to be able to reset itself to the configured
implementation handle.
This patch extends the metrics user facing apis to use a specific
metrics::impl::impl object identified by its integer handle.

Note that the constructor of 'metric_groups' is marked explicit
in this patch and updates two call sites where the constructor was used
implicitly.
This patch removes two subsequent calls to `get_local_impl` and reuses
the returned handle in that scope.
This patch extends the user facing prometheus apis allowing the user to
specify the internal metrics implementation to be used through a handle.
Additionally, 'add_prometheus_routes' now takes an argument that
specifies the route on which to advertise the metrics. This enables
different metrics "namespaces" to be served by different endpoints in
isolation.

(cherry picked from commit 6189522)
This patch extends the scollectd apis with the ability to select the
internal metrics implementation to be used by providing a handle.

(cherry picked from commit d4331d1)
This patch adds a 'get_skip_when_empy' getter to the 'registered_metric'
class. It is used by follow-up patches in order to replicate metrics.
This patch adds private methods to the 'metrics::impl' class that deal
with the creation of replicated metrics. They will be used to build the
public api in future commits.
This patch adds private helpers to 'metrics::impl' that deal with the
removal of replicated metric families from their destintation
implementation. These methods will be used in subsequent commits to
manage the lifetime of replicated metrics.
This patch adds a public method to the 'metrics::impl' class:
'set_metric_families_to_replicate'. When this method is called
the families that match any of the specifications will be replicated
on the specified destinations.
This patch extends the metric registration and unregistration processes
to make them aware of metric replication.

In the case of metric registration, if the new metric belongs to a
family that matches one of the replication specs, then a replicated
metric is created accordingly.

For unregistration of a metric, the replicated metric is unregistered
too if one exists.
This patch exposes a method in the public interface of the metrics
module ('replicate_metric_families'), which enables metric replication
internally for the requested metric families.
Extends the metrics api to allow changing the aggregation labels of a
metrics family.

Otherwise one had to un-register every single metric instance in a
metric family and then re-register with the changed aggregation labels.

For metric families with thousands of instances (e.g.: histograms with
lots of different labels) this is quite expensive.

With this change we avoid the full reconstruction of the metrics family
and all its metrics. Only the work associated with marking the metrics
`dirty()` is needed then.
Avoid a string copy when writing the metric value and the metric type to
the prometheus text-based response.
Adds a test to verify that the metrics returned with the prometheus text
format are formatted as expected.
Fixes up an earlier commit which should have been a refactor commit but
had an unintended change to the way `double`'s were formatted on the
prometheus response.

Note that now `fmt::print` is used instead of the originally used
`std::to_string`. This is because `std::to_string` is going to change
its formatting behaviour in C++26, so we prefer to use a more
deterministic formatter here.

Fixes up 864c209
BenPope and others added 27 commits August 6, 2025 13:57
Signed-off-by: Ben Pope <ben@redpanda.com>
This commit adds support for using OpenSSL, instead of GnuTLS, as the
TLS provider within Seastar.  To support this change, the configure
script has been updated to allow users to select which cryptographic
provider should be used by supply `--crypto-provider` and specificying
either `OpenSSL` or `GnuTLS`.

The OpenSSL implementation mirrors the GnuTLS implementation.  Instead
of using callbacks, a custom BIO was created to handle moving data
on/off of the OpenSSL SSL session into the Seastar TLS session data
sinks.

When compiled for OpenSSL, the
`certificate_credentials::set_priority_string` method is compiled out and
replaced with the following:

* `set_cipher_string`
* `set_ciphersuites`
* `enable_server_precedence`
* `set_minimum_tls_version`
* `set_maximum_tls_version`

These methods are specific to OpenSSL.

The github actions have been updated to run the full suite of tests
against both cryptographic providers.

`src/net/tcp.hh` and `src/websocket/server.cc` have been updated to use
OpenSSL instead of GnuTLS, depending upon the build configuration.

Signed-off-by: Michael Boquard <michael@redpanda.com>
Signed-off-by: Ben Pope <ben@redpanda.com>
Added pretty-print capabilities to seastar::tls::session for OpenSSL and
added a number of log statements that may be helpful if debugging the
implementation.

Signed-off-by: Michael Boquard <michael@redpanda.com>
Prior to this commit, in certain situations, the put() and get() methods
would 'drop into' handshake() if they determined that a re-negotiation
was occurring.  This was unecessary as OpenSSL provides helpful error
codes like SSL_WANT_READ and SSL_WANT_WRITE which indicates if OpenSSL
needed to send/receive non-user data to handle re-negotiations.

This may have also led to a rare situation where the put() method would
assert because the _output_pending future had not resolved.  This change
properly handles the SSL_WANT_READ and SSL_WANT_WRITE conditions and
ensures that the _output_pending future resolves.

Signed-off-by: Michael Boquard <michael@redpanda.com>
More recent versions of OpenSSL requrire CA certificates to have CA:true

Signed-off-by: Michael Boquard <michael@redpanda.com>
Now handling situations where the get() call doesn't throw but does
return an empty buffer indicating EOF.

Signed-off-by: Michael Boquard <michael@redpanda.com>
TLS renegotiation was removed in TLSv1.3 due to issues around security.
This change makes that the default behavior, however a method is exposed
to re-enable it if desired.  This was only added to OpenSSL as GnuTLS
does not provide a means to fully disable renegotiation.

Signed-off-by: Michael Boquard <michael@redpanda.com>
Signed-off-by: Ben Pope <ben@redpanda.com>
Added a dn_format flag to switch between legacy and RFC2253 format.
Legacy format matches that originally returned by the GnuTLS api
`gnutls_x509_crt_get_dn` which returns the DN in the format of
`C=US,ST=London,L=London,O=Redpanda Data,OU=Core,CN=id`.  RFC2253
format, effectively reverses the order: `CN=id,OU=Core,O=Redpanda
Data,L=London,ST=London,C=US`.

Signed-off-by: Michael Boquard <michael@redpanda.com>
Signed-off-by: Ben Pope <ben@redpanda.com>
We need to build with OpenSSL for the TLS tests to work on our fork.
Signed-off-by: Ben Pope <ben@redpanda.com>
Seastar has a C++20 modules variation of the build in GHA, but our
changes have left it quite broken as we don't use modules.

Disable this build for now so we can get the build green and protect
the seastar branch.
This commit introduces an alternative reload callback prototype which
includes the credentials that were reloaded.

In service of that, also adds `reloadable_credentials_base::as_certificate_credentials`
to produce a const-ref to `certificate_credentials`, suitable for exposure
through the new callback.

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
This commit introduces `struct cert_info` which encapsulates useful
information about TLS certificates, currently:

- The certificate's serial number (as assigned by the signing CA)
- The certificate's expiration time

Additionally, this commit adds public API on `tls::certificate_credentials`
to access this information for loaded certificates AND the loaded trust
list.

The motivation for this change is to give applications the ability to
monitor and report on certificate and CA lifetimes on demand, the idea
being that an application may cache serial numbers and expiries when
credentials are reloaded, serving up a summary of those details until
the next reload time.

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Signed-off-by: Ben Pope <ben@redpanda.com>
Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
`future::get0()` was deprecated. so let's use `future::get()` instead.
this addresses following build failure:

```
FAILED: tests/unit/CMakeFiles/test_unit_tls.dir/tls_test.cc.o
/usr/bin/g++-13 -DBOOST_ALL_DYN_LINK -DBOOST_NO_CXX98_FUNCTION_BASE -DFMT_SHARED -DSEASTAR_API_LEVEL=7 -DSEASTAR_BUILD_SHARED_LIBS -DSEASTAR_DEBUG -DSEASTAR_DEBUG_PROMISE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_DEFERRED_ACTION_REQUIRE_NOEXCEPT -DSEASTAR_HAS_MEMBARRIER -DSEASTAR_HAVE_ASAN_FIBER_SUPPORT -DSEASTAR_HAVE_HWLOC -DSEASTAR_HAVE_NUMA -DSEASTAR_HAVE_SYSTEMTAP_SDT -DSEASTAR_HAVE_URING -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_SSTRING -DSEASTAR_TESTING_MAIN -DSEASTAR_TESTING_WITH_NETWORKING=1 -DSEASTAR_TYPE_ERASE_MORE -I/home/circleci/project/tests/unit -I/home/circleci/project/src -I/home/circleci/project/include -I/home/circleci/project/build/debug/gen/include -I/home/circleci/project/build/debug/gen/src -g -std=gnu++20 -U_FORTIFY_SOURCE -Wno-maybe-uninitialized -Wno-error=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -UNDEBUG -Wall -Werror -Wimplicit-fallthrough -Wdeprecated -Wno-error=deprecated -Wno-error=stringop-overflow -Wno-error=array-bounds -gz -MD -MT tests/unit/CMakeFiles/test_unit_tls.dir/tls_test.cc.o -MF tests/unit/CMakeFiles/test_unit_tls.dir/tls_test.cc.o.d -o tests/unit/CMakeFiles/test_unit_tls.dir/tls_test.cc.o -c /home/circleci/project/tests/unit/tls_test.cc
In file included from /usr/include/boost/test/test_tools.hpp:45,
                 from /usr/include/boost/test/unit_test.hpp:18,
                 from /home/circleci/project/include/seastar/testing/seastar_test.hh:27,
                 from /home/circleci/project/include/seastar/testing/test_case.hh:31,
                 from /home/circleci/project/tests/unit/tls_test.cc:40:
/home/circleci/project/tests/unit/tls_test.cc: In member function 'void test_tls13_session_tickets::do_run_test_case() const':
/home/circleci/project/tests/unit/tls_test.cc:1541:61: error: 'seastar::future<T>::get0_return_type seastar::future<T>::get0() [with T = bool; get0_return_type = bool]' is deprecated: Use get() instead [-Werror=deprecated-declarations]
 1541 |         BOOST_REQUIRE(!tls::check_session_is_resumed(c).get0()); // no resume data
      |                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
In file included from /home/circleci/project/include/seastar/core/do_with.hh:25,
                 from /home/circleci/project/tests/unit/tls_test.cc:25:
/home/circleci/project/include/seastar/core/future.hh:1376:22: note: declared here
 1376 |     get0_return_type get0() {
      |                      ^~~~
/home/circleci/project/tests/unit/tls_test.cc:1544:57: error: 'seastar::future<T>::get0_return_type seastar::future<T>::get0() [with T = std::vector<unsigned char>; get0_return_type = std::vector<unsigned char>]' is deprecated: Use get() instead [-Werror=deprecated-declarations]
 1544 |         sess_data = tls::get_session_resume_data(c).get0();
      |                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
/home/circleci/project/include/seastar/core/future.hh:1376:22: note: declared here
 1376 |     get0_return_type get0() {
      |                      ^~~~
/home/circleci/project/tests/unit/tls_test.cc:1584:29: error: 'seastar::future<T>::get0_return_type seastar::future<T>::get0() [with T = bool; get0_return_type = bool]' is deprecated: Use get() instead [-Werror=deprecated-declarations]
 1584 |         BOOST_REQUIRE(f.get0()); // Should work
      |                       ~~~~~~^~
/home/circleci/project/include/seastar/core/future.hh:1376:22: note: declared here
 1376 |     get0_return_type get0() {
      |                      ^~~~
cc1plus: all warnings being treated as errors
```

Refs c224fe0
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
GnuTLS provides serial numbers not as a legible string of
hex digits but as a byte array corresponding to an integer
value up to 20 octets wide. The cert_info interface should
reflect this.

To that end, this commit updates the type of `cert_info::serial`
to an non-null-terminated basic_sstring of uint8_t.

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Passed up to application code as a tls::blob (basic_string_view<char>).

Also provide the ability to fetch this blob directly via
credentials_builder::get_trust_file_blob()

Updates unit tests to accommodate the new reload callback signature.

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Signed-off-by: Ben Pope <ben@redpanda.com>
char_traits<unsigned> is deprecated in LLVM 18, slated for
removal in LLVM 19. Therefore remove this non-standard usage
from cert_info.

Extracting the x509 serial number is a relatively infrequent
operation and the result is always <= 20B, so the cost of using
a vector here should be marginal.

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
We disable interception of _Unwind_RaiseException when ASAN is enabled
since it can produce a large number of stack-related false positives
when it is enabled and exceptions are thrown.

Fixes scylladb#2892.
Redpanda builds seastar with depcrecations as errors. However, some of
the newly deprecated functions in Seastar are used heavily in RP. Hence
this commit temporarily comments those annotations out until their usage
is removed in RP.
@BenPope
Copy link
Copy Markdown
Owner Author

BenPope commented Aug 6, 2025

Moved to redpanda-data#228

@BenPope BenPope closed this Aug 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.