Skip to content

[CDRIVER-6048] Add New Time and Duration Functionality #2074

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 61 commits into
base: master
Choose a base branch
from

Conversation

vector-of-bool
Copy link
Contributor

@vector-of-bool vector-of-bool commented Jul 24, 2025

Refer: CDRIVER-6048

Background

This is a preliminary work for CSOT, to give us a stable foundation for working with time going forward.

The C driver has had some duration/time types (mcd_time_point and mcd_duration), but they weren't widely used in the codebase and could use some TLC. This PR description will only present a high-level overview, as much of the implementation explanation has been added as commentary within the source files themselves.

This PR also updates one core internal submodule to use the new time APIs, as a test-run of deploying them into the codebase in preparation for CSOT.

Fortunately, the vast majority of code additions in this changeset are explanatory comments.

New Time APIs

The new time types are similar to the previous mcd_ time types:

  • struct mlib_duration represents a (possibly negative) difference between two points in time, with a fixed resolution (currently it has microsecond granularity). This is meant to replace any plain integers in the codebase that represent durations as a count of units. Using plain integers is problematic since the associated time unit cannot be statically enforced, and doing arithmetic on plain integers is inherently unsafe. The mlib_duration arithmetic is fully well-defined to use saturating arithmetic, so attempting to make a "too large" duration value will simply clamp to the nearest representable duration. While this is potentially problematic in extreme conditions, it only appears when juggling durations of hundreds of millennia, which reconnects Russia+Alaska and deletes Japan.
  • struct mlib_time_point represents a fixed point-in-time. Currently the implementation is written against the program's monotonic clock. It is encoded as an mlib_duration relative to the unspecified monotonic clock epoch. Because the epoch is unspecified, this type cannot be reliably converted to any "IRL" time point. (I tried writing an lldb pretty-printer to do this for debugging purposes, but it turns out to be very non-trivial and platform-dependent.)
    • The main API surface for interacting with the underlying clock is mlib_now() which simply returns a time-point for the moment that the function is called.
    • The mlib_now() function actually replaces the implementation of bson_get_monotonic_time function, which incidentally fixes CDRIVER-4526, since mlib_now uses the Win32 fine-grained monotonic clock.
    • Adding support for different clocks would be possible, but would require that the struct carry a clock identifier, and would make things significantly more complicated, as doing comparisons/differences between time points of different clocks requires knowing how the clock epochs are related, which, as mentioned, is non-trivial, enough that no one really wants to do this. For C++, for example, there is no stdlib conversation between the monotonic clock time points and the "IRL" clock time points.
  • mlib_timer represents a deadline. This simply stores an "expires-at" mlib_time_point. The main reason for this being a separate type is to provide distinct types and functions specific to manipulating and inspecting deadlines.

Creating and Manipulating Durations

The initial duration API was extremely verbose, and looked like:

int foo(int a, int b) {
  mlib_duration d = mlib_duration_add(mlib_milliseconds(a), mlib_seconds(b));
}

while this works, its incredibly tedious to write. Instead, an mlib_duration() function-like macro was added, which can be used used with two arguments to create a duration from a unit count, or three arguments to do duration arithmetic, and supports nested argument lists:

int foo(int a, int b) {
  mlib_duration d = mlib_duration((a, ms), plus, (b, sec));
}

See the doc comments in mlib/duration.h for an explanation. The macro trickery isn't pretty, but it is heavily commented for future maintainers.

Usage in mongoc-async-cmd

The mongoc-async-cmd component was chosen arbitrarily as a test bed for the new APIs. Unfortunately, the code in there was heavily under-commented and used non-descriptive struct member names, so a lot of effort was spent just deducing what it is actually trying to do with the dozen int64_t values it was using for timeouts/timestamps/deadlines.

A heavy amount of comments and API renames have been applied to the module, to ensure that we're doing what we expect, and so future refactors won't need to spend a whole 3+ days trying to deduce what everything means. This doesn't give me a great outlook on future CSOT refactor work, depending on whether all time-handling code is similarly obtuse or I just happened to pick a module that was particularly problematic.

Additionally, my refactor of mongoc-async-cmd code led to the discovery that command deadlines can be violated (up to 2x or possibly greater) because of a hack that was added for CDRIVER-1571. Since this is a core module underlying all command execution, refactors will be required to prevent this if we want a successful CSOT implementation. An inline function doc-comment explains the problem.

Review Order

Initially, there was attempt to do isolated commits in logical order, but it somewhat broke down towards the end. Instead, it is recommended to review the final changes in the following order:

  1. mlib/platform.h is added to just #include some common platform headers "correctly", since that's actually non-trivial.
  2. mlib/duration.h is the basis of all time functions.
  3. mlib/time_point.h builds upon duration to create points-in-time.
  4. mlib/timer.h a simpler file that adds deadline-specific functions.
  5. The mongoc-async and mongoc-async-cmd files, which refactor to use the time types.
  6. Everything else: There are several other minor changes across the codebase.

This is intended to replace `mcd_duration` as a more
capable duration type.
This uses preprocessor trickery to support more
terse conversion and arithmetic of duration types.
The following changes are made, in order of significance:

1. The `mongoc_stream_poll` function declaration now says the units on its
   timeout.
2. Add a `_mongoc_stream_poll_internal` that takes a deadline timer rather than
   a timeout duration.
3. Move async-command typedefs into the async-command header.
4. async command creation now takes typed durations instead of integers
5. Async command internally uses timers, time points, and durations,
   rather than juggling `int64_t` for everything.
7. Heavy comments in async-cmd files are added all over the place to explain
   what is going on.
@vector-of-bool vector-of-bool requested a review from kevinAlbs July 24, 2025 22:33
@kevinAlbs kevinAlbs requested a review from eramongodb July 29, 2025 14:27
Copy link
Contributor

@eramongodb eramongodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice improvements. Initial feedback, more pending.

Comment on lines 51 to 52
#define MONGOC_SECURE_CHANNEL_ENABLED() @MONGOC_ENABLE_SSL_SECURE_CHANNEL@
#if MONGOC_SECURE_CHANNEL_ENABLED() != 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes might be premature potential API breaking changes (e.g. note, here). Consider reverting them for now and deferring public header function-macro improvements to a separate ticket/PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it would be a breaking change, as the func-macro takes the same value from the @replacement@ that is used for the obj-macro (note the func-macro has a different name, so the obj-macro is still available).

The reason I added these drive-by changes is that some changes cause a "dead-write" warning to spookily appear from the scan-build task in mongoc-async-cmd.cm, and it was easier to rewrite the offending block as a ?: than to dance around the conditional compilation, so I added these macros.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be a breaking change due to users who may be providing their own mongoc-config.h header file (as a whole), not merely setting the MONGOC_ENABLE_SSL_SECURE_CHANNEL macro (like we would prefer them to do instead), per the first linked note:

The internal preprocessor symbol HAVE_STRINGS_H has been renamed BSON_HAVE_STRINGS_H. If you maintain a handwritten bson-config.h you must rename this symbol.

Then again, I suppose this note suggests changing these config macros does not constitute an API breaking change (1.10.0 release notes)? More API ambiguity that needs to be addressed at some point. 😅

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

users who may be providing their own mongoc-config.h header file (as a whole)

I hope most users do not. I expect their copies need to be updated when new macros are added. I see a semi-relevant result on GitHub code search (from 7 years ago):

# Rather than run automake and autoconf, etc we simply copy the file in
$(CWD)/mongo-c-driver/src/mongoc/mongoc-config.h: $(CWD)/mongoc-config.h
	@cp $< $@~ && mv $@~ $@

More API ambiguity that needs to be addressed at some point. 😅

Agreed. Erring towards caution: consider defining the MONGOC_SECURE_CHANNEL_ENABLED() function macros in a private header (maybe common-config.h.in?). That may side-step the issue until we have a better defined API policy (CDRIVER-5705).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reverted these changes for now. I'll defer to CDRIVER-6064

Copy link
Collaborator

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added comments on the async APIs are very much appreciated.

#define _mlibCommaIfParens(...) ,

/**
* @brief Expands to `1` if the given macro argument is a parenthesized group of
* tokens, otherwize `0`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* tokens, otherwize `0`
* tokens, otherwise `0`

// Feature detection
#ifdef __has_include
#if __has_include(<features.h>)
#include <features.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is features.h needed? If no, consider removing. https://man7.org/linux/man-pages/man7/feature_test_macros.7.html notes:

Note: applications do not need to directly include <features.h>; indeed, doing so is actively discouraged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it looks like it all builds fine without this header. Can't recall why I initially thought it was necessary.

ret.time_since_monotonic_start = mlib_duration_from_timespec (ts);
return ret;
#elif defined(_WIN32)
// Win32 APIs for the high-performance monotonic counter. These APIs never fail after Windows XP
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest using mlib_check to verify the returns of QueryPerformanceFrequency and QueryPerformanceCounter are successful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel safe omitting these checks as even the standard libray implementation does it

// Number of microseconds beyond the last whole second:
const int64_t subsecond_us = ((ticks % ticks_per_second) * one_million) / ticks_per_second;
mlib_time_point ret;
ret.time_since_monotonic_start = mlib_duration ((whole_seconds_1m, us), plus, (subsecond_us, us));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ret.time_since_monotonic_start = mlib_duration ((whole_seconds_1m, us), plus, (subsecond_us, us));
ret.time_since_monotonic_start = mlib_duration ((whole_seconds, sec), plus, (subsecond_us, us));

Suggest removing whole_seconds_1m to simplify.

* @brief Timer types and functions
* @date 2025-04-18
*
* This file contains APIs for creating fixed-deadline timer object that represent
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* This file contains APIs for creating fixed-deadline timer object that represent
* This file contains APIs for creating fixed-deadline timer object that represents

MONGOC_ASYNC_CMD_SETUP,
// The command has no stream and needs to connect to a peer
MONGOC_ASYNC_CMD_PENDING_CONNECT,
// The command has connected and has a stream, but needs to run stream setup
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest noting setup can be the TLS handshake (which I think is the only setup currently done).

Suggested change
// The command has connected and has a stream, but needs to run stream setup
// The command has connected and has a stream, but needs to run stream setup (e.g. TLS handshake)

@@ -2214,7 +2216,7 @@ test_mongoc_client_descriptions_pooled (void *unused)
/* wait for background thread to discover all members */
start = bson_get_monotonic_time ();
do {
_mongoc_usleep (1000);
mlib_sleep_for (1, ms);
/* Windows IPv4 tasks may take longer to connect since connection to the
* first address returned by getaddrinfo may be IPv6, and failure to
* connect may take a couple seconds. See CDRIVER-3639. */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: but consider also updating the below check with mlib_timer:

mlib_timer tm = mlib_expires_after (3, sec);
// ...
if (mlib_timer_is_expired (tm)) {

Comment on lines 51 to 52
#define MONGOC_SECURE_CHANNEL_ENABLED() @MONGOC_ENABLE_SSL_SECURE_CHANNEL@
#if MONGOC_SECURE_CHANNEL_ENABLED() != 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

users who may be providing their own mongoc-config.h header file (as a whole)

I hope most users do not. I expect their copies need to be updated when new macros are added. I see a semi-relevant result on GitHub code search (from 7 years ago):

# Rather than run automake and autoconf, etc we simply copy the file in
$(CWD)/mongo-c-driver/src/mongoc/mongoc-config.h: $(CWD)/mongoc-config.h
	@cp $< $@~ && mv $@~ $@

More API ambiguity that needs to be addressed at some point. 😅

Agreed. Erring towards caution: consider defining the MONGOC_SECURE_CHANNEL_ENABLED() function macros in a private header (maybe common-config.h.in?). That may side-step the issue until we have a better defined API policy (CDRIVER-5705).

if (div.is_signed) {
a._rep /= div.bits.as_signed;
} else {
a._rep /= div.bits.as_unsigned;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
a._rep /= div.bits.as_unsigned;
a._rep = (int64_t) ((uint64_t) a._rep / div.bits.as_unsigned);

To address the following GCC warning:

warning: conversion to ‘long unsigned int’ from ‘mlib_duration_rep_t’ {aka ‘long int’} may change the sign of the result [-Wsign-conversion]
  200 |          a._rep /= div.bits.as_unsigned;
      |                 ^~

@@ -30,15 +30,13 @@
#include <mlib/platform.h>

// Check for POSIX clock functions functions
#define mlib_have_posix_clocks() 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#define mlib_have_posix_clocks() 0
#undef mlib_have_posix_clocks
#define mlib_have_posix_clocks() 0

Safe-guard against pre-existing definitions (again?).

if ((dur._rep < 0) != (fac < 0)) {
uintmax_t bits;
if ((mlib_mul) (&bits, true, true, (uintmax_t) dur._rep, fac.is_signed, fac.bits.as_unsigned)) {
if ((dur._rep < 0) != (fac.is_signed && fac.bits.as_signed < 0)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps there should be a three-way mlib_compare() and/or an mlib_is_negative(). (OK with deferring to another PR.)

if ((dur._rep < 0) != (fac.is_signed && fac.bits.as_signed < 0)) {

if ((dur._rep < 0) != mlib_is_negative(fac)) {

if ((div.is_signed && div.bits.as_signed == -1) && ...) {

if (mlib_compare(div, -1) == 0 && ...) {

Comment on lines +272 to +273
// Don't attempt to cancel a comman in the error state, as it will already have
// a waiting completion.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Don't attempt to cancel a comman in the error state, as it will already have
// a waiting completion.
// Don't cancel a command in the error state, as it already has a waiting completion.

Wording tweak suggestion + perhaps replace "waiting completion" (new term not found elsewhere) with a different description?

? 0
// Otherwise, use the deadline
: mlib_milliseconds_count (mlib_timer_remaining (deadline));
if (mongoc_stream_tls_handshake (tls_stream, host, remain_ms, &retry_events, error)) {
Copy link
Contributor

@eramongodb eramongodb Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: conversion from ‘mlib_duration_rep_t’ {aka ‘long int’} to ‘int32_t’ {aka ‘int’} may change value [-Wconversion]
   92 |    if (mongoc_stream_tls_handshake (tls_stream, host, remain_ms, &retry_events, error)) {
      |                                                       ^~~~~~~~~

Looks like the need for this cast ended up moving here, where the public API mongoc_stream_tls_handshake unfortunately requires an int32_t. Suggest restoring the check (clamp?) + explicit cast to int32_t here.

Comment on lines +176 to +177
#define MLIB_ARGC_PICK(Prefix, ...) MLIB_ARGC_PASTE (Prefix, __VA_ARGS__) (__VA_ARGS__)
#define MLIB_ARGC_PASTE(Prefix, ...) MLIB_PASTE_3 (Prefix, _argc_, MLIB_ARG_COUNT (__VA_ARGS__))
Copy link
Contributor

@eramongodb eramongodb Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there may be something about this change (or related) which MSVC doesn't like:

warning C4003: not enough arguments for function-like macro invocation '_mlib_foreach_urange_argc_3'
error C2059: syntax error: 'constant'

Note: ignore the mingw task failure which is incorrectly using MSVC, to be fixed by #2087.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants