Skip to content

Semantic preservation requirements for encode/decode in the presence of pre-release semantics #57

@maennchen

Description

@maennchen

Problem Statement

This issue asks for clarification of the semantic guarantees provided by vers when encoding and decoding ecosystem-native version ranges.

Several ecosystems define range operators whose semantics are not reducible to simple order intervals over version precedence. An example is discussed in #49 (Elixir’s ~> operator), but the question is broader and not limited to that ecosystem.

The core question is:

What level of semantic preservation is vers expected to guarantee under encode/decode?

Expected Semantic Property

For a given ecosystem E, let:

  • range be a native version range expression
  • matches_E(range, v) be the ecosystem’s native matching predicate

Then the natural expectation of encode/decode is semantic equivalence:

encoded = encode_vers(range)
decoded = decode_vers(encoded)

For all versions v in the ecosystem domain:

  matches_E(range, v) ==
  matches_E(decoded, v)

Important clarifications:

  • decoded does not need to be syntactically identical to range
  • The constraint structure may differ
  • What must be preserved is the membership predicate

This must hold:

  • for all currently published versions
  • for any future versions
  • without enumerating the version space

In other words, encode/decode must preserve the meaning of the range, not merely its apparent bounds.

Why This Is Non-Trivial

For ecosystems using Semantic Versioning:

  • Versions are totally ordered by precedence
  • Pre-release versions have lower precedence than the corresponding release
  • There are infinitely many possible pre-release identifiers
  • Between any two bounds, infinitely many future versions may appear

Example ordering:

1.0.0-alpha < 1.0.0-beta < 1.0.0 < 1.0.1

Crucially:

  • < 1.0.0 includes all 1.0.0-* pre-releases
  • New pre-releases can always be inserted before a release
  • The version space is continuous and extensible, not enumerable

This makes semantic equivalence dependent on how pre-releases are handled.

Limitations of Pure Order-Based Constraints

Many ecosystems define operators whose semantics depend not only on ordering but also on structural properties of the version, such as:

  • Whether the candidate version is a pre-release
  • Whether the requirement explicitly mentions a pre-release
  • Whether the upper bound’s pre-releases should be excluded
  • Ecosystem-level pre-release admission rules

Such semantics cannot always be expressed as a simple union of <, <=, >, >=, = constraints over the precedence order alone.

If vers is strictly order-theoretic, then certain native operators may not be representable without semantic loss.

If semantic loss occurs, encode/decode no longer preserves the native matching predicate.

Specification Question

The specification should clarify:

  1. Is vers required to preserve full ecosystem matching semantics under encode/decode?
  2. Or is vers intentionally limited to order-based precedence constraints?
  3. If some ecosystem-specific semantics cannot be represented, is that considered acceptable?

Clarifying this determines whether encode/decode equivalence is:

  • A semantic guarantee, or
  • A best-effort structural approximation.

Why This Matters

If semantic preservation is not guaranteed:

  • A range may match future versions that the original ecosystem would reject
  • SBOM tooling and vulnerability matching may yield different results after translation
  • Divergence may only become visible when new versions are released

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions