feat(runtime-expressions): improve ABNF grammar clarity#454
feat(runtime-expressions): improve ABNF grammar clarity#454frankkilcommins wants to merge 1 commit intoOAI:v1.1-devfrom
Conversation
| "firstName": "{$inputs.customer#/firstName}", | ||
| "lastName": "{$inputs.customer#/lastName}", | ||
| "dateOfBirth": "{$inputs.customer#/dateOfBirth}", | ||
| "postalCode": "{$inputs.customer#/postalCode}" |
There was a problem hiding this comment.
It would be great to include more complex and diverse examples demonstrating the application of ABNF syntax.
| component-name = identifier | ||
|
|
||
| ; Identifier rule | ||
| identifier = 1*( ALPHA / DIGIT / "." / "-" / "_" ) |
There was a problem hiding this comment.
The PR's identifier = 1*( ALPHA / DIGIT / "." / "-" / "_" ) is used for all IDs (stepId, workflowId, sourceDescriptionName, component keys, input/output names). But the spec defines two
different patterns:
- stepId, workflowId, sourceDescriptionName: SHOULD [A-Za-z0-9_-]+ (no dot)
- Components keys: MUST ^[a-zA-Z0-9.-_]+$ (with dot)
A single shared identifier rule conflates these — it allows dots in step/workflow IDs where the spec says they shouldn't be, and it's only SHOULD-level enforcement anyway. Separate rules would
be more faithful to the spec's intent.
| field-name = identifier | ||
|
|
||
| ; Source descriptions expressions | ||
| source-reference = source-name "." reference-id |
There was a problem hiding this comment.
source-reference is too restrictive with identifier
The proposed grammar uses:
source-reference = source-name "." reference-id
reference-id = identifier
The <reference> part can be an operationId from an OpenAPI description or a workflowId from
an Arazzo document. OpenAPI does not constrain operationId to any specific character set —
it's just a string. This means operationIds like get/pets, get pets, or create-user@v2 are
technically valid in OpenAPI but would be rejected by the identifier rule.
I'd suggest using a less restrictive rule for reference-id — something like 1*CHAR (any
character except { and }) — to avoid rejecting valid OpenAPI operationIds.
| "$statusCode" / | ||
| "$request." source / | ||
| "$response." source / | ||
| "$inputs." input-reference / |
There was a problem hiding this comment.
Plural for consistency with $steps or $workflows
| "$inputs." input-reference / | |
| "$inputs." inputs-reference / |
| "$request." source / | ||
| "$response." source / | ||
| "$inputs." input-reference / | ||
| "$outputs." output-reference / |
There was a problem hiding this comment.
| "$outputs." output-reference / | |
| "$outputs." outputs-reference / |
| ; JSON Pointer (RFC 6901) | ||
| json-pointer = *( "/" reference-token ) | ||
| reference-token = *( unescaped / escaped ) | ||
| unescaped = %x00-2E / %x30-7D / %x7F-10FFFF |
There was a problem hiding this comment.
unescaped in json-pointer still includes { and } — breaks embedded expression parsing
The PR correctly excludes { and } from the CHAR rule for unambiguous embedded
expression parsing, but the unescaped rule in json-pointer still uses %x30-7D, which
includes } (%x7D) and { (%x7B).
This means an embedded expression like {$request.body#/status} or
{$steps.someStepId.outputs.pets#/0/id} cannot be reliably parsed — the json-pointer's
unescaped will consume the closing }, making it impossible to determine where the
expression ends.
The fix is to change unescaped from:
unescaped = %x00-2E / %x30-7D / %x7F-10FFFF
to:
unescaped = %x00-2E / %x30-7A / %x7C / %x7E-10FFFF
; %x2F ('/'), %x7E ('~'), %x7B ('{'), %x7D ('}') are excluded
This is a minor deviation from RFC 6901, but { and } in JSON Pointer reference tokens
are extremely rare in practice, and without this fix the expression-string grammar
cannot work correctly for any expression containing a json-pointer.
We validated this in our ABNF parser implementation at
https://github.com/swaggerexpert/arazzo-runtime-expression — after making this change,
all expressions with json-pointers work correctly in both standalone and embedded
contexts.
Restructure the ABNF grammar to use explicit, typed reference rules in the primary grammar instead of relying on secondary grammars with two-pass parsing. This improves grammar clarity and aligns with the proposed spec changes in OAI/Arazzo-Specification#454. Key changes: - Add $self expression support - Add $inputs/$outputs JSON Pointer support (e.g., $inputs.customer#/firstName) - Inline all secondary grammars into the primary grammar - Extract shared identifier and identifier-strict rules - Adapt json-pointer to exclude { and } from unescaped for unambiguous embedded expression parsing, fixing the body expression extract limitation - Require explicit component types (parameters/successActions/failureActions) - Update README with current grammar and examples Resolves: OAI/Arazzo-Specification#424, OAI/Arazzo-Specification#425, OAI/Arazzo-Specification#426, OAI/Arazzo-Specification#428 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| ; Matches recommended pattern [A-Za-z0-9_\-]+ from spec | ||
|
|
||
| ; Legacy 'name' rule (retained for query/path references) | ||
| name = *( CHAR ) |
There was a problem hiding this comment.
name isn't legacy. It's the correct rule for query and path references because query parameter names and path parameter names are user-defined and can contain any valid character.
Implementation VerificationI implemented the proposed grammar changes in my ABNF parser at swaggerexpert/arazzo-runtime-expression#116 to verify the grammar is correct and parseable. All 152 tests pass. Below are the findings from the implementation. Issue:
|
$components now requires explicit component type (parameters/successActions/failureActions). Generic components.name pattern removed. Note: This was already semantically invalid per spec.
fixes: #424
fixes: #425
fixes: #426
fixes: #428
fixes: #437
resolves: #427