feat: support sourceDescriptions in operationId/operationPath by displague · Pull Request #73 · jentic/arazzo-engine

displague · 2025-08-28T14:05:01Z

I have multiple Openapi specs in my Arazzo spec and found that arazzo-runner was confused by the sourceDescriptions references.

arazzo-runner - ERROR - Error executing step foo: Operation bar not found in source descriptions

This PR adds support for those. Feel free to close this (AI slop?) and implement it directly, or let me know if there are specific criteria to make this mergeable and I'll accomodate.

Relevant snippets from the spec used in prompting:

example given in https://spec.openapis.org/arazzo/latest.html#arazzo-specification-object-example
https://spec.openapis.org/arazzo/latest.html#fixed-fields-3
- operationId | string | The name of an existing, resolvable operation, as defined with a unique operationId and existing within one of the sourceDescriptions. The referenced operation will be invoked by this workflow step. If multiple (non arazzo type) sourceDescriptions are defined, then the operationId MUST be specified using a Runtime Expression (e.g., $sourceDescriptions..) to avoid ambiguity or potential clashes. This field is mutually exclusive of the operationPath and workflowId fields respectively.
- operationPath | string | A reference to a Source Description Object combined with a JSON Pointer to reference an operation. This field is mutually exclusive of the operationId and workflowId fields respectively. The operation being referenced MUST be described within one of the sourceDescriptions descriptions. A Runtime Expression syntax MUST be used to identify the source description document. If the referenced operation has an operationId defined then the operationId SHOULD be preferred over the operationPath.
https://spec.openapis.org/arazzo/latest.html#examples

Runtime expressions preserve the type of the referenced value. Expressions can be embedded into string values by surrounding the expression with {} curly braces.

I did not include dependsOn...sourceDescriptions or sourceDescriptions...workflowId support in this PR

The referenced operation will be invoked by this workflow step. If multiple (non arazzo type) sourceDescriptions are defined, then the operationId MUST be specified using a [Runtime Expression (https://spec.openapis.org/arazzo/latest.html#runtime-expressions) (e.g., $sourceDescriptions..)

I did not strictly abide by the MUST requirement. The existing lookup behavior is used when $sourceDescriptions is not specificied (presumably first matching operationId from first openapi spec)

🤖 Generated with GPT 5

feat: support $sourceDescriptions operationId references
feat: support sourceDescriptions with curly braces for operationPath refs

displague · 2025-08-28T14:14:47Z

My local version of python differs and I needed this local change to get pytest to run (excluded from the PR under the presumption that this isn't wanted)

diff --git a/runner/arazzo_runner/blob_store.py b/runner/arazzo_runner/blob_store.py
index a9e34f0..0d84d25 100644
--- a/runner/arazzo_runner/blob_store.py
+++ b/runner/arazzo_runner/blob_store.py
@@ -11,7 +11,7 @@ import logging
 import os
 import time
 import uuid
-from typing import Any, Protocol
+from typing import Any, Protocol, Optional, Dict, List
 
 # Configure logging
 logger = logging.getLogger("arazzo-runner.blob_store")
@@ -20,7 +20,7 @@ logger = logging.getLogger("arazzo-runner.blob_store")
 class BlobStore(Protocol):
     """Protocol for blob storage backends."""
 
-    def save(self, data: bytes, meta: dict[str, Any]) -> str:
+    def save(self, data: bytes, meta: Dict[str, Any]) -> str:
         """Save binary data and return a blob ID."""
         ...
 
@@ -28,7 +28,7 @@ class BlobStore(Protocol):
         """Load binary data by blob ID."""
         ...
 
-    def info(self, blob_id: str) -> dict[str, Any]:
+    def info(self, blob_id: str) -> Dict[str, Any]:
         """Get metadata for a blob."""
         ...
 
@@ -40,7 +40,7 @@ class BlobStore(Protocol):
 class LocalFileBlobStore:
     """File-based blob storage implementation."""
 
-    def __init__(self, root: str | None = None, janitor_after_h: int = 24):
+    def __init__(self, root: Optional[str] = None, janitor_after_h: int = 24):
         """
         Initialize the local file blob store.
 
@@ -182,10 +182,10 @@ class InMemoryBlobStore:
         Args:
             max_size: Maximum number of blobs to keep in memory
         """
-        self.blobs: dict[str, bytes] = {}
-        self.metadata: dict[str, dict[str, Any]] = {}
+        self.blobs: Dict[str, bytes] = {}
+        self.metadata: Dict[str, Dict[str, Any]] = {}
         self.max_size = max_size
-        self.access_order: list[str] = []  # For LRU eviction
+        self.access_order: List[str] = []  # For LRU eviction
 
     def _evict_if_needed(self) -> None:
         """Evict oldest blobs if we've exceeded max_size."""
@@ -194,7 +194,7 @@ class InMemoryBlobStore:
             self.blobs.pop(oldest_id, None)
             self.metadata.pop(oldest_id, None)
 
-    def save(self, data: bytes, meta: dict[str, Any]) -> str:
+    def save(self, data: bytes, meta: Dict[str, Any]) -> str:
         """Save binary data with metadata."""
         self._evict_if_needed()
 
diff --git a/runner/arazzo_runner/models.py b/runner/arazzo_runner/models.py
index e94e504..5585adc 100644
--- a/runner/arazzo_runner/models.py
+++ b/runner/arazzo_runner/models.py
@@ -6,13 +6,21 @@ This module defines the data models and enums used by the Arazzo Runner.
 """
 
 from dataclasses import dataclass, field
-from enum import Enum, StrEnum
-from typing import Any, Optional
+from enum import Enum
+from typing import Any, Optional, Dict, List
+
+# StrEnum was added in Python 3.11. Provide a lightweight fallback for older Pythons.
+try:
+    from enum import StrEnum  # type: ignore
+except Exception:  # pragma: no cover - fallback for older Python
+    class StrEnum(str, Enum):
+        def __str__(self) -> str:  # keep behavior similar to stdlib
+            return str(self.value)
 
 from pydantic import BaseModel, ConfigDict, Field
 
-OpenAPIDoc = dict[str, Any]
-ArazzoDoc = dict[str, Any]
+OpenAPIDoc = Dict[str, Any]
+ArazzoDoc = Dict[str, Any]
 
 
 class StepStatus(Enum):
@@ -69,10 +77,10 @@ class WorkflowExecutionResult:
 
     status: WorkflowExecutionStatus
     workflow_id: str
-    outputs: dict[str, Any] = field(default_factory=dict)
-    step_outputs: dict[str, dict[str, Any]] | None = None
-    inputs: dict[str, Any] | None = None
-    error: str | None = None
+    outputs: Dict[str, Any] = field(default_factory=dict)
+    step_outputs: Optional[Dict[str, Dict[str, Any]]] = None
+    inputs: Optional[Dict[str, Any]] = None
+    error: Optional[str] = None
 
 
 @dataclass
@@ -80,12 +88,12 @@ class ExecutionState:
     """Represents the current execution state of a workflow"""
 
     workflow_id: str
-    current_step_id: str | None = None
-    inputs: dict[str, Any] = None
-    step_outputs: dict[str, dict[str, Any]] = None
-    workflow_outputs: dict[str, Any] = None
-    dependency_outputs: dict[str, dict[str, Any]] = None
-    status: dict[str, StepStatus] = None
+    current_step_id: Optional[str] = None
+    inputs: Optional[Dict[str, Any]] = None
+    step_outputs: Optional[Dict[str, Dict[str, Any]]] = None
+    workflow_outputs: Optional[Dict[str, Any]] = None
+    dependency_outputs: Optional[Dict[str, Dict[str, Any]]] = None
+    status: Optional[Dict[str, StepStatus]] = None
     runtime_params: Optional["RuntimeParams"] = None
 
     def __post_init__(self):
@@ -105,9 +113,9 @@ class ExecutionState:
 class ServerVariable(BaseModel):
     """Represents a variable for server URL template substitution."""
 
-    description: str | None = None
-    default_value: str | None = Field(None, alias="default")
-    enum_values: list[str] | None = Field(None, alias="enum")
+    description: Optional[str] = None
+    default_value: Optional[str] = Field(None, alias="default")
+    enum_values: Optional[List[str]] = Field(None, alias="enum")
 
     model_config = ConfigDict(populate_by_name=True, extra="allow")
 
@@ -116,9 +124,9 @@ class ServerConfiguration(BaseModel):
     """Represents an API server configuration with a templated URL and variables."""
 
     url_template: str = Field(alias="url")
-    description: str | None = None
-    variables: dict[str, ServerVariable] = Field(default_factory=dict)
-    api_title_prefix: str | None = None  # Derived from spec's info.title
+    description: Optional[str] = None
+    variables: Dict[str, ServerVariable] = Field(default_factory=dict)
+    api_title_prefix: Optional[str] = None  # Derived from spec's info.title
 
     model_config = ConfigDict(populate_by_name=True, extra="allow")
 
@@ -128,6 +136,6 @@ class RuntimeParams(BaseModel):
     Container for all runtime parameters that may influence workflow or operation execution.
     """
 
-    servers: dict[str, str] | None = Field(
+    servers: Optional[Dict[str, str]] = Field(
         default=None, description="Server variable overrides for server resolution."
     )

runner/arazzo_runner/executor/operation_finder.py

char0n · 2025-08-28T15:23:14Z

Hi @displague,

Thanks for your contribution! We appreciate you taking the time to work on this feature. We'll review it and get back to you with our thoughts.

Signed-off-by: Marques Johansson <mjohansson@equinix.com>

…refs

Signed-off-by: Marques Johansson <mjohansson@equinix.com>

Copilot

Pull Request Overview

This PR adds support for Arazzo specification sourceDescriptions references in both operationId and operationPath fields. The changes enable the arazzo-runner to properly resolve operations when multiple OpenAPI specs are defined in an Arazzo workflow, addressing confusion when source descriptions references are used.

Added support for $sourceDescriptions.<name>.<operationId> syntax in operationId fields
Enhanced operationPath handling to support runtime expressions with sourceDescriptions references
Implemented braced runtime expression evaluation for operationPath fields

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
runner/arazzo_runner/executor/operation_finder.py	Added sourceDescriptions parsing logic to find_by_id and enhanced operationPath resolution in get_operations_for_workflow
runner/tests/executor/test_operation_finder.py	Added comprehensive test coverage for new sourceDescriptions functionality

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-08-28T22:11:31Z

runner/arazzo_runner/executor/operation_finder.py

                decoded_path = encoded_path.replace("~1", "/").replace("~0", "~")
+                # Normalize leading slashes: decoded_path may start with multiple slashes
+                # because encoded_path often begins with a leading '/'. Ensure a single leading slash.
+                decoded_path = "/" + decoded_path.lstrip("/")


This path normalization logic is duplicated from existing code and could cause unexpected behavior. The comment mentions "multiple slashes" but the logic strips all leading slashes then adds one back, which could incorrectly modify paths that legitimately start with multiple slashes. Consider extracting this to a dedicated path normalization utility function to ensure consistent behavior across the codebase.

Copilot · 2025-08-28T22:11:31Z

runner/arazzo_runner/executor/operation_finder.py

+                        (
+                            name
+                            for name, desc in self.source_descriptions.items()
+                            if desc.get("url") and (
+                                desc.get("url") == resolved_left
+                                or resolved_left.endswith(desc.get("url"))
+                                or desc.get("url") in resolved_left
+                                or resolved_left in desc.get("url")
+                            )
+                        ),
+                        None,
+                    )


The URL matching logic is overly permissive and could lead to false positives. Using substring matching (in operator) on both directions could match unintended sources. For example, if one source has URL "api.com" and another has "myapi.com", both would match "api.com". Consider using more precise matching logic such as exact matches or proper URL parsing.

Copilot · 2025-08-28T22:11:31Z

runner/arazzo_runner/executor/operation_finder.py

+                    eval_state = ExecutionState(workflow_id="__internal__")
+
+                    def _eval_braced(m: re.Match) -> str:
+                        expr = m.group(1)


Creating an ExecutionState with a hardcoded workflow_id "internal" for expression evaluation may not provide the proper context for runtime expressions. This could lead to incorrect evaluation results if the expressions depend on actual workflow state. Consider passing the actual workflow context or creating a more appropriate evaluation context.

Suggested change

eval_state = ExecutionState(workflow_id="__internal__")

def _eval_braced(m: re.Match) -> str:

expr = m.group(1)

workflow_id = workflow.get("id", "__internal__")

eval_state = ExecutionState(workflow_id=workflow_id)

def _eval_braced(m: re.Match) -> str:

char0n · 2025-09-01T11:25:13Z

My local version of python differs and I needed this local change to get pytest to run (excluded from the PR under the presumption that this isn't wanted)

Hi @displague,

This project requires Python 3.11 or higher. You're probably using older version?

char0n · 2025-09-01T11:27:40Z

I did not strictly abide by the MUST requirement. The existing lookup behavior is used when $sourceDescriptions is not specificied (presumably first matching operationId from first openapi spec)

That's fine. This can be a lenient fallback, first matching operation from one of the sources (in the order as defined in the Arazzo Document) SHOULD be matched.

Copilot AI review requested due to automatic review settings August 28, 2025 14:05

This comment was marked as outdated.

Sign in to view

displague commented Aug 28, 2025

View reviewed changes

runner/arazzo_runner/executor/operation_finder.py Outdated Show resolved Hide resolved

displague added 4 commits August 28, 2025 17:57

feat: support $sourceDescriptions operationId references

c96335f

Signed-off-by: Marques Johansson <mjohansson@equinix.com>

feat: support sourceDescriptions with curly braces for operationPath …

a4f46ac

…refs

use ExpressionEvaluator in sourceDescriptions evals

fe22044

Signed-off-by: Marques Johansson <mjohansson@equinix.com>

optimize for line-count in handling sourceDescriptions

7691a5b

Signed-off-by: Marques Johansson <mjohansson@equinix.com>

displague force-pushed the sourceDescriptions_refs branch from c4fb15b to 7691a5b Compare August 28, 2025 21:58

optimize readbility and diff-size in sourceDescriptions handling

66ba2e9

Signed-off-by: Marques Johansson <mjohansson@equinix.com>

displague requested a review from Copilot August 28, 2025 22:10

Copilot AI reviewed Aug 28, 2025

View reviewed changes

char0n added enhancement New feature or request runner labels Aug 29, 2025

char0n self-requested a review August 29, 2025 11:00

char0n self-assigned this Aug 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support sourceDescriptions in operationId/operationPath#73

feat: support sourceDescriptions in operationId/operationPath#73
displague wants to merge 5 commits intojentic:mainfrom
displague:sourceDescriptions_refs

displague commented Aug 28, 2025 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

displague commented Aug 28, 2025

Uh oh!

Uh oh!

char0n commented Aug 28, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Aug 28, 2025

Uh oh!

Copilot AI Aug 28, 2025

Uh oh!

Copilot AI Aug 28, 2025

Uh oh!

char0n commented Sep 1, 2025

Uh oh!

char0n commented Sep 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

displague commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

displague commented Aug 28, 2025

Uh oh!

Uh oh!

char0n commented Aug 28, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

char0n commented Sep 1, 2025

Uh oh!

char0n commented Sep 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

displague commented Aug 28, 2025 •

edited

Loading