Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions .github/workflows/ci-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,57 @@ concurrency:
group: build-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
benchmarks:
name: "StreamFlow benchmarks"
strategy:
matrix:
on: ["ubuntu-24.04"]
python: ["3.10", "3.11", "3.12", "3.13", "3.14"]
include:
- on: "macos-15-intel"
python: "3.14"
runs-on: ${{ matrix.on }}
env:
TOXENV: ${{ format('py{0}-benchmark', matrix.python) }}
steps:
- uses: actions/checkout@v6
- uses: astral-sh/setup-uv@v7
with:
version: "0.9.16"
- uses: actions/setup-python@v6
with:
python-version: ${{ matrix.python }}
- uses: actions/setup-node@v6
with:
node-version: "24"
- name: "Install Docker (MacOs X)"
uses: docker/setup-docker-action@v4
env:
LIMA_START_ARGS: '--vm-type=vz --mount-type=virtiofs --mount /private/var/folders:w'
if: ${{ startsWith(matrix.on, 'macos-') }}
- uses: docker/setup-qemu-action@v3
if: ${{ startsWith(matrix.on, 'ubuntu-') }}
- name: "Install Apptainer"
uses: eWaterCycle/setup-apptainer@v2
with:
apptainer-version: 1.4.2
if: ${{ startsWith(matrix.on, 'ubuntu-') }}
- name: "Install KinD"
uses: helm/kind-action@v1.14.0
with:
config: .github/kind/config.yaml
kubectl_version: v1.35.0
version: v0.31.0
if: ${{ startsWith(matrix.on, 'ubuntu-') }}
- name: "Configure Calico on KinD"
run: |
kubectl apply -f https://docs.projectcalico.org/v3.25/manifests/calico.yaml
kubectl -n kube-system set env daemonset/calico-node FELIX_IGNORELOOSERPF=true
if: ${{ startsWith(matrix.on, 'ubuntu-') }}
- name: "Install Tox"
run: uv tool install tox --with tox-uv
- name: "Run StreamFlow benchmarks via Tox"
run: tox
code-ql-check:
name: "StreamFlow CodeQL check"
runs-on: ubuntu-24.04
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,8 @@ dmypy.json
.DS_Store

# StreamFlow
.streamflow
.benchmarks/
.streamflow/

#SQLite
*.db-shm
Expand Down
245 changes: 245 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,245 @@
# StreamFlow Agent Guidelines

This document provides essential guidelines for agentic coding agents working on the StreamFlow codebase.

## Project Overview

StreamFlow is a container-native Workflow Management System (WMS) written in Python 3 (versions 3.10-3.14). It implements the Common Workflow Language (CWL) standard (v1.0-v1.3) for multi-cloud/HPC hybrid workflow executions.

**Key Architecture:**
- **Deployment** → **Service** → **Location** (hierarchical execution units)
- Supports multiple connectors: local, docker, kubernetes, ssh, slurm, pbs, singularity, etc.

## Setup & Installation

```bash
# Clone and install dependencies
git clone git@github.com:alpha-unito/streamflow.git
cd streamflow
uv sync --all-extras
```

## Essential Commands

### Testing
```bash
# Run all tests
uv run make test

# Run specific test file
uv run pytest tests/test_file.py

# Run single test function
uv run pytest tests/test_file.py::test_function_name

# Run tests with coverage
uv run make testcov

# Test specific connectors only (all tested in CI)
uv run pytest --deploys local,docker tests/test_remotepath.py
```

**Requirements:** Docker (for most connector tests), Singularity/Apptainer, Kubernetes (minikube)

### Linting & Formatting (REQUIRED BEFORE COMMIT)
```bash
# Check all (must pass before committing)
uv run make format-check flake8 codespell-check

# Auto-fix formatting
uv run make format codespell

# Apply pyupgrade for Python 3.10+ compatibility
uv run make pyupgrade
```

## Mandatory Agent Behavior

All agents **MUST** adhere to these non-negotiable rules:

### Package & Dependency Management (MANDATORY)

**MUST** obtain explicit user permission before installing packages or updating dependencies. Specify what is being installed/updated, why, and await confirmation before proceeding.

### Git Commit Requirements (MANDATORY)

**MUST** follow this sequence before committing:

1. Propose a commit message following the "Git Commit Message Guidelines"
2. Present it to the user and request confirmation
3. Allow user modifications
4. Proceed **only after** explicit user approval

### Pre-Commit Checks (MANDATORY)

**MUST** run `uv run make format-check flake8 codespell-check` and ensure all checks pass before committing. Fix any failures and re-run checks. **MUST NOT** commit if any checks fail.

## Code Style Guidelines

**Target:** Python 3.10-3.14 | **Line length:** 88 chars | **Format:** Black + isort | **Exclude:** `streamflow/cwl/antlr`

### Import Organization
```python
from __future__ import annotations # Always first

import asyncio
from abc import ABC, abstractmethod
from typing import TYPE_CHECKING, Any

from typing_extensions import Self # Third-party

from streamflow.core.context import StreamFlowContext # Local
from streamflow.log_handler import logger

if TYPE_CHECKING: # Avoid circular imports
from streamflow.core.data import DataManager
```

### Type Hints & Async
```python
# Always use type hints
def process_data(
self,
config: MutableMapping[str, Any],
items: MutableSequence[str]
) -> dict[str, Any]:
pass

# Use Self for classmethods
@classmethod
async def load(cls, context: StreamFlowContext) -> Self:
pass

# Proper async cleanup
async def close(self) -> None:
try:
await asyncio.gather(
asyncio.create_task(self.manager.close()),
asyncio.create_task(self.scheduler.close()),
)
except Exception as e:
logger.exception(e)
finally:
await self.database.close()
```

### Naming & Error Handling
- **Classes:** `PascalCase` | **Functions:** `snake_case` | **Constants:** `UPPER_SNAKE_CASE`
- **Private:** `_method_name` | **Type vars:** `_KT`, `_VT`

```python
# Use custom exceptions from streamflow.core.exception
from streamflow.core.exception import WorkflowExecutionException
from streamflow.log_handler import logger

try:
result = await process()
except SpecificException as e:
logger.exception(e)
raise WorkflowExecutionException(f"Failed: {e}") from e
```

**Available exceptions:** `ProcessorTypeError`, `WorkflowException`, `WorkflowDefinitionException`, `WorkflowExecutionException`, `WorkflowProvenanceException`, `FailureHandlingException`, `InvalidPluginException`

### Documentation (American English, reStructuredText)
```python
def process_workflow(workflow: Workflow, config: dict[str, Any]) -> bool:
"""
Process a workflow with the given configuration.

:param workflow: The workflow to process
:param config: Configuration dictionary for processing
:returns: True if processing succeeded, False otherwise
:raises WorkflowExecutionException: If workflow processing fails
"""
pass
```

### Testing (REQUIRED for new features/bugfixes)
```python
# Use pytest with async support
async def test_workflow_execution(context: StreamFlowContext) -> None:
"""Test basic workflow execution."""
workflow = await build_workflow(context)
result = await workflow.execute()
assert result.status == "completed"
```

**Coverage:** https://app.codecov.io/gh/alpha-unito/streamflow

## Git Commit Message Guidelines

**Format:**
```
<type>(<scope>): <subject>

<body>
```

**Types:** `Add`, `Fix`, `Refactor`, `Update`, `Remove`, `Bump`, `Docs`, `Test`, `Chore`

**Rules:**
- **Subject:** Imperative mood, capitalize, no period, max 50 chars
- **Scope (optional):** Module/component (e.g., `cwl`, `deployment`, `scheduling`)
- **Body (required):** Explain *what* and *why* (not *how*), wrap at 72 chars, separate with blank line, include issue refs (e.g., `Fixes #123`). Exception: trivial changes like typo fixes.
- **Language:** American English

**Examples:**
```
Add restore method to DataManager

Implement restore method to enable workflow recovery from checkpoints.
This allows jobs to resume from the last completed step.

Fix SSH connector authentication timeout (Fixes #931)

Increase default timeout for SSH authentication from 5s to 30s to handle
slow networks and high-latency connections.

Bump kubernetes-asyncio from 33.3.0 to 34.3.3
```

## Common Workflows

**Adding a feature:**
1. Write tests first in `tests/`
2. Implement feature with type hints and docstrings
3. Run `uv run make format` to auto-format
4. Run `uv run make format-check flake8 codespell-check`
5. Run `uv run pytest` to verify tests pass
6. Update docs if needed
7. Commit with proper message

**Fixing a bug:**
1. Add regression test in `tests/`
2. Fix the bug
3. Follow linting/formatting guidelines
4. Verify with tests
5. Commit with proper message

## Key Project Structure

```
streamflow/
├── core/ # Abstractions (context, deployment, exception, workflow)
├── cwl/ # CWL implementation (v1.0-v1.3)
├── deployment/ # Connectors (docker, k8s, ssh, slurm, pbs, singularity)
├── workflow/ # Workflow execution engine
├── data/ # Data management
├── persistence/ # Database (SQLite)
├── scheduling/ # Scheduling policies
├── recovery/ # Checkpointing/fault tolerance
└── ext/ # Plugin system
tests/ # Pytest test suite
docs/ # Sphinx documentation
```

## Quick Reference

**Extension Points:** Connector, BindingFilter, CWLDockerTranslator, Scheduler, Database, DataManager, CheckpointManager, FailureManager

**CWL Conformance:** `./cwl-conformance-test.sh` (supports VERSION, DOCKER, EXCLUDE env vars)

**Documentation:** `uv run make html` | Update checksum: `cd docs && uv run make checksum`

**Resources:** [Website](https://streamflow.di.unito.it/) | [Docs](https://streamflow.di.unito.it/documentation/0.2/) | [GitHub](https://github.com/alpha-unito/streamflow) | [Contributing](CONTRIBUTING.md)
19 changes: 12 additions & 7 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
.PHONY: benchmark codespell codespell-check coverage-report flake8 format format-check pyupgrade test testcov typing

benchmark:
python -m pytest benchmark -rs ${PYTEST_EXTRA}

codespell:
codespell -w $(shell git ls-files | grep -v streamflow/cwl/antlr)

Expand All @@ -11,24 +16,24 @@ coverage-report: testcov
coverage report

flake8:
flake8 --exclude streamflow/cwl/antlr streamflow tests
flake8 --exclude streamflow/cwl/antlr streamflow tests benchmark

format:
isort streamflow tests
black --target-version py310 streamflow tests
isort streamflow tests benchmark
black --target-version py310 streamflow tests benchmark

format-check:
isort --check-only streamflow tests
black --target-version py310 --diff --check streamflow tests
isort --check-only streamflow tests benchmark
black --target-version py310 --diff --check streamflow tests benchmark

pyupgrade:
pyupgrade --py3-only --py310-plus $(shell git ls-files | grep .py | grep -v streamflow/cwl/antlr)

test:
python -m pytest -rs ${PYTEST_EXTRA}
python -m pytest tests -rs ${PYTEST_EXTRA}

testcov:
python -m pytest -rs --cov --junitxml=junit.xml -o junit_family=legacy --cov-report= ${PYTEST_EXTRA}
python -m pytest tests -rs --cov --junitxml=junit.xml -o junit_family=legacy --cov-report= ${PYTEST_EXTRA}

typing:
mypy streamflow tests
Empty file added benchmark/__init__.py
Empty file.
Loading
Loading