diff --git a/.claude/skills/docs/SKILL.md b/.claude/skills/docs/SKILL.md new file mode 100644 index 000000000..1e5255223 --- /dev/null +++ b/.claude/skills/docs/SKILL.md @@ -0,0 +1,162 @@ +--- +name: Documentation Writing +description: This skill should be used when the user asks to "create documentation", "add docs", "update documentation", "write user guide", "document commands", "verify commands in docs", "fix hallucinated commands", or when creating/editing .rst files in docs/source/. Ensures all commands are tested before documenting to prevent hallucinations. +version: 1.1.0 +--- + +# Documentation Writing Skill + +## Purpose + +Ensure accurate technical documentation for StreamFlow. **ZERO TOLERANCE for command hallucinations** - verify every command, flag, and output before documenting. + +## Critical Principle + +**TEST EVERY COMMAND BEFORE DOCUMENTING** + +Documentation is NOT complete until all commands are verified. + +## Documentation Structure + +``` +docs/source/ +├── user-guide/ # Tutorial-style guides +├── developer-guide/ # Contributor docs (TODO) +└── reference/ # Technical specs, CLI, schemas +``` + +**Section-Specific Guidelines:** +- **User Guide:** See [user-guide/SKILL.md](user-guide/SKILL.md) - Tutorial writing, dependency ordering, progressive examples +- **Reference:** See [reference/SKILL.md](reference/SKILL.md) - CLI docs, schema-driven config, lookup content + +## Mandatory Verification Workflow + +### Before Writing ANY Documentation + +1. **Test Command Exists:** `uv run --help` +2. **Test Flags:** `uv run streamflow run --help | grep -i debug` +3. **Capture Actual Output:** Use EXACT output in documentation +4. **Document Limitations:** Note platform/Docker requirements + +### Common Hallucination Traps + +**NEVER assume commands exist - verify first!** + +See `references/verified-commands.md` for complete list. + +## File Requirements + +Every file must include: + +```rst +============ +Document Title +============ + +.. meta:: + :description: Brief description + :keywords: keyword1, keyword2 + :audience: users|developers + :difficulty: beginner|intermediate|advanced + +**Prerequisites:** + +* :doc:`prerequisite1` + +**What You'll Learn:** + +* Learning objective 1 +``` + +**Writing Style:** See `references/style-guide.md` + +- Imperative mood: "Run the command" (not "You should run") +- Technical and concise +- Show actual outputs +- Document edge cases + +## Complete Verification Process + +### Phase 1: Identify Commands + +```bash +grep -n "^\s*streamflow\|^\s*docker" docs/source/file.rst +``` + +### Phase 2: Test Each Command + +```bash +uv run streamflow --help +uv run streamflow run --help +uv run cwl-runner --help # Separate command! +``` + +### Phase 3: Write Documentation + +Use ONLY verified commands. No assumptions. + +### Phase 4: Final Verification + +```bash +# Build documentation +cd docs +uv run make clean && uv run make html + +# Check for errors +grep -i "error" build.log +``` + +## Quality Checklist + +**Accuracy:** +- [ ] All commands tested with `--help` +- [ ] All flags verified +- [ ] All outputs match actual output +- [ ] Platform limitations documented + +**Completeness:** +- [ ] Meta tags included +- [ ] Prerequisites listed +- [ ] Cross-references added + +**Style:** +- [ ] Technical style +- [ ] Imperative mood +- [ ] Unified command format ($ prompt) + +**Build:** +- [ ] `make clean && make html` succeeds +- [ ] No errors in build output + +## Emergency Fix Protocol + +If hallucinated command discovered: + +1. **STOP** - Do not proceed +2. **Find correct command** - Test alternatives +3. **Find ALL instances** - `grep -rn "wrong-command" docs/source/` +4. **Fix ALL instances** +5. **Test the fix** +6. **Rebuild docs** +7. **Document in commit** + +## Additional Resources + +**Reference Files:** +- `references/verified-commands.md` - Complete command reference +- `references/style-guide.md` - Detailed style guidelines + +**Section-Specific Skills:** +- `user-guide/SKILL.md` - Tutorial-oriented docs +- `reference/SKILL.md` - Lookup-oriented docs + +## Remember + +**Task is NOT complete until:** + +1. ✅ Every command tested and verified +2. ✅ Every flag confirmed to exist +3. ✅ Every output matches reality +4. ✅ Documentation builds without errors + +**If in doubt, TEST IT. NO HALLUCINATIONS. EVER.** diff --git a/.claude/skills/docs/reference/SKILL.md b/.claude/skills/docs/reference/SKILL.md new file mode 100644 index 000000000..25c7c3b31 --- /dev/null +++ b/.claude/skills/docs/reference/SKILL.md @@ -0,0 +1,257 @@ +--- +name: Reference Documentation +description: This skill should be used when the user asks to "document CLI command", "create reference docs", "document connector", "write API reference", "document configuration schema", or when editing files in docs/source/reference/. Provides lookup-oriented documentation with schema-driven configuration and comprehensive technical specifications. +version: 1.1.0 +parent_skill: ../SKILL.md +--- + +# Reference Documentation Subskill + +## Purpose + +Reference documentation is **lookup-oriented** - providing complete, accurate technical specifications for quick consultation. + +## Parent Skill + +Extends [Documentation Writing Skill](../SKILL.md). **All rules apply**, especially: +- **TEST EVERY COMMAND** before documenting +- **ZERO TOLERANCE** for hallucinations + +## Reference Structure + +``` +docs/source/reference/ +├── cli/ # run, list, report, prov, plugin, ext, schema, cwl-runner +├── configuration/ # streamflow-yml, workflow-config, deployment-config, binding-config +├── connectors/ # docker, singularity, kubernetes, ssh, slurm, pbs (FLAT) +├── cwl-docker-translators/ # docker, kubernetes, singularity, no-container +└── glossary.rst +``` + +## Critical: Use Schema Directive + +**MANDATORY:** Use `sphinx-jsonschema` for ALL configuration options. + +```rst +Configuration +============= + +.. jsonschema:: https://streamflow.di.unito.it/schemas/deployment/connector/docker.json + :lift_description: true +``` + +**NEVER** manually create configuration tables. Schemas are source of truth. + +**Schema URLs:** +- **Connectors:** `https://streamflow.di.unito.it/schemas/deployment/connector/{name}.json` +- **CWL Docker Translators:** `https://streamflow.di.unito.it/schemas/cwl/requirement/docker/{name}.json` + +## File Requirements + +Every Reference file MUST include: + +1. **Meta tags** - Keywords, description +2. **Overview** - Brief introduction +3. **Quick Reference** - Summary table +4. **Examples** - Practical usage (BEFORE configuration) +5. **Configuration** - Schema directive (AT END, before Related Docs) +6. **Related Documentation** - Cross-references + +See `references/templates.md` for complete templates. + +## Standard Section Order + +### For Connector/Translator Pages + +1. **Title, Meta, Overview** +2. **Quick Reference** - Essential info table +3. **Examples** - Basic → advanced +4. **Prerequisites** +5. **Platform Support** +6. **Configuration** - Schema-driven (AT END) +7. **Related Documentation** + +### For CLI Pages + +1. **Title, Meta, Overview** +2. **Synopsis** - Command syntax +3. **Arguments** +4. **Options** - All flags from --help +5. **Examples** +6. **Exit Status** (if applicable) +7. **Related Documentation** + +### For Configuration Pages + +1. **Title, Meta, Overview** +2. **File Structure** +3. **Schema** - jsonschema directive +4. **Examples** +5. **Validation** +6. **Related Documentation** + +## Quick Reference Tables + +```rst +Quick Reference +=============== + +============ ==================================== +Type ``connector-type`` +Category Container|Cloud|HPC +Scalability Single node|Multi-node|Cluster +Best For Use case description +============ ==================================== +``` + +## Synopsis Format + +```rst +Synopsis +======== + +.. code-block:: text + + streamflow run [OPTIONS] FILE + streamflow list [OPTIONS] [NAME] +``` + +## Option Format + +```rst +``--flag-name VALUE`` + Description. Include default if applicable. + + **Type:** string|integer|boolean + **Default:** value (if applicable) + **Required:** Yes|No +``` + +## Verification Process + +### CLI Commands + +```bash +# 1. Get actual help text +uv run streamflow run --help > /tmp/run-help.txt + +# 2. Document ONLY what's in help +cat /tmp/run-help.txt + +# 3. Test every flag +uv run streamflow run --debug streamflow.yml +uv run streamflow run --quiet streamflow.yml +``` + +### Configuration Schemas + +```bash +# 1. Verify schema URL exists +curl -I https://streamflow.di.unito.it/schemas/deployment/connector/docker.json + +# 2. Test jsonschema renders +cd docs +uv run make clean && uv run make html + +# 3. Check HTML output +open build/html/reference/connectors/docker.html +``` + +### Connectors + +```bash +# 1. Test connector works +cat > test-streamflow.yml << 'EOF' +version: v1.0 +deployments: + test: + type: docker + config: + image: alpine:latest +EOF + +uv run streamflow run test-streamflow.yml + +# 2. Document only tested configurations +``` + +## Writing Style + +### DO +- **Be comprehensive** - Document every option +- **Be precise** - Exact syntax, types, defaults +- **Use tables** - Quick Reference for scanning +- **Show examples** - Simple → complex +- **Link to schemas** - Use jsonschema directive +- **Organize logically** - Group related options + +### DON'T +- **Explain how to use** - That's User Guide +- **Manual config tables** - Use jsonschema +- **Assume knowledge** - Define all terms +- **Skip options** - Document everything from --help + +## Cross-Reference Strategy + +**From User Guide to Reference:** +```rst +For complete options, see :doc:`/reference/connectors/docker`. +For all CLI flags, see :doc:`/reference/cli/run`. +``` + +**From Reference to User Guide:** +```rst +For tutorial, see :doc:`/user-guide/configuring-deployments`. +``` + +**Within Reference:** +```rst +See :doc:`streamflow-yml` for file structure. +See :doc:`../cli/run` for execution commands. +``` + +## Quality Checklist + +**Accuracy:** +- [ ] CLI flags verified with --help +- [ ] Configuration uses jsonschema directive +- [ ] Examples tested and produce results +- [ ] Schema URLs valid + +**Completeness:** +- [ ] Every --help option documented +- [ ] Configuration at end (before Related Docs) +- [ ] Quick Reference table included +- [ ] Platform Support documented + +**Structure:** +- [ ] Follows standard section order +- [ ] Examples before Configuration +- [ ] Configuration uses jsonschema only +- [ ] No manual config tables + +**Validation:** +- [ ] jsonschema renders in HTML +- [ ] Cross-references work +- [ ] Build produces no errors + +## Additional Resources + +**Templates:** +- `references/templates.md` - Complete file templates + +**Parent Skill:** +- `../SKILL.md` - Main documentation skill + +## Remember + +**Reference = Lookup** + +Users find specific technical details quickly: +- Quick Reference tables +- Exact syntax in Synopsis +- Complete details from jsonschema +- Working examples + +**If it's not in `--help` or JSON schema, don't document it.** +**If you haven't tested it, don't claim it works.** diff --git a/.claude/skills/docs/reference/references/templates.md b/.claude/skills/docs/reference/references/templates.md new file mode 100644 index 000000000..e5fb0e41f --- /dev/null +++ b/.claude/skills/docs/reference/references/templates.md @@ -0,0 +1,372 @@ +# Reference Documentation Templates + +This reference contains complete templates for Reference documentation files. + +## CLI Command Reference Template + +```rst +============== +streamflow run +============== + +.. meta:: + :keywords: StreamFlow, CLI, run, workflow, execution + :description: Run StreamFlow workflows - complete command reference + +Overview +======== + +Execute a StreamFlow workflow defined in a configuration file. + +Synopsis +======== + +.. code-block:: text + + streamflow run [OPTIONS] FILE + +Arguments +========= + +``FILE`` + Path to StreamFlow configuration file (``streamflow.yml``) + +Options +======= + +``--debug`` + Enable debug logging output + +``--quiet`` + Suppress non-error output + +``--outdir DIR`` + Output directory for workflow results + + **Type:** string + **Default:** Current directory + +[Continue with ALL options from --help] + +Examples +======== + +Basic Execution +--------------- + +.. code-block:: bash + + $ streamflow run streamflow.yml + + [ACTUAL TESTED OUTPUT] + +With Debug Logging +------------------ + +.. code-block:: bash + + $ streamflow run --debug streamflow.yml + + [ACTUAL TESTED OUTPUT] + +Exit Status +=========== + +**0** - Success +**1** - Error during execution +**2** - Invalid arguments + +Related Documentation +===================== + +**User Guide:** + - :doc:`/user-guide/running-workflows` - Workflow execution guide + - :doc:`/user-guide/troubleshooting` - Troubleshooting execution + +**Reference:** + - :doc:`/reference/configuration/streamflow-yml` - Configuration file reference + - :doc:`list` - List workflows +``` + +## Connector Reference Template + +```rst +================ +Docker Connector +================ + +.. meta:: + :keywords: StreamFlow, docker, container, deployment + :description: Docker connector reference for StreamFlow + +Overview +======== + +The Docker connector executes workflow tasks in Docker containers, providing isolation, reproducibility, and portability for local and single-node deployments. + +Quick Reference +=============== + +============ ==================================== +Type ``docker`` +Category Container +Scalability Single host +Best For Local development, CI/CD pipelines +============ ==================================== + +Examples +======== + +Basic Docker Deployment +----------------------- + +.. code-block:: yaml + :caption: streamflow.yml - Minimal Docker + + deployments: + docker-python: + type: docker + config: + image: python:3.10 + +With Volume Mounts +------------------ + +.. code-block:: yaml + :caption: streamflow.yml - With volumes + + deployments: + docker-volumes: + type: docker + config: + image: ubuntu:22.04 + volume: + - /host/data:/container/data:ro + - /host/output:/container/output:rw + +With Resource Limits +-------------------- + +.. code-block:: yaml + :caption: streamflow.yml - Resource limits + + deployments: + docker-limited: + type: docker + config: + image: python:3.10 + memory: 8g + cpus: 4 + +Prerequisites +============= + +* Docker installed and running +* User has Docker permissions (member of ``docker`` group on Linux) +* Required images available or pullable from registry + +Platform Support +================ + +**Linux:** Full support +**macOS:** Full support +**Windows:** Not supported + +.. note:: + StreamFlow only supports Linux and macOS. Windows is not supported. + +Configuration +============= + +.. jsonschema:: https://streamflow.di.unito.it/schemas/deployment/connector/docker.json + :lift_description: true + +Related Documentation +===================== + +**User Guide:** + - :doc:`/user-guide/configuring-deployments` - Deployment configuration guide + - :doc:`/user-guide/binding-workflows` - Binding workflows to deployments + - :doc:`/user-guide/troubleshooting` - Docker troubleshooting + +**Connectors:** + - :doc:`index` - All connectors + - :doc:`singularity` - For HPC container deployments + - :doc:`kubernetes` - For container orchestration + +**External Resources:** + - :doc:`/reference/cwl-docker-translators/docker` - CWL Docker translator configuration +``` + +## Configuration File Reference Template + +```rst +=================== +The StreamFlow File +=================== + +.. meta:: + :keywords: StreamFlow, configuration, YAML, streamflow.yml + :description: Main StreamFlow configuration file reference + +Overview +======== + +The ``streamflow.yml`` file is the main configuration file for StreamFlow. It defines workflows, deployments, and bindings in a single YAML document. + +File Structure +============== + +.. code-block:: yaml + + version: v1.0 + + workflows: + # Workflow definitions + + deployments: + # Deployment configurations + + bindings: + # Step-to-deployment bindings + +Schema +====== + +.. jsonschema:: https://streamflow.di.unito.it/schemas/config/v1.0/config_schema.json + :lift_description: true + +Examples +======== + +Complete Configuration +---------------------- + +.. code-block:: yaml + :caption: streamflow.yml - Complete example + + version: v1.0 + + workflows: + my-workflow: + type: cwl + config: + file: workflow.cwl + + deployments: + docker-python: + type: docker + config: + image: python:3.10 + + bindings: + - step: /process + target: + deployment: docker-python + +Validation +========== + +Validate configuration: + +.. code-block:: bash + + $ streamflow schema + +Related Documentation +===================== + +**User Guide:** + - :doc:`/user-guide/quickstart` - First configuration + - :doc:`/user-guide/configuring-deployments` - Deployment configuration + +**Reference:** + - :doc:`workflow-config` - Workflow configuration + - :doc:`deployment-config` - Deployment configuration + - :doc:`binding-config` - Binding configuration +``` + +## CWL Docker Translator Template + +```rst +================= +Docker Translator +================= + +.. meta:: + :keywords: StreamFlow, CWL, Docker, translator + :description: Docker translator for CWL DockerRequirement + +Overview +======== + +The **Docker Translator** is the default CWL Docker Translator in StreamFlow. It instantiates a Docker deployment for every CWL ``DockerRequirement`` specification. + +Examples +======== + +Default Behavior +---------------- + +No ``streamflow.yml`` configuration needed - Docker is used automatically. + +.. code-block:: yaml + :caption: workflow.cwl - With DockerRequirement + + class: CommandLineTool + requirements: + DockerRequirement: + dockerPull: python:3.10 + +Custom Docker Configuration +--------------------------- + +Override Docker settings: + +.. code-block:: yaml + :caption: streamflow.yml - Custom Docker config + + cwl: + docker: + type: docker + config: + memory: 8g + cpus: 4 + +Configuration +============= + +.. jsonschema:: https://streamflow.di.unito.it/schemas/cwl/requirement/docker/docker.json + :lift_description: true + +Related Documentation +===================== + +**User Guide:** + - :doc:`/user-guide/writing-workflows` - CWL workflow guide + +**Connectors:** + - :doc:`/reference/connectors/docker` - Docker connector reference + +**CWL Translators:** + - :doc:`index` - All CWL translators + - :doc:`kubernetes` - Kubernetes translator + - :doc:`singularity` - Singularity translator +``` + +## Glossary Entry Template + +```rst +Term +---- + +**Definition:** Brief one-sentence definition + +**Usage:** How the term is used in StreamFlow context + +**Example:** Code example or usage example + +**Related Terms:** :term:`Related Term 1`, :term:`Related Term 2` + +**See Also:** :doc:`/reference/related-doc` +``` diff --git a/.claude/skills/docs/references/style-guide.md b/.claude/skills/docs/references/style-guide.md new file mode 100644 index 000000000..d9e07308a --- /dev/null +++ b/.claude/skills/docs/references/style-guide.md @@ -0,0 +1,150 @@ +# Documentation Style Guide + +This reference contains detailed writing style guidelines for StreamFlow documentation. + +## Writing Style Principles + +### DO + +- **Use imperative mood:** "Run the command" (not "You should run") +- **Be technical and concise:** No fluff or marketing language +- **Show actual command outputs:** Use real, tested examples +- **Document edge cases:** Include limitations and requirements +- **Use American English:** Consistent spelling and grammar +- **Include captions:** All code blocks need `:caption:` + +### DON'T + +- **Use blog-style language:** No "Let's dive in!", emojis, or exclamation marks +- **Add quick reference tables with Time/Difficulty:** Not needed +- **Assume command syntax:** Always verify with --help +- **Skip platform limitations:** Document OS requirements +- **Use placeholders:** Show real, runnable examples + +## Command Output Format + +Use unified command format with `$` prompt: + +```rst +.. code-block:: bash + + $ streamflow version + + StreamFlow version 0.2.0 +``` + +## Code Block Captions + +Always include captions for context: + +```rst +.. code-block:: yaml + :caption: streamflow.yml - Docker deployment + + deployments: + docker-local: + type: docker +``` + +## Cross-Reference Style + +### Within Same Section + +```rst +See :doc:`other-document` for details. +``` + +### To Other Sections + +```rst +See :doc:`/user-guide/installation` for setup. +See :doc:`/reference/cli/run` for all flags. +``` + +### External Links + +```rst +See `Docker Documentation `_ for more. +``` + +## Platform Support Format + +Always document OS support: + +```rst +Platform Support +================ + +**Linux:** Full support +**macOS:** Full support +**Windows:** Not supported + +.. note:: + StreamFlow only supports Linux and macOS. Windows is not supported. +``` + +## Note and Warning Directives + +Use Sphinx directives for important information: + +```rst +.. note:: + + This command requires Docker running. + +.. warning:: + + This will delete all data. + +.. important:: + + Back up your data first. +``` + +## Dependency-Based Ordering + +Documents must be ordered by prerequisites: + +**Correct Order:** +1. Documents with no prerequisites first +2. Documents requiring earlier documents next +3. Advanced topics last + +**Example:** +- Installation (no prerequisites) +- Quickstart (requires Installation) +- Writing Workflows (requires Installation) +- Running Workflows (requires Writing Workflows + Deployments) + +## Meta Tags Format + +Every file needs meta tags for AI optimization: + +```rst +.. meta:: + :keywords: keyword1, keyword2, keyword3 + :description: Brief 1-2 sentence description + :audience: users|developers + :difficulty: beginner|intermediate|advanced + :reading_time_minutes: 10 +``` + +## File Structure Template + +```rst +============ +Document Title +============ + +.. meta:: + :description: Brief description + :keywords: keyword1, keyword2, keyword3 + +[Prerequisites section - if applicable] + +[What You'll Learn section - if applicable] + +[Main content sections] + +[Related Documentation section] +``` diff --git a/.claude/skills/docs/references/verified-commands.md b/.claude/skills/docs/references/verified-commands.md new file mode 100644 index 000000000..eb9e55b77 --- /dev/null +++ b/.claude/skills/docs/references/verified-commands.md @@ -0,0 +1,88 @@ +# StreamFlow Verified Commands Reference + +This file contains all verified StreamFlow commands with their exact syntax and flags. + +## Main Commands + +```bash +# Core commands +streamflow version ✅ Show version +streamflow run FILE ✅ Run workflow +streamflow list [NAME] ✅ List workflows +streamflow report WORKFLOWS ✅ Generate report +streamflow prov WORKFLOW ✅ Generate provenance +streamflow ext list ✅ List extensions +streamflow plugin list ✅ List plugins +streamflow schema ✅ Show JSON schema +``` + +## Flags for streamflow run + +```bash +--debug ✅ Debug logging +--quiet ✅ Minimal output +--color ✅ Colored output +--outdir DIR ✅ Output directory +--name NAME ✅ Workflow name +``` + +## Flags for streamflow report + +```bash +--all ✅ Include all executions +--format [html,json,...] ✅ Output format +--outdir DIR ✅ Output directory +--name NAME ✅ Report name +``` + +## CWL Runner (Standalone Command) + +```bash +cwl-runner workflow.cwl inputs.yml ✅ +cwl-runner --streamflow-file config.yml workflow.cwl ✅ +cwl-runner --debug workflow.cwl inputs.yml ✅ +``` + +## Common Hallucinations to Avoid + +### ❌ WRONG Commands (Never Document) + +```bash +streamflow --version # Wrong - use: streamflow version +streamflow run --log-level DEBUG # Wrong - use: --debug +streamflow cwl-runner # Wrong - use: cwl-runner (separate!) +streamflow list connectors # Wrong - use: streamflow ext list +streamflow report --execution 0 # Wrong - use: --all +--keep-intermediates # Doesn't exist yet +``` + +### ✅ CORRECT Commands (Verified) + +```bash +streamflow version # Show version +streamflow run FILE --debug # Run with debug logging +cwl-runner workflow.cwl # CWL runner (standalone command) +streamflow ext list # List extensions +streamflow report NAME --all # Include all executions +``` + +## Verification Workflow + +Test every command before documenting: + +```bash +# 1. Check command exists +uv run streamflow --help + +# 2. Check subcommand exists +uv run streamflow run --help + +# 3. Test specific flags +uv run streamflow run --help | grep -i debug + +# 4. Test actual execution +uv run streamflow run streamflow.yml --debug + +# 5. Capture output for documentation +uv run streamflow run streamflow.yml > /tmp/output.txt 2>&1 +``` diff --git a/.claude/skills/docs/user-guide/SKILL.md b/.claude/skills/docs/user-guide/SKILL.md new file mode 100644 index 000000000..26a20f831 --- /dev/null +++ b/.claude/skills/docs/user-guide/SKILL.md @@ -0,0 +1,211 @@ +--- +name: User Guide Documentation +description: This skill should be used when the user asks to "write user guide", "create tutorial", "document workflow execution", "write installation guide", "document quickstart", or when editing files in docs/source/user-guide/. Provides tutorial-oriented documentation guidelines with dependency-based ordering and progressive examples. +version: 1.1.0 +parent_skill: ../SKILL.md +--- + +# User Guide Documentation Subskill + +## Purpose + +User Guide documentation is **tutorial-oriented** - teaching users through practical examples and step-by-step instructions. + +## Parent Skill + +Extends [Documentation Writing Skill](../SKILL.md). **All rules apply**, especially: +- **TEST EVERY COMMAND** before documenting +- **ZERO TOLERANCE** for hallucinations + +## User Guide Structure + +``` +docs/source/user-guide/ +├── installation.rst # No prerequisites +├── quickstart.rst # Prerequisites: installation +├── writing-workflows.rst # Prerequisites: installation +├── configuring-deployments.rst # Prerequisites: installation +├── binding-workflows.rst # Prerequisites: writing + configuring +├── running-workflows.rst # Prerequisites: binding +├── inspecting-results.rst # Prerequisites: running +├── advanced-patterns/ # Prerequisites: running +└── troubleshooting.rst # Prerequisites: all +``` + +## Critical: Dependency-Based Ordering + +**Documents MUST be ordered by prerequisites.** + +❌ **WRONG:** Any other order violates dependency principle + +## File Requirements + +Every User Guide file MUST include: + +1. **Meta tags** - Keywords, description, audience +2. **Prerequisites section** - What to read first (with WHY) +3. **What You'll Learn** - Specific, measurable objectives +4. **Progressive examples** - Simple to complex +5. **Next Steps** - Where to go next +6. **Related Documentation** - Cross-references + +See `references/templates.md` for complete templates. + +## Writing Style + +### DO +- **Teach step-by-step** - Guide through tasks +- **Use real examples** - Show working configurations +- **Explain concepts** - Don't just list commands +- **Provide context** - Why would you do this? +- **Show output** - What users should see +- **Handle errors** - Common problems and solutions + +### DON'T +- **Assume knowledge** - Link to prerequisites +- **Skip steps** - Be explicit +- **Use placeholders** - Show real, runnable examples +- **Omit output** - Users need to verify + +## Prerequisites Format + +Always explain WHY: + +```rst +Prerequisites +============= + +Before reading this guide: + +* :doc:`installation` - StreamFlow must be installed +* :doc:`writing-workflows` - Understanding CWL structure +* Docker installed and running +``` + +## Learning Objectives Format + +✅ **GOOD:** +```rst +* Run workflow with ``streamflow run`` command +* Configure Docker deployments in ``streamflow.yml`` +* Debug execution with ``--debug`` flag +``` + +❌ **BAD:** +```rst +* Learn about workflows +* Understand deployments +``` + +## Example Structure + +Progressive examples - start simple, add complexity: + +```rst +Basic Example +============= + +.. code-block:: yaml + :caption: streamflow.yml - Simple deployment + + version: v1.0 + deployments: + docker-local: + type: docker + +.. code-block:: bash + + $ streamflow run streamflow.yml + + [ACTUAL TESTED OUTPUT HERE] + +Advanced Example +================ + +.. code-block:: yaml + :caption: streamflow.yml - With resource limits + + version: v1.0 + deployments: + docker-limited: + type: docker + config: + memory: 4g + cpus: 2 +``` + +## Cross-References + +**Within User Guide:** +```rst +See :doc:`writing-workflows` for CWL syntax. +``` + +**To Reference:** +```rst +See :doc:`/reference/cli/run` for all flags. +See :doc:`/reference/connectors/docker` for options. +``` + +## Verification Workflow + +```bash +# 1. Create example files +mkdir /tmp/streamflow-test +cd /tmp/streamflow-test + +# 2. Write actual streamflow.yml +cat > streamflow.yml << 'EOF' +version: v1.0 +workflows: + example: + type: cwl + config: + file: workflow.cwl +EOF + +# 3. Test EVERY command +uv run streamflow run streamflow.yml + +# 4. Capture actual output +uv run streamflow run streamflow.yml > /tmp/output.txt 2>&1 + +# 5. Use EXACT output in docs +``` + +## Quality Checklist + +**Content:** +- [ ] Prerequisites explicit and accurate +- [ ] Learning objectives measurable +- [ ] Examples progress simple → complex +- [ ] All commands tested, output verified +- [ ] Next steps guide to logical next doc + +**Structure:** +- [ ] Ordered by dependencies +- [ ] Follows template +- [ ] Cross-references bidirectional +- [ ] Advanced patterns after basics + +**Examples:** +- [ ] All YAML valid and tested +- [ ] All commands produce documented output +- [ ] File paths realistic +- [ ] Captions describe examples + +## Additional Resources + +**Templates:** +- `references/templates.md` - Complete file templates + +**Parent Skill:** +- `../SKILL.md` - Main documentation skill + +## Remember + +**User Guide = Teaching** + +Every document should enable users to **do something** by the end. + +If you can't test it, don't document it. If users can't reproduce it, rewrite it. diff --git a/.claude/skills/docs/user-guide/references/templates.md b/.claude/skills/docs/user-guide/references/templates.md new file mode 100644 index 000000000..482b9a698 --- /dev/null +++ b/.claude/skills/docs/user-guide/references/templates.md @@ -0,0 +1,374 @@ +git s# User Guide File Templates + +This reference contains complete templates for User Guide documentation files. + +## Standard User Guide Template + +```rst +============== +Document Title +============== + +.. meta:: + :keywords: keyword1, keyword2, keyword3 + :description: Brief 1-2 sentence description of what this document teaches + +Prerequisites +============= + +Before reading this guide: + +* :doc:`prerequisite-doc-1` - What you need from it +* :doc:`prerequisite-doc-2` - What you need from it + +What You'll Learn +================= + +After completing this guide, you will: + +* Learning objective 1 (specific and measurable) +* Learning objective 2 (specific and measurable) +* Learning objective 3 (specific and measurable) + +[MAIN CONTENT SECTIONS] + +Quick Reference +=============== + +[Summary table or command list] + +Next Steps +========== + +* :doc:`next-logical-document` - What to learn next +* :doc:`related-document` - Related concepts + +Related Documentation +===================== + +**User Guide:** + - :doc:`related-guide-1` + - :doc:`related-guide-2` + +**Reference:** + - :doc:`/reference/related-reference-1` + - :doc:`/reference/related-reference-2` +``` + +## Installation Guide Template + +```rst +============ +Installation +============ + +.. meta:: + :keywords: installation, setup, pip, uv + :description: Install StreamFlow on Linux or macOS + +What You'll Learn +================= + +* Install StreamFlow using pip or uv +* Verify installation +* Install optional dependencies + +System Requirements +=================== + +* Python 3.10-3.14 +* Linux or macOS (Windows not supported) +* 4GB RAM minimum + +Installation Methods +==================== + +Using pip +--------- + +.. code-block:: bash + + $ pip install streamflow + + [ACTUAL OUTPUT] + +Using uv +-------- + +.. code-block:: bash + + $ uv pip install streamflow + + [ACTUAL OUTPUT] + +Verification +============ + +.. code-block:: bash + + $ streamflow version + + StreamFlow version 0.2.0 + +Next Steps +========== + +* :doc:`quickstart` - Run your first workflow +``` + +## Quickstart Template + +```rst +========== +Quickstart +========== + +.. meta:: + :keywords: quickstart, getting started, first workflow + :description: Run your first StreamFlow workflow in 10 minutes + +Prerequisites +============= + +* :doc:`installation` - StreamFlow must be installed + +What You'll Learn +================= + +* Create a simple workflow +* Configure StreamFlow +* Run a workflow +* View results + +Create Workflow Files +===================== + +Create ``workflow.cwl``: + +.. code-block:: yaml + :caption: workflow.cwl - Simple workflow + + [COMPLETE EXAMPLE] + +Create ``streamflow.yml``: + +.. code-block:: yaml + :caption: streamflow.yml - Configuration + + [COMPLETE EXAMPLE] + +Run the Workflow +================ + +.. code-block:: bash + + $ streamflow run streamflow.yml + + [ACTUAL OUTPUT] + +Next Steps +========== + +* :doc:`writing-workflows` - Learn CWL syntax +* :doc:`configuring-deployments` - Configure execution environments +``` + +## Configuration Guide Template + +```rst +======================= +Configuring Deployments +======================= + +.. meta:: + :keywords: deployments, configuration, docker, kubernetes + :description: Configure execution environments for StreamFlow workflows + +Prerequisites +============= + +* :doc:`installation` - StreamFlow installed +* Docker installed (for container examples) + +What You'll Learn +================= + +* Define deployments in streamflow.yml +* Configure Docker deployments +* Configure remote deployments +* Use multiple deployments + +Deployment Structure +==================== + +Deployments define where workflow steps execute: + +.. code-block:: yaml + :caption: streamflow.yml - Basic structure + + deployments: + deployment-name: + type: connector-type + config: + # Configuration options + +Basic Docker Deployment +======================= + +.. code-block:: yaml + :caption: streamflow.yml - Docker deployment + + deployments: + docker-python: + type: docker + config: + image: python:3.10 + +For complete configuration options, see :doc:`/reference/connectors/docker`. + +Next Steps +========== + +* :doc:`binding-workflows` - Bind steps to deployments +* :doc:`/reference/connectors/index` - All connector options +``` + +## Troubleshooting Template + +```rst +=============== +Troubleshooting +=============== + +.. meta:: + :keywords: troubleshooting, errors, debugging, problems + :description: Common problems and solutions for StreamFlow + +Prerequisites +============= + +* :doc:`running-workflows` - Understanding workflow execution + +Common Problems +=============== + +Command Not Found +----------------- + +**Symptom:** + +.. code-block:: text + + streamflow: command not found + +**Solution:** + +Ensure StreamFlow is installed: + +.. code-block:: bash + + $ pip list | grep streamflow + +If not installed, run: + +.. code-block:: bash + + $ pip install streamflow + +Docker Connection Failed +------------------------ + +**Symptom:** + +.. code-block:: text + + Error: Cannot connect to Docker daemon + +**Solution:** + +1. Verify Docker is running: + + .. code-block:: bash + + $ docker ps + +2. Check user permissions: + + .. code-block:: bash + + $ groups | grep docker + +3. Add user to docker group (Linux): + + .. code-block:: bash + + $ sudo usermod -aG docker $USER + $ newgrp docker + +Related Documentation +===================== + +* :doc:`installation` - Installation troubleshooting +* :doc:`running-workflows` - Execution troubleshooting +``` + +## Advanced Pattern Template + +```rst +================= +[Pattern Name] +================= + +.. meta:: + :keywords: advanced, pattern-name, specific-keywords + :description: Brief description of advanced pattern + +Prerequisites +============= + +* :doc:`/user-guide/running-workflows` - Basic workflow execution +* Understanding of [specific concept] + +What You'll Learn +================= + +* Specific technique 1 +* Specific technique 2 +* When to use this pattern + +Overview +======== + +[1-2 paragraph explanation of the pattern] + +Basic Example +============= + +.. code-block:: yaml + :caption: streamflow.yml - Basic pattern + + [MINIMAL WORKING EXAMPLE] + +Advanced Example +================ + +.. code-block:: yaml + :caption: streamflow.yml - Advanced usage + + [MORE COMPLEX EXAMPLE] + +Use Cases +========= + +Use this pattern when: + +* Scenario 1 +* Scenario 2 +* Scenario 3 + +Related Documentation +===================== + +* :doc:`/user-guide/related-guide` +* :doc:`/reference/related-reference` +``` diff --git a/.claude/skills/mypy/SKILL.md b/.claude/skills/mypy/SKILL.md new file mode 100644 index 000000000..2fac308ef --- /dev/null +++ b/.claude/skills/mypy/SKILL.md @@ -0,0 +1,122 @@ +--- +name: Mypy Type Checking +description: This skill should be used when the user asks to "fix mypy errors", "add type annotations", "fix type checking", "resolve no-untyped-def", "fix mypy type errors", or when working with type hints while respecting StreamFlow's forbidden type constraints (no Any, dict[str, Any], etc.). +version: 0.1.0 +--- + +# Mypy Type Checking Skill + +Comprehensive guidance for fixing mypy type errors while respecting forbidden type constraints. + +## Core Constraint: Forbidden Types + +**See AGENTS.md** for the complete list. **Key rule: Never use `Any` in any form.** + +**Examples of forbidden types:** +- `Any` (bare type) +- `dict[str, Any]`, `list[Any]`, `tuple[Any, ...]` +- `MutableMapping[str, Any]`, `MutableSequence[Any]` +- `object` (bare type) + +**If a fix requires a forbidden type, DO NOT FIX IT.** Skip the error entirely. + +## When to Use This Skill + +- Fixing mypy type errors +- Adding missing type annotations +- Improving type coverage without violating constraints + +## Sub-Skills by Error Type + +This skill covers multiple mypy error types. Select the appropriate sub-skill for the specific error: + +### 1. no-untyped-def +→ See `no-untyped-def/SKILL.md` + +**Error:** Function/method missing type annotations +**Most common error type** + +Example: +```python +# Error: Function is missing a type annotation [no-untyped-def] +def process_data(self, config): + return {"result": config} +``` + +### 2. var-annotated +→ See `var-annotated/SKILL.md` + +**Error:** Variable missing type annotation +**Common with empty collections** + +Example: +```python +# Error: Need type annotation for "cache" [var-annotated] +cache = {} +``` + +### 3. type-arg (Coming Soon) + +**Error:** Generic type missing type arguments +**Example:** Using `list` instead of `list[str]` + +### 4. attr-defined (Coming Soon) + +**Error:** Attribute not defined on type +**Solutions:** Protocols, unions, type guards + +## General Workflow (All Error Types) + +For any mypy error, follow this high-level process. Each sub-skill provides detailed steps for specific error types. + +1. **Identify Error Type:** Determine which sub-skill to use (no-untyped-def, var-annotated, etc.) + +2. **Analyze & Classify:** Determine if concrete types suffice + - **FIXABLE:** Can use concrete types only → proceed with fix + - **UNFIXABLE:** Requires forbidden types → skip entirely + +3. **Fix:** Add type annotations using concrete types (see sub-skill for details) + +4. **Validate:** Run forbidden types check **(MANDATORY)** + ```bash + git diff | grep -E "Any\b|MutableMapping\\[.*Any|MutableSequence\\[.*Any|list\\[Any|dict\\[.*Any" + ``` + - If ANY match found → STOP and revert immediately + - If no match → proceed + +5. **Quality Check:** Run all linters/formatters + ```bash + uv run make format-check flake8 codespell-check typing + ``` + +6. **Commit:** Follow git approval workflow (see AGENTS.md) + +## Concrete Types Quick Reference + +**Allowed types:** +- **Primitives:** `None`, `bool`, `int`, `str`, `float`, `bytes` +- **Concrete classes:** `Workflow`, `Connector`, `Port`, `Token`, etc. +- **Unions:** `str | None`, `int | bool` +- **Generics:** `list[str]`, `dict[str, int]`, `set[Token]` +- **Special:** `Self`, `type[MyClass]`, `Literal["value1", "value2"]` + +**Prefer `Literal` over generic types:** For fixed sets of values (status strings, connector types, mode flags), use `Literal["docker", "kubernetes"]` instead of generic `str` for stronger type safety. + +**Broad arguments, narrow returns:** +- **Arguments:** Use abstract types (`Iterable`, `Sequence`, `Mapping`) when only basic operations are needed +- **Returns:** Use concrete types (`list`, `dict`, `set`) to provide specific guarantees +- Example: `def process(items: Iterable[T]) -> list[T]` + +## Tips for Success + +1. **Start with easy fixes** - Functions with clear return types like `-> None` or `-> bool` +2. **Use TypedDict** - When a dict has known keys, define a TypedDict instead of `dict[str, Any]` +3. **Check existing patterns** - Look at similar functions in the codebase for guidance +4. **Don't force it** - If a function requires forbidden types, skip it +5. **Test changes** - Run mypy after adding types to ensure no new errors + +**Key principle:** When in doubt, skip it. Better to leave an error than violate the forbidden types constraint. + +## Related Documentation + +See AGENTS.md sections: Mandatory Agent Behavior, Git Commit Requirements, Git Commit Message Guidelines, Code Style Guidelines diff --git a/.claude/skills/mypy/no-untyped-def/SKILL.md b/.claude/skills/mypy/no-untyped-def/SKILL.md new file mode 100644 index 000000000..d52c06ecd --- /dev/null +++ b/.claude/skills/mypy/no-untyped-def/SKILL.md @@ -0,0 +1,179 @@ +--- +name: Mypy no-untyped-def Fixer +description: This skill should be used when the user encounters "no-untyped-def" mypy errors, asks to "fix no-untyped-def errors", "add type annotations to functions", or mentions "Function is missing a type annotation". Provides specific workflow for fixing missing type annotations while respecting forbidden type constraints. +version: 0.1.0 +--- + +# Mypy Sub-Skill: Fixing no-untyped-def Errors + +**Error Code:** `[no-untyped-def]` - Function is missing type annotations for parameters or return type. + +## Decision Process + +For each function with this error: +1. **Can return type use concrete types?** (e.g., `None`, `bool`, `Connector`, `str | None`) → If NO, skip function +2. **Can all parameters use concrete types?** → If NO, skip function +3. **Both YES?** → Function is FIXABLE, proceed to add annotations + +**Key rule:** If ANY type needs forbidden types (`Any`, `dict[str, Any]`, etc.), skip the entire function. + +## Workflow + +### 1. Identify & Classify +```bash +uv run mypy path/to/file.py | grep "no-untyped-def" +``` +Create two lists: **FIXABLE** (concrete types only) vs **UNFIXABLE** (need forbidden types - skip these) + +### 2. Add Type Annotations +For FIXABLE functions only, add concrete types to parameters and return values. + +### 3. Validate & Quality Check +```bash +# Check for forbidden types (must return no matches) +git diff | grep -E "Any\b|MutableMapping\\[.*Any|MutableSequence\\[.*Any|list\\[Any|dict\\[.*Any" + +# Run all quality checks (must pass) +uv run make format-check flake8 codespell-check typing +``` + +### 4. Commit +See AGENTS.md "Git Commit Requirements" and "Git Commit Message Guidelines" + +## Key Patterns + +**Fixable Examples:** +```python +# Async functions returning None +async def close(self) -> None: + await self.manager.close() + +# Boolean checks +def is_ready(self) -> bool: + return bool(self.status == "ready") + +# Literal types for fixed value sets (PREFERRED over generic str/int) +from typing import Literal + +def get_status(self) -> Literal["pending", "running", "completed", "failed"]: + return self.status + +def set_mode(self, mode: Literal["sync", "async"]) -> None: + self.mode = mode + +def get_connector_type(self) -> Literal["docker", "kubernetes", "ssh", "slurm"]: + return self.connector_type + +# Broad arguments, narrow returns (container types) +from collections.abc import Iterable, Sequence, Mapping + +def process_connectors(self, connectors: Iterable[Connector]) -> list[str]: + """Argument: Iterable (broad), Return: list (narrow)""" + return [c.name for c in connectors] + +def filter_configs(self, configs: Sequence[Config]) -> list[Config]: + """Argument: Sequence (broad - supports indexing), Return: list (narrow)""" + return [c for c in configs if c.enabled] + +def validate_settings(self, settings: Mapping[str, str]) -> bool: + """Argument: Mapping (broad - read-only), Return: bool (narrow)""" + return "host" in settings and "port" in settings + +def get_config_dict(self) -> dict[str, str]: + """Return: dict (narrow - specific mutable mapping)""" + return {"host": self.host, "port": str(self.port)} + +# Optional returns +def get_connector(self, name: str) -> Connector | None: + return self.connectors.get(name) + +# Classmethods with Self +@classmethod +async def load(cls, context: StreamFlowContext) -> Self: + instance = cls(context) + await instance.initialize() + return instance + +# Multiple parameters with Literal types +def deploy( + self, + name: str, + config: DeploymentConfig, + location: Location, + mode: Literal["foreground", "background"] = "foreground" +) -> Connector: + connector = self.create_connector(name, config) + connector.deploy(location, mode) + return connector +``` + +**Unfixable Examples (SKIP these):** +```python +# Dynamic dict - requires dict[str, Any] +def get_config(self): + return {"name": self.name, "config": self.config} # SKIP + +# Mixed list - requires list[Any] +def get_items(self): + return [self.id, self.config, self.status] # SKIP + +# Generic mapping from interface - requires MutableMapping[str, Any] +def save(self): + return self.config # self.config is MutableMapping[str, Any] - SKIP +``` + +## Quick Tips + +**Prefer `Literal` over generic types:** +```python +# GOOD - Specific literal values +def set_status(self, status: Literal["active", "inactive"]) -> None: + self.status = status + +# LESS GOOD - Generic str allows any value +def set_status(self, status: str) -> None: + self.status = status +``` + +**Broad arguments, narrow returns:** +```python +from collections.abc import Iterable, Sequence, Mapping + +# GOOD - Accepts any iterable, returns specific list +def pretty_print(array: Iterable[str]) -> list[str]: + for i in array: + print(i) + return [i for i in array] + +# LESS GOOD - Requires list, but only uses iteration +def pretty_print(array: list[str]) -> list[str]: + for i in array: + print(i) + return [i for i in array] +``` + +**Common abstract types for arguments:** +- `Iterable[T]` - Only iteration needed (broadest) +- `Sequence[T]` - Indexing + iteration (list-like, read-only) +- `Mapping[K, V]` - Read-only dict-like access + +**TypedDict for structured dicts:** +```python +from typing import TypedDict + +class ConnectorConfig(TypedDict): + name: str + type: str + enabled: bool + +def get_config(self) -> ConnectorConfig: + return {"name": "docker", "type": "container", "enabled": True} +``` + +**Complex control flow:** Analyze ALL return paths. If any path needs forbidden types, skip the function. + +**New mypy errors after adding types?** Either fix them if simple, or revert if they indicate forbidden types are needed. + +## See Also + +Parent skill: `../SKILL.md` | AGENTS.md: Forbidden Types, Git Commit Requirements, Code Style Guidelines diff --git a/.claude/skills/mypy/var-annotated/SKILL.md b/.claude/skills/mypy/var-annotated/SKILL.md new file mode 100644 index 000000000..26e62f263 --- /dev/null +++ b/.claude/skills/mypy/var-annotated/SKILL.md @@ -0,0 +1,210 @@ +--- +name: Mypy var-annotated Fixer +description: This skill should be used when the user encounters "var-annotated" mypy errors, asks to "fix var-annotated errors", "add type annotations to variables", or mentions "Need type annotation for" errors. Provides workflow for fixing missing variable type annotations while respecting forbidden type constraints (no Any, dict[str, Any], list[Any], etc.). +version: 0.1.0 +--- + +# Mypy var-annotated Error Fixer + +Fix `[var-annotated]` errors by adding explicit type annotations to variables where mypy cannot infer types. + +## What is var-annotated? + +This error occurs when mypy cannot infer a variable's type, typically with: +- Empty collections: `cache = {}` or `items = []` +- Generic types: `future = asyncio.Future()` +- Complex object creation: `port = workflow.create_port()` + +**Solution:** Add explicit type annotation: `variable: Type = value` + +## Critical Constraint + +**Never use forbidden types** (see AGENTS.md or parent mypy/SKILL.md): +- ❌ `Any`, `dict[str, Any]`, `list[Any]`, `MutableMapping[str, Any]` +- ✅ If a fix requires forbidden types → **skip the error entirely** + +## Quick Fix Patterns + +### Pattern 1: Empty Collections + +**Strategy:** Examine what gets stored/retrieved to determine element types. + +```python +# ❌ Error +cache = {} +items = [] + +# ✅ Fix - Check usage to find types +cache: dict[str, Port] = {} # Port is value type +items: list[Token] = [] # Token is element type +ports: dict[str, list[str]] = {} # Nested: dict of string lists +``` + +**Common types:** +- `dict[str, Port]`, `dict[str, Token]`, `dict[str, Storage]` +- `list[Token]`, `list[Port]`, `list[ExecutionLocation]` +- `dict[str, list[str]]`, `dict[int, str]` + +### Pattern 2: Generic Types + +**Strategy:** Specify type parameters based on what the generic holds and variance rules. + +```python +# ❌ Error +future = asyncio.Future() +cache = LRUCache(maxsize=5) + +# ✅ Fix - Use concrete type parameters +future: asyncio.Future[T] = asyncio.Future() # T from function signature +cache: LRUCache[int, int] = LRUCache(maxsize=5) # key, value types +``` + +**Variance Rules:** + +When choosing type parameters for generic types, follow these variance principles: + +- **Contravariant types** (type parameters marked with `contravariant=True`): + - Prefer the **most specific type** possible + - Use the `bound` argument for `TypeVar` objects + - Example: For function parameters that accept callbacks + +- **Covariant types** (type parameters marked with `covariant=True`): + - Prefer the **most generic type** possible (excluding forbidden `Any`) + - Example: For return types and read-only containers like `Sequence[T]` + +- **Invariant types** (default, no variance specified): + - Check if the invariant generic can be substituted with a **covariant alternative** + - If substitution not possible, use the **exact type** that matches usage + - Example: Use `Mapping[K, V]` (covariant) instead of `MutableMapping[K, V]` (invariant) if you only need read access + - Example: Use `Sequence[T]` (covariant) instead of `MutableSequence[T]` (invariant) if you only need read access + +**Reference:** Check variance in Python's [typeshed repository](https://github.com/python/typeshed) for standard library types + +```python +# Examples of variance-aware typing: + +# Covariant (Sequence is covariant in T) - use more generic types +items: Sequence[Token] = [] # Can accept list[Token], tuple[Token], etc. +results: Mapping[str, int] = {} # Read-only dict, covariant in value type + +# Invariant (dict/MutableMapping/list are invariant) - use exact types +cache: dict[str, int] = {} # Exact types required +mutable_items: MutableSequence[Token] = [] # Exact type required + +# Better: If you only need read access, use covariant alternatives +# ✅ Instead of: items: MutableSequence[Token] = [] +# ✅ Use: items: Sequence[Token] = [] (if you only read) + +# Contravariant (rarely seen in variable annotations) +# Usually appears in function signatures, not variable declarations +``` + +### Pattern 3: StreamFlow Objects + +**Strategy:** Import the concrete type from streamflow modules. + +```python +# ❌ Error +storage = {} + +# ✅ Fix - Import and use concrete types +from streamflow.data.utils import Storage + +storage: dict[str, Storage] = {} +``` + +**Common StreamFlow types:** +- `Port`, `Token` - from `streamflow.core.workflow` +- `Storage` - from `streamflow.data.utils` +- `ExecutionLocation` - from `streamflow.core.deployment` +- `ValueFromTransformer` - from `streamflow.cwl.step` + +### Pattern 4: Nested Dicts with Known Keys + +**Strategy:** Prefer TypedDict for known keys; use generic dicts only for dynamic keys. + +```python +# ❌ Error +config = {} +config["name"] = "test" +config["count"] = 42 + +# ✅ Best - TypedDict for known structure +from typing import TypedDict + +class ConfigDict(TypedDict): + name: str + count: int + +config: ConfigDict = {"name": "test", "count": 42} + +# ⚠️ Fallback - Generic nested dict (only if keys are dynamic) +extensions: dict[str, dict[str, dict[str, str]]] = {} # Use when keys unknown +``` + +## Workflow + +**For each error:** + +1. **Read surrounding code** (10-20 lines) - What gets assigned/retrieved? + +2. **Find the concrete type:** + - Check function signatures for parameter/return types + - Examine usage: what gets stored, retrieved, or called? + - Search for similar patterns in the codebase + - Look up type definitions and imports + +3. **Verify no forbidden types** - Does the type use `Any`? + - ✅ No → apply fix + - ❌ Yes → skip this error + +4. **Apply annotation:** `variable: Type = value` + +5. **Validate:** + ```bash + uv run make format # Auto-format + uv run make format-check flake8 codespell-check # Must pass + git diff | grep -E "Any\b|dict\[.*Any" # Must be empty + ``` + +## Type Discovery Commands + +```bash +# Find function signatures +grep -n "def function_name" file.py + +# Find type definitions +grep -rn "class TypeName" streamflow/ + +# Find similar annotations +grep -rn "variable_name: " streamflow/ + +# Check imports +head -50 file.py | grep "from streamflow" +``` + +## Common Mistakes + +**❌ Don't guess types** - Always verify by reading code +**❌ Don't use `Any`** - Skip error if concrete type not possible +**❌ Don't forget imports** - Add necessary imports for types +**❌ Don't use bare generics** - Use `list[str]` not `list` +**❌ Don't use nested dicts for known keys** - Use TypedDict instead + +## Validation Commands + +```bash +# List all var-annotated errors +uv run make typing 2>&1 | grep "\[var-annotated\]" + +# Count remaining errors +uv run make typing 2>&1 | grep "\[var-annotated\]" | wc -l + +# Check for forbidden types in changes +git diff | grep -E "^\+.*: .*(Any|object)\b" +``` + +## Related Documentation + +- **AGENTS.md** - Forbidden types, validation commands, commit requirements +- **mypy/SKILL.md** - General mypy workflow, concrete types reference diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 000000000..fb6434608 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,276 @@ +# StreamFlow Agent Guidelines + +This document provides essential guidelines for agentic coding agents working on the StreamFlow codebase. + +## Project Overview + +StreamFlow is a container-native Workflow Management System (WMS) written in Python 3 (versions 3.10-3.14). It implements the Common Workflow Language (CWL) standard (v1.0-v1.3) for multi-cloud/HPC hybrid workflow executions. + +**Key Architecture:** +- **Deployment** → **Service** → **Location** (hierarchical execution units) +- Supports multiple connectors: local, docker, kubernetes, ssh, slurm, pbs, singularity, etc. + +## Setup & Installation + +```bash +# Clone and install dependencies +git clone git@github.com:alpha-unito/streamflow.git +cd streamflow +uv sync --all-extras +``` + +## Essential Commands + +### Testing +```bash +# Run all tests +uv run make test + +# Run specific test file +uv run pytest tests/test_file.py + +# Run single test function +uv run pytest tests/test_file.py::test_function_name + +# Run tests with coverage +uv run make testcov + +# Test specific connectors only (all tested in CI) +uv run pytest --deploys local,docker tests/test_remotepath.py +``` + +**Requirements:** Docker (for most connector tests), Singularity/Apptainer, Kubernetes (minikube) + +### Linting & Formatting (REQUIRED BEFORE COMMIT) +```bash +# Check all (must pass before committing) +uv run make format-check flake8 codespell-check typing + +# Auto-fix formatting +uv run make format codespell + +# Apply pyupgrade for Python 3.10+ compatibility +uv run make pyupgrade + +# Run type checking with mypy (strict mode) +uv run make typing +``` + +## Mandatory Agent Behavior + +All agents **MUST** adhere to these non-negotiable rules: + +### Package & Dependency Management (MANDATORY) + +**MUST** obtain explicit user permission before installing packages or updating dependencies. Specify what is being installed/updated, why, and await confirmation before proceeding. + +### ⚠️ FORBIDDEN TYPES - CRITICAL CONSTRAINT (MANDATORY) + +**AGENTS MUST NEVER USE THESE TYPES:** + +- ❌ `Any` (bare type) +- ❌ `object` (bare type) +- ❌ `MutableMapping[str, Any]`, `MutableSequence[Any]` +- ❌ `list[Any]`, `dict[str, Any]`, `dict[Any, Any]`, `tuple[Any, ...]` + +**For validation procedures and detailed mypy workflows:** See `.claude/skills/mypy/SKILL.md` + +### Git Commit Requirements (MANDATORY) + +**CRITICAL: Agents MUST NEVER create git commits without explicit user approval.** + +**Required sequence:** + +1. **Validate changes:** + - Run `uv run make format-check flake8 codespell-check typing` (all must pass) + - For type changes: See `.claude/skills/mypy/SKILL.md` for validation procedures + - Fix any failures and re-run checks before proceeding + +2. **Present to user:** + - Show complete commit message (following Git Commit Message Guidelines) + - Show `git diff --stat` and full `git diff` + - **Ask explicitly:** "Should I create this commit? Yes/No" + - **Wait for explicit approval** + +3. **Create commit (only after approval):** + - If "No" → keep changes in working directory for review + - If "Yes" → run `git commit` and verify with `git log -1` + +## Code Style Guidelines + +**Target:** Python 3.10-3.14 | **Line length:** 88 chars | **Format:** Black + isort | **Type checking:** mypy (strict mode) | **Exclude:** `streamflow/cwl/antlr` + +### Import Organization +```python +from __future__ import annotations # Always first + +import asyncio +from abc import ABC, abstractmethod +from typing import TYPE_CHECKING, Any + +from typing_extensions import Self # Third-party + +from streamflow.core.context import StreamFlowContext # Local +from streamflow.log_handler import logger + +if TYPE_CHECKING: # Avoid circular imports + from streamflow.core.data import DataManager +``` + +### Type Hints & Async +```python +# Always use concrete type hints +def process_workflow( + self, + workflow: Workflow, + config: WorkflowConfig # Concrete type, not dict[str, Any] +) -> bool: # Concrete return type + pass + +# Use Self for classmethods +@classmethod +async def load(cls, context: StreamFlowContext) -> Self: + pass + +# Proper async cleanup +async def close(self) -> None: + try: + await asyncio.gather( + asyncio.create_task(self.manager.close()), + asyncio.create_task(self.scheduler.close()), + ) + except Exception as e: + logger.exception(e) + finally: + await self.database.close() +``` + +### Naming & Error Handling +- **Classes:** `PascalCase` | **Functions:** `snake_case` | **Constants:** `UPPER_SNAKE_CASE` +- **Private:** `_method_name` | **Type vars:** `_KT`, `_VT` + +```python +# Use custom exceptions from streamflow.core.exception +from streamflow.core.exception import WorkflowExecutionException +from streamflow.log_handler import logger + +try: + result = await process() +except SpecificException as e: + logger.exception(e) + raise WorkflowExecutionException(f"Failed: {e}") from e +``` + +**Available exceptions:** `ProcessorTypeError`, `WorkflowException`, `WorkflowDefinitionException`, `WorkflowExecutionException`, `WorkflowProvenanceException`, `FailureHandlingException`, `InvalidPluginException` + +### Documentation + +**For Sphinx documentation (.rst files):** See `.claude/skills/docs/SKILL.md` for comprehensive guidelines on writing reStructuredText documentation. + +**Python docstrings:** Use standard Python docstring format with field lists: + +```python +def process_workflow(workflow: Workflow, config: dict[str, Any]) -> bool: + """ + Process a workflow with the given configuration. + + :param workflow: The workflow to process + :param config: Configuration dictionary for processing + :returns: True if processing succeeded, False otherwise + :raises WorkflowExecutionException: If workflow processing fails + """ + pass +``` + +**Language:** American English for all documentation and docstrings. + +### Testing (REQUIRED for new features/bugfixes) +```python +# Use pytest with async support +async def test_workflow_execution(context: StreamFlowContext) -> None: + """Test basic workflow execution.""" + workflow = await build_workflow(context) + result = await workflow.execute() + assert result.status == "completed" +``` + +**Coverage:** https://app.codecov.io/gh/alpha-unito/streamflow + +## Git Commit Message Guidelines + +**Format:** +``` +(): + + +``` + +**Types:** `Add`, `Fix`, `Refactor`, `Update`, `Remove`, `Bump`, `Docs`, `Test`, `Chore` + +**Rules:** +- **Subject:** Imperative mood, capitalize, no period, max 50 chars +- **Scope (optional):** Module/component (e.g., `cwl`, `deployment`, `scheduling`) +- **Body (required):** Explain *what* and *why* (not *how*), wrap at 72 chars, separate with blank line, include issue refs (e.g., `Fixes #123`). Exception: trivial changes like typo fixes. +- **Language:** American English + +**Examples:** +``` +Add restore method to DataManager + +Implement restore method to enable workflow recovery from checkpoints. +This allows jobs to resume from the last completed step. + +Fix SSH connector authentication timeout (Fixes #931) + +Increase default timeout for SSH authentication from 5s to 30s to handle +slow networks and high-latency connections. + +Bump kubernetes-asyncio from 33.3.0 to 34.3.3 +``` + +## Common Workflows + +**Adding a feature:** +1. Write tests first in `tests/` +2. Implement feature with type hints and docstrings +3. Run `uv run make format` to auto-format +4. Run `uv run make format-check flake8 codespell-check typing` +5. Run `uv run pytest` to verify tests pass +6. Update docs if needed +7. Follow Git Commit Requirements section for submitting changes + +**Fixing a bug:** +1. Add regression test in `tests/` +2. Fix the bug with proper type annotations +3. Follow linting/formatting guidelines (steps 3-4 above) +4. Verify with tests +5. Follow Git Commit Requirements section for submitting changes + +**For mypy type error fixes:** See `.claude/skills/mypy/SKILL.md` + +## Key Project Structure + +``` +streamflow/ +├── core/ # Abstractions (context, deployment, exception, workflow) +├── cwl/ # CWL implementation (v1.0-v1.3) +├── deployment/ # Connectors (docker, k8s, ssh, slurm, pbs, singularity) +├── workflow/ # Workflow execution engine +├── data/ # Data management +├── persistence/ # Database (SQLite) +├── scheduling/ # Scheduling policies +├── recovery/ # Checkpointing/fault tolerance +└── ext/ # Plugin system +tests/ # Pytest test suite +docs/ # Sphinx documentation +``` + +## Quick Reference + +**Extension Points:** Connector, BindingFilter, CWLDockerTranslator, Scheduler, Database, DataManager, CheckpointManager, FailureManager + +**CWL Conformance:** `./cwl-conformance-test.sh` (supports VERSION, DOCKER, EXCLUDE env vars) + +**Documentation:** `uv run make html` | Update checksum: `cd docs && uv run make checksum` + +**Resources:** [Website](https://streamflow.di.unito.it/) | [Docs](https://streamflow.di.unito.it/documentation/0.2/) | [GitHub](https://github.com/alpha-unito/streamflow) | [Contributing](CONTRIBUTING.md)