diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 00000000..0472c305 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,156 @@ +# Contributing to ml-intern + +Thanks for your interest in contributing. This document covers development setup, code conventions, the PR workflow, and the review standards we apply. + +## Table of Contents + +- [Development Setup](#development-setup) +- [Before You Open a PR](#before-you-open-a-pr) +- [Branch and Commit Conventions](#branch-and-commit-conventions) +- [Code Style](#code-style) +- [Testing](#testing) +- [PR Types and Guidelines](#pr-types-and-guidelines) +- [Review Standards](#review-standards) +- [FAQ](#faq) + +--- + +## Development Setup + +```bash +git clone https://github.com/YOUR_USERNAME/ml-intern.git +cd ml-intern +uv sync +uv tool install -e . +``` + +Requires Python 3.11+ and [uv](https://github.com/astral-sh/uv). + +Create a `.env` file in the project root with your API keys (see [README](./README.md) for the full list). At minimum you need `HF_TOKEN` to run the agent. + +--- + +## Before You Open a PR + +1. **For bug fixes** — reproduce the issue locally and confirm your fix resolves it. +2. **For new features or tools** — open an Issue first and describe what you want to add. Wait for a maintainer signal before investing significant implementation time. A PR without prior discussion is more likely to be declined on design grounds. +3. **For documentation changes** — no prior discussion required. + +Read [REVIEW.md](./REVIEW.md) to understand how PRs are evaluated. The review is automated via Claude and uses the severity levels defined there (P0 / P1 / P2). + +--- + +## Branch and Commit Conventions + +Branch names follow a `type/description` pattern, matching existing branches in this repo: + +| Change type | Prefix | Example | +|---|---|---| +| Bug fix | `fix/` | `fix/cli-model-loose-validation` | +| Feature | `feat/` | `feat/anthropic-prompt-caching` | +| Documentation | `docs/` | `docs/update-readme` | +| CI/tooling | `ci/` | `ci/review-md` | + +Commit messages should be short and imperative (`fix: handle empty tool call list`, not `Fixed the thing`). No strict format is enforced, but be specific. + +--- + +## Code Style + +- **Python 3.11+** with type annotations throughout. +- Use `from __future__ import annotations` at the top of new files. +- Every public function needs a docstring. Document parameters and return type. See any file in `agent/tools/` for examples. +- Match the async patterns already in the codebase — handlers are `async def` and follow the `ToolSpec` pattern in `agent/core/tools.py`. +- Keep new tool definitions consistent with existing ones: a `TOOL_SPEC` constant plus a separate `handler` function, both imported and registered in `agent/core/tools.py`. + +There is no auto-formatter configured in CI yet. Match the style of the file you're editing. + +--- + +## Testing + +Tests live in `tests/unit/`. The test runner is `pytest` with `asyncio_mode = auto` (see `pyproject.toml`). + +```bash +# Install dev dependencies +uv sync --extra dev + +# Run all unit tests +pytest tests/unit/ + +# Run a single file +pytest tests/unit/test_llm_params.py +``` + +**What to test:** + +- New pure functions (no network, no filesystem) should have unit tests. +- Tool handlers that hit external services do not need unit tests, but include a brief note in the PR description about how you manually verified the behavior. +- If you're modifying the agent loop, doom-loop detector, or context manager, run the existing tests and confirm nothing regresses. + +PRs that add meaningful new behavior without any test coverage are harder to merge. A small focused test is better than no test. + +--- + +## PR Types and Guidelines + +### Bug Fixes + +Describe the bug in the PR body: what breaks, how to reproduce it, and what the fix does. Reference the relevant issue if one exists. No need to write a test if the behavior is difficult to unit-test, but explain how you verified the fix. + +### Documentation + +Typos, README improvements, docstring additions — welcome without prior discussion. Keep the scope small; one fix per PR is easier to review. + +### New Tools + +New tools must follow the `ToolSpec` pattern in `agent/core/tools.py`. Each tool needs: + +- A `TOOL_SPEC` constant (name, description, JSON-schema parameters). +- An async handler function with a docstring. +- Registration in `create_builtin_tools()`. +- A note in the PR body about how you tested it end-to-end. + +Open an Issue before starting implementation. The tool should serve ml-intern's core mission (researching papers, training models, shipping ML code in the HF ecosystem). + +### Configuration Changes + +Changes to `configs/cli_agent_config.json` or `configs/frontend_agent_config.json` should explain *why* in the PR description — what behavior changes, and why the new values are better. See PR #118 for an example. + +### CI / Infrastructure + +Changes to `.github/workflows/` or build tooling follow the same code review process. Be explicit about what the change enables or fixes. + +--- + +## Review Standards + +PRs are automatically reviewed by Claude using the rules in [REVIEW.md](./REVIEW.md). The key points: + +- **P0** — blocks merge. Must be addressed before the PR can land. +- **P1** — worth fixing, not blocking. You can defer to a follow-up issue at your discretion. +- **P2** — informational. No action required. + +The review philosophy is *rigor over speed*. Every behavior claim in a finding will cite a `file:line` reference. If you believe a P0 finding is incorrect, point to the code or test that contradicts it — the reviewer will reconsider. If you believe a P1 or P2 is not worth addressing, say so and move on. + +Maintainers may add their own review on top of the automated one. + +--- + +## FAQ + +**My PR has been open for a few days with no response. What should I do?** + +Leave a comment asking for feedback. The team is small and reviews sometimes get delayed. + +**Can I submit a draft PR to get early feedback?** + +Yes. The automated review only runs on non-draft PRs (see `.github/workflows/claude-review.yml`), so a draft is a good way to share work-in-progress without triggering a full review. + +**My change touches `uv.lock`. Do I need to do anything special?** + +Lock file changes are reviewed for provenance and correctness of the accompanying `pyproject.toml` change. Make sure the package version in `pyproject.toml` matches what you intended and include a brief note on why the new dependency is needed. + +**I want to contribute but don't know where to start.** + +Check the open Issues. Documentation improvements and small bug fixes are low-risk starting points and help you get familiar with the codebase before tackling larger features.