Skip to content

Add valkyrie contract#45

Open
oalmatov wants to merge 8 commits intomainfrom
oa/docent-ingestion
Open

Add valkyrie contract#45
oalmatov wants to merge 8 commits intomainfrom
oa/docent-ingestion

Conversation

@oalmatov
Copy link
Copy Markdown
Contributor

@oalmatov oalmatov commented Apr 27, 2026

Summary

Adds the Valkyrie agent contract scaffolding so finance-agent can run on the Agentic Harness (under the name docent-finance-agent) and wires up Docent ingestion so each task's transcript and metadata land in a Docent collection alongside the eval result.

This PR is the "reference agent" half of the Docent ingestion work — paired with vals-ai/Valkyrie#309 (tracker side) and vals-ai/model-proxy#709 (the model-library-ingest CLI it invokes).

Changes

  • contract.yaml — Valkyrie agent contract: install_cmd, run_cmd, ingest_cmd, defaults. The ingest_cmd runs model-library-ingest against the agent output dir and writes the resulting AgentRun JSON to the path the tracker provides.
  • setup.sh + Makefile + README.md — sandbox bootstrap, push target (make pushvalkyrie agent push . --name docent-finance-agent), and short instructions.
  • pyproject.toml — adds model-library[docent], vendored as a wheel under wheels/ and pinned via [tool.uv.sources]. Required because model-proxy is private and the Valkyrie sandbox has no SSH key for GitHub, so git+ssh:// clones hang on the host-key prompt. The [docent] extra still pulls docent-python from PyPI normally.
  • wheels/model_library-0.1.13.dev183+gd461e5e36-py3-none-any.whl — vendored model-library build that includes the loader fix for populating query_result.history from per-turn history.bin.
  • Merge of origin/main: takes main's (503, "sec.gov") tuple-style retry API for finance_agent/exceptions.py + tools.py, dropping the older url_patterns={503: ["sec.gov"]} variant from this branch.

Related Issues

Type of Change

  • New feature
  • Docs / config

Testing

  • Manually tested end-to-end: pushed docent-finance-agent to Valkyrie via make push, ran a finance benchmark task, watched the tracker logs report successful Docent upload, then confirmed via the Docent SDK that the run was in the collection with the initial user prompt at message 0, complete tracker-owned metadata (job_id, task_id, benchmark, agent, eval_result, ingested_at), and the eval grading attached.

Checklist

  • Self-reviewed the diff
  • No debug/dead code left in
  • Docs updated if needed

Reference agent that exercises the new ingest_cmd contract field plus
model_library.ingest. Renames the agent to docent-finance-agent so it
ships alongside finance-agent without disrupting it.

- contract.yaml: declare ingest_cmd (runs model-library-ingest against
  the agent output dir and writes the AgentRun JSON to the path the
  tracker provides).
- pyproject.toml: switch model-library to the model-proxy branch with
  the [docent] extra (pulls in docent-python).
- Makefile / setup.sh: rename references.
The Valkyrie sandbox has no SSH key and no known_hosts for github.com,
so `uv sync` was hanging on the interactive host-key prompt when
pulling model-proxy via git+ssh. Switch to a vendored wheel under
wheels/ via [tool.uv.sources]; docent-python still resolves from PyPI.
Pulls in the loader fix that populates query_result.history from the
per-turn pickled history.bin, so the initial user prompt and any
hook-injected messages survive into the Docent transcript.
# Conflicts:
#	finance_agent/exceptions.py
#	finance_agent/tools.py
@assert-app
Copy link
Copy Markdown

assert-app Bot commented Apr 27, 2026

Review on Assert →

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant