Conversation
Reference agent that exercises the new ingest_cmd contract field plus model_library.ingest. Renames the agent to docent-finance-agent so it ships alongside finance-agent without disrupting it. - contract.yaml: declare ingest_cmd (runs model-library-ingest against the agent output dir and writes the AgentRun JSON to the path the tracker provides). - pyproject.toml: switch model-library to the model-proxy branch with the [docent] extra (pulls in docent-python). - Makefile / setup.sh: rename references.
The Valkyrie sandbox has no SSH key and no known_hosts for github.com, so `uv sync` was hanging on the interactive host-key prompt when pulling model-proxy via git+ssh. Switch to a vendored wheel under wheels/ via [tool.uv.sources]; docent-python still resolves from PyPI.
Pulls in the loader fix that populates query_result.history from the per-turn pickled history.bin, so the initial user prompt and any hook-injected messages survive into the Docent transcript.
# Conflicts: # finance_agent/exceptions.py # finance_agent/tools.py
10 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the Valkyrie agent contract scaffolding so finance-agent can run on the Agentic Harness (under the name
docent-finance-agent) and wires up Docent ingestion so each task's transcript and metadata land in a Docent collection alongside the eval result.This PR is the "reference agent" half of the Docent ingestion work — paired with vals-ai/Valkyrie#309 (tracker side) and vals-ai/model-proxy#709 (the
model-library-ingestCLI it invokes).Changes
contract.yaml— Valkyrie agent contract:install_cmd,run_cmd,ingest_cmd, defaults. Theingest_cmdrunsmodel-library-ingestagainst the agent output dir and writes the resulting AgentRun JSON to the path the tracker provides.setup.sh+Makefile+README.md— sandbox bootstrap, push target (make push→valkyrie agent push . --name docent-finance-agent), and short instructions.pyproject.toml— addsmodel-library[docent], vendored as a wheel underwheels/and pinned via[tool.uv.sources]. Required becausemodel-proxyis private and the Valkyrie sandbox has no SSH key for GitHub, sogit+ssh://clones hang on the host-key prompt. The[docent]extra still pullsdocent-pythonfrom PyPI normally.wheels/model_library-0.1.13.dev183+gd461e5e36-py3-none-any.whl— vendored model-library build that includes the loader fix for populatingquery_result.historyfrom per-turnhistory.bin.origin/main: takes main's(503, "sec.gov")tuple-style retry API forfinance_agent/exceptions.py+tools.py, dropping the olderurl_patterns={503: ["sec.gov"]}variant from this branch.Related Issues
Type of Change
Testing
docent-finance-agentto Valkyrie viamake push, ran a finance benchmark task, watched the tracker logs report successful Docent upload, then confirmed via the Docent SDK that the run was in the collection with the initial user prompt at message 0, complete tracker-owned metadata (job_id,task_id,benchmark,agent,eval_result,ingested_at), and the eval grading attached.Checklist