Skip to content

AIML-Solutions/SnorkelTools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Snorkel-Terminus

Task template and training materials for TerminalBench Expert Coders (ECs)—the official training portal by Snorkel AI for contributing real-world software engineering tasks used to evaluate AI coding agents.

This repo is built from the template-task skeleton provided by the Terminus EC Training portal. Use it as a starter when creating or working on TerminalBench tasks.

What is TerminalBench?

TerminalBench is a benchmark that tests AI coding agents on tasks that mirror real software engineering work: debugging race conditions, implementing API endpoints, refactoring legacy code, fixing security vulnerabilities, and more. Expert Coders create these tasks; the Terminus portal is the home base for contributing.

Template structure

Path Purpose
task.toml Task metadata (difficulty, category, timeouts, resources)
instruction.md Human-readable task instructions
environment/Dockerfile Runtime environment for the task
tests/ Verifier: test.sh (runs tests, writes reward) and test_outputs.py
solution/ Oracle/reference solution (e.g. solve.sh)
  • Verifier: Runs in Docker; uses uv + pytest; writes /logs/verifier/reward.txt (1 = pass, 0 = fail).
  • Agent: Gets the same environment and must satisfy the same verifier.

Quick start

  1. Clone and customize
    Replace instruction.md with your task, implement solution/, and adjust task.toml and environment/Dockerfile as needed.

  2. Run tests locally
    Build and run the environment (see portal docs for full workflow). The verifier runs tests/test.sh, which executes tests/test_outputs.py and produces the reward file.

  3. Submit via the portal
    Sign in at Terminus EC Training, claim or create a task, and follow the submit workflow.

Training and docs

  • Portal (login required): Terminus EC Training — onboarding, task lifecycle, docs, and submission.
  • This repo: docs/ contains onboarding and task-creation notes captured from the public portal and Terminus materials.

License

Template structure aligns with Snorkel AI’s Terminus EC Training program. See portal and Snorkel AI for terms.

About

Applied benchmarking and utility lane for AI workflows, testing, and execution quality checks.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors