Skip to content

Add pando PBVerified submission (2026-03-12)#23

Open
georgeciobanu wants to merge 1 commit intoamazon-science:submissionfrom
georgeciobanu:submission
Open

Add pando PBVerified submission (2026-03-12)#23
georgeciobanu wants to merge 1 commit intoamazon-science:submissionfrom
georgeciobanu:submission

Conversation

@georgeciobanu
Copy link

Adds a new PBVerified submission for pando using gpt-5.2-codex.

Included:

  • evaluation/PBVerified/20260312_pando_gpt-5.2-codex/all_preds.jsonl
  • evaluation/PBVerified/20260312_pando_gpt-5.2-codex/logs/
  • evaluation/PBVerified/20260312_pando_gpt-5.2-codex/trajs/
  • evaluation/PBVerified/20260312_pando_gpt-5.2-codex/metadata.yaml
  • evaluation/PBVerified/20260312_pando_gpt-5.2-codex/README.md

Notes:

  • This is a TypeScript-only PBVerified submission package.
  • all_preds.jsonl contains 382 verified-set entries.
  • logs/ and trajs/ contain the 100 evaluated TypeScript instances.
  • Reported evaluated-subset pass rate: 47.00% (47/100).

@mshihabr
Copy link
Contributor

Hi George, thanks for creating the PR and congrats on the numbers! We will verify from our end and update the website.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants