Skip to content

emilwagman/SOP-bench

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

SOP-bench

SOP-bench is a Benchmark for evaluating llm agents to solve real-world standard operating procedures. We will Release the dataset and metrics in 2024

About

Benchmark for evaluating llm agents to solve real-world standard operating procedures

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors