A backend service for evaluating AI model responses against reference answers.
- Evaluate responses from multiple AI models (OpenAI, Anthropic Claude, Google Gemini)
- Compare model outputs against reference answers
- Store evaluation results
- API endpoints for single and batch evaluations
- Clone the repository
git clone https://github.com/Arnasltlt/simpleevals-be.git
cd simpleevals-be- Install dependencies
npm install- Set up environment variables
cp .env.example .env
# Edit .env with your API keys- Start the server
npm run devPOST /api/mvp/evaluate- Evaluate a single question across modelsPOST /api/mvp/evaluate-set- Evaluate multiple questions in batchGET /api/mvp/sets- Get all evaluation setsGET /api/mvp/sets/:id- Get a specific evaluation setGET /api/mvp/share/:id- Get a shareable evaluation set
MIT