Skip to content

aidevelopertraining/gowlin

Gowlin

Development Status Tests Coverage License

Secure autograder for LLM agent development and evaluation

Warning: This project is currently in alpha development and not ready for production use. Core features are being actively developed and the API may change significantly.

Overview

Gowlin is transitioning to a framework-agnostic evaluation platform for LLM agents. This alpha release introduces the foundation for integrating open-source tools like LLM-Sandbox and DeepEval, with initial framework adapter architecture.

Current Features (Phase 1)

  • Framework Adapter Architecture: Base classes for framework-specific evaluation
  • LangChain Detection: Basic pattern matching for LangChain usage
  • Adapter Structure: Foundation for LLM-Sandbox and DeepEval integration
  • Modular Design: Separated concerns between frameworks, sandboxing, and evaluation

Planned Features

  • Full LLM-Sandbox integration for secure execution
  • DeepEval metrics implementation
  • Additional framework adapters (AutoGen, CrewAI)
  • Production readiness scoring

Installation

Gowlin is not yet published to PyPI. Install from source for development:

git clone https://github.com/aidevelopertraining/gowlin.git
cd gowlin
./scripts/setup-dev.sh
source .venv/bin/activate

Usage

# Submit a solution
gowlin submit solution.py --mission hello-agent

# Check evaluation status
gowlin status

# View detailed results
gowlin results --detailed

Architecture

Current implementation structure:

src/gowlin/
├── frameworks/          # Framework adapter system
│   ├── base.py         # Abstract base classes
│   └── langchain_adapter.py  # Basic LangChain detection
├── integrations/       # Placeholder adapters
│   ├── llm_sandbox_adapter.py  # LLM-Sandbox stub
│   └── deepeval_adapter.py     # DeepEval stub
└── [existing modules]

Implemented Components

  • FrameworkAdapter: Abstract base class for framework-specific adapters
  • FrameworkEvaluationResult: Data model for evaluation results
  • LangChainAdapter: Basic framework detection (patterns only)
  • Integration Stubs: Placeholder classes for LLM-Sandbox and DeepEval

Development Status

Current Phase: Transitioning to open-source integrations

Completed

  • ✓ Framework adapter architecture (base classes)
  • ✓ Basic project structure reorganization
  • ✓ Placeholder integration adapters

In Progress

  • ⚠ LangChain adapter (detection only, no evaluation)
  • ⚠ Integration implementations (stubs only)

Not Started

  • ✗ LLM-Sandbox integration
  • ✗ DeepEval integration
  • ✗ AutoGen, CrewAI adapters
  • ✗ Evaluation logic

Development Setup

Prerequisites

  • Python 3.12+ (required for security consistency with CI)
  • Docker (for future sandbox testing)
  • Firecracker (planned for production sandboxing)

Dependencies Note

The new dependencies (llm-sandbox, deepeval) are commented out in requirements.txt due to version conflicts with existing packages. They will be added when the integration is implemented.

Setup

git clone https://github.com/aidevelopertraining/gowlin.git
cd gowlin
./scripts/setup-dev.sh
source .venv/bin/activate
pytest

Contributing

We welcome contributions. Please read our Contributing Guide for details on our development process and code of conduct.

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes with tests
  4. Submit a pull request

Security

Security features are planned through LLM-Sandbox integration (not yet implemented).

For security vulnerabilities, please email security@aidevelopertraining.com.

Documentation

License

Licensed under the Apache License 2.0. See LICENSE for details.

Community


Built by the AI Developer Training community

About

Gowlin: Open-source Secure autograder for LLM agent development and evaluation

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published