An open-source, production-ready Generative AI platform for institutions Built by Boise State University, designed for everyone.
Features · Architecture · Admin Dashboard · Deployment · Release Notes · Contributing
🚀 Ready to deploy? Fork the repo and follow the GitHub Actions Quick Start to deploy to your AWS environment in under an hour.
AgentCore Public Stack is the working codebase behind boisestate.ai, a full-stack AI assistant platform serving students, faculty, and staff at Boise State University. It combines AWS Bedrock AgentCore and Strands Agents into a turnkey system that any institution can fork, deploy, and make their own — without writing a single line of agent code.
The platform ships with a modern chat interface, a full admin dashboard, multi-model support, tool and MCP server management, quota enforcement, cost tracking, and role-based access control — all configurable through the UI.
Most institutions face the same AI dilemma: vendor-hosted per-seat subscriptions don't scale, and building from scratch is too expensive.
| The Problem | Our Answer |
|---|---|
| 💸 Per-seat pricing doesn't scale | Consumption-based billing — pay only for tokens used, not seats purchased |
| 🔐 Student data leaves your control | All data stays inside your AWS account. Bedrock model interactions are never shared with vendors |
| 🔒 Vendor lock-in limits flexibility | Swap models freely — Claude, Llama, Mistral, GPT, Gemini — no contracts, no lock-in |
| 🧩 Fragmented tool ecosystems | One platform with MCP servers connecting Canvas, Google Workspace, PeopleSoft, and more |
| ⚖️ Inequitable access | Every student gets the same models and tools — no premium tiers required |
At 30,000 users, commercial subscriptions would cost upwards of $7.2M per year. This platform delivers the same (and more) capabilities at a fraction of the cost.
Add, swap, or disable AI models without redeploying. The platform supports any model available through AWS Bedrock — Claude, Llama, Mistral, and more — as well as external providers like OpenAI and Google Gemini via API. Administrators configure models through the dashboard, set per-model pricing, and control which roles have access. When a better model launches, add it in minutes.
Every interaction is tracked at the token level. Automatic prompt caching reduces repeated input costs. Turn-based message buffering cuts memory API calls by 75%. Quota tiers let administrators set spending limits by role, department, or individual — with soft warnings before hard stops. Real-time cost analytics surface exactly where budget is going.
Your data never leaves your AWS account. The platform uses OIDC authentication (Entra ID, Cognito, Google) with PKCE, role-based access control at every layer, and SigV4-signed requests for all MCP tool communication. There are no external data pipelines, no third-party analytics, and no model training on your interactions.
Connect institutional systems through MCP (Model Context Protocol) servers — no code changes, no redeployment. Register an MCP server URL in the admin dashboard and its tools are automatically discovered, assigned to roles, and made available to users. Deploy MCP servers on your own schedule, on your own infrastructure, completely independent of the core platform.
Example integrations:
- 📚 Canvas LMS — Course materials, grades, assignments
- 📁 Google Workspace — Drive search, Docs, Calendar
- 🎓 PeopleSoft — Student records, registration
- 🔍 Library Systems — Research databases, catalog search
- 🔧 Any custom API — Wrap it in an MCP server and plug it in
Users can upload images (PNG, JPEG, GIF, WebP) and documents (PDF, CSV, DOCX) directly into conversations. The agent can generate charts and diagrams. More features to come.
Train and run inference on open-source models directly from the platform. Users with admin-granted access can upload datasets, launch SageMaker training jobs on models like BERT, RoBERTa, GPT-2, and more, then run batch inference on trained models — all with real-time progress tracking, quota enforcement, and automatic 30-day artifact retention. No ML infrastructure setup required.
Enable programmatic access to the platform's AI models via REST API. Users generate personal API keys from the Settings page and use them to call the /chat/api-converse endpoint — the same models, quotas, and RBAC permissions as the web UI, accessible from any language or tool.
How it works:
- One key per user — creating a new key automatically revokes the previous one
- 90-day expiration — keys expire automatically; create a new one to renew
- Secure by design — keys are SHA-256 hashed at rest; the raw key is shown only once at creation
- Rate limited — 60 requests per minute per key
- Streaming support — both SSE streaming and standard JSON responses
Authentication: Include the key in the X-API-Key header with each request. The platform resolves the user, checks quotas, enforces RBAC model access, and tracks costs — identical to browser-based usage.
Built-in code examples: The API Keys settings page includes a code generator with cURL, Python, and JavaScript examples, a model selector, and configurable parameters (temperature, max tokens, system prompt) so users can start integrating immediately.
A two-tier memory system combines short-term session history with long-term user preferences. The agent remembers coding style preferences, language choices, and learned facts across sessions — personalization without compromising privacy.
The admin dashboard gives institutional administrators full control over the platform through a web interface — no code or CLI required.
| Feature | Description |
|---|---|
| 📈 Cost Analytics | Real-time dashboards showing token usage and costs by user, model, and time period. Identify spending trends, view top users, and export reports for budget planning. |
| 🤖 Model Management | Add, configure, enable, or disable AI models from any supported provider. Set per-model pricing, control access by role, and adjust availability instantly. |
| 🧰 Tool Catalog | Browse, enable, and configure all available tools — local, built-in, and MCP-sourced. Sync tools from registered MCP servers and control availability per role. |
| 🔗 MCP Server Registration | Register external MCP servers by URL. Tools are auto-discovered from the server manifest, assigned RBAC permissions, and made available to users — no redeployment needed. |
| 🔑 OAuth Provider Management | Configure third-party OAuth integrations (Google, Microsoft, GitHub, custom) so MCP tools can authenticate on behalf of users against external services. |
| 👥 Role Management | Create application roles with granular permissions over models, tools, and quotas. Map roles to JWT claims from your identity provider for automatic assignment. |
| 🏛️ Auth Provider Configuration | Configure OIDC authentication providers including issuer URLs, client credentials, claim mappings, and login page branding — all from the UI. |
| 👤 User Management | Search and browse users, view individual cost breakdowns, quota status, and role assignments. Apply per-user overrides when needed. |
| 📏 Quota Tiers | Define quota tiers with monthly and daily cost limits, soft warning thresholds, and hard stops. Assign tiers to roles, email domains, or individual users. |
| ⚡ Quota Overrides | Grant temporary exceptions — unlimited access for a research sprint, elevated limits for a class project — with automatic expiration dates. |
| 🔎 Quota Inspector | Debug quota resolution for any user. See which tier resolved, current usage against limits, and recent enforcement events (warnings, blocks, resets). |
| 📋 Quota Events | Monitor all quota enforcement activity in real time. Filter by event type, export to CSV, and audit enforcement decisions. |
| 🧪 Fine-Tuning Access | Grant or revoke fine-tuning access per user. Set monthly compute-hour quotas and monitor usage across training and inference jobs. |
┌─────────────────────────────────────────────────────────────────────────────┐
│ User Interface │
│ Angular v21 + Tailwind CSS v4.2+ │
└──────────┬──────────────────────────────────────────────┬───────────────────┘
│ REST (auth, sessions, admin, files) │ SSE Streaming
v │
┌──────────────────────────────┐ │
│ App API (FastAPI) │ │
│ Auth · RBAC · Sessions │ /auth/runtime-endpoint │
│ Admin · Costs · Quotas │── returns runtime URL ───┘
└──────────────────────────────┘ │
v
┌─────────────────────────────────┐
│ AgentCore Runtime (AWS Managed)│
│ Inference API + Strands Agent │
│ /invocations (SSE streaming) │
└───────┬─────────┬───────┬───────┘
│ │ │
┌───────────────────────────┘ │ └───────────┐
v v v
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ Local Tools │ │ AgentCore Tools │ │ AgentCore │
│ (Direct) │ │ (AWS Managed) │ │ Gateway (MCP) │
│ │ │ │ │ │
│ Weather, Search, │ │ Code Interpreter │ │ Lambda Functions │
│ Visualization │ │ Browser (Nova) │ │ Wikipedia, ArXiv │
└──────────────────┘ │ Memory │ │ Google, Tavily, │
└──────────────────┘ │ Finance │
└──────────────────┘
| Layer | Technologies |
|---|---|
| Frontend | Angular v21, TypeScript, Tailwind CSS v4.2+ |
| Backend | Python 3.13+, FastAPI |
| Agent Framework | Strands Agents SDK |
| Cloud Services | AWS Bedrock AgentCore (Runtime, Memory, Gateway, Code Interpreter, Browser), Amazon SageMaker |
| Infrastructure | AWS CDK (TypeScript), ECS Fargate, CloudFront, DynamoDB |
| Authentication | OIDC (Entra ID, Cognito, Google) with PKCE |
| CI/CD | GitHub Actions with full CDK deployment automation |
The fastest path to production is the GitHub Actions pipeline, which automates the entire AWS deployment — infrastructure, backend services, frontend, and MCP gateway — through a series of workflow runs.
GitHub Actions Quick Start | Step-by-Step Deployment Guide
| Component | AWS Service | Purpose |
|---|---|---|
| Networking | VPC, ALB, Security Groups | Isolated network with load balancing |
| Fine-Tuning (optional) | SageMaker, S3, DynamoDB | Model training, batch inference, artifact storage |
| RAG Ingestion | Lambda, S3 | Document ingestion for retrieval-augmented generation |
| Inference API | Bedrock Agentcore | Agent orchestration with Bedrock |
| App API | ECS Fargate | Authentication, admin, session management |
| Frontend | S3 + CloudFront | Angular SPA with global CDN |
| MCP Gateway | Lambda + API Gateway | Serverless MCP tool endpoints |
| Bootstrapper | DynamoDB | Users, sessions, costs, quotas, roles |
See backend/README.md for detailed backend setup. Authentication is handled by Cognito's first-boot flow — the first user to access the application creates the admin account directly.
agentcore-public-stack/
├── backend/
│ └── src/
│ ├── agents/main_agent/ # Agent core: factory, tools, memory, streaming
│ └── apis/
│ ├── app_api/ # Application API (port 8000)
│ ├── inference_api/ # AgentCore Runtime API (port 8001)
│ └── shared/ # Auth, RBAC, shared utilities
├── frontend/ai.client/ # Angular SPA
│ └── src/app/
│ ├── auth/ # OIDC authentication
│ ├── session/ # Chat UI
│ ├── admin/ # Admin dashboard
│ ├── fine-tuning/ # SageMaker fine-tuning UI
│ └── services/ # State management
├── infrastructure/ # AWS CDK stacks
│ └── lib/ # Infra, App API, Inference API, Frontend,
│ # Gateway, RAG Ingestion, SageMaker Fine-Tuning
└── .github/
├── workflows/ # CI/CD pipelines
└── docs/deploy/ # Deployment guides
See RELEASE_NOTES.md for the full changelog, including new features, bug fixes, platform upgrades, and deployment notes for each release.
Current release: v1.0.0-beta.22
Contributions are welcome! Please open an issue to discuss proposed changes before submitting a pull request.
This project is licensed under the PolyForm Noncommercial License 1.0.0. You may use, modify, and distribute this software for noncommercial purposes only. Commercial use, including use in a product or service that generates revenue, is prohibited without a separate commercial license.
See the LICENSE file for the full license text. For commercial licensing inquiries, please contact: techtransfer@boisestate.edu
Built at Boise State University
