Project: Reasoning agents  - Certification preparation AI

### Track

Reasoning Agents (Azure AI Foundry)

### Project Name

CertPrep Multi-Agent System — Personalised Microsoft Exam Preparation

### GitHub Username

athiq-ahmed

### Repository URL

https://github.com/athiq-ahmed/agentsleague

### Project Description


The CertPrep Multi-Agent System is a production-grade AI solution for personalised Microsoft certification exam preparation, supporting 9 exam families (AI-102, DP-100, AZ-204, AZ-305, AZ-400, SC-100, AI-900, DP-203, MS-102).

Eight specialised reasoning agents collaborate through a typed, sequential + concurrent pipeline:

1. **LearnerProfilingAgent** — converts free-text background into a structured `LearnerProfile` via Azure AI Foundry SDK (Tier 1) or direct GPT-4o JSON-mode (Tier 2), with a deterministic rule-based fallback (Tier 3).
2. **StudyPlanAgent** — generates a week-by-week Gantt study schedule using the Largest Remainder algorithm to allocate hours without exceeding the learner's budget.
3. **LearningPathCuratorAgent** — maps each exam domain to curated MS Learn modules with trusted URLs, resource types, and estimated hours.
4. **ProgressAgent** — computes an exam-weighted readiness score (`0.55 × domain ratings + 0.25 × hours utilisation + 0.20 × practice score`).
5. **AssessmentAgent** — generates a 10-question domain-proportional mock quiz and scores it against the 60% pass threshold.
6. **CertificationRecommendationAgent** — issues a GO / CONDITIONAL GO / NOT YET booking verdict with next-cert suggestions and a remediation plan.

A 17-rule GuardrailsPipeline runs at every agent boundary. Two human-in-the-loop gates ensure agents act on real learner data, not assumptions. The full pipeline runs in under 1 second in mock mode (zero Azure credentials), enabling reliable live demonstrations at any time.


### Demo Video or Screenshots

Demo video: https://www.youtube.com/watch?v=okWcFnQoBsE
Live app: https://agentsleague.streamlit.app/ [with mock data]

### Primary Programming Language

Python

### Key Technologies Used

- Azure AI Foundry Agent Service SDK (`azure-ai-projects`) | Tier 1 managed agent + conversation thread for `LearnerProfilingAgent` |
- Azure OpenAI GPT-4o (JSON mode) | Tier 2 structured profiling fallback; temperature=0.2 |
-  Azure Content Safety (`azure-ai-contentsafety`) | G-16 guardrail — profanity and harmful-content filter |
-  Streamlit | 7-tab interactive UI + Admin Dashboard |
-  Pydantic v2 `BaseModel` | Typed handoff contracts at every agent boundary |
-  `concurrent.futures.ThreadPoolExecutor` | Parallel fan-out of `StudyPlanAgent` ∥ `LearningPathCuratorAgent` |
-  Plotly | Gantt chart, domain radar, agent timeline |
-  SQLite (`sqlite3` stdlib) | Cross-session learner profile + reasoning trace persistence |
-  ReportLab | PDF generation for profile and assessment reports |
-  Python `smtplib` (STARTTLS) | Optional weekly study-progress email digest |
-  `hashlib` SHA-256 | PIN hashing before SQLite storage |
-  Custom `GuardrailsPipeline` (17 rules) | BLOCK / WARN / INFO at every agent boundary; PII, URL trust, content safety |
-  `pytest` + parametrize | 342 automated tests across 15 modules; zero credentials required |
-  Streamlit Community Cloud | Auto-deploy on `git push`; secrets via environment variables |
-  Visual Studio Code + GitHub Copilot | Primary IDE; AI-assisted development throughout |

### Submission Type

Individual

### Team Members

*(Individual submission — no team members)*

### Submission Requirements

- [x] My project meets the track-specific challenge requirements
- [x] My repository includes a comprehensive README.md with setup instructions
- [x] My code does not contain hardcoded API keys or secrets
- [x] I have included demo materials (video or screenshots)
- [x] My project is my own work with proper attribution for any third-party code
- [x] I agree to the [Code of Conduct](https://github.com/microsoft/agentsleague/blob/main/CODE_OF_CONDUCT.md)
- [x] I have read and agree to the [Disclaimer](https://github.com/microsoft/agentsleague/blob/main/DISCLAIMER.md)
- [x] My submission does NOT contain any confidential, proprietary, or sensitive information
- [x] I confirm I have the rights to submit this content and grant the necessary licenses

### Quick Setup Summary


```bash
# 1. Clone the repository
git clone https://github.com/athiq-ahmed/agentsleague.git
cd agentsleague

# 2. Create and activate virtual environment
python -m venv .venv
.venv\Scripts\Activate.ps1   # Windows PowerShell

# 3. Install dependencies
pip install -r requirements.txt

# 4. Configure environment variables
notepad .env   # Fill in AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY, and optionally
               # AZURE_AI_PROJECT_CONNECTION_STRING for Foundry SDK mode
               # Run `az login` once for local Foundry authentication

# 5. Run the application
python -m streamlit run streamlit_app.py

# 6. Run the test suite (no Azure credentials needed)
python -m pytest tests/ -v
```

> **Zero-credential demo mode:** Leave `.env` keys blank or set `FORCE_MOCK_MODE=true` — the full 8-agent pipeline runs deterministically in under 1 second using the rule-based mock engine.


### Technical Highlights


- **3-tier LLM fallback chain** — `LearnerProfilingAgent` attempts Azure AI Foundry SDK (Tier 1), falls back to direct Azure OpenAI JSON-mode (Tier 2), and finally to a deterministic rule-based engine (Tier 3). All three tiers share the same Pydantic output contract, so downstream agents never know which tier ran.

- **Largest Remainder day allocation** — `StudyPlanAgent` uses the parliamentary Largest Remainder Method to distribute study time at the **day level** (`total_days = weeks × 7`) across domains, then converts day blocks to week bands and hours. Guarantees: (1) total days exactly equals budget; (2) every active domain receives at least 1 day (`max(1, int(d))` floor) — no domain is silently zeroed out.

- **Concurrent agent fan-out** — `StudyPlanAgent` and `LearningPathCuratorAgent` have no data dependency on each other; they run in true parallel via `ThreadPoolExecutor`, cutting Block 1 wall-clock time by ~50%.

- **17-rule exam-agnostic guardrail pipeline** — Every agent input and output is validated by a dedicated `GuardrailsPipeline` before the next stage proceeds. BLOCK-level violations call `st.stop()` immediately; nothing downstream ever sees invalid data.

- **Exam-weighted readiness formula** — Progress scoring uses `0.55 × domain ratings + 0.25 × hours utilisation + 0.20 × practice score`, with domain weights pulled from the per-exam registry (not hardcoded for AI-102), so the formula is accurate across all 9 supported certifications.

- **Demo PDF cache** — For demo personas, PDFs are generated once and served from `demo_pdfs/` on all subsequent clicks — no pipeline re-run needed, making live demos instant and reliable.

- **Schema-evolution safe SQLite** — All `*_from_dict` deserialization helpers use a `_dc_filter()` guard that silently drops unknown keys, preventing `TypeError` crashes when the data model evolves and old rows are read back.


### Challenges & Learnings


| Challenge | How We Solved It | Learning |
|-----------|-----------------|----------|
| **Streamlit + asyncio conflict** — `asyncio.gather()` raises `RuntimeError: event loop already running` inside Streamlit | Replaced with `concurrent.futures.ThreadPoolExecutor` — identical I/O latency, no event-loop conflict, stdlib only | Always profile async options in the target host runtime before committing to the pattern |
| **Schema evolution crashes** — Adding new fields to agent output dataclasses caused `TypeError` when loading old SQLite rows | Added `_dc_filter()` helper to all `*_from_dict` functions; unknown keys silently dropped | Design for forward and backward compatibility from day one; use a key guard on every deserialization boundary |
| **Hardcoded AI-102 domain weights** — `ProgressAgent` used AI-102 weights for all exams, giving wrong readiness scores for DP-100 learners | Refactored to call `get_exam_domains(profile.exam_target)` dynamically | Never hardcode domain-specific constants in shared utility functions; always derive from the registry |
| **`st.checkbox` key collision** — Using `hash()[:8]` string slicing raised `TypeError` in Streamlit widget key generation | Changed to `abs(hash(item))` (integer key) which Streamlit handles natively | Read widget key type requirements; integer keys are always safe |
| **PDF generation crashes on None fields** — `AttributeError` when optional profile fields were absent | Added `getattr(obj, field, default)` guards on every field access in PDF generation | Defensive attribute access is essential for any code path that renders stored data |
| **3-tier fallback complexity** — Keeping Foundry SDK, direct OpenAI, and mock engine in sync as the output contract evolved | Defined a single `_PROFILE_JSON_SCHEMA` constant and a shared Pydantic parser used by all three tiers | A single source-of-truth schema makes multi-tier systems maintainable; contract-first design prevents drift |
| **Live demo reliability** — API latency or missing credentials causing demo failures | Mock Mode runs the full 8-agent pipeline with zero credentials in < 1 second; demo personas pre-seeded in SQLite | Always build a zero-dependency demo path; live mode is a bonus, not a requirement |


### Contact Information

athiqahmed.ai@gmail.com

### Country/Region

India

Challenge	How We Solved It	Learning
Streamlit + asyncio conflict — `asyncio.gather()` raises `RuntimeError: event loop already running` inside Streamlit	Replaced with `concurrent.futures.ThreadPoolExecutor` — identical I/O latency, no event-loop conflict, stdlib only	Always profile async options in the target host runtime before committing to the pattern
Schema evolution crashes — Adding new fields to agent output dataclasses caused `TypeError` when loading old SQLite rows	Added `_dc_filter()` helper to all `*_from_dict` functions; unknown keys silently dropped	Design for forward and backward compatibility from day one; use a key guard on every deserialization boundary
Hardcoded AI-102 domain weights — `ProgressAgent` used AI-102 weights for all exams, giving wrong readiness scores for DP-100 learners	Refactored to call `get_exam_domains(profile.exam_target)` dynamically	Never hardcode domain-specific constants in shared utility functions; always derive from the registry
`st.checkbox` key collision — Using `hash()[:8]` string slicing raised `TypeError` in Streamlit widget key generation	Changed to `abs(hash(item))` (integer key) which Streamlit handles natively	Read widget key type requirements; integer keys are always safe
PDF generation crashes on None fields — `AttributeError` when optional profile fields were absent	Added `getattr(obj, field, default)` guards on every field access in PDF generation	Defensive attribute access is essential for any code path that renders stored data
3-tier fallback complexity — Keeping Foundry SDK, direct OpenAI, and mock engine in sync as the output contract evolved	Defined a single `_PROFILE_JSON_SCHEMA` constant and a shared Pydantic parser used by all three tiers	A single source-of-truth schema makes multi-tier systems maintainable; contract-first design prevents drift
Live demo reliability — API latency or missing credentials causing demo failures	Mock Mode runs the full 8-agent pipeline with zero credentials in < 1 second; demo personas pre-seeded in SQLite	Always build a zero-dependency demo path; live mode is a bonus, not a requirement

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Project: Reasoning agents - Certification preparation AI #76

Track

Project Name

GitHub Username

Repository URL

Project Description

Demo Video or Screenshots

Primary Programming Language

Key Technologies Used

Submission Type

Team Members

Submission Requirements

Quick Setup Summary

Technical Highlights

Challenges & Learnings

Contact Information

Country/Region

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Project: Reasoning agents - Certification preparation AI #76

Description

Track

Project Name

GitHub Username

Repository URL

Project Description

Demo Video or Screenshots

Primary Programming Language

Key Technologies Used

Submission Type

Team Members

Submission Requirements

Quick Setup Summary

Technical Highlights

Challenges & Learnings

Contact Information

Country/Region

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions