Skip to content

Project: Reasoning agents - Certification preparation AI #76

@athiq-ahmed

Description

@athiq-ahmed

Track

Reasoning Agents (Azure AI Foundry)

Project Name

CertPrep Multi-Agent System — Personalised Microsoft Exam Preparation

GitHub Username

athiq-ahmed

Repository URL

https://github.com/athiq-ahmed/agentsleague

Project Description

The CertPrep Multi-Agent System is a production-grade AI solution for personalised Microsoft certification exam preparation, supporting 9 exam families (AI-102, DP-100, AZ-204, AZ-305, AZ-400, SC-100, AI-900, DP-203, MS-102).

Eight specialised reasoning agents collaborate through a typed, sequential + concurrent pipeline:

  1. LearnerProfilingAgent — converts free-text background into a structured LearnerProfile via Azure AI Foundry SDK (Tier 1) or direct GPT-4o JSON-mode (Tier 2), with a deterministic rule-based fallback (Tier 3).
  2. StudyPlanAgent — generates a week-by-week Gantt study schedule using the Largest Remainder algorithm to allocate hours without exceeding the learner's budget.
  3. LearningPathCuratorAgent — maps each exam domain to curated MS Learn modules with trusted URLs, resource types, and estimated hours.
  4. ProgressAgent — computes an exam-weighted readiness score (0.55 × domain ratings + 0.25 × hours utilisation + 0.20 × practice score).
  5. AssessmentAgent — generates a 10-question domain-proportional mock quiz and scores it against the 60% pass threshold.
  6. CertificationRecommendationAgent — issues a GO / CONDITIONAL GO / NOT YET booking verdict with next-cert suggestions and a remediation plan.

A 17-rule GuardrailsPipeline runs at every agent boundary. Two human-in-the-loop gates ensure agents act on real learner data, not assumptions. The full pipeline runs in under 1 second in mock mode (zero Azure credentials), enabling reliable live demonstrations at any time.

Demo Video or Screenshots

Demo video: https://www.youtube.com/watch?v=okWcFnQoBsE
Live app: https://agentsleague.streamlit.app/ [with mock data]

Primary Programming Language

Python

Key Technologies Used

  • Azure AI Foundry Agent Service SDK (azure-ai-projects) | Tier 1 managed agent + conversation thread for LearnerProfilingAgent |
  • Azure OpenAI GPT-4o (JSON mode) | Tier 2 structured profiling fallback; temperature=0.2 |
  • Azure Content Safety (azure-ai-contentsafety) | G-16 guardrail — profanity and harmful-content filter |
  • Streamlit | 7-tab interactive UI + Admin Dashboard |
  • Pydantic v2 BaseModel | Typed handoff contracts at every agent boundary |
  • concurrent.futures.ThreadPoolExecutor | Parallel fan-out of StudyPlanAgentLearningPathCuratorAgent |
  • Plotly | Gantt chart, domain radar, agent timeline |
  • SQLite (sqlite3 stdlib) | Cross-session learner profile + reasoning trace persistence |
  • ReportLab | PDF generation for profile and assessment reports |
  • Python smtplib (STARTTLS) | Optional weekly study-progress email digest |
  • hashlib SHA-256 | PIN hashing before SQLite storage |
  • Custom GuardrailsPipeline (17 rules) | BLOCK / WARN / INFO at every agent boundary; PII, URL trust, content safety |
  • pytest + parametrize | 342 automated tests across 15 modules; zero credentials required |
  • Streamlit Community Cloud | Auto-deploy on git push; secrets via environment variables |
  • Visual Studio Code + GitHub Copilot | Primary IDE; AI-assisted development throughout |

Submission Type

Individual

Team Members

(Individual submission — no team members)

Submission Requirements

  • My project meets the track-specific challenge requirements
  • My repository includes a comprehensive README.md with setup instructions
  • My code does not contain hardcoded API keys or secrets
  • I have included demo materials (video or screenshots)
  • My project is my own work with proper attribution for any third-party code
  • I agree to the Code of Conduct
  • I have read and agree to the Disclaimer
  • My submission does NOT contain any confidential, proprietary, or sensitive information
  • I confirm I have the rights to submit this content and grant the necessary licenses

Quick Setup Summary

# 1. Clone the repository
git clone https://github.com/athiq-ahmed/agentsleague.git
cd agentsleague

# 2. Create and activate virtual environment
python -m venv .venv
.venv\Scripts\Activate.ps1   # Windows PowerShell

# 3. Install dependencies
pip install -r requirements.txt

# 4. Configure environment variables
notepad .env   # Fill in AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY, and optionally
               # AZURE_AI_PROJECT_CONNECTION_STRING for Foundry SDK mode
               # Run `az login` once for local Foundry authentication

# 5. Run the application
python -m streamlit run streamlit_app.py

# 6. Run the test suite (no Azure credentials needed)
python -m pytest tests/ -v

Zero-credential demo mode: Leave .env keys blank or set FORCE_MOCK_MODE=true — the full 8-agent pipeline runs deterministically in under 1 second using the rule-based mock engine.

Technical Highlights

  • 3-tier LLM fallback chainLearnerProfilingAgent attempts Azure AI Foundry SDK (Tier 1), falls back to direct Azure OpenAI JSON-mode (Tier 2), and finally to a deterministic rule-based engine (Tier 3). All three tiers share the same Pydantic output contract, so downstream agents never know which tier ran.

  • Largest Remainder day allocationStudyPlanAgent uses the parliamentary Largest Remainder Method to distribute study time at the day level (total_days = weeks × 7) across domains, then converts day blocks to week bands and hours. Guarantees: (1) total days exactly equals budget; (2) every active domain receives at least 1 day (max(1, int(d)) floor) — no domain is silently zeroed out.

  • Concurrent agent fan-outStudyPlanAgent and LearningPathCuratorAgent have no data dependency on each other; they run in true parallel via ThreadPoolExecutor, cutting Block 1 wall-clock time by ~50%.

  • 17-rule exam-agnostic guardrail pipeline — Every agent input and output is validated by a dedicated GuardrailsPipeline before the next stage proceeds. BLOCK-level violations call st.stop() immediately; nothing downstream ever sees invalid data.

  • Exam-weighted readiness formula — Progress scoring uses 0.55 × domain ratings + 0.25 × hours utilisation + 0.20 × practice score, with domain weights pulled from the per-exam registry (not hardcoded for AI-102), so the formula is accurate across all 9 supported certifications.

  • Demo PDF cache — For demo personas, PDFs are generated once and served from demo_pdfs/ on all subsequent clicks — no pipeline re-run needed, making live demos instant and reliable.

  • Schema-evolution safe SQLite — All *_from_dict deserialization helpers use a _dc_filter() guard that silently drops unknown keys, preventing TypeError crashes when the data model evolves and old rows are read back.

Challenges & Learnings

Challenge How We Solved It Learning
Streamlit + asyncio conflictasyncio.gather() raises RuntimeError: event loop already running inside Streamlit Replaced with concurrent.futures.ThreadPoolExecutor — identical I/O latency, no event-loop conflict, stdlib only Always profile async options in the target host runtime before committing to the pattern
Schema evolution crashes — Adding new fields to agent output dataclasses caused TypeError when loading old SQLite rows Added _dc_filter() helper to all *_from_dict functions; unknown keys silently dropped Design for forward and backward compatibility from day one; use a key guard on every deserialization boundary
Hardcoded AI-102 domain weightsProgressAgent used AI-102 weights for all exams, giving wrong readiness scores for DP-100 learners Refactored to call get_exam_domains(profile.exam_target) dynamically Never hardcode domain-specific constants in shared utility functions; always derive from the registry
st.checkbox key collision — Using hash()[:8] string slicing raised TypeError in Streamlit widget key generation Changed to abs(hash(item)) (integer key) which Streamlit handles natively Read widget key type requirements; integer keys are always safe
PDF generation crashes on None fieldsAttributeError when optional profile fields were absent Added getattr(obj, field, default) guards on every field access in PDF generation Defensive attribute access is essential for any code path that renders stored data
3-tier fallback complexity — Keeping Foundry SDK, direct OpenAI, and mock engine in sync as the output contract evolved Defined a single _PROFILE_JSON_SCHEMA constant and a shared Pydantic parser used by all three tiers A single source-of-truth schema makes multi-tier systems maintainable; contract-first design prevents drift
Live demo reliability — API latency or missing credentials causing demo failures Mock Mode runs the full 8-agent pipeline with zero credentials in < 1 second; demo personas pre-seeded in SQLite Always build a zero-dependency demo path; live mode is a bonus, not a requirement

Contact Information

athiqahmed.ai@gmail.com

Country/Region

India

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions