Skip to content
View wingtonrbrito's full-sized avatar

Block or report wingtonrbrito

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
wingtonrbrito/README.md

Wington Brito

Senior AI Engineer building production AI systems — multi-agent orchestration, RAG & search infrastructure, observability pipelines, and streaming data platforms. 12+ years shipping systems at scale, 3+ years focused on AI/ML engineering.

I don't build prototypes. Everything I ship runs in production.


🤖 AI & Infrastructure Portfolio

Project Description Tech Stack
Agentic Patterns Multi-protocol AI agent platform with production observability FastAPI · Pydantic AI · MCP · A2A · gRPC · SSE · PostgreSQL · pgvector · Kafka Streams · ClickHouse · OpenTelemetry
MLOps-GCP MLOps platform on GCP with multi-framework training and RAG pipelines FastAPI · Vertex AI · BigQuery · Cloud Run · Terraform · MLflow
DocumentMind AI Enterprise RAG system on AWS Bedrock LangChain · AWS Bedrock · Pinecone · FastAPI
AgentBench Multi-agent evaluation framework LangGraph · Agno · MCP Protocol · Docker
BetterBets AI Recommendation engine with BERT classification SageMaker · PyTorch · Pinecone · HuggingFace
VoiceFlow POS Voice-enabled POS with multi-agent orchestration LangGraph · Whisper · ChromaDB · Streamlit
Field GenAI MCP Multi-document chat with MCP Protocol Socket.io · Redis · Gemini AI · MCP
SD-MLOps-Studio Stable Diffusion platform with LoRA training PyTorch · ComfyUI · FastAPI · Docker · Kubernetes

🏗️ Agentic Patterns — Multi-Protocol Agent Infrastructure

Production-grade architecture for AI agent platforms across multiple industry verticals.

3-Layer Architecture:

  • Agent Layer — Pydantic AI runtime, tool orchestration, RAG with ColBERT reranking, cognitive memory (episodic/semantic/procedural), MCP tool integrations
  • Observability Layer — OpenTelemetry → Kafka Streams (Java 21) → ClickHouse with tiered storage (hot/warm/cold). SLO error budget policies, T-Digest/HyperLogLog/Welford's for streaming stats
  • Domain Verticals — Built across Healthcare, Education, Logistics, Finance, Trading, and Creative industries

7 Protocols: REST · MCP · A2A v0.3 · gRPC · Webhook · Temporal · SSE

Agentic Patterns Implemented:

Pattern How It's Applied
ReAct Loop Reusable tool-use loop with SSE streaming, session persistence, and parallel tool execution
Supervisor + Specialist Orchestrator delegates to domain-specific agents with mandatory execution chains and behavioral detection
Role-Based Routing Same agent backend serves different personas with different tools, prompts, and data visibility per role
Human-in-the-Loop Two-step confirmation: agent drafts, user confirms, agent executes — for approvals, outreach, bookings
Cross-Agent State Agents share state across domains — an action in one agent triggers effects in another
Parallel Tool Execution Multiple tool calls resolved concurrently within a single agent turn
Multi-Protocol Endpoints Single registration generates REST + MCP + A2A + SSE endpoints per vertical

Guardrails: 5-layer system (hallucination, compliance, PII, toxicity, industry-specific)

Stack: FastAPI · Pydantic AI · MCP · A2A · gRPC · SSE · PostgreSQL · pgvector · Qdrant · Redis · Kafka Streams · ClickHouse · OpenTelemetry · Composio · Temporal · Docker


🔍 Knowledge Engine — Production RAG Infrastructure

Production RAG with swappable backends — same pipeline abstraction runs on self-hosted pgvector or managed Pinecone with zero code changes.

End-to-end pipeline:

Ingest → Chunk (recursive 1500/150) → Enrich (SHA-256, category, timestamps)
  → Scan (injection detection) → Embed → Index
  → Query → Hybrid Search (dense + sparse) → Rerank → Guardrails → Generate

Two production implementations against the same ABCs:

Component pgvector (Self-Hosted) Pinecone (Managed)
Indexing HNSW + GIN indexes, manual tuning Serverless, auto-scaling
Hybrid Search Dense + BM25 + RRF fusion (k=60) in single SQL query Dense + sparse in single query, alpha-weighted fusion
Reranking ms-marco-MiniLM-L-6-v2 (local Docker) bge-reranker-v2-m3 (Pinecone managed) + local fallback
Embeddings all-MiniLM-L6-v2 (384d, local) multilingual-e5-large (1024d, Pinecone Inference API)
Multi-tenancy tenant_id column + WHERE clause Namespace isolation (physically separated)
Fusion RRF in Python Built-in (single API call)

Confidence-driven response routing:

  • >= 0.85: Answer directly
  • 0.60–0.84: Hedge with caveats
  • 0.40–0.59: Cite sources explicitly
  • < 0.40: Decline to answer

5-layer confidence pipeline: domain check → grounding → claim verification → weighted confidence score → compliance (FCRA, HIPAA, GDPR, FERPA per-tenant config)

Dual reranker system: Cross-encoder + ColBERT late-interaction, RRF fusion of both, 3-level fallback (both → one → original). Query agent with rewriting and sub-query decomposition.

Evaluation: Correction rate decay (primary metric), MRR, NDCG@k, confidence calibration (ECE), A/B comparison (with RAG vs without)

Stack: pgvector · Pinecone Serverless · Pinecone Inference API · PostgreSQL 16 · FastAPI · MCP Protocol · Docker


🧠 What I Work On

Multi-Agent Systems

  • Pydantic AI for type-safe agent orchestration with structured outputs and dependency injection
  • LangGraph + StateGraph for workflow orchestration and checkpointing
  • Raw Anthropic SDK tool-use for maximum control when frameworks add too much abstraction
  • MCP Protocol servers with FastMCP, discovery endpoints, multi-server catalogs
  • A2A v0.3 agent cards for agent-to-agent communication
  • Google ADK and Agno for additional agent patterns
  • Temporal for durable execution with retry logic
  • Composio for 500+ tool integrations

Observability & Streaming

  • OpenTelemetry → Kafka Streams → ClickHouse full pipeline
  • 8-processor OTel collector with tail-based sampling and attribute enrichment
  • Kafka Streams dual-path topology: real-time aggregation + raw archival
  • ClickHouse MergeTree with tiered storage and automatic rollup
  • SLO framework: latency P99, availability, throughput, error rate with burn-rate alerts
  • T-Digest, HyperLogLog, Welford's for streaming percentiles/cardinality/variance

MLOps & Training

  • TensorFlow/Keras + PyTorch + sklearn training pipelines
  • MLflow experiment tracking and model versioning
  • Vertex AI + SageMaker deployment with auto-scaling
  • LoRA/QLoRA fine-tuning, DistilBERT for classification tasks
  • Terraform for reproducible GCP/AWS infrastructure

🛠️ Tech Stack

Agent Systems: Pydantic AI · LangGraph · MCP · A2A · Anthropic SDK · Google ADK · Agno · Composio · Temporal · n8n

RAG & Search: pgvector · Pinecone · Qdrant · ChromaDB · HNSW · BM25 · RRF · Cross-Encoder Reranking · ColBERT · Hybrid Search · Recursive Chunking · Confidence Routing · Injection Defense

ML/AI: PyTorch · TensorFlow · HuggingFace · BERT/DistilBERT · LoRA/QLoRA · SageMaker · Vertex AI · MLflow · AWS Bedrock · LLM-as-Judge

Observability: OpenTelemetry · Kafka Streams · ClickHouse · T-Digest · HyperLogLog · SLO Frameworks

Infrastructure: Python · TypeScript · FastAPI · Next.js · React · PostgreSQL · Redis · Docker · Kubernetes · Terraform · AWS · GCP · gRPC · GitHub Actions


📫 Connect

Pinned Loading

  1. full-stack-node-ts-react-template full-stack-node-ts-react-template Public

    Production-grade Node.js + TypeScript + React starter template with Docker, CI/CD (GitHub Actions), and AWS deployment

    TypeScript

  2. e2e-leads-machine-learning-pipeline e2e-leads-machine-learning-pipeline Public

    Production-grade ML pipeline on GCP with Airflow, Dataflow, BigQuery & Vertex AI for sentiment analysis and lead scoring — plus a REST API layer to expose enriched leads.

    Python