Juan Sebastian Mateus Perdomo ElMatiOfficial

Hey, I'm Sebastian 👋

Senior ML Engineer based in Colombia 🇨🇴. I build production ML systems at the intersection of credit risk, recommender systems, classical ML, Agentic Infra and GenAI.

🔭 What I'm working on

🏦 Credit risk ML — loss rate forecasting, PD/LGD models, cohort analysis
🤖 GenAI in production — grounded RAG, hybrid retrieval, LLM evaluation, observability, agentic workflows
📊 Data/ML infrastructure — dbt, BigQuery, Airflow, FastAPI, MLflow

🚀 Currently

Active in competitive ML — Kaggle: 1 Silver, top 20% in 8 challenges

🤖 GenAI — deeper dive

Retrieval — hybrid dense + BM25 with reciprocal rank fusion, cross-encoder reranking (bge)
Grounding & safety — citation-enforcing prompts, refusal paths, hallucination reduction
Evaluation — Recall@K / MRR on retrieval, faithfulness + LLM-as-judge on generation, offline eval sets wired into CI
Observability — Langfuse traces, token-cost monitoring, embedding-drift detection
Agents & tools — structured output, multi-step orchestration, guardrailed tool use

📚 Research interests

Credit-risk methodology — IV/WoE feature assessment, Platt calibration, SHAP-driven adverse-action reasons
Production RAG — hybrid retrieval benchmarks, grounding/faithfulness metrics, cost-latency trade-offs
LLM observability — drift detection on query embeddings, telemetry-driven prompt iteration

🛠️ Tech I reach for

Languages · Python · SQL
ML · scikit-learn · XGBoost · LightGBM · PyTorch · Hugging Face
GenAI · LangChain · RAG · sentence-transformers · ChromaDB · cross-encoder rerankers · Langfuse · OpenAI · Anthropic
Data · BigQuery · dbt · Airflow · Snowflake · Delta Lake
Infra · AWS · Docker · FastAPI · Terraform · GitHub Actions
MLOps · MLflow · Champion-Challenger · PSI drift monitoring

🎤 Beyond code

TEDx speaker · Trilingual (ES/EN/PT) · Mentored 13+ engineers

📬 Let's connect

LinkedIn · Kaggle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly