Skip to content

Structured approach in AI and ML. Fundamentals and Advanced topics. RAG, Scoring & Profiling, LangChain & LangGraph, Certified Azure AI Engineer materials.

Notifications You must be signed in to change notification settings

Glareone/AI-RAG-Basics-To-Advanced-With-Examples

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OpenAI and ChatGPT repo

My Workshops and Posts

My LinkedIn Posts & Presentations

  1. GenAI. Where could be applied. Post 1.pdf
  2. GenAI in Application Refactoring field, Slides.pdf
  3. Legal problems with AI.pdf
  4. Paradigms: Rag, Self-RAG, Re-Ranking RAG, FLARE v.2.pdf
  5. Working with opinionated requests. S2A, RLHF, RLAIF.pdf
  6. Multi-Modal RAG and its features.pdf
  7. Measuring the GenAI Quality.pdf
  8. LLM leveraging RLHF in code review
  9. Everything of Thoughts (XoT). All modern techniques in one place
  10. Non deterministic embedding results
  11. AI Search vs PostgreSQL with pgvector in PROD
  12. Prod-Ready LLM Solutions. Cook Book.
  13. Quality Framework For RAG Applications.pdf
  14. Crew.AI. Agents in LLM Applications (In Progress)
  15. Pydantic data classes and how to manage the output format (In Progress)
  16. XML vs Markdown vs Json for tagging in prompting and metaprompting (In Progress)
  17. Crawlers for LLMs:
  18. Table extraction in RAG systems (In Progress)
  19. Choosing the right programming language for your next AI LLM project
  20. Misjudgements using LogProbs (In Progress)

My Workshops

  1. June 2023. My Workshop Presentation. Run 1.pptx
  2. Online Workshop. ChatGPT -> Azure Function -> PowerAutomate. Run 2.pptx
  3. Online Workshop. Run 3. Deep Learning -> Prompting -> ChatGPT -> Azure Function -> PowerAutomate
  4. Online+Offline Workshop for EHU University
  5. Talk #3. RAG, FLARE, S2A, RLHF, RLAIF, Self-RAG, Re-Ranking. Common approaches and their pros & cons

Theoretical Part

  1. Six Principles of responsible AI
  2. Responsible AI. Trusted AI Framework. Content Filters. Harmful Content. Prerelease Reviews
  3. What is ChatGPT Doing. and why does it work
  4. LLM UseCase in Google. Sorting Optimization
  5. Embeddings. Words to Vector. Useful in Search Scenarios and for Cognitive Search
  6. Cognitive Search. Video
  7. Cognitive Search. From Zero to Hero
  8. Cognitive Search. Indexers. AI Enrichment. Build-in Skills
  9. Transformers. Embeddings. Foundational Model
  10. Computer Vision. Cognitive. AI Face. Custom Vision
  11. Document Intelligence
  12. Azure AI Speech. Speech To Text. Text To Speech. Azure Services
  13. Natural Language Processing(NLP). Text Meaning and analysis. General ways how to
  14. Azure Language Service. Commands interpretation
  15. Azure Language Service. Question-Answer Knowledge base for bots. Question Answering service.
  16. Regression. Logistic and Linear Regression. Multiclass regression

Azure Learn Useful Materials

  1. AI Search. Debug Search Issues
  2. AI Search. Performance and Monitoring
  3. AI Search. Search and Scoring
  4. AI Search. Implement Advanced Search Features. Scoring Profiles, Fuzzy Search, Term Boosting, Term Proximity
  5. AI Search. Scoring profile lab. Add Different Language descriptions
  6. AI Search. Enchance Index by translation using skills
  7. AI Search. Custom Skill using Azure Function
  8. AI Search. Use Custom Analyzers (not default Microsoft Lucene)
  9. AI Search. Geo-spatial functions
  10. AI Search. Knowledge Mining. Lab
  11. AI Search -> PowerBI Table Projection from OCR Document Intelligence
  12. Composed Document Intelligense Models. Case if you need to analyze several doc types
  13. Vision. Train a Custom Model using COCO
  14. Containers. AI Services in Containers, in AKS, ACI, or even locally
  15. Containers. Run services in Isolated Environment disconnected from the internet
  16. Analyze Video Indexer. Widgets Integration and API
  17. Semantic Ranking configuration in AI Search Index
  18. Knowledge Store & Knowledge Mining with AI Search
  19. Integrate OpenAI into App. Useful Lab
  20. Host Mistral and other models in AI Hub
  21. AI Language. Multi-turn multi-step conversation
  22. AI Language. Conversation Language understanding. Classical way to build AI-assistant. Utterances: Turn-on Turn-off & Smart home
  23. AI Language. Custom Named Entities Recognition. Laws, Business Cases
  24. Key Phrases Extraction from text, Sentiment Analysis, Linked Entities
  25. Translate speech to text. Materials
  26. Translate speech to text and synthesize the output if needed. Example
  27. AI Speech. Speech Synthesis
  28. Run Cognitive Services in Docker
  29. Custom Vision. Deploy Custom Vision on the edge devices (phones) using compact models
  30. Custom Vision. Recognize issues in factory. Upload images, Tag Images, and Train the Model
  31. Anomaly Detector for IoT. Univariate Detector (multi-stream)

Azure Search & Document Intelligence

  1. Cognitive Search. Video
  2. Cognitive Search. From Zero to Hero
  3. Cognitive Search. Indexers. AI Enrichment. Build-in Skills
  4. Document Intelligence

Machine Learning Materials

  1. Machine Learning
    a. Machine Learning lab by Microsoft
  2. How Deep Learning Works

Extra materials

  1. Vector Database selection & comparison. VectorDB
  2. Transformer Explainer. Transformer Explainer is an interactive visualization tool designed to help anyone learn how Transformer-based models like GPT work
  3. Table extraction in RAG systems

Practical Part. Table of Content

  1. Example:ConsoleApp CommandGuess
  2. Example: Azure Function with ChatGPT (completion and chat-completion)
  3. Example: Integration with PowerAutomate
  4. Example: Integration with PowerApp
  5. Integration with Outlook (In progress)
  6. OpenAI + PowerAutomate Workshop by me.pptx
  7. Example: OpenAI + Redis
  8. BMW Dealer assistant. ChatGPT Chat + Startup + Redis + Context
  9. Get Embedding
  10. Form Recognizer Cognitive Service
  11. Content Filters (in progress)
  12. OpenAI straightforward examples
  13. Azure Bot Service & Chatbot Framework
  14. LangChain meets Go
  15. TenzorZero Framework (In progress)
  16. Key Phrases Extraction. AI Language. Sentiment Analysis. Extracted Linked Entities
  17. AI Search and Custom Skill using Azure Function
  18. Document Intelligence, Best Practices (In progress)
  19. MCP Server example using FastMCP

Advanced Topics. Theory and Practice.

1. Advanced Evaluation Metrics & Methodologies

  1. Document Retrieval Metrics
    a. NDCG@K (Normalized Discounted Cumulative Gain) - Ranking quality with relevance grades
    b. Mean Reciprocal Rank (MRR) - First relevant document positioning. How quickly users find their first relevant result. Critical for RAG user experience.
    c. Contextual Relevancy - How relevant is the retrieved context to the user's question?
    c. Expected Reciprocal Rank (ERR) - User behavior modeling with graded relevance
    d. Rank-Biased Precision (RBP) - Early result weighting strategies
    e. Embedding Quality Metrics - Intra-cluster vs inter-cluster distance analysis. Quality of your vector space - are similar documents close together
  2. Document Retrieval Metrics 2
    a. Fidelity - Measures recall quality - what percentage of all relevant documents in your dataset were actually retrieved in the top-n results.
    b. XDCG - Ranking quality within your retrieved top-k chunks, ignoring the rest of your document collection
    c. XDCG vs NDCG
    d. Max Relevance N - highest relevance score among your top-k retrieved chunks
    e. Holes - Counts missing ground truth data
  3. Response Quality Metrics
    a. F1, Recall, Precision. Fundamental metrics
    b. BLEU Score - N-gram overlap evaluation
    c. ROUGE (L/1/2) - Recall-oriented summarization metrics
    d. G-Eval (LLM as a judge) - Sophisticated evaluation framework that uses LLMs themselves to evaluate outputs based on detailed criteria.
    d. BERTScore - Semantic similarity using contextualized embeddings
    e. BLEURT - BERT-based learned evaluation metric
    f. SacreBLEU - Standardized BLEU with proper tokenization
    g. METEOR - Synonym and paraphrase consideration
    h. CIDEr - Consensus-based evaluation
    i. CHRF - Character-level F-score for multilingual evaluation
  4. Human-Correlation Metrics
    a. Preference-Based Ranking - Win/loss ratios in A/B testing
    b. Pearson/Spearman Correlation - Human judge alignment
    c. Likert Scale Rating Systems - Multi-point evaluation frameworks

2. RAG Evaluation Frameworks and Libraties. Agentic Application Evaluation

  1. RAG System assessment and quality control:
    a. LangWatch
    b. LangFuse
    b. Galileo
    c. Ragas
    d. DeepEval
    e. TrueLens
    f. HuggingFace (NEW, in progress)
    g. AI Foundry
  2. Agentic Application Evaluation
    a. General Agentic Application Evaluations
    b. Monitoring
    c. Trajectory Evaluation
    d. Structure of the Evaluation
    e. Application Improvements using G-Eval (LLM-as-a-Judge)

3. Advanced ML Architecture & Training

  1. Neural Network Fundamentals
    a. ReLU vs Advanced Activations (GELU, Swish/SiLU) b. Layer Normalization vs Batch Normalization - Training stability techniques
    c. Gradient Clipping - Exploding gradient prevention
    d. Mixed Precision Training - FP16/BF16 memory optimization
  2. CNN Advanced Concepts
    a. Kernel Size Impact - Local vs global feature extraction (3x3 vs 7x7)
    b. Parameter Sharing Benefits - Translation invariance principles
    c. Hierarchical Feature Learning - Low-level to high-level progression
    d. CNN vs MLP Scalability - O(k×c×f) vs O(n×m) parameter complexity
  3. Advanced Training Techniques
    a. Learning Rate Scheduling - Cosine annealing, linear decay
    b. Warmup Steps - Training stability (10% of total steps)
    c. Checkpoint Averaging - Model stability improvement
    d. Gradient Accumulation - Simulating larger batch sizes
  1. LoRA (Low-Rank Adaptation)
    a. Rank Parameter (r) - 8-64 range, efficiency vs capacity trade-off
    b. Alpha Scaling Factor - Typically 16-32
    c. Target Module Selection - Query, value, key, output projections
    d. AdaLoRA - Adaptive rank allocation
    e. QLoRA - 4-bit quantized LoRA for memory efficiency
  2. Training Parameters
    a. Learning Rate Ranges - 1e-5 to 5e-4 for LLMs with warmup
    b. Batch Size Optimization - 8-32 full fine-tuning, 64-128 LoRA
    c. Sequence Length Limits - 512-4096 tokens task dependency
    d. Weight Decay (L2 Regularization) - λ||w||² with λ = 1e-4 to 1e-2

5. Advanced Retrieval & Re-ranking

  1. Re-ranking Algorithms
    a. Reciprocal Rank Fusion (RRF) - RRF_score = Σ(1/(k + rank_i))
    b. Cross-encoder vs Bi-encoder - Accuracy vs speed trade-offs
    c. Neural Re-rankers - BERT/T5-based cross-attention models
    d. Learning to Rank (LTR) - ML-based ranking optimization
    e. Score Normalization Techniques - Min-max, z-score, sigmoid

  2. Advanced Retrieval Concepts
    a. Semantic Similarity Scoring - Cosine similarity between embeddings
    b. Context Preservation - Chunk coherence maintenance
    c. Window Size Optimization - Re-ranking candidate selection (100-1000)

6. MLOps & Production Platforms

  1. Evaluation Platforms
    a. AI Foundry (Microsoft) - Model testing and evaluation
    b. Weights & Biases (W&B) - Experiment tracking
    c. Neptune.ai - MLOps platform capabilities
    d. LangSmith (LangChain) - LLM application testing
    e. Phoenix (Arize AI) - LLM observability and evaluation

  2. Model Management
    a. MLFlow - Model lifecycle management
    b. DVC (Data Version Control) - Data and model versioning
    c. BentoML - Model serving framework architecture

7. Advanced LLM Frameworks. LangGraph. Semantic Kernel.

  1. LangGraph Basics
    a. When to Use What (Decision Framework)
    b. Core LangGraph Primitives: StateGraph & MessageGraph, Compilation model, Checkpointers, Thread/Run concepts
    c. Graph Execution Model: how LangGraph executes iteratively. StateGraph & MessageGraph
    d. LangGraph Checkpointers: MemorySaver, SqliteSaver, PostgresSaver
    e. LangGraph composition: START, END, Conditional Edge. Parallel node execution. Cycle limit (recursion_limit), infinite loops
    e. Subgraphs & Composition: when to use subgraphs vs separate graphs
    f. Error Handling & Interrupts (Critical for production)
  2. LangGraph Advanced Topics
    a. State Management - Persistent conversation state
    b. Graph Architecture - Nodes and edges for complex workflows
    c. Conditional Routing - Dynamic flow based on LLM decisions
    d. Human-in-the-Loop - Approval gates and manual interventions
    e. Parallel Processing - Concurrent graph branch execution
  3. LangGraph Examples and Prototypes
  4. LangGraph System Prompt Techniques
    a. Decision-Tree Prompts & Pattern
    b. Multi-Agent Prompt & Pattern. Primitive Version
    c. Plan-Execute Prompt & Pattern
    d. ReAct. Prompts & Ideas
    e. Prompt-Reflection Pattern. Idea
  5. Semantic Kernel (Microsoft)
    a. Kernel Architecture - Central orchestration engine
    b. Plugin System - Reusable functions (native C# or prompt-based)
    c. Planners - Automatic workflow generation
    d. Memory Management - Vector-based semantic memory patterns
  6. Magentic One + Semantic Kernel
  7. Advanced Framework Concepts
    a. Multi-Agent Systems - Collaborative AI agent coordination
    b. Error Recovery Strategies - Retry logic, fallback mechanisms
    c. Async Execution - Resource management at scale

8. Structured Output & Schema Design

  1. Pydantic Advanced Usage
    a. Field Validation - Custom validators, constraints (min/max, regex)
    b. JSON Schema Generation - Automatic API documentation
    c. Error Handling - Detailed validation error message design
    d. Schema Compliance Monitoring - Production tracking metrics
  2. Best Practices
    a. Schema Complexity vs Success Rates - Optimization strategies
    b. Retry Logic Implementation - Parse failure handling
    c. Validation Feedback Loops - Error correction workflows

9. Massive Parallel Training (Enterprise Scale)

  1. Distributed Training Strategies
    a. Data Parallelism - Batch distribution across GPUs
    b. Model Parallelism - Layer splitting across devices
    c. Pipeline Parallelism - Sequential processing stages
    d. Gradient Synchronization - AllReduce, parameter servers
    e. Mixed Precision Training - Memory efficiency optimization

10. Advanced Overfitting Prevention

  1. Regularization Techniques
    a. Early Stopping - Validation loss plateau detection
    b. Dropout Rates - 0.1-0.3 optimal ranges
    c. Training/Validation Loss Curves - Overfitting gap analysis
    d. Cross-validation Strategies - 5-10 fold robust evaluation
  1. Main Topics. What, Why, How
  2. Q&A. Quick references
    a. Whether or not. Use Cases
    b. How to Start
    c. Parameters
    d. Performance and Trade-offs
  3. Examples
    a. TenSeal, Concrete-ML, Microsoft Seal

LangWatch vs LangFuse vs AI Foundry for SemanticKernel

  1. LangWatch vs LangFuse vs AI Foundry comparison

Advanced Topics. Practice. Semantic Kernel

image
image

Advanced Topics. Practice. Semantic Kernel Knowledge base

  1. Semantic kernel and AI Assistant
  2. Creative Writing Assistant with Semantic Kernel and .Net Aspire

Advanced Topics. Practice. SemanticKernel.

  1. Initial Example
  2. Interactive Chat with Chat History
  3. Model Switching. Hugging Face
  4. Semantic Function for Conversational Chat
  5. Semantic Kernel Pipeline

Advanced Topics. Practice. LangGraph & LangChain

image

Table of Content:

  1. LangChain using Golang (In Progress)
  2. LangChain. Demo examples wiht pipelines
  3. LangGraph. Patterns. Examples
  4. ReACT. Pre-coded loop + LLM to calculate the total weight of dogs
  5. React. Using LangGraph. In Progress
  6. React. Simple LangGraph Prototype

RAG. Cheatsheet


image

1: PowerAutomate. React on manual trigger

image

2: PowerAutomate. React on keyword mentioned

image

About

Structured approach in AI and ML. Fundamentals and Advanced topics. RAG, Scoring & Profiling, LangChain & LangGraph, Certified Azure AI Engineer materials.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published