Build a Simple ETL Pipeline (MLOps) #550

Jeet009 · 2025-10-07T12:21:21Z

No description provided.

moe18

here are some small comments on the ETL question

moe18 · 2025-10-13T14:10:33Z

questions/184_mlops-etl-pipeline/meta.json

@@ -0,0 +1,12 @@
+{
+  "id": "184",


make it Q 187

moe18 · 2025-10-13T14:11:00Z

questions/184_mlops-etl-pipeline/tests.json

@@ -0,0 +1,14 @@
+[
+  {
+    "test": "from solution import run_etl; print(run_etl('user_id,event_type,value\n u1, purchase, 10.0\n u2, view, 1.0\n u1, purchase, 5\n u3, purchase, not_a_number\n u2, purchase, 3.5 \n'))",


for processing our test cases we need the \n to be \n

can you elaborate on this?

I am not exactly sure what do you mean by "need the \n to be \n", did you mean something else?

sorry for the confusion it should look like this it needs two '' 's for example print(run_etl('user_id,event_type,value\\n u1, purchase, 10.0\\n u2, view, 1.0\\n u1, purchase, 5\\n u3, purchase, not_a_number\\n u2, purchase, 3.5 \\n'))

moe18 · 2025-10-13T14:12:46Z

questions/184_mlops-etl-pipeline/tests.json

@@ -0,0 +1,14 @@
+[
+  {
+    "test": "from solution import run_etl; print(run_etl('user_id,event_type,value\n u1, purchase, 10.0\n u2, view, 1.0\n u1, purchase, 5\n u3, purchase, not_a_number\n u2, purchase, 3.5 \n'))",


you also do not need to import run_etl the test could look like
print(run_etl('user_id,event_type,value\n u1, purchase, 10.0\n u2, view, 1.0\n u1, purchase, 5\n u3, purchase, not_a_number\n u2, purchase, 3.5 \n'))

moe18 · 2025-10-16T12:29:50Z

build/185.json

+    }
+  ],
+  "description": "## Problem\n\nImplement a basic data drift check comparing two numeric datasets (reference vs. current).\n\nWrite a function `check_drift(ref, cur, mean_threshold, var_threshold)` that:\n\n- Accepts two lists of numbers `ref` and `cur`.\n- Computes the absolute difference in means and variances.\n- Returns a tuple `(mean_drift, var_drift)` where each element is a boolean indicating whether drift exceeds the corresponding threshold:\n\t- `mean_drift = abs(mean(ref) - mean(cur)) > mean_threshold`\n\t- `var_drift  = abs(var(ref)  - var(cur))  > var_threshold`\n\nAssume population variance (divide by N). Handle empty inputs by returning `(False, False)`.",
+  "learn_section": "## Solution Explanation\n\nWe compare two numeric samples (reference vs. current) using mean and variance with user-defined thresholds.\n\n### Definitions\n- Mean: \\( \\mu = \\frac{1}{N}\\sum_i x_i \\)\n- Population variance: \\( \\sigma^2 = \\frac{1}{N}\\sum_i (x_i - \\mu)^2 \\)\n\n### Drift rules\n- Mean drift if \\(|\\mu_{ref} - \\mu_{cur}| > \\tau_{mean}\\)\n- Variance drift if \\(|\\sigma^2_{ref} - \\sigma^2_{cur}| > \\tau_{var}\\)\n\n### Edge cases\n- If either sample is empty, return `(False, False)` to avoid false alarms.\n- Population vs. sample variance: we use population here to match many monitoring setups. Either is fine if used consistently.\n\n### Complexity\n- O(N + M) to compute stats; O(1) extra space.",


use $ and $$ for equations in the learn section

moe18 · 2025-10-16T12:32:33Z

build/185.json

+  },
+  "test_cases": [
+    {
+      "test": "from solution import check_drift; print(check_drift([1,2,3], [1.1,2.2,3.3], 0.05, 0.1))",


no need to import anything from solution, the test case could just be
print(check_drift([1,2,3], [1.1,2.2,3.3], 0.05, 0.1))

Jeet009 added 4 commits September 15, 2025 19:23

Added Two New Questions(Probability)

5e2a9a9

Added new ML Ops questions

1c96b23

added new question on mlops

784bf19

etl pipeline question

495a310

Jeet009 mentioned this pull request Oct 10, 2025

Basic Data Drift Check: Mean and Variance Thresholds #551

Open

Final commit

cfc2130

moe18 reviewed Oct 13, 2025

View reviewed changes

moe18 reviewed Oct 16, 2025

View reviewed changes

fixed test cases

5c8919f

moe18 approved these changes Oct 23, 2025

View reviewed changes

moe18 merged commit 7998b03 into Open-Deep-ML:main Oct 23, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Build a Simple ETL Pipeline (MLOps) #550

Build a Simple ETL Pipeline (MLOps) #550

Uh oh!

Jeet009 commented Oct 7, 2025

Uh oh!

moe18 left a comment

Uh oh!

moe18 Oct 13, 2025

Uh oh!

moe18 Oct 13, 2025

Uh oh!

Jeet009 Oct 13, 2025

Uh oh!

Jeet009 Oct 13, 2025

Uh oh!

moe18 Oct 16, 2025

Uh oh!

moe18 Oct 13, 2025

Uh oh!

moe18 Oct 16, 2025

Uh oh!

moe18 Oct 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Build a Simple ETL Pipeline (MLOps) #550

Build a Simple ETL Pipeline (MLOps) #550

Uh oh!

Conversation

Jeet009 commented Oct 7, 2025

Uh oh!

moe18 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants