Skip to content

Commit 22b43e4

Browse files
authored
Merge pull request #474 from Open-Deep-ML/add-q-141
added new Q in correct format
2 parents 14c404a + 58ce0e0 commit 22b43e4

File tree

8 files changed

+120
-79
lines changed

8 files changed

+120
-79
lines changed
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Write a Python function `convert_range` that shifts and scales the values of a NumPy array from their original range $[a, b]$ (where $a=\min(x)$ and $b=\max(x)$) to a new target range $[c, d]$. Your function should work for both 1D and 2D arrays, returning an array of the same shape, and only use NumPy. Return floating-point results, and ensure you use the correct formula to map the input interval to the output interval.
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"input": "import numpy as np\nx = np.array([0, 5, 10])\nc, d = 2, 4\nprint(convert_range(x, c, d))",
3+
"output": "[2. 3. 4.]",
4+
"reasoning": "The minimum value (a) is 0 and the maximum value (b) is 10. The formula maps 0 to 2, 5 to 3, and 10 to 4 using: f(x) = c + (d-c)/(b-a)*(x-a)."
5+
}
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# **Shifting and Scaling a Range (Rescaling Data)**
2+
3+
## **1. Motivation**
4+
5+
Rescaling (or shifting and scaling) is a common preprocessing step in data analysis and machine learning. It's often necessary to map data from an original range (e.g., test scores, pixel values, GPA) to a new range suitable for downstream tasks or compatibility between datasets. For example, you might want to shift a GPA from $[0, 10]$ to $[0, 4]$ for comparison or model input.
6+
7+
---
8+
9+
## **2. The General Mapping Formula**
10+
11+
Suppose you have input values in the range $[a, b]$ and you want to map them to the interval $[c, d]$.
12+
13+
- First, shift the lower bound to $0$ by applying $x \mapsto x - a$, so $[a, b] \rightarrow [0, b-a]$.
14+
- Next, scale to unit interval: $t \mapsto \frac{1}{b-a} \cdot t$, yielding $[0, 1]$.
15+
- Now, scale to $[0, d-c]$ with $t \mapsto (d-c)t$, and shift to $[c, d]$ with $t \mapsto c + t$.
16+
- Combining all steps, the complete formula is:
17+
18+
$$
19+
f(x) = c + \left(\frac{d-c}{b-a}\right)(x-a)
20+
$$
21+
22+
- $x$ = the input value
23+
- $a = \min(x)$ and $b = \max(x)$
24+
- $c$, $d$ = target interval endpoints
25+
26+
---
27+
28+
## **3. Applications**
29+
- **Image Processing**: Rescale pixel intensities
30+
- **Feature Engineering**: Normalize features to a common range
31+
- **Score Conversion**: Convert test scores or grades between systems
32+
33+
---
34+
35+
## **4. Practical Considerations**
36+
- Be aware of the case when $a = b$ (constant input); this may require special handling (e.g., output all $c$).
37+
- For multidimensional arrays, use NumPy’s `.min()` and `.max()` to determine the full input range.
38+
39+
---
40+
41+
This formula gives a **simple, mathematically justified way to shift and scale data to any target range**—a core tool for robust machine learning pipelines.
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
{
2+
"id": "141",
3+
"title": "Shift and Scale Array to Target Range",
4+
"difficulty": "easy",
5+
"category": "Machine Learning",
6+
"video": "",
7+
"likes": "0",
8+
"dislikes": "0",
9+
"contributor": [
10+
{
11+
"profile_link": "https://github.com/turkunov",
12+
"name": "turkunov"
13+
}
14+
]
15+
}
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
import numpy as np
2+
3+
def convert_range(values: np.ndarray, c: float, d: float) -> np.ndarray:
4+
"""
5+
Shift and scale values from their original range [min, max] to a target [c, d] range.
6+
7+
Parameters
8+
----------
9+
values : np.ndarray
10+
Input array (1D or 2D) to be rescaled.
11+
c : float
12+
New range lower bound.
13+
d : float
14+
New range upper bound.
15+
16+
Returns
17+
-------
18+
np.ndarray
19+
Scaled array with the same shape as the input.
20+
"""
21+
a, b = values.min(), values.max()
22+
return c + (d - c) / (b - a) * (values - a)
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
import numpy as np
2+
3+
def convert_range(values: np.ndarray, c: float, d: float) -> np.ndarray:
4+
"""
5+
Shift and scale values from their original range [min, max] to a target [c, d] range.
6+
"""
7+
# Your code here
8+
pass
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
[
2+
{
3+
"test": "import numpy as np\nseq = np.array([388, 242, 124, 384, 313, 277, 339, 302, 268, 392])\nc, d = 0, 1\nout = convert_range(seq, c, d)\nprint(np.round(out, 6))",
4+
"expected_output": "[0.985075, 0.440299, 0., 0.970149, 0.705224, 0.570896, 0.802239, 0.664179, 0.537313, 1. ]"
5+
},
6+
{
7+
"test": "import numpy as np\nseq = np.array([[2028, 4522], [1412, 2502], [3414, 3694], [1747, 1233], [1862, 4868]])\nc, d = 4, 8\nout = convert_range(seq, c, d)\nprint(np.round(out, 6))",
8+
"expected_output": "[[4.874828 7.619257]\n [4.196974 5.396424]\n [6.4 6.708116]\n [4.565612 4. ]\n [4.69216 8. ]]"
9+
}
10+
]

utils/convert_single_question.py

Lines changed: 18 additions & 79 deletions
Original file line numberDiff line numberDiff line change
@@ -28,102 +28,41 @@
2828

2929
# ── 1️⃣ EDIT YOUR QUESTION HERE ────────────────────────────────────────────
3030
QUESTION_DICT: Dict[str, Any] = {
31-
'id':'140',
32-
"description": "Write a Python class to implement the Bernoulli Naive Bayes classifier for binary (0/1) feature data. Your class should have two methods: `forward(self, X, y)` to train on the input data (X: 2D NumPy array of binary features, y: 1D NumPy array of class labels) and `predict(self, X)` to output predicted labels for a 2D test matrix X. Use Laplace smoothing (parameter: smoothing=1.0). Return predictions as a NumPy array. Only use NumPy. Predictions must be binary (0 or 1) and you must handle cases where the training data contains only one class. All log/likelihood calculations should use log probabilities for numerical stability.",
31+
"id": "141",
32+
"description": "Write a Python function `convert_range` that shifts and scales the values of a NumPy array from their original range $[a, b]$ (where $a=\\min(x)$ and $b=\\max(x)$) to a new target range $[c, d]$. Your function should work for both 1D and 2D arrays, returning an array of the same shape, and only use NumPy. Return floating-point results, and ensure you use the correct formula to map the input interval to the output interval.",
3333
"test_cases": [
3434
{
35-
"test": "import numpy as np\nmodel = NaiveBayes(smoothing=1.0)\nX = np.array([[1, 0, 1], [1, 1, 0], [0, 0, 1], [0, 1, 0], [1, 1, 1]])\ny = np.array([1, 1, 0, 0, 1])\nmodel.forward(X, y)\nprint(model.predict(np.array([[1, 0, 1]])))",
36-
"expected_output": "[1]"
35+
"test": "import numpy as np\nseq = np.array([388, 242, 124, 384, 313, 277, 339, 302, 268, 392])\nc, d = 0, 1\nout = convert_range(seq, c, d)\nprint(np.round(out, 6))",
36+
"expected_output": "[0.985075, 0.440299, 0., 0.970149, 0.705224, 0.570896, 0.802239, 0.664179, 0.537313, 1. ]"
3737
},
3838
{
39-
"test": "import numpy as np\nmodel = NaiveBayes(smoothing=1.0)\nX = np.array([[0], [1], [0], [1]])\ny = np.array([0, 1, 0, 1])\nmodel.forward(X, y)\nprint(model.predict(np.array([[0], [1]])))",
40-
"expected_output": "[0 1]"
41-
},
42-
{
43-
"test": "import numpy as np\nmodel = NaiveBayes(smoothing=1.0)\nX = np.array([[0, 0], [1, 0], [0, 1]])\ny = np.array([0, 1, 0])\nmodel.forward(X, y)\nprint(model.predict(np.array([[1, 1]])))",
44-
"expected_output": "[0]"
45-
},
46-
{
47-
"test": "import numpy as np\nnp.random.seed(42)\nmodel = NaiveBayes(smoothing=1.0)\nX = np.random.randint(0, 2, (100, 5))\ny = np.random.choice([0, 1], size=100)\nmodel.forward(X, y)\nX_test = np.random.randint(0, 2, (10, 5))\npred = model.predict(X_test)\nprint(pred.shape)",
48-
"expected_output": "(10,)"
49-
},
50-
{
51-
"test": "import numpy as np\nmodel = NaiveBayes(smoothing=1.0)\nX = np.random.randint(0, 2, (10, 3))\ny = np.zeros(10)\nmodel.forward(X, y)\nX_test = np.random.randint(0, 2, (3, 3))\nprint(model.predict(X_test))",
52-
"expected_output": "[0, 0, 0]"
39+
"test": "import numpy as np\nseq = np.array([[2028, 4522], [1412, 2502], [3414, 3694], [1747, 1233], [1862, 4868]])\nc, d = 4, 8\nout = convert_range(seq, c, d)\nprint(np.round(out, 6))",
40+
"expected_output": "[[4.874828 7.619257]\n [4.196974 5.396424]\n [6.4 6.708116]\n [4.565612 4. ]\n [4.69216 8. ]]"
5341
}
5442
],
55-
"solution": "import numpy as np\n\nclass NaiveBayes():\n def __init__(self, smoothing=1.0):\n self.smoothing = smoothing\n self.classes = None\n self.priors = None\n self.likelihoods = None\n\n def forward(self, X, y):\n self.classes, class_counts = np.unique(y, return_counts=True)\n self.priors = {cls: np.log(class_counts[i] / len(y)) for i, cls in enumerate(self.classes)}\n self.likelihoods = {}\n for cls in self.classes:\n X_cls = X[y == cls]\n prob = (np.sum(X_cls, axis=0) + self.smoothing) / (X_cls.shape[0] + 2 * self.smoothing)\n self.likelihoods[cls] = (np.log(prob), np.log(1 - prob))\n\n def _compute_posterior(self, sample):\n posteriors = {}\n for cls in self.classes:\n posterior = self.priors[cls]\n prob_1, prob_0 = self.likelihoods[cls]\n likelihood = np.sum(sample * prob_1 + (1 - sample) * prob_0)\n posterior += likelihood\n posteriors[cls] = posterior\n return max(posteriors, key=posteriors.get)\n\n def predict(self, X):\n return np.array([self._compute_posterior(sample) for sample in X])",
43+
"solution": "import numpy as np\n\ndef convert_range(values: np.ndarray, c: float, d: float) -> np.ndarray:\n \"\"\"\n Shift and scale values from their original range [min, max] to a target [c, d] range.\n\n Parameters\n ----------\n values : np.ndarray\n Input array (1D or 2D) to be rescaled.\n c : float\n New range lower bound.\n d : float\n New range upper bound.\n\n Returns\n -------\n np.ndarray\n Scaled array with the same shape as the input.\n \"\"\"\n a, b = values.min(), values.max()\n return c + (d - c) / (b - a) * (values - a)",
5644
"example": {
57-
"input": "X = np.array([[1, 0, 1], [1, 1, 0], [0, 0, 1], [0, 1, 0], [1, 1, 1]]); y = np.array([1, 1, 0, 0, 1])\nmodel = NaiveBayes(smoothing=1.0)\nmodel.forward(X, y)\nprint(model.predict(np.array([[1, 0, 1]])))",
58-
"output": "[1]",
59-
"reasoning": "The model learns class priors and feature probabilities with Laplace smoothing. For [1, 0, 1], the posterior for class 1 is higher, so the model predicts 1."
45+
"input": "import numpy as np\nx = np.array([0, 5, 10])\nc, d = 2, 4\nprint(convert_range(x, c, d))",
46+
"output": "[2. 3. 4.]",
47+
"reasoning": "The minimum value (a) is 0 and the maximum value (b) is 10. The formula maps 0 to 2, 5 to 3, and 10 to 4 using: f(x) = c + (d-c)/(b-a)*(x-a)."
6048
},
6149
"category": "Machine Learning",
62-
"starter_code": "import numpy as np\n\nclass NaiveBayes():\n def __init__(self, smoothing=1.0):\n # Initialize smoothing\n pass\n\n def forward(self, X, y):\n # Fit model to binary features X and labels y\n pass\n\n def predict(self, X):\n # Predict class labels for test set X\n pass",
63-
"title": "Bernoulli Naive Bayes Classifier",
64-
"learn_section":r"""# **Naive Bayes Classifier**
65-
66-
## **1. Definition**
67-
68-
Naive Bayes is a **probabilistic machine learning algorithm** used for **classification tasks**. It is based on **Bayes' Theorem**, which describes the probability of an event based on prior knowledge of related events.
69-
70-
The algorithm assumes that:
71-
- **Features are conditionally independent** given the class label (the "naive" assumption).
72-
- It calculates the posterior probability for each class and assigns the class with the **highest posterior** to the sample.
73-
74-
---
75-
76-
## **2. Bayes' Theorem**
77-
78-
Bayes' Theorem is given by:
79-
80-
$$
81-
P(C | X) = \frac{P(X | C) \times P(C)}{P(X)}
82-
$$
83-
84-
Where:
85-
- $P(C | X)$ **Posterior** probability: the probability of class $C $ given the feature vector $X$
86-
- $P(X | C)$ → **Likelihood**: the probability of the data $X$ given the class
87-
- $P(C)$ → **Prior** probability: the initial probability of class $C$ before observing any data
88-
- $ P(X)$ → **Evidence**: the total probability of the data across all classes (acts as a normalizing constant)
89-
90-
Since $P(X)$ is the same for all classes during comparison, it can be ignored, simplifying the formula to:
91-
92-
$$
93-
P(C | X) \propto P(X | C) \times P(C)
94-
$$
95-
---
96-
97-
### 3 **Bernoulli Naive Bayes**
98-
- Used for **binary data** (features take only 0 or 1 values).
99-
- The likelihood is given by:
100-
101-
$$
102-
P(X | C) = \prod_{i=1}^{n} P(x_i | C)^{x_i} \cdot (1 - P(x_i | C))^{1 - x_i}
103-
$$
104-
105-
---
106-
107-
## **4. Applications of Naive Bayes**
108-
109-
- **Text Classification:** Spam detection, sentiment analysis, and news categorization.
110-
- **Document Categorization:** Sorting documents by topic.
111-
- **Fraud Detection:** Identifying fraudulent transactions or behaviors.
112-
- **Recommender Systems:** Classifying users into preference groups.
113-
114-
--- """,
50+
"starter_code": "import numpy as np\n\ndef convert_range(values: np.ndarray, c: float, d: float) -> np.ndarray:\n \"\"\"\n Shift and scale values from their original range [min, max] to a target [c, d] range.\n \"\"\"\n # Your code here\n pass",
51+
"title": "Shift and Scale Array to Target Range",
52+
"learn_section": "# **Shifting and Scaling a Range (Rescaling Data)**\n\n## **1. Motivation**\n\nRescaling (or shifting and scaling) is a common preprocessing step in data analysis and machine learning. It's often necessary to map data from an original range (e.g., test scores, pixel values, GPA) to a new range suitable for downstream tasks or compatibility between datasets. For example, you might want to shift a GPA from $[0, 10]$ to $[0, 4]$ for comparison or model input.\n\n---\n\n## **2. The General Mapping Formula**\n\nSuppose you have input values in the range $[a, b]$ and you want to map them to the interval $[c, d]$.\n\n- First, shift the lower bound to $0$ by applying $x \\mapsto x - a$, so $[a, b] \\rightarrow [0, b-a]$.\n- Next, scale to unit interval: $t \\mapsto \\frac{1}{b-a} \\cdot t$, yielding $[0, 1]$.\n- Now, scale to $[0, d-c]$ with $t \\mapsto (d-c)t$, and shift to $[c, d]$ with $t \\mapsto c + t$.\n- Combining all steps, the complete formula is:\n\n$$\n f(x) = c + \\left(\\frac{d-c}{b-a}\\right)(x-a)\n$$\n\n- $x$ = the input value\n- $a = \\min(x)$ and $b = \\max(x)$\n- $c$, $d$ = target interval endpoints\n\n---\n\n## **3. Applications**\n- **Image Processing**: Rescale pixel intensities\n- **Feature Engineering**: Normalize features to a common range\n- **Score Conversion**: Convert test scores or grades between systems\n\n---\n\n## **4. Practical Considerations**\n- Be aware of the case when $a = b$ (constant input); this may require special handling (e.g., output all $c$).\n- For multidimensional arrays, use NumPy’s `.min()` and `.max()` to determine the full input range.\n\n---\n\nThis formula gives a **simple, mathematically justified way to shift and scale data to any target range**—a core tool for robust machine learning pipelines.\n",
11553
"contributor": [
11654
{
117-
"profile_link": "https://github.com/moe18",
118-
"name": "Moe Chabot"
55+
"profile_link": "https://github.com/turkunov",
56+
"name": "turkunov"
11957
}
12058
],
12159
"likes": "0",
12260
"dislikes": "0",
123-
"difficulty": "medium",
124-
"video":''
61+
"difficulty": "easy",
62+
"video": ""
12563
}
12664

65+
12766
# ────────────────────────────────────────────────────────────────────────────
12867

12968

0 commit comments

Comments
 (0)