Skip to content

Commit a3e2984

Browse files
authored
Merge branch 'Open-Deep-ML:main' into Apriori
2 parents be4bbca + 41bdea8 commit a3e2984

File tree

9 files changed

+487
-1
lines changed

9 files changed

+487
-1
lines changed

Problems/110_METEOR/Learn.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
METEOR(Metric for Evaluation of Translation with Explicit ORdering) is a metric generally used for
2+
machine translation and evaluating the text output of generative AI models. METEOR build was introduced to address
3+
the limitations in earlier metrics like BLEU.
4+
5+
## Key Characteristics
6+
- Considers semantic similarity beyond exact word matching
7+
- Accounts for word order and translation variations
8+
- Provides more human-aligned translation assessment
9+
10+
# Implementation
11+
1. **Tokenization**
12+
13+
2. **Frequency of matching words** : Matching needs to be exact
14+
15+
3. **Calculate Precision, Recall and F-mean**
16+
```
17+
F_mean = (Precision * Recall) / (α * Precision + (1 - α) * Recall)
18+
```
19+
- α typically set to 0.9
20+
- Balances precision and recall
21+
22+
4. **Fragmentation Penalty**
23+
```
24+
Chunks = Count of contiguous matched word sequences
25+
Penalty = γ * (Chunks / Matches)^β
26+
```
27+
- β controls penalty weight (typically 3)
28+
- γ limits maximum penalty (typically 0.5)
29+
30+
5. **Final METEOR Score**
31+
```
32+
METEOR = F_mean * (1 - Penalty)
33+
```
34+
- Ranges from 0 (no match) to 1 (perfect match)
35+
36+
**__Note__** : The [paper](https://aclanthology.org/W05-0909/) that introduced the metric doesn't have the parameters (α,β, and γ) as tunable parameters, but implementation in other libraries like NLTK offers this flexibility.
37+
38+
# Example
39+
40+
- Reference: "The quick brown fox jumps over the lazy dog"
41+
- Candidate: "A quick brown fox jumps over a lazy dog"
42+
43+
### 1. Tokenization
44+
- Reference Tokens: ['the', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']
45+
- Candidate Tokens: ['a', 'quick', 'brown', 'fox', 'jumps', 'over', 'a', 'lazy', 'dog']
46+
47+
### 2. Unigram Matching
48+
- Matching tokens: ['quick', 'brown', 'fox', 'jumps', 'over', 'lazy', 'dog']
49+
- Matches: 7
50+
51+
### 3. Unigram Precision and Recall Calculation
52+
- Precision = Matches / Candidate Length = 7 / 9 ≈ 0.778
53+
54+
- Recall = Matches / Reference Length = 7 / 9 ≈ 0.778
55+
56+
### 4. F-mean Calculation (α = 0.9)
57+
```
58+
F_mean = (Precision * Recall) / (α * Precision + (1 - α) * Recall)
59+
= (0.778 * 0.778) / (0.9 * 0.778 + (1 - 0.9) * 0.778)
60+
= 0.606 / (0.7 + 0.078)
61+
= 0.606 / 0.778
62+
≈ 0.779
63+
```
64+
65+
### 5. Chunk Calculation
66+
- Contiguous matched sequences:
67+
1. ['quick', 'brown', 'fox']
68+
2. ['jumps', 'over']
69+
3. ['lazy', 'dog']
70+
- Number of Chunks: 3
71+
- Total Number of Unigram Matches: 7
72+
73+
### 6. Penalty Calculation (β = 3, γ = 0.5)
74+
```
75+
Penalty = γ * (Number of Chunks / Total Number of Unigram Matches)^β
76+
= 0.5 * (3 / 7)^3
77+
= 0.5 * (0.429)^3
78+
≈ 0.039
79+
```
80+
81+
### 7. Final METEOR Score
82+
```
83+
METEOR = F_mean * (1 - Penalty)
84+
= 0.779 * (1 - 0.039)
85+
= 0.779 * 0.961
86+
≈ 0.749
87+
```

Problems/110_METEOR/solution.py

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
import numpy as np
2+
from collections import Counter
3+
4+
def meteor_score(reference, candidate, alpha=0.9, beta=3, gamma=0.5):
5+
if not reference or not candidate:
6+
raise ValueError("Reference and candidate cannot be empty")
7+
8+
# Tokenize and count
9+
ref_tokens = reference.lower().split()
10+
cand_tokens = candidate.lower().split()
11+
12+
# Counter for unigram for reference and candidate
13+
ref_counts = Counter(ref_tokens)
14+
cand_counts = Counter(cand_tokens)
15+
16+
# Calculate matches
17+
num_matches = sum((ref_counts & cand_counts).values()) # Number of matching words in candidate and reference
18+
ref_len = len(ref_tokens)
19+
cand_len = len(cand_tokens)
20+
21+
# Unigram Precision and Recall
22+
precision = num_matches / cand_len if cand_len > 0 else 0 # Avoiding Division by zero
23+
recall = num_matches / ref_len if ref_len > 0 else 0 # Avoiding Division by zero
24+
25+
if num_matches == 0:
26+
return 0.0
27+
28+
fmean = (precision * recall) / (alpha * precision + (1 - alpha) * recall)
29+
30+
# Chunk calculation
31+
matched_positions = []
32+
ref_positions = {} # Store positions of words in reference
33+
used_positions = set() # Track already used indices
34+
35+
# Populate reference positions for word alignment tracking
36+
for i, word in enumerate(ref_tokens):
37+
ref_positions.setdefault(word, []).append(i)
38+
39+
# Determine the sequence of matched positions in reference
40+
for word in cand_tokens:
41+
if word in ref_positions:
42+
for pos in ref_positions[word]:
43+
if pos not in used_positions:
44+
matched_positions.append(pos)
45+
used_positions.add(pos)
46+
break # Ensure each match is used only once
47+
48+
# Count chunks by detecting breaks in position sequence
49+
num_chunks = 1 if matched_positions else 0
50+
for i in range(1, len(matched_positions)):
51+
if matched_positions[i] != matched_positions[i - 1] + 1:
52+
num_chunks += 1 # Break in sequence → new chunk
53+
54+
# Fragmentation penalty
55+
penalty = gamma * ((num_chunks / num_matches) ** beta) if num_matches > 0 else 0
56+
57+
# Final score
58+
return round(fmean * (1 - penalty), 3) # Rounding to 3 Decimal places
59+
60+
def test_meteor_score():
61+
# Test Case 1: Identical translations
62+
ref_test1 = "The cat sits on the mat"
63+
cand_test1 = "The cat sits on the mat"
64+
expected1 = 1.0
65+
assert meteor_score(ref_test1, cand_test1) == expected1, "Test Case 1 Failed"
66+
67+
# Test Case 2: Similar translations
68+
ref_test2 = "The quick brown fox jumps over the lazy dog"
69+
cand_test2 = "A quick brown fox jumps over a lazy dog"
70+
expected2 = 0.991
71+
assert meteor_score(ref_test2, cand_test2) == expected2, "Test Case 2 Failed"
72+
73+
# Test Case 3: Completely different translations
74+
ref_test3 = "The cat sits on the mat"
75+
cand_test3 = "Dogs run in the park"
76+
expected3 = 0.0
77+
assert meteor_score(ref_test3, cand_test3) == expected3, "Test Case 3 Failed"
78+
79+
# Test Case 4: Partially matching translations
80+
ref_test4 = "Machine learning is an exciting field"
81+
cand_test4 = "Machine learning algorithms are fascinating"
82+
expected4 = 0.667
83+
assert meteor_score(ref_test4, cand_test4) == expected4, "Test Case 4 Failed"
84+
85+
# Test Case 5: Empty input handling
86+
try:
87+
meteor_score("", "Some text")
88+
assert False, "Test Case 5 Failed"
89+
except ValueError:
90+
pass
91+
92+
# Test Case 6: Partial match with penalty
93+
ref_test6 = "The cat sits on the mat"
94+
cand_test6 = "The cat on the mat sits"
95+
expected6 = 0.933
96+
assert meteor_score(ref_test6, cand_test6) == expected6, "Test Case 6 Failed"
97+
98+
if __name__ == "__main__":
99+
test_meteor_score()
100+
print("All Test Cases Passed!")

Problems/111_PMI/Learn.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# Pointwise Mutual Information (PMI)
2+
3+
Pointwise Mutual Information (PMI) is a statistical measure used in information theory and Natural Language Processing (NLP) to quantify the level of association between two events. It compares the probability of two events
4+
occurring together versus the probability of them occurring independently. It is commonly used in Natural Language Processing(NLP) and Information Retrieval to find association between two words, feature selection in text classification,
5+
document similarity.
6+
7+
## Implementation
8+
1. **Collect Count Data for event x, y and joint occurence**
9+
10+
2. **Calculate Individual Probabilities**
11+
12+
3. **Calculate Joint Probability**
13+
14+
4. **Final Score : PMI(x,y) = log₂(P(x,y) / (P(x) * P(y)))**
15+
Where:
16+
- P(x,y) is the probability of events x and y occurring together
17+
18+
- P(x) is the probability of event x occurring
19+
20+
- P(y) is the probability of event y occurring
21+
22+
## Interpreting PMI Values
23+
24+
- **Positive PMI**: Events co-occur more than expected by chance
25+
- **Zero PMI**: Events are statistically independent
26+
- **Negative PMI**: Events co-occur less than expected by chance
27+
- **Undefined**: When P(x,y) = 0 (events never co-occur)
28+
29+
## Variants
30+
31+
### 1. Normalized PMI (NPMI)
32+
- Scales PMI to range [-1, 1]
33+
- Easier to compare across different datasets
34+
- Formula: NPMI(x,y) = PMI(x,y) / -log₂(P(x,y))
35+
36+
### 2. Positive PMI (PPMI)
37+
- Sets negative PMI values to zero
38+
- Commonly used in word embedding models
39+
- Formula: PPMI(x,y) = max(PMI(x,y), 0)

Problems/111_PMI/solution.py

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
import numpy as np
2+
3+
def compute_pmi(joint_counts, total_counts_x, total_counts_y, total_samples):
4+
5+
if not all(isinstance(x, (int, float)) for x in [joint_counts, total_counts_x, total_counts_y, total_samples]):
6+
raise ValueError("All inputs must be numeric")
7+
8+
if any(x < 0 for x in [joint_counts, total_counts_x, total_counts_y, total_samples]):
9+
raise ValueError("Counts cannot be negative")
10+
11+
if total_samples == 0:
12+
raise ValueError("Total samples cannot be zero")
13+
14+
if joint_counts > min(total_counts_x, total_counts_y):
15+
raise ValueError("Joint counts cannot exceed individual counts")
16+
17+
if any(x > total_samples for x in [total_counts_x, total_counts_y]):
18+
raise ValueError("Individual counts cannot exceed total samples")
19+
20+
p_x = total_counts_x / total_samples
21+
p_y = total_counts_y / total_samples
22+
p_xy = joint_counts / total_samples
23+
24+
# Handle edge cases where probabilities are zero
25+
if p_xy == 0 or p_x == 0 or p_y == 0:
26+
return float('-inf')
27+
28+
pmi = np.log2(p_xy / (p_x * p_y))
29+
30+
return round(pmi, 3)
31+
32+
def test_pmi():
33+
# Test Case 1: Perfect positive association
34+
joint_counts1 = 100
35+
total_counts_x1 = 100
36+
total_counts_y1 = 100
37+
total_samples1 = 100
38+
expected1 = round(np.log2(1/(1*1)), 3) # Should be 0.0
39+
assert compute_pmi(joint_counts1, total_counts_x1, total_counts_y1, total_samples1) == expected1, "Test Case 1 Failed"
40+
41+
# Test Case 2: Independence
42+
joint_counts2 = 25
43+
total_counts_x2 = 50
44+
total_counts_y2 = 50
45+
total_samples2 = 100
46+
expected2 = round(np.log2((25/100)/((50/100)*(50/100))), 3) # Should be 0.0
47+
assert compute_pmi(joint_counts2, total_counts_x2, total_counts_y2, total_samples2) == expected2, "Test Case 2 Failed"
48+
49+
# Test Case 3: Negative association
50+
joint_counts3 = 10
51+
total_counts_x3 = 50
52+
total_counts_y3 = 50
53+
total_samples3 = 100
54+
expected3 = round(np.log2((10/100)/((50/100)*(50/100))), 3) # Should be negative
55+
assert compute_pmi(joint_counts3, total_counts_x3, total_counts_y3, total_samples3) == expected3, "Test Case 3 Failed"
56+
57+
# Test Case 4: Zero joint occurrence
58+
joint_counts4 = 0
59+
total_counts_x4 = 50
60+
total_counts_y4 = 50
61+
total_samples4 = 100
62+
expected4 = float('-inf')
63+
assert compute_pmi(joint_counts4, total_counts_x4, total_counts_y4, total_samples4) == expected4, "Test Case 4 Failed"
64+
65+
# Test Case 5: Invalid inputs
66+
try:
67+
compute_pmi(-1, 50, 50, 100)
68+
assert False, "Test Case 5 Failed: Should raise ValueError for negative counts"
69+
except ValueError:
70+
pass
71+
72+
print("All Test Cases Passed!")
73+
74+
if __name__ == "__main__":
75+
test_pmi()
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Min-Max Normalization
2+
3+
Min-Max Normalization is a technique used to scale numerical data between *0 and 1*. It ensures that values are transformed while maintaining their original distribution. A machine learning model usually assigns a higher weight to a feature with larger values and a lower weight to a feature with smaller values.
4+
The goal of normalization is to make every datapoint have the same scale so each feature is equally important to the model.
5+
6+
7+
The formula used for Min-Max Normalization is:
8+
9+
$$
10+
X' = \frac{X - X_{\min}}{X_{\max} - X_{\min}}
11+
$$
12+
13+
- Note 1 : In case the input array has identical elements just return an array of the same size filled with zeros.
14+
- Note 2 : Round the resultant values upto the 4th decimal place.
15+
16+
## Where:
17+
- \( X \) is the original value.
18+
- $$(X_{\min})$$ is the minimum value in the dataset.
19+
- $$( X_{\text{max}} )$$ is the maximum value in the dataset.
20+
- \( X' \) is the normalized value.
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
def min_max(x:list[int])->list[float]:
2+
3+
#trying to modify the input list in-place instead creating a new list
4+
5+
largest=max(x) #finding the largest value in the input array
6+
smallest=min(x) #finding the minimum value in the input array
7+
8+
if largest==smallest: # if input has identical elements
9+
return[0.0]*len(x)
10+
for i in range(len(x)):
11+
#using the formula to normalize the input
12+
x[i]=round((x[i]-smallest)/(largest-smallest),4)
13+
14+
return(x)
15+
16+
def test_min_max():
17+
#first_test_case:
18+
19+
assert min_max([1,2,3,4,5])==[0.0, 0.25, 0.5, 0.75, 1.0],"Test Case 1 failed"
20+
21+
assert min_max([30,45,56,70,88])==[0.0, 0.2586, 0.4483, 0.6897, 1.0],"Test Case 2 failed"
22+
23+
assert min_max([5,5,5,5])==[0.0,0.0,0.0,0.0], "Test Case 3 failed"
24+
25+
26+
if __name__=="__main__":
27+
test_min_max()
28+
print("All Min Max Normalization test cases passed.")
29+

Problems/50_lasso_regression_gradient_descent/learn.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ <h3>2. Make Predictions at each step using the formula \[ \hat{y}_i = \sum_{j=1}
2020

2121
<h3>3. Find the residuals (Difference between the actual y values and the predicted ones) </h3>
2222
<h3>4. Update the weights and bias using the formula </p>
23-
<p> First, find the gradient with respect to weights $w$ using the formula \[ \frac{\partial J}{\partial w_j} = \frac{1}{n} \sum_{i=1}^nX_{ij}(y_i - \hat{y}_i) + \alpha \cdot sign(w_j) \] </h3>
23+
<p> First, find the gradient with respect to weights $w$ using the formula \[ \frac{\partial J}{\partial w_j} = \frac{1}{n} \sum_{i=1}^nX_{ij}(\hat{y}_i - y_i) + \alpha \cdot sign(w_j) \] </h3>
2424
<p> Then, we need to find the gradient with respect to the bias $b$. Since the bias term $b$ does not have a regularization component (since Lasso regularization is applied only to the weights $w_j$), the gradient with respect to $b$ is just the partial derivative of the Mean Squared Error (MSE) loss function with respect to $b$ \[ \frac{\partial J(w,b)}{\partial b} = \frac{1}{n} \sum_{i = 1}^{n}(\hat{y}_i-y_i)\]</p>
2525
<p> Next, we update the weights and bias using the formula \[ w_j = w_j - \eta \cdot \frac{\partial J}{\partial w_j} \] \[ b = b - \eta \cdot \frac{\partial J}{\partial b} \] Where eta is the learning rate defined in the beginning of the function</p>
2626
<h3> 5. Repeat steps 2-4 until the weights converge. This is determined by evaulating the L1 Norm of the weight gradients </h3>

0 commit comments

Comments
 (0)