Merge branch 'Open-Deep-ML:main' into Apriori

saitiger · web-flow · commit a3e298466bab · 2025-03-25T14:33:37.000-07:00
diff --git a/Problems/110_METEOR/Learn.md b/Problems/110_METEOR/Learn.md
@@ -0,0 +1,87 @@
+METEOR(Metric for Evaluation of Translation with Explicit ORdering) is a metric generally used for 
+machine translation and evaluating the text output of generative AI models. METEOR build was introduced to address 
+the limitations in earlier metrics like BLEU.
+
+## Key Characteristics
+- Considers semantic similarity beyond exact word matching
+- Accounts for word order and translation variations
+- Provides more human-aligned translation assessment
+
+# Implementation 
+1. **Tokenization**
+
+2. **Frequency of matching words** : Matching needs to be exact
+
+3. **Calculate Precision, Recall and F-mean**
+```
+   F_mean = (Precision * Recall) / (α * Precision + (1 - α) * Recall)
+```
+   - α typically set to 0.9
+   - Balances precision and recall
+
+4. **Fragmentation Penalty**
+   ```
+   Chunks = Count of contiguous matched word sequences
+   Penalty = γ * (Chunks / Matches)^β
+   ```
+   - β controls penalty weight (typically 3)
+   - γ limits maximum penalty (typically 0.5)
+
+5. **Final METEOR Score**
+   ```
+   METEOR = F_mean * (1 - Penalty)
+   ```
+   - Ranges from 0 (no match) to 1 (perfect match)
+
+**__Note__** : The [paper](https://aclanthology.org/W05-0909/) that introduced the metric doesn't have the parameters (α,β, and γ) as tunable parameters, but implementation in other libraries like NLTK offers this flexibility.
+
+# Example 
+
+- Reference: "The quick brown fox jumps over the lazy dog"
+- Candidate: "A quick brown fox jumps over a lazy dog"
+
+### 1. Tokenization
+- Reference Tokens: ['the', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']
+- Candidate Tokens: ['a', 'quick', 'brown', 'fox', 'jumps', 'over', 'a', 'lazy', 'dog']
+
+### 2. Unigram Matching
+- Matching tokens: ['quick', 'brown', 'fox', 'jumps', 'over', 'lazy', 'dog']
+- Matches: 7
+
+### 3. Unigram Precision and Recall Calculation
+- Precision = Matches / Candidate Length = 7 / 9 ≈ 0.778
+
+- Recall = Matches / Reference Length = 7 / 9 ≈ 0.778
+
+### 4. F-mean Calculation (α = 0.9)
+```
+F_mean = (Precision * Recall) / (α * Precision + (1 - α) * Recall)
+       = (0.778 * 0.778) / (0.9 * 0.778 + (1 - 0.9) * 0.778)
+       = 0.606 / (0.7 + 0.078)
+       = 0.606 / 0.778
+       ≈ 0.779
+```
+
+### 5. Chunk Calculation
+- Contiguous matched sequences:
+  1. ['quick', 'brown', 'fox']
+  2. ['jumps', 'over']
+  3. ['lazy', 'dog']
+- Number of Chunks: 3
+- Total Number of Unigram Matches: 7
+
+### 6. Penalty Calculation (β = 3, γ = 0.5)
+```
+Penalty = γ * (Number of Chunks / Total Number of Unigram Matches)^β
+        = 0.5 * (3 / 7)^3
+        = 0.5 * (0.429)^3
+        ≈ 0.039
+```
+
+### 7. Final METEOR Score
+```
+METEOR = F_mean * (1 - Penalty)
+       = 0.779 * (1 - 0.039)
+       = 0.779 * 0.961
+       ≈ 0.749
+```
diff --git a/Problems/110_METEOR/solution.py b/Problems/110_METEOR/solution.py
@@ -0,0 +1,100 @@
+import numpy as np
+from collections import Counter
+
+def meteor_score(reference, candidate, alpha=0.9, beta=3, gamma=0.5):
+    if not reference or not candidate:
+        raise ValueError("Reference and candidate cannot be empty")
+    
+    # Tokenize and count
+    ref_tokens = reference.lower().split()
+    cand_tokens = candidate.lower().split()
+
+    # Counter for unigram for reference and candidate 
+    ref_counts = Counter(ref_tokens) 
+    cand_counts = Counter(cand_tokens)
+    
+    # Calculate matches
+    num_matches = sum((ref_counts & cand_counts).values()) # Number of matching words in candidate and reference 
+    ref_len = len(ref_tokens)
+    cand_len = len(cand_tokens)  
+
+    # Unigram Precision and Recall 
+    precision = num_matches / cand_len if cand_len > 0 else 0 # Avoiding Division by zero
+    recall = num_matches / ref_len if ref_len > 0 else 0 # Avoiding Division by zero 
+    
+    if num_matches == 0:
+        return 0.0
+    
+    fmean = (precision * recall) / (alpha * precision + (1 - alpha) * recall)
+
+    # Chunk calculation 
+    matched_positions = []
+    ref_positions = {}  # Store positions of words in reference
+    used_positions = set()  # Track already used indices
+
+    # Populate reference positions for word alignment tracking
+    for i, word in enumerate(ref_tokens):
+        ref_positions.setdefault(word, []).append(i)
+
+    # Determine the sequence of matched positions in reference
+    for word in cand_tokens:
+        if word in ref_positions:
+            for pos in ref_positions[word]:
+                if pos not in used_positions:
+                    matched_positions.append(pos)
+                    used_positions.add(pos)
+                    break  # Ensure each match is used only once
+
+    # Count chunks by detecting breaks in position sequence
+    num_chunks = 1 if matched_positions else 0
+    for i in range(1, len(matched_positions)):
+        if matched_positions[i] != matched_positions[i - 1] + 1:
+            num_chunks += 1  # Break in sequence → new chunk
+
+    # Fragmentation penalty
+    penalty = gamma * ((num_chunks / num_matches) ** beta) if num_matches > 0 else 0
+    
+    # Final score
+    return round(fmean * (1 - penalty), 3) # Rounding to 3 Decimal places 
+
+def test_meteor_score():
+    # Test Case 1: Identical translations
+    ref_test1 = "The cat sits on the mat"
+    cand_test1 = "The cat sits on the mat"
+    expected1 = 1.0
+    assert meteor_score(ref_test1, cand_test1) == expected1, "Test Case 1 Failed"
+    
+    # Test Case 2: Similar translations
+    ref_test2 = "The quick brown fox jumps over the lazy dog"
+    cand_test2 = "A quick brown fox jumps over a lazy dog"
+    expected2 = 0.991
+    assert meteor_score(ref_test2, cand_test2) == expected2, "Test Case 2 Failed"
+    
+    # Test Case 3: Completely different translations
+    ref_test3 = "The cat sits on the mat"
+    cand_test3 = "Dogs run in the park"
+    expected3 = 0.0
+    assert meteor_score(ref_test3, cand_test3) == expected3, "Test Case 3 Failed"
+    
+    # Test Case 4: Partially matching translations
+    ref_test4 = "Machine learning is an exciting field"
+    cand_test4 = "Machine learning algorithms are fascinating"
+    expected4 = 0.667
+    assert meteor_score(ref_test4, cand_test4) == expected4, "Test Case 4 Failed"
+    
+    # Test Case 5: Empty input handling
+    try:
+        meteor_score("", "Some text")
+        assert False, "Test Case 5 Failed"
+    except ValueError:
+        pass
+    
+    # Test Case 6: Partial match with penalty
+    ref_test6 = "The cat sits on the mat"
+    cand_test6 = "The cat on the mat sits"
+    expected6 = 0.933
+    assert meteor_score(ref_test6, cand_test6) == expected6, "Test Case 6 Failed"
+    
+if __name__ == "__main__":
+    test_meteor_score()
+    print("All Test Cases Passed!")
diff --git a/Problems/111_PMI/Learn.md b/Problems/111_PMI/Learn.md
@@ -0,0 +1,39 @@
+# Pointwise Mutual Information (PMI)
+
+Pointwise Mutual Information (PMI) is a statistical measure used in information theory and Natural Language Processing (NLP) to quantify the level of association between two events. It compares the probability of two events 
+occurring together versus the probability of them occurring independently. It is commonly used in Natural Language Processing(NLP) and Information Retrieval to find association between two words, feature selection in text classification, 
+document similarity.
+
+## Implementation 
+1. **Collect Count Data for event x, y and joint occurence**
+   
+2. **Calculate Individual Probabilities**
+
+3. **Calculate Joint Probability**
+
+4. **Final Score : PMI(x,y) = log₂(P(x,y) / (P(x) * P(y)))**
+   Where:
+  - P(x,y) is the probability of events x and y occurring together
+    
+  - P(x) is the probability of event x occurring
+    
+  - P(y) is the probability of event y occurring
+
+## Interpreting PMI Values
+
+- **Positive PMI**: Events co-occur more than expected by chance
+- **Zero PMI**: Events are statistically independent
+- **Negative PMI**: Events co-occur less than expected by chance
+- **Undefined**: When P(x,y) = 0 (events never co-occur)
+
+## Variants
+
+### 1. Normalized PMI (NPMI)
+- Scales PMI to range [-1, 1]
+- Easier to compare across different datasets
+- Formula: NPMI(x,y) = PMI(x,y) / -log₂(P(x,y))
+
+### 2. Positive PMI (PPMI)
+- Sets negative PMI values to zero
+- Commonly used in word embedding models
+- Formula: PPMI(x,y) = max(PMI(x,y), 0)
diff --git a/Problems/111_PMI/solution.py b/Problems/111_PMI/solution.py
@@ -0,0 +1,75 @@
+import numpy as np
+
+def compute_pmi(joint_counts, total_counts_x, total_counts_y, total_samples):
+ 
+    if not all(isinstance(x, (int, float)) for x in [joint_counts, total_counts_x, total_counts_y, total_samples]):
+        raise ValueError("All inputs must be numeric")
+        
+    if any(x < 0 for x in [joint_counts, total_counts_x, total_counts_y, total_samples]):
+        raise ValueError("Counts cannot be negative")
+        
+    if total_samples == 0:
+        raise ValueError("Total samples cannot be zero")
+        
+    if joint_counts > min(total_counts_x, total_counts_y):
+        raise ValueError("Joint counts cannot exceed individual counts")
+        
+    if any(x > total_samples for x in [total_counts_x, total_counts_y]):
+        raise ValueError("Individual counts cannot exceed total samples")
+  
+    p_x = total_counts_x / total_samples
+    p_y = total_counts_y / total_samples
+    p_xy = joint_counts / total_samples
+    
+    # Handle edge cases where probabilities are zero
+    if p_xy == 0 or p_x == 0 or p_y == 0:
+        return float('-inf')
+  
+    pmi = np.log2(p_xy / (p_x * p_y))
+    
+    return round(pmi, 3)
+
+def test_pmi():
+    # Test Case 1: Perfect positive association
+    joint_counts1 = 100
+    total_counts_x1 = 100
+    total_counts_y1 = 100
+    total_samples1 = 100
+    expected1 = round(np.log2(1/(1*1)), 3)  # Should be 0.0
+    assert compute_pmi(joint_counts1, total_counts_x1, total_counts_y1, total_samples1) == expected1, "Test Case 1 Failed"
+    
+    # Test Case 2: Independence
+    joint_counts2 = 25
+    total_counts_x2 = 50
+    total_counts_y2 = 50
+    total_samples2 = 100
+    expected2 = round(np.log2((25/100)/((50/100)*(50/100))), 3)  # Should be 0.0
+    assert compute_pmi(joint_counts2, total_counts_x2, total_counts_y2, total_samples2) == expected2, "Test Case 2 Failed"
+    
+    # Test Case 3: Negative association
+    joint_counts3 = 10
+    total_counts_x3 = 50
+    total_counts_y3 = 50
+    total_samples3 = 100
+    expected3 = round(np.log2((10/100)/((50/100)*(50/100))), 3)  # Should be negative
+    assert compute_pmi(joint_counts3, total_counts_x3, total_counts_y3, total_samples3) == expected3, "Test Case 3 Failed"
+    
+    # Test Case 4: Zero joint occurrence
+    joint_counts4 = 0
+    total_counts_x4 = 50
+    total_counts_y4 = 50
+    total_samples4 = 100
+    expected4 = float('-inf')
+    assert compute_pmi(joint_counts4, total_counts_x4, total_counts_y4, total_samples4) == expected4, "Test Case 4 Failed"
+    
+    # Test Case 5: Invalid inputs
+    try:
+        compute_pmi(-1, 50, 50, 100)
+        assert False, "Test Case 5 Failed: Should raise ValueError for negative counts"
+    except ValueError:
+        pass
+    
+    print("All Test Cases Passed!")
+
+if __name__ == "__main__":
+    test_pmi()
diff --git a/Problems/112_Min_Max_Normalization/learn.md b/Problems/112_Min_Max_Normalization/learn.md
@@ -0,0 +1,20 @@
+# Min-Max Normalization
+
+Min-Max Normalization is a technique used to scale numerical data between *0 and 1*. It ensures that values are transformed while maintaining their original distribution. A machine learning model usually assigns a higher weight to a feature with larger values and a lower weight to a feature with smaller values.  
+The goal of normalization is to make every datapoint have the same scale so each feature is equally important to the model.
+
+
+The formula used for Min-Max Normalization is:
+
+$$
+X' = \frac{X - X_{\min}}{X_{\max} - X_{\min}}
+$$
+
+- Note 1 : In case the input array has identical elements just return an array of the same size filled with zeros.
+- Note 2 : Round the resultant values upto the 4th decimal place.
+
+## Where:
+- \( X \) is the original value.
+- $$(X_{\min})$$ is the minimum value in the dataset.
+- $$( X_{\text{max}} )$$ is the maximum value in the dataset.
+- \( X' \) is the normalized value.
diff --git a/Problems/112_Min_Max_Normalization/solution.py b/Problems/112_Min_Max_Normalization/solution.py
@@ -0,0 +1,29 @@
+def min_max(x:list[int])->list[float]:
+
+    #trying to modify the input list in-place instead creating a new list
+
+    largest=max(x) #finding the largest value in the input array
+    smallest=min(x) #finding the minimum value in the input array 
+
+    if largest==smallest: # if input has identical elements
+        return[0.0]*len(x)
+    for i in range(len(x)):
+        #using the formula to normalize the input
+        x[i]=round((x[i]-smallest)/(largest-smallest),4)
+
+    return(x)
+
+def test_min_max():
+    #first_test_case:
+
+    assert min_max([1,2,3,4,5])==[0.0, 0.25, 0.5, 0.75, 1.0],"Test Case 1 failed"
+
+    assert min_max([30,45,56,70,88])==[0.0, 0.2586, 0.4483, 0.6897, 1.0],"Test Case 2 failed"
+
+    assert min_max([5,5,5,5])==[0.0,0.0,0.0,0.0], "Test Case 3 failed"
+
+
+if __name__=="__main__":
+    test_min_max()
+    print("All Min Max Normalization test cases passed.")
+
diff --git a/Problems/50_lasso_regression_gradient_descent/learn.html b/Problems/50_lasso_regression_gradient_descent/learn.html
@@ -20,7 +20,7 @@ <h3>2. Make Predictions at each step using the formula \[ \hat{y}_i = \sum_{j=1}
 
 <h3>3. Find the residuals (Difference between the actual y values and the predicted ones) </h3>
 <h3>4. Update the weights and bias using the formula </p>
-<p> First, find the gradient with respect to weights $w$ using the formula \[ \frac{\partial J}{\partial w_j} = \frac{1}{n} \sum_{i=1}^nX_{ij}(y_i - \hat{y}_i) + \alpha \cdot sign(w_j) \] </h3>
+<p> First, find the gradient with respect to weights $w$ using the formula \[ \frac{\partial J}{\partial w_j} = \frac{1}{n} \sum_{i=1}^nX_{ij}(\hat{y}_i - y_i) + \alpha \cdot sign(w_j) \] </h3>
 <p> Then, we need to find the gradient with respect to the bias $b$. Since the bias term $b$ does not have a regularization component (since Lasso regularization is applied only to the weights $w_j$), the gradient with respect to $b$ is just the partial derivative of the Mean Squared Error (MSE) loss function with respect to $b$ \[ \frac{\partial J(w,b)}{\partial b} =  \frac{1}{n} \sum_{i = 1}^{n}(\hat{y}_i-y_i)\]</p>
 <p> Next, we update the weights and bias using the formula \[ w_j = w_j - \eta \cdot \frac{\partial J}{\partial w_j} \] \[ b = b - \eta \cdot \frac{\partial J}{\partial b} \] Where eta is the learning rate defined in the beginning of the function</p>
 <h3> 5. Repeat steps 2-4 until the weights converge. This is determined by evaulating the L1 Norm of the weight gradients </h3>
diff --git a/Problems/6_calculate_eigenvalues/solution.cpp b/Problems/6_calculate_eigenvalues/solution.cpp
diff --git a/Problems/8_Calculate_2x2_Matrix_Inverse/solution.cpp b/Problems/8_Calculate_2x2_Matrix_Inverse/solution.cpp