Adding initial template for KNN question

kartik-git · kartik-git · commit fd99cbcb99d7 · 2025-07-22T20:55:55.000-07:00
diff --git a/questions/162_implement-k-nearest-neighbors/description.md b/questions/162_implement-k-nearest-neighbors/description.md
@@ -0,0 +1,3 @@
+## Problem
+
+Given a list of points in n-dimensional space represented as tuples and a query point, implement a function to find the k nearest neighbors to the query point using Euclidean distance.
diff --git a/questions/162_implement-k-nearest-neighbors/example.json b/questions/162_implement-k-nearest-neighbors/example.json
@@ -0,0 +1,5 @@
+{
+  "input": "points = [(1, 2), (3, 4), (1, 1), (5, 6), (2, 3)], query_point = (2, 2), k = 3",
+  "output": "[(1, 2), (2, 3), (1, 1)]",
+  "reasoning": "Calculate Euclidean distances from (2, 2): (1, 2) → √((1-2)² + (2-2)²) = 1.0, (3, 4) → √((3-2)² + (4-2)²) = √5 ≈ 2.24, (1, 1) → √((1-2)² + (1-2)²) = √2 ≈ 1.41, (5, 6) → √((5-2)² + (6-2)²) = 5.0, (2, 3) → √((2-2)² + (3-2)²) = 1.0. The 3 closest distances are 1.0, 1.0, and √2, corresponding to points (1, 2), (2, 3), and (1, 1)."
+}
diff --git a/questions/162_implement-k-nearest-neighbors/learn.md b/questions/162_implement-k-nearest-neighbors/learn.md
@@ -0,0 +1,29 @@
+## Solution Explanation
+
+The key insight is to use numpy's vectorized operations to efficiently calculate distances between the query point and all data points simultaneously, then select the k smallest distances.
+
+## Algorithm Steps:
+
+Convert to numpy arrays - Transform the input tuples into numpy arrays for vectorized operations
+Calculate distances - Use broadcasting to compute Euclidean distance from query point to all points at once
+Find k nearest - Use np.argsort() to get indices of points sorted by distance, then take first k
+Return as tuples - Convert the selected points back to tuple format
+
+## Key Implementation Details:
+
+Vectorized distance calculation: np.sqrt(np.sum((points_array - query_array) ** 2, axis=1)) computes all distances in one operation instead of looping
+Broadcasting: numpy automatically handles the subtraction between the query point and all data points
+Efficient sorting: np.argsort() returns indices of sorted elements without actually sorting the array, allowing us to select just the k smallest
+Dimension agnostic: The solution works for any number of dimensions (2D, 3D, etc.) without modification
+
+Time Complexity: O(n log n) where n is the number of points, dominated by the sorting step
+Space Complexity: O(n) for storing the distance array
+
+## Edge Case Handling:
+
+Empty points list returns empty result
+k larger than available points returns all points
+Single point datasets work correctly
+Duplicate points at same distance are handled by numpy's stable sorting
+
+The numpy approach is much more efficient than a naive loop-based implementation, especially for large datasets, as it leverages optimized C implementations for mathematical operations.
diff --git a/questions/162_implement-k-nearest-neighbors/meta.json b/questions/162_implement-k-nearest-neighbors/meta.json
@@ -0,0 +1,15 @@
+{
+  "id": "162",
+  "title": "Implement K-Nearest Neighbors",
+  "difficulty": "medium",
+  "category": "Machine Learning",
+  "video": "",
+  "likes": "0",
+  "dislikes": "0",
+  "contributor": [
+      {
+        "profile_link": "https://github.com/kartik-git",
+        "name": "kartik-git"
+      }
+  ]
+}
diff --git a/questions/162_implement-k-nearest-neighbors/solution.py b/questions/162_implement-k-nearest-neighbors/solution.py
@@ -0,0 +1,30 @@
+def k_nearest_neighbors(points, query_point, k):
+    """
+    Find k nearest neighbors to a query point
+    
+    Args:
+        points: List of tuples representing points [(x1, y1), (x2, y2), ...]
+        query_point: Tuple representing query point (x, y)
+        k: Number of nearest neighbors to return
+    
+    Returns:
+        List of k nearest neighbor points as tuples
+    """
+    if not points or k <= 0:
+        return []
+    
+    if k > len(points):
+        k = len(points)
+    
+    # Convert to numpy arrays for vectorized operations
+    points_array = np.array(points)
+    query_array = np.array(query_point)
+    
+    # Calculate Euclidean distances using broadcasting
+    distances = np.sqrt(np.sum((points_array - query_array) ** 2, axis=1))
+    
+    # Get indices of k smallest distances
+    k_nearest_indices = np.argsort(distances)[:k]
+    
+    # Return the k nearest points as tuples
+    return [tuple(points_array[i]) for i in k_nearest_indices]
diff --git a/questions/162_implement-k-nearest-neighbors/starter_code.py b/questions/162_implement-k-nearest-neighbors/starter_code.py
@@ -0,0 +1,13 @@
+def k_nearest_neighbors(points, query_point, k):
+    """
+    Find k nearest neighbors to a query point
+    
+    Args:
+        points: List of tuples representing points [(x1, y1), (x2, y2), ...]
+        query_point: Tuple representing query point (x, y)
+        k: Number of nearest neighbors to return
+    
+    Returns:
+        List of k nearest neighbor points as tuples
+    """
+    pass
diff --git a/questions/162_implement-k-nearest-neighbors/tests.json b/questions/162_implement-k-nearest-neighbors/tests.json
@@ -0,0 +1,14 @@
+[
+  {
+      "test": "print(k_nearest_neighbors([(1, 2), (3, 4), (1, 1), (5, 6), (2, 3)], (2, 2), 3))",
+      "expected_output": "[(1, 2), (2, 3), (1, 1)]"
+  },
+  {
+      "test": "print(k_nearest_neighbors([(0, 0), (1, 1), (2, 2), (3, 3)], (1.5, 1.5), 2))",
+      "expected_output": "[(1, 1), (2, 2)]"
+  },
+  {
+      "test": "print(k_nearest_neighbors([(1, 1), (2, 2), (3, 3)], (0, 0), 1))",
+      "expected_output": "[(1, 1)]"
+  }
+]

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+## Problem`
	`2`	`+`
	`3`	`+Given a list of points in n-dimensional space represented as tuples and a query point, implement a function to find the k nearest neighbors to the query point using Euclidean distance.`