Skip to content

Commit fd99cbc

Browse files
committed
Adding initial template for KNN question
1 parent 306fe25 commit fd99cbc

File tree

7 files changed

+109
-0
lines changed

7 files changed

+109
-0
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
## Problem
2+
3+
Given a list of points in n-dimensional space represented as tuples and a query point, implement a function to find the k nearest neighbors to the query point using Euclidean distance.
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"input": "points = [(1, 2), (3, 4), (1, 1), (5, 6), (2, 3)], query_point = (2, 2), k = 3",
3+
"output": "[(1, 2), (2, 3), (1, 1)]",
4+
"reasoning": "Calculate Euclidean distances from (2, 2): (1, 2) → √((1-2)² + (2-2)²) = 1.0, (3, 4) → √((3-2)² + (4-2)²) = √5 ≈ 2.24, (1, 1) → √((1-2)² + (1-2)²) = √2 ≈ 1.41, (5, 6) → √((5-2)² + (6-2)²) = 5.0, (2, 3) → √((2-2)² + (3-2)²) = 1.0. The 3 closest distances are 1.0, 1.0, and √2, corresponding to points (1, 2), (2, 3), and (1, 1)."
5+
}
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
## Solution Explanation
2+
3+
The key insight is to use numpy's vectorized operations to efficiently calculate distances between the query point and all data points simultaneously, then select the k smallest distances.
4+
5+
## Algorithm Steps:
6+
7+
Convert to numpy arrays - Transform the input tuples into numpy arrays for vectorized operations
8+
Calculate distances - Use broadcasting to compute Euclidean distance from query point to all points at once
9+
Find k nearest - Use np.argsort() to get indices of points sorted by distance, then take first k
10+
Return as tuples - Convert the selected points back to tuple format
11+
12+
## Key Implementation Details:
13+
14+
Vectorized distance calculation: np.sqrt(np.sum((points_array - query_array) ** 2, axis=1)) computes all distances in one operation instead of looping
15+
Broadcasting: numpy automatically handles the subtraction between the query point and all data points
16+
Efficient sorting: np.argsort() returns indices of sorted elements without actually sorting the array, allowing us to select just the k smallest
17+
Dimension agnostic: The solution works for any number of dimensions (2D, 3D, etc.) without modification
18+
19+
Time Complexity: O(n log n) where n is the number of points, dominated by the sorting step
20+
Space Complexity: O(n) for storing the distance array
21+
22+
## Edge Case Handling:
23+
24+
Empty points list returns empty result
25+
k larger than available points returns all points
26+
Single point datasets work correctly
27+
Duplicate points at same distance are handled by numpy's stable sorting
28+
29+
The numpy approach is much more efficient than a naive loop-based implementation, especially for large datasets, as it leverages optimized C implementations for mathematical operations.
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
{
2+
"id": "162",
3+
"title": "Implement K-Nearest Neighbors",
4+
"difficulty": "medium",
5+
"category": "Machine Learning",
6+
"video": "",
7+
"likes": "0",
8+
"dislikes": "0",
9+
"contributor": [
10+
{
11+
"profile_link": "https://github.com/kartik-git",
12+
"name": "kartik-git"
13+
}
14+
]
15+
}
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
def k_nearest_neighbors(points, query_point, k):
2+
"""
3+
Find k nearest neighbors to a query point
4+
5+
Args:
6+
points: List of tuples representing points [(x1, y1), (x2, y2), ...]
7+
query_point: Tuple representing query point (x, y)
8+
k: Number of nearest neighbors to return
9+
10+
Returns:
11+
List of k nearest neighbor points as tuples
12+
"""
13+
if not points or k <= 0:
14+
return []
15+
16+
if k > len(points):
17+
k = len(points)
18+
19+
# Convert to numpy arrays for vectorized operations
20+
points_array = np.array(points)
21+
query_array = np.array(query_point)
22+
23+
# Calculate Euclidean distances using broadcasting
24+
distances = np.sqrt(np.sum((points_array - query_array) ** 2, axis=1))
25+
26+
# Get indices of k smallest distances
27+
k_nearest_indices = np.argsort(distances)[:k]
28+
29+
# Return the k nearest points as tuples
30+
return [tuple(points_array[i]) for i in k_nearest_indices]
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
def k_nearest_neighbors(points, query_point, k):
2+
"""
3+
Find k nearest neighbors to a query point
4+
5+
Args:
6+
points: List of tuples representing points [(x1, y1), (x2, y2), ...]
7+
query_point: Tuple representing query point (x, y)
8+
k: Number of nearest neighbors to return
9+
10+
Returns:
11+
List of k nearest neighbor points as tuples
12+
"""
13+
pass
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
[
2+
{
3+
"test": "print(k_nearest_neighbors([(1, 2), (3, 4), (1, 1), (5, 6), (2, 3)], (2, 2), 3))",
4+
"expected_output": "[(1, 2), (2, 3), (1, 1)]"
5+
},
6+
{
7+
"test": "print(k_nearest_neighbors([(0, 0), (1, 1), (2, 2), (3, 3)], (1.5, 1.5), 2))",
8+
"expected_output": "[(1, 1), (2, 2)]"
9+
},
10+
{
11+
"test": "print(k_nearest_neighbors([(1, 1), (2, 2), (3, 3)], (0, 0), 1))",
12+
"expected_output": "[(1, 1)]"
13+
}
14+
]

0 commit comments

Comments
 (0)