diff --git a/build/1.json b/build/1.json index ad432627..09ddad8e 100644 --- a/build/1.json +++ b/build/1.json @@ -37,16 +37,20 @@ "expected_output": "[5.5, 10.0]" } ], - "tinygrad_starter_code": "from tinygrad.tensor import Tensor\n\ndef matrix_dot_vector_tg(a, b) -> Tensor:\n \"\"\"\n Compute the product of matrix `a` and vector `b` using tinygrad.\n Inputs can be Python lists, NumPy arrays, or tinygrad Tensors.\n Returns a 1-D Tensor of length m, or Tensor(-1) if dimensions mismatch.\n \"\"\"\n # Dimension mismatch check\n if len(a[0]) != len(b):\n return Tensor(-1)\n # Convert to Tensor\n a_t = Tensor(a)\n b_t = Tensor(b)\n # Your implementation here\n pass", - "tinygrad_solution": "from tinygrad.tensor import Tensor\n\ndef matrix_dot_vector_tg(a, b) -> Tensor:\n \"\"\"\n Compute the product of matrix `a` and vector `b` using tinygrad.\n Inputs can be Python lists, NumPy arrays, or tinygrad Tensors.\n Returns a 1-D Tensor of length m, or Tensor(-1) if dimensions mismatch.\n \"\"\"\n if len(a[0]) != len(b):\n return Tensor(-1)\n a_t = Tensor(a)\n b_t = Tensor(b)\n return a_t.matmul(b_t)", + "tinygrad_starter_code": "from tinygrad.tensor import Tensor\n\ndef matrix_dot_vector_tg(a:Tensor, b:Tensor) -> Tensor:\n \"\"\"\n Compute the product of matrix `a` and vector `b` using tinygrad.\n Will be tinygrad Tensors.\n Returns a 1-D Tensor of length m, or Tensor(-1) if dimensions mismatch.\n \"\"\"\n pass", + "tinygrad_solution": "from tinygrad.tensor import Tensor\n\ndef matrix_dot_vector_tg(a: Tensor, b: Tensor) -> Tensor:\n \"\"\"\n Compute the product of matrix `a` and vector `b` using tinygrad.\n Inputs will be tinygrad Tensors.\n Returns a 1-D Tensor of length m, or Tensor(-1) if dimensions mismatch.\n \"\"\"\n if len(a[0]) != len(b):\n return Tensor(-1)\n return a @ b", "tinygrad_test_cases": [ { - "test": "from tinygrad.tensor import Tensor\nres = matrix_dot_vector_tg(\n [[1,2,3],[2,4,5],[6,8,9]],\n [1,2,3]\n)\nprint(res.numpy().tolist())", + "test": "from tinygrad.tensor import Tensor\nres = matrix_dot_vector_tg(\n Tensor([[1,2,3],[2,4,5],[6,8,9]]),\n Tensor([1,2,3])\n)\nprint(res.numpy().tolist())", "expected_output": "[14.0, 25.0, 49.0]" }, { - "test": "from tinygrad.tensor import Tensor\nres = matrix_dot_vector_tg(\n [[1,2,3],[2,4,5]],\n [1,2]\n)\nprint(res.numpy().tolist())", + "test": "from tinygrad.tensor import Tensor\nres = matrix_dot_vector_tg(\n Tensor([[1,2,3],[2,4,5]]),\n Tensor([1,2])\n)\nprint(res.numpy().tolist())", "expected_output": "-1" + }, + { + "test": "from tinygrad.tensor import Tensor\nres = matrix_dot_vector_tg(\n Tensor([[1, 2], [2, 4]]),\n Tensor([1, 2])\n)\nprint(res.numpy().tolist())", + "expected_output": "[5, 10]" } ], "pytorch_starter_code": "import torch\n\ndef matrix_dot_vector(a, b) -> torch.Tensor:\n \"\"\"\n Compute the product of matrix `a` and vector `b` using PyTorch.\n Inputs can be Python lists, NumPy arrays, or torch Tensors.\n Returns a 1-D tensor of length m, or tensor(-1) if dimensions mismatch.\n \"\"\"\n a_t = torch.as_tensor(a, dtype=torch.float)\n b_t = torch.as_tensor(b, dtype=torch.float)\n # Dimension mismatch check\n if a_t.size(1) != b_t.size(0):\n return torch.tensor(-1)\n # Your implementation here\n pass", diff --git a/build/174.json b/build/174.json new file mode 100644 index 00000000..7e51f036 --- /dev/null +++ b/build/174.json @@ -0,0 +1,51 @@ +{ + "id": "174", + "title": "Train a Simple GAN on 1D Gaussian Data", + "difficulty": "hard", + "category": "Deep Learning", + "video": "", + "likes": "0", + "dislikes": "0", + "contributor": [ + { + "profile_link": "https://github.com/moe18", + "name": "moe" + } + ], + "pytorch_difficulty": "medium", + "description": "In this task, you will train a Generative Adversarial Network (GAN) to learn a one-dimensional Gaussian distribution. The GAN consists of a generator that produces samples from latent noise and a discriminator that estimates the probability that a given sample is real. Both networks should have one hidden layer with ReLU activation in the hidden layer. The generator’s output layer is linear, while the discriminator's output layer uses a sigmoid activation.\n\nYou must train the GAN using the standard non-saturating GAN loss for the generator and binary cross-entropy loss for the discriminator. In the NumPy version, parameters should be updated using vanilla gradient descent. In the PyTorch version, parameters should be updated using stochastic gradient descent (SGD) with the specified learning rate. The training loop should alternate between updating the discriminator and the generator each iteration.\n\nYour function must return the trained generator forward function `gen_forward(z)`, which produces generated samples given latent noise.", + "learn_section": "## Understanding GANs for 1D Gaussian Data\nA Generative Adversarial Network (GAN) consists of two neural networks - a **Generator** $G_\\theta$ and a **Discriminator** $D_\\phi$ - trained in a minimax game.\n\n### 1. The Roles\n- **Generator** $G_\\theta(z)$: Takes a latent noise vector $z \\sim \\mathcal{N}(0, I)$ and outputs a sample intended to resemble the real data.\n- **Discriminator** $D_\\phi(x)$: Outputs a probability $p \\in (0, 1)$ that the input $x$ came from the real data distribution rather than the generator.\n\n### 2. The Objective\nThe classical GAN objective is:\n$$\n\\min_{\\theta} \\; \\max_{\\phi} \\; \\mathbb{E}_{x \\sim p_{\\text{data}}} [\\log D_\\phi(x)] + \\mathbb{E}_{z \\sim p(z)} [\\log (1 - D_\\phi(G_\\theta(z)))]\n$$\nHere:\n- $p_{\\text{data}}$ is the real data distribution.\n- $p(z)$ is the prior distribution for the latent noise (often standard normal).\n\n### 3. Practical Losses\nIn implementation, we minimize:\n- **Discriminator loss**:\n$$\n\\mathcal{L}_D = - \\left( \\frac{1}{m} \\sum_{i=1}^m \\log D(x^{(i)}_{\\text{real}}) + \\log(1 - D(x^{(i)}_{\\text{fake}})) \\right)\n$$\n- **Generator loss** (non-saturating form):\n$$\n\\mathcal{L}_G = - \\frac{1}{m} \\sum_{i=1}^m \\log D(G(z^{(i)}))\n$$\n\n### 4. Forward/Backward Flow\n1. **Discriminator step**: Real samples $x_{\\text{real}}$ and fake samples $x_{\\text{fake}} = G(z)$ are passed through $D$, and $\\mathcal{L}_D$ is minimized w.r.t. $\\phi$.\n2. **Generator step**: Fresh $z$ is sampled, $x_{\\text{fake}} = G(z)$ is passed through $D$, and $\\mathcal{L}_G$ is minimized w.r.t. $\\theta$ while keeping $\\phi$ fixed.\n\n### 5. Architecture for This Task\n- **Generator**: Fully connected layer ($\\mathbb{R}^{\\text{latent\\_dim}} \\to \\mathbb{R}^{\\text{hidden\\_dim}}$) -> ReLU -> Fully connected layer ($\\mathbb{R}^{\\text{hidden\\_dim}} \\to \\mathbb{R}^1$).\n- **Discriminator**: Fully connected layer ($\\mathbb{R}^1 \\to \\mathbb{R}^{\\text{hidden\\_dim}}$) → ReLU → Fully connected layer ($\\mathbb{R}^{\\text{hidden\\_dim}} \\to \\mathbb{R}^1$) → Sigmoid.\n\n### 6. Numerical Tips\n- Initialize weights with a small Gaussian ($\\mathcal{N}(0, 0.01)$).\n- Add $10^{-8}$ to logs for numerical stability.\n- Use a consistent batch size $m$ for both real and fake samples.\n- Always sample fresh noise for the generator on each update.\n\n**Your Task**: Implement the training loop to learn the parameters $\\theta$ and $\\phi$, and return the trained `gen_forward(z)` function. The evaluation (mean/std of generated samples) will be handled in the test cases.", + "starter_code": "import numpy as np\n\ndef train_gan(mean_real: float, std_real: float, latent_dim: int = 1, hidden_dim: int = 16, learning_rate: float = 0.001, epochs: int = 5000, batch_size: int = 128, seed: int = 42):\n \"\"\"\n Train a simple GAN to learn a 1D Gaussian distribution.\n\n Args:\n mean_real: Mean of the target Gaussian\n std_real: Std of the target Gaussian\n latent_dim: Dimension of the noise input to the generator\n hidden_dim: Hidden layer size for both networks\n learning_rate: Learning rate for gradient descent\n epochs: Number of training epochs\n batch_size: Training batch size\n seed: Random seed for reproducibility\n\n Returns:\n gen_forward: A function that takes z and returns generated samples\n \"\"\"\n # Your code here\n pass", + "solution": "import numpy as np\n\ndef relu(x):\n return np.maximum(0, x)\n\ndef sigmoid(x):\n return 1 / (1 + np.exp(-x))\n\ndef train_gan(mean_real: float, std_real: float, latent_dim: int = 1, hidden_dim: int = 16, learning_rate: float = 0.001, epochs: int = 5000, batch_size: int = 128, seed: int = 42):\n np.random.seed(seed)\n data_dim = 1\n\n # Initialize generator weights\n w1_g = np.random.normal(0, 0.01, (latent_dim, hidden_dim))\n b1_g = np.zeros(hidden_dim)\n w2_g = np.random.normal(0, 0.01, (hidden_dim, data_dim))\n b2_g = np.zeros(data_dim)\n\n # Initialize discriminator weights\n w1_d = np.random.normal(0, 0.01, (data_dim, hidden_dim))\n b1_d = np.zeros(hidden_dim)\n w2_d = np.random.normal(0, 0.01, (hidden_dim, 1))\n b2_d = np.zeros(1)\n\n def disc_forward(x):\n h1 = np.dot(x, w1_d) + b1_d\n a1 = relu(h1)\n logit = np.dot(a1, w2_d) + b2_d\n p = sigmoid(logit)\n return p, logit, a1, h1\n\n def gen_forward(z):\n h1 = np.dot(z, w1_g) + b1_g\n a1 = relu(h1)\n x_gen = np.dot(a1, w2_g) + b2_g\n return x_gen, a1, h1\n\n for epoch in range(epochs):\n # Sample real data\n x_real = np.random.normal(mean_real, std_real, batch_size)[:, None]\n z = np.random.normal(0, 1, (batch_size, latent_dim))\n x_fake, _, _ = gen_forward(z)\n\n # Discriminator forward\n p_real, _, a1_real, h1_real = disc_forward(x_real)\n p_fake, _, a1_fake, h1_fake = disc_forward(x_fake)\n\n # Discriminator gradients\n grad_logit_real = - (1 - p_real) / batch_size\n grad_a1_real = grad_logit_real @ w2_d.T\n grad_h1_real = grad_a1_real * (h1_real > 0)\n grad_w1_d_real = x_real.T @ grad_h1_real\n grad_b1_d_real = np.sum(grad_h1_real, axis=0)\n grad_w2_d_real = a1_real.T @ grad_logit_real\n grad_b2_d_real = np.sum(grad_logit_real, axis=0)\n\n grad_logit_fake = p_fake / batch_size\n grad_a1_fake = grad_logit_fake @ w2_d.T\n grad_h1_fake = grad_a1_fake * (h1_fake > 0)\n grad_w1_d_fake = x_fake.T @ grad_h1_fake\n grad_b1_d_fake = np.sum(grad_h1_fake, axis=0)\n grad_w2_d_fake = a1_fake.T @ grad_logit_fake\n grad_b2_d_fake = np.sum(grad_logit_fake, axis=0)\n\n grad_w1_d = grad_w1_d_real + grad_w1_d_fake\n grad_b1_d = grad_b1_d_real + grad_b1_d_fake\n grad_w2_d = grad_w2_d_real + grad_w2_d_fake\n grad_b2_d = grad_b2_d_real + grad_b2_d_fake\n\n w1_d -= learning_rate * grad_w1_d\n b1_d -= learning_rate * grad_b1_d\n w2_d -= learning_rate * grad_w2_d\n b2_d -= learning_rate * grad_b2_d\n\n # Generator update\n z = np.random.normal(0, 1, (batch_size, latent_dim))\n x_fake, a1_g, h1_g = gen_forward(z)\n p_fake, _, a1_d, h1_d = disc_forward(x_fake)\n\n grad_logit_fake = - (1 - p_fake) / batch_size\n grad_a1_d = grad_logit_fake @ w2_d.T\n grad_h1_d = grad_a1_d * (h1_d > 0)\n grad_x_fake = grad_h1_d @ w1_d.T\n\n grad_a1_g = grad_x_fake @ w2_g.T\n grad_h1_g = grad_a1_g * (h1_g > 0)\n grad_w1_g = z.T @ grad_h1_g\n grad_b1_g = np.sum(grad_h1_g, axis=0)\n grad_w2_g = a1_g.T @ grad_x_fake\n grad_b2_g = np.sum(grad_x_fake, axis=0)\n\n w1_g -= learning_rate * grad_w1_g\n b1_g -= learning_rate * grad_b1_g\n w2_g -= learning_rate * grad_w2_g\n b2_g -= learning_rate * grad_b2_g\n\n return gen_forward", + "example": { + "input": "gen_forward = train_gan(4.0, 1.25, epochs=1000, seed=42)\nz = np.random.normal(0, 1, (500, 1))\nx_gen, _, _ = gen_forward(z)\n(round(np.mean(x_gen), 4), round(np.std(x_gen), 4))", + "output": "(0.0004, 0.0002)", + "reasoning": "The test cases call `gen_forward` after training, sample 500 points, and then compute the mean and std." + }, + "test_cases": [ + { + "test": "gen_forward = train_gan(4.0, 1.25, epochs=1000, seed=42)\nz = np.random.normal(0, 1, (500, 1))\nx_gen, _, _ = gen_forward(z)\nprint((round(np.mean(x_gen), 4), round(np.std(x_gen), 4)))", + "expected_output": "(0.0004, 0.0002)" + }, + { + "test": "gen_forward = train_gan(0.0, 1.0, epochs=500, seed=0)\nz = np.random.normal(0, 1, (300, 1))\nx_gen, _, _ = gen_forward(z)\nprint((round(np.mean(x_gen), 4), round(np.std(x_gen), 4)))", + "expected_output": "(-0.0002, 0.0002)" + }, + { + "test": "gen_forward = train_gan(-2.0, 0.5, epochs=1500, seed=123)\nz = np.random.normal(0, 1, (400, 1))\nx_gen, _, _ = gen_forward(z)\nprint((round(np.mean(x_gen), 4), round(np.std(x_gen), 4)))", + "expected_output": "(-0.0044, 0.0002)" + } + ], + "pytorch_starter_code": "import torch\nimport torch.nn as nn\nimport torch.optim as optim\n\ndef train_gan(mean_real: float, std_real: float, latent_dim: int = 1, hidden_dim: int = 16, learning_rate: float = 0.001, epochs: int = 5000, batch_size: int = 128, seed: int = 42):\n torch.manual_seed(seed)\n # Your PyTorch implementation here\n pass", + "pytorch_solution": "import torch\nimport torch.nn as nn\nimport torch.optim as optim\n\ndef train_gan(mean_real: float, std_real: float, latent_dim: int = 1, hidden_dim: int = 16, learning_rate: float = 0.001, epochs: int = 5000, batch_size: int = 128, seed: int = 42):\n torch.manual_seed(seed)\n\n class Generator(nn.Module):\n def __init__(self):\n super().__init__()\n self.net = nn.Sequential(\n nn.Linear(latent_dim, hidden_dim),\n nn.ReLU(),\n nn.Linear(hidden_dim, 1)\n )\n def forward(self, z):\n return self.net(z)\n\n class Discriminator(nn.Module):\n def __init__(self):\n super().__init__()\n self.net = nn.Sequential(\n nn.Linear(1, hidden_dim),\n nn.ReLU(),\n nn.Linear(hidden_dim, 1),\n nn.Sigmoid()\n )\n def forward(self, x):\n return self.net(x)\n\n G = Generator()\n D = Discriminator()\n\n # Use SGD as requested\n opt_G = optim.SGD(G.parameters(), lr=learning_rate)\n opt_D = optim.SGD(D.parameters(), lr=learning_rate)\n criterion = nn.BCELoss()\n\n for _ in range(epochs):\n # Real and fake batches\n real_data = torch.normal(mean_real, std_real, size=(batch_size, 1))\n noise = torch.randn(batch_size, latent_dim)\n fake_data = G(noise)\n\n # ----- Discriminator step -----\n opt_D.zero_grad()\n pred_real = D(real_data)\n pred_fake = D(fake_data.detach())\n loss_real = criterion(pred_real, torch.ones_like(pred_real))\n loss_fake = criterion(pred_fake, torch.zeros_like(pred_fake))\n loss_D = loss_real + loss_fake\n loss_D.backward()\n opt_D.step()\n\n # ----- Generator step -----\n opt_G.zero_grad()\n pred_fake = D(fake_data)\n # non-saturating generator loss: maximize log D(G(z)) -> minimize -log D(G(z))\n loss_G = criterion(pred_fake, torch.ones_like(pred_fake))\n loss_G.backward()\n opt_G.step()\n\n return G.forward", + "pytorch_test_cases": [ + { + "test": "gen_forward = train_gan(4.0, 1.25, epochs=100, seed=42)\nz = torch.randn(500, 1)\nx_gen = gen_forward(z)\nprint((round(x_gen.mean().item(), 4), round(x_gen.std().item(), 4)))", + "expected_output": "(0.4725, 0.3563)" + }, + { + "test": "gen_forward = train_gan(0.0, 1.0, epochs=50, seed=0)\nz = torch.randn(300, 1)\nx_gen = gen_forward(z)\nprint((round(x_gen.mean().item(), 4), round(x_gen.std().item(), 4)))", + "expected_output": "(0.0644, 0.244)" + } + ] +} \ No newline at end of file diff --git a/build/2.json b/build/2.json index 8faf4c09..e42cf6db 100644 --- a/build/2.json +++ b/build/2.json @@ -33,15 +33,15 @@ "expected_output": "[[1, 4], [2, 5], [3, 6]]" } ], - "tinygrad_starter_code": "from tinygrad.tensor import Tensor\n\ndef transpose_matrix_tg(a) -> Tensor:\n \"\"\"\n Transpose a 2D matrix `a` using tinygrad.\n Inputs can be Python lists, NumPy arrays, or tinygrad Tensors.\n Returns a transposed Tensor.\n \"\"\"\n # Convert to Tensor\n a_t = Tensor(a)\n # Your implementation here\n pass", - "tinygrad_solution": "from tinygrad.tensor import Tensor\n\ndef transpose_matrix_tg(a) -> Tensor:\n \"\"\"\n Transpose a 2D matrix `a` using tinygrad.\n Inputs can be Python lists, NumPy arrays, or tinygrad Tensors.\n Returns a transposed Tensor.\n \"\"\"\n a_t = Tensor(a)\n return a_t.transpose(0,1)", + "tinygrad_starter_code": "from tinygrad.tensor import Tensor\n\ndef transpose_matrix_tg(a:Tensor) -> Tensor:\n \"\"\"\n Transpose a 2D matrix `a` using tinygrad.\n Inputs are tinygrad Tensors.\n Returns a transposed Tensor.\n \"\"\"\n pass", + "tinygrad_solution": "from tinygrad.tensor import Tensor\n\ndef transpose_matrix_tg(a) -> Tensor:\n \"\"\"\n Transpose a 2D matrix `a` using tinygrad.\n Inputs are tinygrad Tensors.\n Returns a transposed Tensor.\n \"\"\"\n return a.T", "tinygrad_test_cases": [ { - "test": "from tinygrad.tensor import Tensor\nres = transpose_matrix_tg([[1,2,3],[4,5,6]])\nprint(res.numpy().tolist())", + "test": "from tinygrad.tensor import Tensor\nres = transpose_matrix_tg(Tensor([[1,2,3],[4,5,6]]))\nprint(res.numpy().tolist())", "expected_output": "[[1, 4], [2, 5], [3, 6]]" }, { - "test": "from tinygrad.tensor import Tensor\nres = transpose_matrix_tg([[1,2],[3,4]])\nprint(res.numpy().tolist())", + "test": "from tinygrad.tensor import Tensor\nres = transpose_matrix_tg(Tensor([[1,2],[3,4]]))\nprint(res.numpy().tolist())", "expected_output": "[[1, 3], [2, 4]]" } ], diff --git a/build/3.json b/build/3.json index 5aecd531..366f7b2a 100644 --- a/build/3.json +++ b/build/3.json @@ -46,15 +46,15 @@ "expected_output": "[[1, 2, 3, 4], [5, 6, 7, 8]]" } ], - "tinygrad_starter_code": "from tinygrad.tensor import Tensor\n\ndef reshape_matrix_tg(a, new_shape) -> Tensor:\n \"\"\"\n Reshape a 2D matrix `a` to shape `new_shape` using tinygrad.\n Inputs can be Python lists, NumPy arrays, or tinygrad Tensors.\n Returns a Tensor of shape `new_shape`, or an empty Tensor on mismatch.\n \"\"\"\n # Dimension check\n if len(a) * len(a[0]) != new_shape[0] * new_shape[1]:\n return Tensor([])\n # Convert to Tensor and reshape\n a_t = Tensor(a)\n # Your implementation here\n pass", - "tinygrad_solution": "from tinygrad.tensor import Tensor\n\ndef reshape_matrix_tg(a, new_shape) -> Tensor:\n \"\"\"\n Reshape a 2D matrix `a` to shape `new_shape` using tinygrad.\n Inputs can be Python lists, NumPy arrays, or tinygrad Tensors.\n Returns a Tensor of shape `new_shape`, or an empty Tensor on mismatch.\n \"\"\"\n # Dimension check\n if len(a) * len(a[0]) != new_shape[0] * new_shape[1]:\n return Tensor([])\n a_t = Tensor(a)\n return a_t.reshape(new_shape)", + "tinygrad_starter_code": "from tinygrad.tensor import Tensor\n\ndef reshape_matrix_tg(a:Tensor, new_shape:tuple) -> Tensor:\n \"\"\"\n Reshape a 2D matrix `a` to shape `new_shape` using tinygrad.\n Inputs are tinygrad Tensors.\n Returns a Tensor of shape `new_shape`, or an empty Tensor on mismatch.\n \"\"\"\n pass", + "tinygrad_solution": "from tinygrad.tensor import Tensor\n\ndef reshape_matrix_tg(a, new_shape) -> Tensor:\n \"\"\"\n Reshape a 2D matrix `a` to shape `new_shape` using tinygrad.\n Inputs are tinygrad Tensors.\n Returns a Tensor of shape `new_shape`, or an empty Tensor on mismatch.\n \"\"\"\n # Dimension check\n if len(a) * len(a[0]) != new_shape[0] * new_shape[1]:\n return Tensor([])\n return a.reshape(new_shape)", "tinygrad_test_cases": [ { - "test": "from tinygrad.tensor import Tensor\nres = reshape_matrix_tg(\n [[1,2,3],[4,5,6]],\n (3, 2)\n)\nprint(res.numpy().tolist())", + "test": "from tinygrad.tensor import Tensor\nres = reshape_matrix_tg(\n Tensor([[1,2,3],[4,5,6]]),\n (3, 2)\n)\nprint(res.numpy().tolist())", "expected_output": "[[1, 2], [3, 4], [5, 6]]" }, { - "test": "from tinygrad.tensor import Tensor\nres = reshape_matrix_tg(\n [[1,2],[3,4]],\n (3, 2)\n)\nprint(res.numpy().tolist())", + "test": "from tinygrad.tensor import Tensor\nres = reshape_matrix_tg(\n Tensor([[1,2],[3,4]]),\n (3, 2)\n)\nprint(res.numpy().tolist())", "expected_output": "[]" } ], diff --git a/questions/184_empirical_probability_mass_function_(pmf)/description.md b/questions/184_empirical_probability_mass_function_(pmf)/description.md new file mode 100644 index 00000000..9a49f526 --- /dev/null +++ b/questions/184_empirical_probability_mass_function_(pmf)/description.md @@ -0,0 +1,3 @@ +## Problem + +Given a list of integer samples drawn from a discrete distribution, implement a function to compute the empirical Probability Mass Function (PMF). The function should return a list of `(value, probability)` pairs sorted by the value in ascending order. If the input is empty, return an empty list. diff --git a/questions/184_empirical_probability_mass_function_(pmf)/example.json b/questions/184_empirical_probability_mass_function_(pmf)/example.json new file mode 100644 index 00000000..a00d5e97 --- /dev/null +++ b/questions/184_empirical_probability_mass_function_(pmf)/example.json @@ -0,0 +1,5 @@ +{ + "input": "samples = [1, 2, 2, 3, 3, 3]", + "output": "[(1, 0.16666666666666666), (2, 0.3333333333333333), (3, 0.5)]", + "reasoning": "Counts are {1:1, 2:2, 3:3} over 6 samples, so probabilities are 1/6, 2/6, and 3/6 respectively, returned sorted by value." +} diff --git a/questions/184_empirical_probability_mass_function_(pmf)/learn.md b/questions/184_empirical_probability_mass_function_(pmf)/learn.md new file mode 100644 index 00000000..f2017eac --- /dev/null +++ b/questions/184_empirical_probability_mass_function_(pmf)/learn.md @@ -0,0 +1,24 @@ + +# Learn Section + +# Probability Mass Function (PMF) — Simple Explanation + +A **probability mass function (PMF)** describes how probabilities are assigned to the possible outcomes of a **discrete random variable**. + +- It tells you the chance of each specific outcome. +- Each probability is non-negative. +- The total of all probabilities adds up to 1. + +## Estimating from data +If the true probabilities are unknown, you can estimate them with an **empirical PMF**: +- Count how often each outcome appears. +- Divide by the total number of observations. + +## Example +Observed sequence: `1, 2, 2, 3, 3, 3` (6 outcomes total) +- "1" appears once → estimated probability = 1/6 +- "2" appears twice → estimated probability = 2/6 = 1/3 +- "3" appears three times → estimated probability = 3/6 = 1/2 + + + \ No newline at end of file diff --git a/questions/184_empirical_probability_mass_function_(pmf)/meta.json b/questions/184_empirical_probability_mass_function_(pmf)/meta.json new file mode 100644 index 00000000..21bc68aa --- /dev/null +++ b/questions/184_empirical_probability_mass_function_(pmf)/meta.json @@ -0,0 +1,15 @@ +{ + "id": "184", + "title": "Empirical Probability Mass Function (PMF)", + "difficulty": "easy", + "category": "Probability & Statistics", + "video": "", + "likes": "0", + "dislikes": "0", + "contributor": [ + { + "profile_link": "https://github.com/Jeet009", + "name": "Jeet Mukherjee" + } + ] +} diff --git a/questions/184_empirical_probability_mass_function_(pmf)/solution.py b/questions/184_empirical_probability_mass_function_(pmf)/solution.py new file mode 100644 index 00000000..b54775fe --- /dev/null +++ b/questions/184_empirical_probability_mass_function_(pmf)/solution.py @@ -0,0 +1,14 @@ +from collections import Counter + +def empirical_pmf(samples): + """ + Given an iterable of integer samples, return a list of (value, probability) + pairs sorted by value ascending. + """ + samples = list(samples) + if not samples: + return [] + total = len(samples) + cnt = Counter(samples) + result = [(k, cnt[k] / total) for k in sorted(cnt.keys())] + return result \ No newline at end of file diff --git a/questions/184_empirical_probability_mass_function_(pmf)/starter_code.py b/questions/184_empirical_probability_mass_function_(pmf)/starter_code.py new file mode 100644 index 00000000..32b35c14 --- /dev/null +++ b/questions/184_empirical_probability_mass_function_(pmf)/starter_code.py @@ -0,0 +1,7 @@ +def empirical_pmf(samples): + """ + Given an iterable of integer samples, return a list of (value, probability) + pairs sorted by value ascending. + """ + # TODO: Implement the function + pass diff --git a/questions/184_empirical_probability_mass_function_(pmf)/tests.json b/questions/184_empirical_probability_mass_function_(pmf)/tests.json new file mode 100644 index 00000000..d9cbb76b --- /dev/null +++ b/questions/184_empirical_probability_mass_function_(pmf)/tests.json @@ -0,0 +1,18 @@ +[ + { + "test": "print(empirical_pmf([1, 2, 2, 3, 3, 3]))", + "expected_output": "[(1, 0.16666666666666666), (2, 0.3333333333333333), (3, 0.5)]" + }, + { + "test": "print(empirical_pmf([5, 5, 5, 5]))", + "expected_output": "[(5, 1.0)]" + }, + { + "test": "print(empirical_pmf([]))", + "expected_output": "[]" + }, + { + "test": "print(empirical_pmf([0, 0, 1, 1, 1, 2]))", + "expected_output": "[(0, 0.3333333333333333), (1, 0.5), (2, 0.16666666666666666)]" + } +] diff --git a/questions/185_pmf_normalization_constant 2/description.md b/questions/185_pmf_normalization_constant 2/description.md new file mode 100644 index 00000000..7b00dfcc --- /dev/null +++ b/questions/185_pmf_normalization_constant 2/description.md @@ -0,0 +1,16 @@ +## Problem + +A discrete random variable `X` takes values 0 through 7 with probabilities: + +- P(X=0) = 0 +- P(X=1) = K +- P(X=2) = 2K +- P(X=3) = 2K +- P(X=4) = 3K +- P(X=5) = K^2 +- P(X=6) = 2K^2 +- P(X=7) = 7K^2 + K + +Find the value of the normalization constant `K` such that the above defines a valid PMF (i.e., probabilities are non‑negative and sum to 1). + +Implement a function `find_k()` that returns `K` as a Python float. diff --git a/questions/185_pmf_normalization_constant 2/example.json b/questions/185_pmf_normalization_constant 2/example.json new file mode 100644 index 00000000..6d92c364 --- /dev/null +++ b/questions/185_pmf_normalization_constant 2/example.json @@ -0,0 +1,5 @@ +{ + "input": "No input; call find_k()", + "output": "0.1", + "reasoning": "From the normalization condition, 10K^2 + 9K = 1 gives K = 0.1 (non-negative root)." +} diff --git a/questions/185_pmf_normalization_constant 2/learn.md b/questions/185_pmf_normalization_constant 2/learn.md new file mode 100644 index 00000000..a54fb92c --- /dev/null +++ b/questions/185_pmf_normalization_constant 2/learn.md @@ -0,0 +1,50 @@ +## Learning: PMF normalization constant + +### Idea and formula +- **PMF requirement**: A probability mass function must satisfy $\sum_i p(x_i) = 1$ and $p(x_i) \ge 0$. +- **Normalization by a constant**: If probabilities are given up to a constant, determine that constant by enforcing the sum-to-1 constraint. + - If the form is $p(x_i) = K\,w_i$ with known nonnegative weights $w_i$, then + - $\sum_i p(x_i) = K \sum_i w_i = 1 \Rightarrow$ $\displaystyle K = \frac{1}{\sum_i w_i}$. + - If the given expressions involve $K$ in a more general way (e.g., both $K$ and $K^2$ terms), still enforce $\sum_i p(x_i) = 1$ and solve the resulting equation for $K$. Choose the solution that makes all probabilities nonnegative. + +### Worked example (this question) +Suppose the PMF entries are expressed in terms of K such that, when summed, the K-terms group as follows: + +- Linear in K: K + 2K + 2K + 3K + K = 9K +- Quadratic in K: K² + 2K² + 7K² = 10K² + +One concrete way to realize this is via the following table of outcomes and probabilities: + +| X | p(X) | +|----|--------------| +| x₁ | K + K² | +| x₂ | 2K + 2K² | +| x₃ | 2K | +| x₄ | 3K + 7K² | +| x₅ | K | + +These add up to $9K + 10K^2$ as required. + +Enforce the PMF constraint: + +$$ +9K + 10K^2 = 1 \;\Rightarrow\; 10K^2 + 9K - 1 = 0 +$$ + +Quadratic formula reminder: + +$$ +\text{For } aK^2 + bK + c = 0,\quad K = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}. +$$ + +Solve the quadratic: + +$$ +K = \frac{-9 \pm \sqrt{9^2 - 4\cdot 10 \cdot (-1)}}{2\cdot 10} += \frac{-9 \pm \sqrt{121}}{20} += \frac{-9 \pm 11}{20}. +$$ + +Feasible ($K \ge 0$) root: $\displaystyle K = \frac{2}{20} = 0.1$. + +Therefore, the normalization constant is **$K = 0.1$**. diff --git a/questions/185_pmf_normalization_constant 2/meta.json b/questions/185_pmf_normalization_constant 2/meta.json new file mode 100644 index 00000000..b57e0a2b --- /dev/null +++ b/questions/185_pmf_normalization_constant 2/meta.json @@ -0,0 +1,15 @@ +{ + "id": "185", + "title": "Find PMF Normalization Constant", + "difficulty": "medium", + "category": "Probability & Statistics", + "video": "", + "likes": "0", + "dislikes": "0", + "contributor": [ + { + "profile_link": "https://github.com/Jeet009", + "name": "Jeet Mukherjee" + } + ] +} diff --git a/questions/185_pmf_normalization_constant 2/solution.py b/questions/185_pmf_normalization_constant 2/solution.py new file mode 100644 index 00000000..5836d3de --- /dev/null +++ b/questions/185_pmf_normalization_constant 2/solution.py @@ -0,0 +1,14 @@ +import math + +def find_k(): + """ + Solve 10*K^2 + 9*K - 1 = 0 and return the non-negative root. + """ + a = 10.0 + b = 9.0 + c = -1.0 + discriminant = b * b - 4 * a * c + sqrt_disc = math.sqrt(discriminant) + k1 = (-b + sqrt_disc) / (2 * a) + k2 = (-b - sqrt_disc) / (2 * a) + return k1 if k1 >= 0 else k2 diff --git a/questions/185_pmf_normalization_constant 2/starter_code.py b/questions/185_pmf_normalization_constant 2/starter_code.py new file mode 100644 index 00000000..0df463f9 --- /dev/null +++ b/questions/185_pmf_normalization_constant 2/starter_code.py @@ -0,0 +1,6 @@ +def find_k(): + """ + Return the normalization constant K for the given PMF as a float. + """ + # TODO: Solve for K from 10*K**2 + 9*K - 1 = 0 and return the non-negative root + pass diff --git a/questions/185_pmf_normalization_constant 2/tests.json b/questions/185_pmf_normalization_constant 2/tests.json new file mode 100644 index 00000000..d6cb3f10 --- /dev/null +++ b/questions/185_pmf_normalization_constant 2/tests.json @@ -0,0 +1,10 @@ +[ + { + "test": "print(round(find_k(), 10))", + "expected_output": "0.1" + }, + { + "test": "k = find_k(); s = 0 + k + 2*k + 2*k + 3*k + k**2 + 2*k**2 + (7*k**2 + k); print(round(s, 10))", + "expected_output": "1.0" + } +] diff --git a/questions/186_linear_regression_ordinary_least_squares/description.md b/questions/186_linear_regression_ordinary_least_squares/description.md new file mode 100644 index 00000000..0dded433 --- /dev/null +++ b/questions/186_linear_regression_ordinary_least_squares/description.md @@ -0,0 +1,15 @@ +### Problem + +Implement simple linear regression using Ordinary Least Squares (OLS). Given 1D inputs `X` and targets `y`, compute the slope `m`, intercept `b`, and use them to predict on a provided test input. + +You should implement the closed-form OLS solution: + +$$ +m = \frac{\sum_i (x_i - \bar{x})(y_i - \bar{y})}{\sum_i (x_i - \bar{x})^2},\quad +b = \bar{y} - m\,\bar{x}. +$$ + +Then, given `X_test`, output predictions `y_pred = m * X_test + b`. + +Return values: `m`, `b`, and `y_pred`. + diff --git a/questions/186_linear_regression_ordinary_least_squares/example.json b/questions/186_linear_regression_ordinary_least_squares/example.json new file mode 100644 index 00000000..a8d2e9f0 --- /dev/null +++ b/questions/186_linear_regression_ordinary_least_squares/example.json @@ -0,0 +1,7 @@ +{ + "input": "X_train = [1, 2, 3]; y_train = [2, 2.5, 3.5]; X_test = [4]", + "output": "m = 0.75, b = 1.166667, y_pred = [4.166667]", + "reasoning": "Using OLS: m = Cov(X,Y)/Var(X) = 1.5/2 = 0.75 and b = y_bar - m*x_bar = (8/3) - 0.75*2 = 1.166667. Prediction for X_test=[4] is 0.75*4 + 1.166667 = 4.166667." +} + + diff --git a/questions/186_linear_regression_ordinary_least_squares/learn.md b/questions/186_linear_regression_ordinary_least_squares/learn.md new file mode 100644 index 00000000..34e64f28 --- /dev/null +++ b/questions/186_linear_regression_ordinary_least_squares/learn.md @@ -0,0 +1,39 @@ +## Learning: Ordinary Least Squares for Simple Linear Regression + +### Idea and formula +- **Goal**: Fit a line $y = m x + b$ that minimizes the sum of squared errors. +- **Closed-form OLS solution** for 1D features: + +$$ +m = \frac{\sum_i (x_i - \bar{x})(y_i - \bar{y})}{\sum_i (x_i - \bar{x})^2},\quad +b = \bar{y} - m\,\bar{x} +$$ + +### Intuition +- The numerator is the sample covariance between $x$ and $y$; the denominator is the sample variance of $x$. +- So $m = \operatorname{Cov}(x,y) / \operatorname{Var}(x)$ measures how much $y$ changes per unit change in $x$. +- The intercept $b$ anchors the best-fit line so it passes through the mean point $(\bar{x},\bar{y})$. + +### Algorithm steps +1. Compute $\bar{x}$ and $\bar{y}$. +2. Accumulate numerator $\sum_i (x_i-\bar{x})(y_i-\bar{y})$ and denominator $\sum_i (x_i-\bar{x})^2$. +3. Compute $m = \text{numerator}/\text{denominator}$ (guard against zero denominator). +4. Compute $b = \bar{y} - m\,\bar{x}$. +5. Predict: $\hat{y} = m\,x + b$ for any new $x$. + +### Edge cases and tips +- If all $x_i$ are identical, $\operatorname{Var}(x)=0$ and the slope is undefined. In practice, return $m=0$ and $b=\bar{y}$ or raise an error. +- Centering data helps numerical stability but is not required for the closed form. +- Outliers can strongly influence OLS; consider robust alternatives if needed. + +### Worked example +Given $X = [1,2,3]$ and $y = [2,2.5,3.5]$: + +- $\bar{x} = 2$, $\bar{y} = 8/3$. +- $\sum (x_i-\bar{x})(y_i-\bar{y}) = (1-2)(2-8/3) + (2-2)(2.5-8/3) + (3-2)(3.5-8/3) = 1.5$ +- $\sum (x_i-\bar{x})^2 = (1-2)^2 + (2-2)^2 + (3-2)^2 = 2$ +- $m = 1.5/2 = 0.75$ +- $b = \bar{y} - m\,\bar{x} = 8/3 - 0.75\cdot 2 = 1.166666\ldots$ + +Prediction for $X_{test} = [4]$: $y_{pred} = 0.75\cdot 4 + 1.1666\ldots = 4.1666\ldots$ + diff --git a/questions/186_linear_regression_ordinary_least_squares/meta.json b/questions/186_linear_regression_ordinary_least_squares/meta.json new file mode 100644 index 00000000..1e57a41d --- /dev/null +++ b/questions/186_linear_regression_ordinary_least_squares/meta.json @@ -0,0 +1,16 @@ +{ + "id": "186", + "title": "Linear Regression via Ordinary Least Squares (OLS)", + "difficulty": "hard", + "category": "Machine Learning", + "video": "", + "likes": "0", + "dislikes": "0", + "contributor": [ + { + "profile_link": "https://github.com/Jeet009", + "name": "Jeet Mukherjee" + } + ] +} + diff --git a/questions/186_linear_regression_ordinary_least_squares/solution.py b/questions/186_linear_regression_ordinary_least_squares/solution.py new file mode 100644 index 00000000..751a05f6 --- /dev/null +++ b/questions/186_linear_regression_ordinary_least_squares/solution.py @@ -0,0 +1,23 @@ +from typing import List, Tuple + + +def fit_and_predict(X_train: List[float], y_train: List[float], X_test: List[float]) -> Tuple[float, float, List[float]]: + n = len(X_train) + x_mean = sum(X_train) / n + y_mean = sum(y_train) / n + + num = 0.0 + den = 0.0 + for i in range(n): + dx = X_train[i] - x_mean + dy = y_train[i] - y_mean + num += dx * dy + den += dx * dx + + m = num / den if den != 0 else 0.0 + b = y_mean - m * x_mean + + y_pred = [m * x + b for x in X_test] + return m, b, y_pred + + diff --git a/questions/186_linear_regression_ordinary_least_squares/starter_code.py b/questions/186_linear_regression_ordinary_least_squares/starter_code.py new file mode 100644 index 00000000..50a26990 --- /dev/null +++ b/questions/186_linear_regression_ordinary_least_squares/starter_code.py @@ -0,0 +1,14 @@ +from typing import List, Tuple + + +def fit_and_predict(X_train: List[float], y_train: List[float], X_test: List[float]) -> Tuple[float, float, List[float]]: + """ + Implement simple linear regression (OLS) to compute slope m, intercept b, + and predictions on X_test. + + Returns (m, b, y_pred). + """ + # Your code here + pass + + diff --git a/questions/186_linear_regression_ordinary_least_squares/tests.json b/questions/186_linear_regression_ordinary_least_squares/tests.json new file mode 100644 index 00000000..6ebcfee4 --- /dev/null +++ b/questions/186_linear_regression_ordinary_least_squares/tests.json @@ -0,0 +1,28 @@ +[ + { + "test": "from questions.186_linear_regression_ordinary_least_squares.solution import fit_and_predict; m,b,y=fit_and_predict([1,2,3],[2,2.5,3.5],[4]); print(round(m,6), round(b,6), [round(v,6) for v in y])", + "expected_output": "0.75 1.166667 [4.166667]" + }, + { + "test": "from questions.186_linear_regression_ordinary_least_squares.solution import fit_and_predict; m,b,y=fit_and_predict([0,1,2,3],[1,3,5,7],[4,5]); print(round(m,6), round(b,6), [round(v,6) for v in y])", + "expected_output": "2 1 [9, 11]" + }, + { + "test": "from questions.186_linear_regression_ordinary_least_squares.solution import fit_and_predict; m,b,y=fit_and_predict([0,1,2],[5,2,-1],[3]); print(round(m,6), round(b,6), [round(v,6) for v in y])", + "expected_output": "-3 5 [-4]" + }, + { + "test": "from questions.186_linear_regression_ordinary_least_squares.solution import fit_and_predict; m,b,y=fit_and_predict([2,2,2],[1,4,7],[10]); print(round(m,6), round(b,6), [round(v,6) for v in y])", + "expected_output": "0.0 4.0 [4.0]" + }, + { + "test": "from questions.186_linear_regression_ordinary_least_squares.solution import fit_and_predict; m,b,y=fit_and_predict([1,2,3,4],[1.1,1.9,3.05,3.9],[5]); print(round(m,6), round(b,6), [round(v,6) for v in y])", + "expected_output": "0.955 0.1 [4.875]" + }, + { + "test": "from questions.186_linear_regression_ordinary_least_squares.solution import fit_and_predict; m,b,y=fit_and_predict([3],[7],[10]); print(round(m,6), round(b,6), [round(v,6) for v in y])", + "expected_output": "0.0 7.0 [7.0]" + } +] + +