Skip to content

Commit f0dac06

Browse files
authored
Merge pull request #532 from Open-Deep-ML/add-q-174
added Q 174
2 parents 9af7f66 + fc02e08 commit f0dac06

File tree

11 files changed

+328
-31
lines changed

11 files changed

+328
-31
lines changed
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
In this task, you will train a Generative Adversarial Network (GAN) to learn a one-dimensional Gaussian distribution. The GAN consists of a generator that produces samples from latent noise and a discriminator that estimates the probability that a given sample is real. Both networks should have one hidden layer with ReLU activation in the hidden layer. The generator’s output layer is linear, while the discriminator's output layer uses a sigmoid activation.
2+
3+
You must train the GAN using the standard non-saturating GAN loss for the generator and binary cross-entropy loss for the discriminator. In the NumPy version, parameters should be updated using vanilla gradient descent. In the PyTorch version, parameters should be updated using stochastic gradient descent (SGD) with the specified learning rate. The training loop should alternate between updating the discriminator and the generator each iteration.
4+
5+
Your function must return the trained generator forward function `gen_forward(z)`, which produces generated samples given latent noise.
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"input": "gen_forward = train_gan(4.0, 1.25, epochs=1000, seed=42)\nz = np.random.normal(0, 1, (500, 1))\nx_gen, _, _ = gen_forward(z)\n(round(np.mean(x_gen), 4), round(np.std(x_gen), 4))",
3+
"output": "(0.0004, 0.0002)",
4+
"reasoning": "The test cases call `gen_forward` after training, sample 500 points, and then compute the mean and std."
5+
}
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
## Understanding GANs for 1D Gaussian Data
2+
A Generative Adversarial Network (GAN) consists of two neural networks - a **Generator** $G_\theta$ and a **Discriminator** $D_\phi$ - trained in a minimax game.
3+
4+
### 1. The Roles
5+
- **Generator** $G_\theta(z)$: Takes a latent noise vector $z \sim \mathcal{N}(0, I)$ and outputs a sample intended to resemble the real data.
6+
- **Discriminator** $D_\phi(x)$: Outputs a probability $p \in (0, 1)$ that the input $x$ came from the real data distribution rather than the generator.
7+
8+
### 2. The Objective
9+
The classical GAN objective is:
10+
$$
11+
\min_{\theta} \; \max_{\phi} \; \mathbb{E}_{x \sim p_{\text{data}}} [\log D_\phi(x)] + \mathbb{E}_{z \sim p(z)} [\log (1 - D_\phi(G_\theta(z)))]
12+
$$
13+
Here:
14+
- $p_{\text{data}}$ is the real data distribution.
15+
- $p(z)$ is the prior distribution for the latent noise (often standard normal).
16+
17+
### 3. Practical Losses
18+
In implementation, we minimize:
19+
- **Discriminator loss**:
20+
$$
21+
\mathcal{L}_D = - \left( \frac{1}{m} \sum_{i=1}^m \log D(x^{(i)}_{\text{real}}) + \log(1 - D(x^{(i)}_{\text{fake}})) \right)
22+
$$
23+
- **Generator loss** (non-saturating form):
24+
$$
25+
\mathcal{L}_G = - \frac{1}{m} \sum_{i=1}^m \log D(G(z^{(i)}))
26+
$$
27+
28+
### 4. Forward/Backward Flow
29+
1. **Discriminator step**: Real samples $x_{\text{real}}$ and fake samples $x_{\text{fake}} = G(z)$ are passed through $D$, and $\mathcal{L}_D$ is minimized w.r.t. $\phi$.
30+
2. **Generator step**: Fresh $z$ is sampled, $x_{\text{fake}} = G(z)$ is passed through $D$, and $\mathcal{L}_G$ is minimized w.r.t. $\theta$ while keeping $\phi$ fixed.
31+
32+
### 5. Architecture for This Task
33+
- **Generator**: Fully connected layer ($\mathbb{R}^{\text{latent\_dim}} \to \mathbb{R}^{\text{hidden\_dim}}$) -> ReLU -> Fully connected layer ($\mathbb{R}^{\text{hidden\_dim}} \to \mathbb{R}^1$).
34+
- **Discriminator**: Fully connected layer ($\mathbb{R}^1 \to \mathbb{R}^{\text{hidden\_dim}}$) → ReLU → Fully connected layer ($\mathbb{R}^{\text{hidden\_dim}} \to \mathbb{R}^1$) → Sigmoid.
35+
36+
### 6. Numerical Tips
37+
- Initialize weights with a small Gaussian ($\mathcal{N}(0, 0.01)$).
38+
- Add $10^{-8}$ to logs for numerical stability.
39+
- Use a consistent batch size $m$ for both real and fake samples.
40+
- Always sample fresh noise for the generator on each update.
41+
42+
**Your Task**: Implement the training loop to learn the parameters $\theta$ and $\phi$, and return the trained `gen_forward(z)` function. The evaluation (mean/std of generated samples) will be handled in the test cases.
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
{
2+
"id": "174",
3+
"title": "Train a Simple GAN on 1D Gaussian Data",
4+
"difficulty": "hard",
5+
"category": "Deep Learning",
6+
"video": "",
7+
"likes": "0",
8+
"dislikes": "0",
9+
"contributor": [
10+
{
11+
"profile_link": "https://github.com/moe18",
12+
"name": "moe"
13+
}
14+
],
15+
"pytorch_difficulty": "medium"
16+
}
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
import torch
2+
import torch.nn as nn
3+
import torch.optim as optim
4+
5+
def train_gan(mean_real: float, std_real: float, latent_dim: int = 1, hidden_dim: int = 16, learning_rate: float = 0.001, epochs: int = 5000, batch_size: int = 128, seed: int = 42):
6+
torch.manual_seed(seed)
7+
8+
class Generator(nn.Module):
9+
def __init__(self):
10+
super().__init__()
11+
self.net = nn.Sequential(
12+
nn.Linear(latent_dim, hidden_dim),
13+
nn.ReLU(),
14+
nn.Linear(hidden_dim, 1)
15+
)
16+
def forward(self, z):
17+
return self.net(z)
18+
19+
class Discriminator(nn.Module):
20+
def __init__(self):
21+
super().__init__()
22+
self.net = nn.Sequential(
23+
nn.Linear(1, hidden_dim),
24+
nn.ReLU(),
25+
nn.Linear(hidden_dim, 1),
26+
nn.Sigmoid()
27+
)
28+
def forward(self, x):
29+
return self.net(x)
30+
31+
G = Generator()
32+
D = Discriminator()
33+
34+
# Use SGD as requested
35+
opt_G = optim.SGD(G.parameters(), lr=learning_rate)
36+
opt_D = optim.SGD(D.parameters(), lr=learning_rate)
37+
criterion = nn.BCELoss()
38+
39+
for _ in range(epochs):
40+
# Real and fake batches
41+
real_data = torch.normal(mean_real, std_real, size=(batch_size, 1))
42+
noise = torch.randn(batch_size, latent_dim)
43+
fake_data = G(noise)
44+
45+
# ----- Discriminator step -----
46+
opt_D.zero_grad()
47+
pred_real = D(real_data)
48+
pred_fake = D(fake_data.detach())
49+
loss_real = criterion(pred_real, torch.ones_like(pred_real))
50+
loss_fake = criterion(pred_fake, torch.zeros_like(pred_fake))
51+
loss_D = loss_real + loss_fake
52+
loss_D.backward()
53+
opt_D.step()
54+
55+
# ----- Generator step -----
56+
opt_G.zero_grad()
57+
pred_fake = D(fake_data)
58+
# non-saturating generator loss: maximize log D(G(z)) -> minimize -log D(G(z))
59+
loss_G = criterion(pred_fake, torch.ones_like(pred_fake))
60+
loss_G.backward()
61+
opt_G.step()
62+
63+
return G.forward
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
import torch
2+
import torch.nn as nn
3+
import torch.optim as optim
4+
5+
def train_gan(mean_real: float, std_real: float, latent_dim: int = 1, hidden_dim: int = 16, learning_rate: float = 0.001, epochs: int = 5000, batch_size: int = 128, seed: int = 42):
6+
torch.manual_seed(seed)
7+
# Your PyTorch implementation here
8+
pass
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
[
2+
{
3+
"test": "gen_forward = train_gan(4.0, 1.25, epochs=100, seed=42)\nz = torch.randn(500, 1)\nx_gen = gen_forward(z)\nprint((round(x_gen.mean().item(), 4), round(x_gen.std().item(), 4)))",
4+
"expected_output": "(0.4725, 0.3563)"
5+
},
6+
{
7+
"test": "gen_forward = train_gan(0.0, 1.0, epochs=50, seed=0)\nz = torch.randn(300, 1)\nx_gen = gen_forward(z)\nprint((round(x_gen.mean().item(), 4), round(x_gen.std().item(), 4)))",
8+
"expected_output": "(0.0644, 0.244)"
9+
}
10+
]
Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
import numpy as np
2+
3+
def relu(x):
4+
return np.maximum(0, x)
5+
6+
def sigmoid(x):
7+
return 1 / (1 + np.exp(-x))
8+
9+
def train_gan(mean_real: float, std_real: float, latent_dim: int = 1, hidden_dim: int = 16, learning_rate: float = 0.001, epochs: int = 5000, batch_size: int = 128, seed: int = 42):
10+
np.random.seed(seed)
11+
data_dim = 1
12+
13+
# Initialize generator weights
14+
w1_g = np.random.normal(0, 0.01, (latent_dim, hidden_dim))
15+
b1_g = np.zeros(hidden_dim)
16+
w2_g = np.random.normal(0, 0.01, (hidden_dim, data_dim))
17+
b2_g = np.zeros(data_dim)
18+
19+
# Initialize discriminator weights
20+
w1_d = np.random.normal(0, 0.01, (data_dim, hidden_dim))
21+
b1_d = np.zeros(hidden_dim)
22+
w2_d = np.random.normal(0, 0.01, (hidden_dim, 1))
23+
b2_d = np.zeros(1)
24+
25+
def disc_forward(x):
26+
h1 = np.dot(x, w1_d) + b1_d
27+
a1 = relu(h1)
28+
logit = np.dot(a1, w2_d) + b2_d
29+
p = sigmoid(logit)
30+
return p, logit, a1, h1
31+
32+
def gen_forward(z):
33+
h1 = np.dot(z, w1_g) + b1_g
34+
a1 = relu(h1)
35+
x_gen = np.dot(a1, w2_g) + b2_g
36+
return x_gen, a1, h1
37+
38+
for epoch in range(epochs):
39+
# Sample real data
40+
x_real = np.random.normal(mean_real, std_real, batch_size)[:, None]
41+
z = np.random.normal(0, 1, (batch_size, latent_dim))
42+
x_fake, _, _ = gen_forward(z)
43+
44+
# Discriminator forward
45+
p_real, _, a1_real, h1_real = disc_forward(x_real)
46+
p_fake, _, a1_fake, h1_fake = disc_forward(x_fake)
47+
48+
# Discriminator gradients
49+
grad_logit_real = - (1 - p_real) / batch_size
50+
grad_a1_real = grad_logit_real @ w2_d.T
51+
grad_h1_real = grad_a1_real * (h1_real > 0)
52+
grad_w1_d_real = x_real.T @ grad_h1_real
53+
grad_b1_d_real = np.sum(grad_h1_real, axis=0)
54+
grad_w2_d_real = a1_real.T @ grad_logit_real
55+
grad_b2_d_real = np.sum(grad_logit_real, axis=0)
56+
57+
grad_logit_fake = p_fake / batch_size
58+
grad_a1_fake = grad_logit_fake @ w2_d.T
59+
grad_h1_fake = grad_a1_fake * (h1_fake > 0)
60+
grad_w1_d_fake = x_fake.T @ grad_h1_fake
61+
grad_b1_d_fake = np.sum(grad_h1_fake, axis=0)
62+
grad_w2_d_fake = a1_fake.T @ grad_logit_fake
63+
grad_b2_d_fake = np.sum(grad_logit_fake, axis=0)
64+
65+
grad_w1_d = grad_w1_d_real + grad_w1_d_fake
66+
grad_b1_d = grad_b1_d_real + grad_b1_d_fake
67+
grad_w2_d = grad_w2_d_real + grad_w2_d_fake
68+
grad_b2_d = grad_b2_d_real + grad_b2_d_fake
69+
70+
w1_d -= learning_rate * grad_w1_d
71+
b1_d -= learning_rate * grad_b1_d
72+
w2_d -= learning_rate * grad_w2_d
73+
b2_d -= learning_rate * grad_b2_d
74+
75+
# Generator update
76+
z = np.random.normal(0, 1, (batch_size, latent_dim))
77+
x_fake, a1_g, h1_g = gen_forward(z)
78+
p_fake, _, a1_d, h1_d = disc_forward(x_fake)
79+
80+
grad_logit_fake = - (1 - p_fake) / batch_size
81+
grad_a1_d = grad_logit_fake @ w2_d.T
82+
grad_h1_d = grad_a1_d * (h1_d > 0)
83+
grad_x_fake = grad_h1_d @ w1_d.T
84+
85+
grad_a1_g = grad_x_fake @ w2_g.T
86+
grad_h1_g = grad_a1_g * (h1_g > 0)
87+
grad_w1_g = z.T @ grad_h1_g
88+
grad_b1_g = np.sum(grad_h1_g, axis=0)
89+
grad_w2_g = a1_g.T @ grad_x_fake
90+
grad_b2_g = np.sum(grad_x_fake, axis=0)
91+
92+
w1_g -= learning_rate * grad_w1_g
93+
b1_g -= learning_rate * grad_b1_g
94+
w2_g -= learning_rate * grad_w2_g
95+
b2_g -= learning_rate * grad_b2_g
96+
97+
return gen_forward
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
import numpy as np
2+
3+
def train_gan(mean_real: float, std_real: float, latent_dim: int = 1, hidden_dim: int = 16, learning_rate: float = 0.001, epochs: int = 5000, batch_size: int = 128, seed: int = 42):
4+
"""
5+
Train a simple GAN to learn a 1D Gaussian distribution.
6+
7+
Args:
8+
mean_real: Mean of the target Gaussian
9+
std_real: Std of the target Gaussian
10+
latent_dim: Dimension of the noise input to the generator
11+
hidden_dim: Hidden layer size for both networks
12+
learning_rate: Learning rate for gradient descent
13+
epochs: Number of training epochs
14+
batch_size: Training batch size
15+
seed: Random seed for reproducibility
16+
17+
Returns:
18+
gen_forward: A function that takes z and returns generated samples
19+
"""
20+
# Your code here
21+
pass
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
[
2+
{
3+
"test": "gen_forward = train_gan(4.0, 1.25, epochs=1000, seed=42)\nz = np.random.normal(0, 1, (500, 1))\nx_gen, _, _ = gen_forward(z)\nprint((round(np.mean(x_gen), 4), round(np.std(x_gen), 4)))",
4+
"expected_output": "(0.0004, 0.0002)"
5+
},
6+
{
7+
"test": "gen_forward = train_gan(0.0, 1.0, epochs=500, seed=0)\nz = np.random.normal(0, 1, (300, 1))\nx_gen, _, _ = gen_forward(z)\nprint((round(np.mean(x_gen), 4), round(np.std(x_gen), 4)))",
8+
"expected_output": "(-0.0002, 0.0002)"
9+
},
10+
{
11+
"test": "gen_forward = train_gan(-2.0, 0.5, epochs=1500, seed=123)\nz = np.random.normal(0, 1, (400, 1))\nx_gen, _, _ = gen_forward(z)\nprint((round(np.mean(x_gen), 4), round(np.std(x_gen), 4)))",
12+
"expected_output": "(-0.0044, 0.0002)"
13+
}
14+
]

0 commit comments

Comments
 (0)