You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Problems/x_batch_normalization/learn.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,11 +15,11 @@ The process of Batch Normalization consists of the following steps:
15
15
16
16
### Structure of Batch Normalization for BCHW Input
17
17
18
-
For an input tensor with the shape **BCHW** (where:
18
+
For an input tensor with the shape **BCHW**, where:
19
19
-**B**: batch size,
20
20
-**C**: number of channels,
21
21
-**H**: height,
22
-
-**W**: width),
22
+
-**W**: width,
23
23
the Batch Normalization process operates on specific dimensions based on the task's requirement.
24
24
25
25
#### 1. Mean and Variance Calculation
@@ -46,7 +46,7 @@ The mean and variance are computed **over all spatial positions (H, W)** and **a
46
46
47
47
#### 2. Normalization
48
48
49
-
Once the mean $\mu_c$ and variance $\sigma_c^2$ have been computed for each channel, the next step is to **normalize** the input. The normalization is done by subtracting the mean and dividing by the standard deviation (square root of the variance, plus a small constant $\epsilon$ for numerical stability):
49
+
Once the mean $\mu_c$ and variance $\sigma_c^2$ have been computed for each channel, the next step is to **normalize** the input. The normalization is done by subtracting the mean and dividing by the standard deviation (plus a small constant $\epsilon$ for numerical stability):
Copy file name to clipboardExpand all lines: Problems/x_group_normalization/learn.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,25 +15,25 @@ The process of Group Normalization consists of the following steps:
15
15
16
16
### Structure of Group Normalization for BCHW Input
17
17
18
-
For an input tensor with the shape **BCHW**(where:
18
+
For an input tensor with the shape **BCHW**, where:
19
19
-**B**: batch size,
20
20
-**C**: number of channels,
21
21
-**H**: height,
22
-
-**W**: width),
22
+
-**W**: width,
23
23
the Group Normalization process operates on specific dimensions based on the task's requirement.
24
24
25
25
#### 1. Group Division
26
26
27
27
- The input feature dimension **C** (channels) is divided into several groups. The number of groups is determined by the **n_groups** parameter, and the size of each group is calculated as:
28
28
29
29
$$
30
-
\text{group\_size} = \frac{C}{n_{\text{groups}}}
30
+
\text{groupSize} = \frac{C}{n_{\text{groups}}}
31
31
$$
32
32
33
33
Where:
34
34
-**C** is the number of channels.
35
35
-**n_groups** is the number of groups into which the channels are divided.
36
-
-**group_size** is the number of channels in each group.
36
+
-**groupSize** is the number of channels in each group.
37
37
38
38
The input tensor is then reshaped to group the channels into the specified groups.
39
39
@@ -42,18 +42,18 @@ the Group Normalization process operates on specific dimensions based on the tas
42
42
- For each group, the **mean** $\mu_g$ and **variance** $\sigma_g^2$ are computed over the spatial dimensions and across the batch. This normalization helps to stabilize the activations within each group.
43
43
44
44
$$
45
-
\mu_g = \frac{1}{B \cdot H \cdot W \cdot \text{group\_size}} \sum_{i=1}^{B} \sum_{h=1}^{H} \sum_{w=1}^{W} \sum_{g=1}^{\text{group\_size}} x_{i,g,h,w}
45
+
\mu_g = \frac{1}{B \cdot H \cdot W \cdot \text{groupSize}} \sum_{i=1}^{B} \sum_{h=1}^{H} \sum_{w=1}^{W} \sum_{g=1}^{\text{groupSize}} x_{i,g,h,w}
46
46
$$
47
47
48
48
$$
49
-
\sigma_g^2 = \frac{1}{B \cdot H \cdot W \cdot \text{group\_size}} \sum_{i=1}^{B} \sum_{h=1}^{H} \sum_{w=1}^{W} \sum_{g=1}^{\text{group\_size}} (x_{i,g,h,w} - \mu_g)^2
49
+
\sigma_g^2 = \frac{1}{B \cdot H \cdot W \cdot \text{groupSize}} \sum_{i=1}^{B} \sum_{h=1}^{H} \sum_{w=1}^{W} \sum_{g=1}^{\text{groupSize}} (x_{i,g,h,w} - \mu_g)^2
50
50
$$
51
51
52
52
Where:
53
53
- $x_{i,g,h,w}$ is the activation at batch index $i$, group index $g$, height $h$, and width $w$.
54
54
- $B$ is the batch size.
55
55
- $H$ and $W$ are the spatial dimensions (height and width).
56
-
- $\text{group_size}$ is the number of channels in each group.
56
+
- $\text{groupSize}$ is the number of channels in each group.
0 commit comments