aws-samples · kcarnold · Oct 26, 2022 · Oct 26, 2022 · Oct 26, 2022 · Oct 26, 2022
diff --git a/code/linear-regression/src/Components/ClosedForm.svelte b/code/linear-regression/src/Components/ClosedForm.svelte
@@ -51,7 +51,7 @@
       true
     )}
     Add circles to the chart below to see how the Normal Equation calculates two
-    featues, the bias and weight, for the corresponding regression model.
+    features, the intercept and weight, for the corresponding regression model.
   </p>
   <div id="cf-container">
     <div id="equations-container">
@@ -221,7 +221,7 @@
     In research publications and statistical software, coefficients of regression
     models are often presented with associated p-values. These p-values come from
     traditional null hypothesis statistical tests: t-tests are used to measure whether
-    a given cofficient is significantly different than zero (the null hypothesis
+    a given coefficient is significantly different than zero (the null hypothesis
     that a particular coefficient {@html katexify(`\\beta_i`, false)} equals zero),
     while F tests are used to measure whether
     <i>any</i>

diff --git a/code/linear-regression/src/Components/GradientDescent.svelte b/code/linear-regression/src/Components/GradientDescent.svelte
@@ -36,7 +36,7 @@
   find suitable coefficients for our regression model that minimize prediction error
   (remember, lower MSE equals better model).
   <br /><br />
-  A full conversation on gradient descent is outside the course of this article (stay-tuned
+  A full conversation on gradient descent is outside the course of this article (stay tuned
   for our future article on the subject), but if you'd like to learn more, click
   the "Show Math" button below. Otherwise, read on!
   <br />
@@ -53,7 +53,7 @@
       Gradient descent works as follows. We assume that we have some convex
       function representing the error of our machine learning algorithm (in our
       case, MSE). Gradient descent will iteratively update our model's
-      coefficients in the direction of our error functions minimum <span
+      coefficients in a direction that reduces the value our error<span
         class="info-tooltip"
         title="Gradient descent won't always yield the best coefficients for our model, because it can sometimes 
     get stuck in local minima (as opposed to global minima). Many extensions exist to help solve this problem."
@@ -79,7 +79,7 @@
       this, we'll use the gradient, which represents the direction that the function 
       is increasing, and the rate at which it is increasing. Since we want to find
       the minimum of this function, we can go in the opposite direction of where it's 
-      increasing. This is exactly what Gradient Descent does, it works by taking steps 
+      increasing. This is exactly what Gradient Descent does: it works by taking steps 
       in the direction opposite of where our error function is increasing, proportional 
       to the rate of change. To find the coefficients that minimize the function, we 
       first calculate the derivatives of our error function is increasing. To find 
@@ -96,9 +96,9 @@
       Now that we have the gradients for our error function (with respect
       to each coefficient to be updated), we perform iterative updates:
       {@html katexify(
-        `\\text{repeat until converge:} = \\begin{cases}
-         \\beta_0 = \\beta_0 - \\alpha (-\\frac{2}{n} \\sum^{n}_{i=1}(y_i - \\hat{y_i}))  \\\\
-         \\beta_1 = \\beta_1 - \\alpha (-\\frac{2}{n} x_i\\sum^{n}_{i=1}(y_i - \\hat{y_i})) 
+        `\\text{repeat until converge:}  \\begin{cases}
+         \\beta_0 \\Larr \\beta_0 - \\alpha (-\\frac{2}{n} \\sum^{n}_{i=1}(y_i - \\hat{y_i}))  \\\\
+         \\beta_1 \\Larr \\beta_1 - \\alpha (-\\frac{2}{n} x_i\\sum^{n}_{i=1}(y_i - \\hat{y_i})) 
         \\end{cases}`,
         true
       )}
@@ -148,10 +148,10 @@
         >100 Steps</button
       >
     </div>
-    <div id="bias-slider">
+    <div id="intercept-slider-2">
       <div class="input-container">
         <p>
-          Bias ({@html katexify(`\\hat{\\beta_0}`, false)}): {formatter(
+          Intercept ({@html katexify(`\\hat{\\beta_0}`, false)}): {formatter(
             $gdBias
           )}
         </p>

diff --git a/code/linear-regression/src/Components/Intro.svelte b/code/linear-regression/src/Components/Intro.svelte
@@ -91,7 +91,7 @@
         ><span
           class="info-tooltip"
           title="The coefficient B<sub>0</sub> represents the
-      intercept of our model, and each other coefficient 
+      intercept or bias of our model, and each other coefficient 
       B<sub>i</sub> (i > 0) is a slope defining how variable 
       x<sub>i</sub> contributes to the model. We discuss how to
       interpret regression coefficients later in the article."
@@ -101,15 +101,15 @@
       >
     </li>
     <li>
-      {@html katexify(`\\epsilon`, false)}: the irreducible error in our model.
-      A term that collects together all the unmodeled parts of our data.
+      {@html katexify(`\\epsilon`, false)}: the residual (or "error") of our model.
+      Our model will not make perfect predictions, so we compute this term by subtracting the predicted value from the actual value.
     </li>
   </ul>
   <br />
 
   <p class="body-text">
     Fitting a linear regression model is all about finding the set of
-    cofficients that best model {@html katexify(`y`, false)} as a function of our
+    coefficients that best model {@html katexify(`y`, false)} as a function of our
     features. We may never know the true parameters for our model, but we can estimate
     them (more on this later). Once we've estimated these coefficients, {@html katexify(
       `\\hat{\\beta_i}`,

diff --git a/code/linear-regression/src/Components/MeanSquaredError.svelte b/code/linear-regression/src/Components/MeanSquaredError.svelte
@@ -72,18 +72,19 @@
     >
     More specifically, r-squared measures the percentage of variance explained normalized
     against the baseline variance of our model (which is just the variance of the
-    mean):
+    the trivial model that always predicts the mean):
     {@html katexify(
       `\\begin{aligned} R^2 = 1 - \\frac{\\Sigma^{n}_{i=1}(y_i - \\hat{y_i})^2 }{\\Sigma^{n}_{i=1}(y_i - \\bar{y})^2 }  \\end{aligned}`,
       true
     )}
     The highest possible value for r-squared is 1, representing a model that captures
     100% of the variance. A negative r-squared means that our model is doing worse
-    (capturing less variance) than a flat line through mean of our data would.
+    (capturing less variance) than a flat line through mean of our data would. (The name
+    "r-<em>squared</em>" falsely implies that it would not have a negative value.)
 
     <br /><br />To build intuition for yourself, try changing the weight and
-    bias terms below to see how the MSE and r-squared change across different
-    model fits:
+    intercept terms below to see how the MSE and r-squared change across different
+    possible models for a toy dataset (click Shuffle Data to make a new toy dataset):
   </p>
   <br /><br />
   <div id="mse-container">
@@ -93,10 +94,10 @@
           >Shuffle Data</button
         >
       </div>
-      <div id="bias-slider">
+      <div id="intercept-slider">
         <div class="input-container">
           <p>
-            Bias ({@html katexify(`\\hat{\\beta_0}`, false)}): {formatter(
+            Intercept ({@html katexify(`\\hat{\\beta_0}`, false)}): {formatter(
               $mseBias
             )}
           </p>
@@ -181,7 +182,7 @@
     <a href="https://en.wikipedia.org/wiki/Root-mean-square_deviation">RMSE</a
     >). If instead we wanted our error to reflect the linear distance between
     what we predicted and what is correct, or we wanted our data minimized by
-    the median, we could try something like Mean Abosulte Error (<a
+    the median, we could try something like Mean Absolute Error (<a
       href="https://en.wikipedia.org/wiki/Mean_absolute_error">MAE</a
     >). Whatever the case, you should be thinking of your evaluation metric as
     part of your modeling process, and select the best metric based on the

diff --git a/code/linear-regression/src/Components/ScrollyRegression.svelte b/code/linear-regression/src/Components/ScrollyRegression.svelte
@@ -32,6 +32,8 @@
       $lineType = "regressionLineFlat";
       $showRegressionLine = true;
       $showResiduals = true;
+      $coeff = 0;
+      $intercept = 293683;
     },
 
     2: () => {
@@ -169,7 +171,7 @@
             Once we've fit our model, predicting future values is super easy! We
             just plug in any {@html katexify(`x_i`, false)} values into our equation!
             <br /><br />For our simple model, that means plugging in a value for
-            {@html katexify(`sqft`, false)} into our model:
+            {@html katexify(`sqft`, false)} into our model (try adjusting the slider):
           </p>
           <br />
           <div id="input-container">

diff --git a/code/linear-regression/src/Components/Tab_Binary.svelte b/code/linear-regression/src/Components/Tab_Binary.svelte
@@ -174,9 +174,9 @@
     class="dot-without"
   />) and houses with swimming pools (<span class="dot-with" />).
   <br /><br /> The intercept, {formatter(Math.round(intercept))}, is the average
-  predicted price for houses that do not have swimming pools (to see this,
-  simply set {@html katexify(`pool`, false)} to 0 and solve the equation). 
-  To find the average price predicted price for houses with pools, we simply plug 
+  predicted price for houses that <em>do not</em> have swimming pools (to see this,
+  set {@html katexify(`pool`, false)} to 0 and simplify the equation). 
+  To find the average price predicted price for houses that <em>do</em> have pools, we plug 
   in {@html katexify(`pool=1`, false)} to obtain 
   {formatter(
     Math.round(intercept)

diff --git a/code/linear-regression/src/Components/Tab_Interaction.svelte b/code/linear-regression/src/Components/Tab_Interaction.svelte
@@ -175,10 +175,10 @@
     `sqft`,
     false
   )} should differ between houses that do have pools and houses that do not, we can
-  add an interaction term to our model, {@html katexify(`(sqft:pool) `, false)}.
+  add an interaction term to our model, {@html katexify(`(sqft*pool) `, false)}.
   <br /><br />
   The coefficient of the interaction term {@html katexify(
-    `(sqft:pool)`,
+    `(sqft*pool)`,
     false
   )}, {formatter(Math.round(slopeInteraction))}, represents the difference in
   the slope for {@html katexify(`sqft`, false)}, comparing houses that do and 
@@ -191,7 +191,7 @@
   housing price for houses with no pools and a square-footage of zero.<sup
     ><span
       class="info-tooltip"
-      title="Because this value doesn't make much intuitive sense, it's common for the features to be centered at zero."
+      title="Because this value doesn't make much intuitive sense, it's common to preprocess the data so that the features are centered at zero."
       use:tooltip
     >
       [&#8505;]