Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions code/linear-regression/src/Components/ClosedForm.svelte
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@
true
)}
Add circles to the chart below to see how the Normal Equation calculates two
featues, the bias and weight, for the corresponding regression model.
features, the intercept and weight, for the corresponding regression model.
</p>
<div id="cf-container">
<div id="equations-container">
Expand Down Expand Up @@ -221,7 +221,7 @@
In research publications and statistical software, coefficients of regression
models are often presented with associated p-values. These p-values come from
traditional null hypothesis statistical tests: t-tests are used to measure whether
a given cofficient is significantly different than zero (the null hypothesis
a given coefficient is significantly different than zero (the null hypothesis
that a particular coefficient {@html katexify(`\\beta_i`, false)} equals zero),
while F tests are used to measure whether
<i>any</i>
Expand Down
16 changes: 8 additions & 8 deletions code/linear-regression/src/Components/GradientDescent.svelte
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
find suitable coefficients for our regression model that minimize prediction error
(remember, lower MSE equals better model).
<br /><br />
A full conversation on gradient descent is outside the course of this article (stay-tuned
A full conversation on gradient descent is outside the course of this article (stay tuned
for our future article on the subject), but if you'd like to learn more, click
the "Show Math" button below. Otherwise, read on!
<br />
Expand All @@ -53,7 +53,7 @@
Gradient descent works as follows. We assume that we have some convex
function representing the error of our machine learning algorithm (in our
case, MSE). Gradient descent will iteratively update our model's
coefficients in the direction of our error functions minimum <span
coefficients in a direction that reduces the value our error<span
class="info-tooltip"
title="Gradient descent won't always yield the best coefficients for our model, because it can sometimes
get stuck in local minima (as opposed to global minima). Many extensions exist to help solve this problem."
Expand All @@ -79,7 +79,7 @@
this, we'll use the gradient, which represents the direction that the function
is increasing, and the rate at which it is increasing. Since we want to find
the minimum of this function, we can go in the opposite direction of where it's
increasing. This is exactly what Gradient Descent does, it works by taking steps
increasing. This is exactly what Gradient Descent does: it works by taking steps
in the direction opposite of where our error function is increasing, proportional
to the rate of change. To find the coefficients that minimize the function, we
first calculate the derivatives of our error function is increasing. To find
Expand All @@ -96,9 +96,9 @@
Now that we have the gradients for our error function (with respect
to each coefficient to be updated), we perform iterative updates:
{@html katexify(
`\\text{repeat until converge:} = \\begin{cases}
\\beta_0 = \\beta_0 - \\alpha (-\\frac{2}{n} \\sum^{n}_{i=1}(y_i - \\hat{y_i})) \\\\
\\beta_1 = \\beta_1 - \\alpha (-\\frac{2}{n} x_i\\sum^{n}_{i=1}(y_i - \\hat{y_i}))
`\\text{repeat until converge:} \\begin{cases}
\\beta_0 \\Larr \\beta_0 - \\alpha (-\\frac{2}{n} \\sum^{n}_{i=1}(y_i - \\hat{y_i})) \\\\
\\beta_1 \\Larr \\beta_1 - \\alpha (-\\frac{2}{n} x_i\\sum^{n}_{i=1}(y_i - \\hat{y_i}))
\\end{cases}`,
true
)}
Expand Down Expand Up @@ -148,10 +148,10 @@
>100 Steps</button
>
</div>
<div id="bias-slider">
<div id="intercept-slider-2">
<div class="input-container">
<p>
Bias ({@html katexify(`\\hat{\\beta_0}`, false)}): {formatter(
Intercept ({@html katexify(`\\hat{\\beta_0}`, false)}): {formatter(
$gdBias
)}
</p>
Expand Down
8 changes: 4 additions & 4 deletions code/linear-regression/src/Components/Intro.svelte
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@
><span
class="info-tooltip"
title="The coefficient B<sub>0</sub> represents the
intercept of our model, and each other coefficient
intercept or bias of our model, and each other coefficient
B<sub>i</sub> (i > 0) is a slope defining how variable
x<sub>i</sub> contributes to the model. We discuss how to
interpret regression coefficients later in the article."
Expand All @@ -101,15 +101,15 @@
>
</li>
<li>
{@html katexify(`\\epsilon`, false)}: the irreducible error in our model.
A term that collects together all the unmodeled parts of our data.
{@html katexify(`\\epsilon`, false)}: the residual (or "error") of our model.
Our model will not make perfect predictions, so we compute this term by subtracting the predicted value from the actual value.
</li>
</ul>
<br />

<p class="body-text">
Fitting a linear regression model is all about finding the set of
cofficients that best model {@html katexify(`y`, false)} as a function of our
coefficients that best model {@html katexify(`y`, false)} as a function of our
features. We may never know the true parameters for our model, but we can estimate
them (more on this later). Once we've estimated these coefficients, {@html katexify(
`\\hat{\\beta_i}`,
Expand Down
15 changes: 8 additions & 7 deletions code/linear-regression/src/Components/MeanSquaredError.svelte
Original file line number Diff line number Diff line change
Expand Up @@ -72,18 +72,19 @@
>
More specifically, r-squared measures the percentage of variance explained normalized
against the baseline variance of our model (which is just the variance of the
mean):
the trivial model that always predicts the mean):
{@html katexify(
`\\begin{aligned} R^2 = 1 - \\frac{\\Sigma^{n}_{i=1}(y_i - \\hat{y_i})^2 }{\\Sigma^{n}_{i=1}(y_i - \\bar{y})^2 } \\end{aligned}`,
true
)}
The highest possible value for r-squared is 1, representing a model that captures
100% of the variance. A negative r-squared means that our model is doing worse
(capturing less variance) than a flat line through mean of our data would.
(capturing less variance) than a flat line through mean of our data would. (The name
"r-<em>squared</em>" falsely implies that it would not have a negative value.)

<br /><br />To build intuition for yourself, try changing the weight and
bias terms below to see how the MSE and r-squared change across different
model fits:
intercept terms below to see how the MSE and r-squared change across different
possible models for a toy dataset (click Shuffle Data to make a new toy dataset):
</p>
<br /><br />
<div id="mse-container">
Expand All @@ -93,10 +94,10 @@
>Shuffle Data</button
>
</div>
<div id="bias-slider">
<div id="intercept-slider">
<div class="input-container">
<p>
Bias ({@html katexify(`\\hat{\\beta_0}`, false)}): {formatter(
Intercept ({@html katexify(`\\hat{\\beta_0}`, false)}): {formatter(
$mseBias
)}
</p>
Expand Down Expand Up @@ -181,7 +182,7 @@
<a href="https://en.wikipedia.org/wiki/Root-mean-square_deviation">RMSE</a
>). If instead we wanted our error to reflect the linear distance between
what we predicted and what is correct, or we wanted our data minimized by
the median, we could try something like Mean Abosulte Error (<a
the median, we could try something like Mean Absolute Error (<a
href="https://en.wikipedia.org/wiki/Mean_absolute_error">MAE</a
>). Whatever the case, you should be thinking of your evaluation metric as
part of your modeling process, and select the best metric based on the
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@
$lineType = "regressionLineFlat";
$showRegressionLine = true;
$showResiduals = true;
$coeff = 0;
$intercept = 293683;
},

2: () => {
Expand Down Expand Up @@ -169,7 +171,7 @@
Once we've fit our model, predicting future values is super easy! We
just plug in any {@html katexify(`x_i`, false)} values into our equation!
<br /><br />For our simple model, that means plugging in a value for
{@html katexify(`sqft`, false)} into our model:
{@html katexify(`sqft`, false)} into our model (try adjusting the slider):
</p>
<br />
<div id="input-container">
Expand Down
6 changes: 3 additions & 3 deletions code/linear-regression/src/Components/Tab_Binary.svelte
Original file line number Diff line number Diff line change
Expand Up @@ -174,9 +174,9 @@
class="dot-without"
/>) and houses with swimming pools (<span class="dot-with" />).
<br /><br /> The intercept, {formatter(Math.round(intercept))}, is the average
predicted price for houses that do not have swimming pools (to see this,
simply set {@html katexify(`pool`, false)} to 0 and solve the equation).
To find the average price predicted price for houses with pools, we simply plug
predicted price for houses that <em>do not</em> have swimming pools (to see this,
set {@html katexify(`pool`, false)} to 0 and simplify the equation).
To find the average price predicted price for houses that <em>do</em> have pools, we plug
in {@html katexify(`pool=1`, false)} to obtain
{formatter(
Math.round(intercept)
Expand Down
6 changes: 3 additions & 3 deletions code/linear-regression/src/Components/Tab_Interaction.svelte
Original file line number Diff line number Diff line change
Expand Up @@ -175,10 +175,10 @@
`sqft`,
false
)} should differ between houses that do have pools and houses that do not, we can
add an interaction term to our model, {@html katexify(`(sqft:pool) `, false)}.
add an interaction term to our model, {@html katexify(`(sqft*pool) `, false)}.
<br /><br />
The coefficient of the interaction term {@html katexify(
`(sqft:pool)`,
`(sqft*pool)`,
false
)}, {formatter(Math.round(slopeInteraction))}, represents the difference in
the slope for {@html katexify(`sqft`, false)}, comparing houses that do and
Expand All @@ -191,7 +191,7 @@
housing price for houses with no pools and a square-footage of zero.<sup
><span
class="info-tooltip"
title="Because this value doesn't make much intuitive sense, it's common for the features to be centered at zero."
title="Because this value doesn't make much intuitive sense, it's common to preprocess the data so that the features are centered at zero."
use:tooltip
>
[&#8505;]
Expand Down