- [i] Instance-Based Learning
- Simply stores training examples, i.e. features-label pairs
- In contrast to constructing a general, explicit description of the target function over the training examples.
- Delayed/Lazy Learning
- Only generalize when an unseen data needs to be classified.
- Each time an unseen data sample is encountered, its relationship to the previously stored examples is examined to assign a target function value.
- The relationships can be estimated locally.
- Most basic instance-based method
- Inputs of data are numeric ones:
- Each data point is of
$n$ -dimensions, lying in space$\mathbb{R}^n$ . - Define "nearest" neighbors in terms of Euclidean Distance.
- Each data point is of
Given:
- Two arbitrary instances
$x_i,x_j\in X={x_1,x_2,\cdots,x_k}$ .- Where
$x_i,x_j$ are$n$ -d datas - $x_i=\begin{bmatrix}x_{i1} & x_{i2} & \cdots & x_{in} \end{bmatrix}$, $x_j=\begin{bmatrix}x_{j1} & x_{j2} & \cdots & x_{jn} \end{bmatrix}$ Do:
- Where
- The euclidean distance between
$x_i$ and$x_j$ is:$d(x_i,x_j)=\sqrt{\sum_{r=1}^{n}(x_{ir}-x_{jr})^2}$
Objective:
- Learn a discrete-valued target functions
- of form
$\mathbb{R}^n\rightarrow Y$ , where -
$Y={y_1,y_2,\cdots,y_s}$ is the set of target classes
- of form
Given:
- Training Data-Label Pairs:
- $D={\langle\mathbf{x}{1},f(\mathbf{x}{1})\rangle,\langle\mathbf{x}{2},f(\mathbf{x}{2})\rangle,\cdots,\langle\mathbf{x}{m},f(\mathbf{x}{m})\rangle}\subset X\times Y$, where
-
$X$ is the set of training values:-
$X={x_1, x_2, \cdots, x_m}\subset \mathbb{R}^n$ , where-
$\forall i\in [1,m], x_i\in\mathbb{R}^n$ .
-
-
- A set of classes:
-
$Y={y_1,y_2,\cdots,y_s}$ .
-
- A mapping or assignments function from any training sample to a class:
-
$f:X\mapsto Y$ .
-
-
- $D={\langle\mathbf{x}{1},f(\mathbf{x}{1})\rangle,\langle\mathbf{x}{2},f(\mathbf{x}{2})\rangle,\cdots,\langle\mathbf{x}{m},f(\mathbf{x}{m})\rangle}\subset X\times Y$, where
- A sample query instance
$x_q$ to be classified.- $x_q=\begin{bmatrix}x_{q1} & x_{q2} & \cdots & x_{qn}\end{bmatrix} \in \mathbb{R}^n$.
- A constant
$k$ . Do: - Let
${x_1,x_2,\cdots,x_k}$ be$k$ instances from training examples that's nearest to$x_q$ . - Output:
- $\hat{f}(x_1)\leftarrow \text{argmax}{y\in Y} \sum{i=1}^{k}\delta(y,f(x_i))$, where
$\delta(y,f(x_i)) = \biggl{_{0, \ if \ f(x_i)\neq y}^{1, \ if \ f(x_i)=y}$
- Gives the most common value (class) from the
$k$ samples.
- $\hat{f}(x_1)\leftarrow \text{argmax}{y\in Y} \sum{i=1}^{k}\delta(y,f(x_i))$, where
Objective:
- Learn a discrete-valued target functions
- of form
$\mathbb{R}^n\rightarrow y$ , where -
$y\in\mathbb{R}$ , which is a real value, i.e., a scalar.
- of form
Given:
- Training Data-Label Pairs:
- $D={\langle\mathbf{x}{1},y{1}\rangle,\langle\mathbf{x}{2},y{1}\rangle,\cdots,\langle\mathbf{x}{m},y{1}\rangle}\subset X\times \mathbb{R}$, where
-
$X$ is the set of training values:-
$X={x_1, x_2, \cdots, x_m}\subset \mathbb{R}^n$ , where-
$\forall i\in [1,m], x_i\in\mathbb{R}^n$ .
-
-
-
- $D={\langle\mathbf{x}{1},y{1}\rangle,\langle\mathbf{x}{2},y{1}\rangle,\cdots,\langle\mathbf{x}{m},y{1}\rangle}\subset X\times \mathbb{R}$, where
- A sample query instance
$x_q$ to be classified.- $\mathbf{x}q=\begin{bmatrix}x{q1} & x_{q2} & \cdots & x_{qn}\end{bmatrix} \in \mathbb{R}^n$.
- A constant
$k$ .
Do:
- Let
${x_1,x_2,\cdots,x_k}$ be$k$ instances from training examples that's nearest to$x_q$ . - Output:
$\hat{f}(x_q)\leftarrow\dfrac{\sum_{i=1}^{k}f(x_i)}{k}$ - That is, the simple mean of the values around.
- Weight the contribution
- of each of the
$k$ neighbors - according to the distance to query point
$x_q$ - closer neighbors = greater weights
- of each of the
-
$=\text{argmax}{y\in Y}\sum{i=1}^{k}\dfrac{\delta(y,f(\mathbf{x}{i}))}{d(\mathbf{x}{q},\mathbf{x}_{i})^2}$
-
$=\text{argmax}\sum_{i=1}^{k}\sum_{i=1}^{k}\dfrac{\delta(y,f(\mathbf{x}{i}))}{\sum{j=1}^{n}(x_{ij}-x_{qj})^2}$ where,
-
$w_i=\dfrac{1}{d(x_q,x_i)^2}=\dfrac{1}{\sum_{j=1}^{n}(x_{ij}-x_{qj})^2}$
The weighted mean: $$ \hat{h}(\mathbf{x}{q})\leftarrow \dfrac{\sum{i=1}^{k}w_{i}f(\mathbf{x}{i})}{\sum{i=1}^{k}w_{i}} $$
$w_i=\dfrac{1}{d(x_q,x_i)^2}=\dfrac{1}{\sum_{j=1}^{n}(x_{ij}-x_{qj})^2}$
Given
- A set of values:
| ID | label | |||
|---|---|---|---|---|
| 6 | 4 | 2 | 1 | |
| 2 | 8 | 3 | 9 | |
| 9 | 2 | 1 | 5 | |
| 3 | 8 | 6 | 1 | |
| 4 | 2 | 9 | 8 |
- A query
$\mathbf{x}_{q}=(3,7,3)$
Q1. Calculate
$d^2(\mathbf{x}{1},\mathbf{x}{q})=(6-3)^2+(4-7)^2+(2-3)^2=19, \ w_{1}=\dfrac{1}{19}$
$d^2(\mathbf{x}{2},\mathbf{x}{q})=(2-3)^2+(8-7)^2+(3-3)^2=2, \ w_{2}=\dfrac{1}{2}$
$d^2(\mathbf{x}{3},\mathbf{x}{q})=(9-3)^2+(2-7)^2+(1-3)^2=65, \ w_{3}=\dfrac{1}{65}$
$d^2(\mathbf{x}{4},\mathbf{x}{q})=(3-3)^2+(8-7)^2+(6-3)^2=10, \ w_{4}=\dfrac{1}{10}$
$d^2(\mathbf{x}{5},\mathbf{x}{q})=(4-3)^2+(2-7)^2+(9-3)^2=62, \ w_{5}=\dfrac{1}{62}$
Therefore, the
Therefore, the vote is
Q2. Calculate