Machine Learning Garden

Powered by 🌱Roam Garden

Supervised Learning

References

Notes {{word-count}}

Summary:

Key points:

In , given \mathcal{D}=\left{\left(x_{1}, y_{1}\right),\left(x_{2}, y_{2}\right), \ldots,\left(x_{n}, y_{n}\right)\right}, the objective is to learn fθ(x)yf_{\theta}(x) \approx y.

Generally, our goal is to predict yy given some xx.

However, prediction itself is very difficult because there are many boundary cases in the real world.

Predicting probabilities

So we use probabilities to represent the likelihood of a prediction falling into a certain category.

Predicting probabilities instead of labels can make training easier, which is due to smoothness,

Intuitively, discrete labels cannot be changed by a bit. It's either all or nothing.

Given \mathcal{D}=\left{\left(x_{1}, y_{1}\right),\left(x_{2}, y_{2}\right), \ldots,\left(x_{n}, y_{n}\right)\right}, the objective is to learn pθ(yx)p_{\theta}(y \mid x).

xx is a representing the input.

xx is a random variable because we do not know what xx we will get. There is some true underlying process in the real world that gives rise to different xx's.

yy is a representing the output.

p(x,y)=p(x)p(yx)p(x, y)=p(x) p(y \mid x) by .

p(yx)=p(x,y)p(x)\displaystyle p(y \mid x)=\frac{p(x, y)}{p(x)} by the definition of .

Discriminative Model because the goal is to discriminate between different yy's.

When predicting probabilities, instead of representing the output by object labels, we do it using objective probability: what is the likelihood of this object falling into this category?

Referenced in

Supervised Learning

In , given \mathcal{D}=\left{\left(x_{1}, y_{1}\right),\left(x_{2}, y_{2}\right), \ldots,\left(x_{n}, y_{n}\right)\right}, the objective is to learn fθ(x)yf_{\theta}(x) \approx y.

Supervised Learning