References
Tags: concept
Sources:
Related notes:
Updates:
April 19th, 2021: add lecture notes from CS 182: Lecture 2, Part 2: Machine Learning Basics.
April 18th, 2021: created note.
Notes {{word-count}}
Summary:
Key points:
In Supervised Learning, given \mathcal{D}=\left{\left(x_{1}, y_{1}\right),\left(x_{2}, y_{2}\right), \ldots,\left(x_{n}, y_{n}\right)\right}, the objective is to learn .
Generally, our goal is to predict given some .
However, prediction itself is very difficult because there are many boundary cases in the real world.
Predicting probabilities
So we use probabilities to represent the likelihood of a prediction falling into a certain category.
Predicting probabilities instead of labels can make training easier, which is due to smoothness,
Intuitively, discrete labels cannot be changed by a bit. It's either all or nothing.
Given \mathcal{D}=\left{\left(x_{1}, y_{1}\right),\left(x_{2}, y_{2}\right), \ldots,\left(x_{n}, y_{n}\right)\right}, the objective is to learn .
is a Random Variable representing the input.
is a random variable because we do not know what we will get. There is some true underlying process in the real world that gives rise to different 's.
is a Random Variable representing the output.
by the definition of Conditional Probability.
When predicting probabilities, instead of representing the output by object labels, we do it using objective probability: what is the likelihood of this object falling into this category?