Machine Learning Garden

Maximum Likelihood Estimation (MLE)

References

Tags: concept

Sources:

Related notes:

Machine Learning

Updates:

April 20th, 2021: created note.

Notes {{word-count}}

Summary:

Key points:

We want to learn $p_\theta (y \mid x)$ , and it is a model which approximates the true $p(y \mid x)$ .

A good model should make the data look probable.

We choose $\theta$ such that $p(\mathcal{D})=\prod_{i} p\left(x_{i}\right) p_{\theta}\left(y_{i} \mid x_{i}\right)$ is maximized.

However, one numerical problem here is that we are multiply together many numbers less than one.

To solve the problem, we can use $\log$ to convert multiplication into addition.

$\log p(\mathcal{D})=\sum_{i} \log p\left(x_{i}\right)+\log p_{\theta}\left(y_{i} \mid x_{i}\right) =\sum_{i} \log p_{\theta}\left(y_{i} \mid x_{i}\right)+\text { const }$

$\theta^{\star} \leftarrow \arg \max _{\theta} \sum_{i} \log p_{\theta}\left(y_{i} \mid x_{i}\right)$

This can also be formulated as a minimization problem.

\theta^{\star} \leftarrow \arg \min _{\theta}-\sum_{i} \log p_{\theta}\left(y_{i} \mid x_{i}\right)

This is also called Negative Log-Likelihood (NLL).

Machine Learning Garden

Maximum Likelihood Estimation (MLE)

Referenced in

Machine Learning Concepts

Negative Log-Likelihood (NLL)

Machine Learning