**References**

**Notes** {{word-count}}

**Summary**:

**Key points**:

The Loss Function quantifies how bad $\theta$ is.

We want the least bad $\theta$.

Negative Log-Likelihood (NLL) is sometimes also called as Cross-Entropy.

Examples

$-\sum_{i} \log p_{\theta}\left(y_{i} \mid x_{i}\right)$

$-\sum_{i} \delta\left(f_{\theta}\left(x_{i}\right)=y_{i}\right)$

$\sum_{i} \frac{1}{2}\left|f_{\theta}\left(x_{i}\right)-y_{i}\right|^{2}$

Mean Squared Error is actually Negative Log-Likelihood (NLL).

The Loss Function quantifies how bad $\theta$ is.

Loss Function measures if one model in the model class is better than another.