**References**

**Notes** {{word-count}}

**Summary**:

**Key points**:

Models that learn $p(x, y)$ are called Generative Model because such a model can learn to generate $x$.

If we can learn $p(x, y)$, we can recover $p(y \mid x)$ from the definition of Conditional Probability.

Models that learn $p(x, y)$ are called Generative Model because such a model can learn to generate $x$.