**References**

**Notes** {{word-count}}

**Summary**:

**Key points**:

**Independent** means every $(x_i, y_i)$ is independent of each $(x_j, y_j)$.

**Identically distributed** means every $(x_i, y_i)$ comes from the same distribution.

When it is Independent and Identically Distributed, $p(\mathcal{D})=\prod_{i} p\left(x_{i}, y_{i}\right) = \prod_{i} p\left(x_{i}\right) p\left(y_{i} \mid x_{i}\right)$.

One assumption we need to make here is the Independent and Identically Distributed (i.i.d.) assumption.

When it is Independent and Identically Distributed, $p(\mathcal{D})=\prod_{i} p\left(x_{i}, y_{i}\right) = \prod_{i} p\left(x_{i}\right) p\left(y_{i} \mid x_{i}\right)$.