Models that learn p(x,y)p(x, y)p(x,y) are called Generative Model because such a model can learn to generate xxx.
If we can learn p(x,y)p(x, y)p(x,y), we can recover p(y∣x)p(y \mid x)p(y∣x) from the definition of Conditional Probability.