Machine Learning Garden

Powered by 🌱Roam Garden

Transformer

References

Tags: concept

Sources:

Updates:

Notes {{word-count}}

Summary:

Key points:

Problems in Transformer training and their solutions are introduced in paper Understanding the Difficulty of Training Transformers.

Referenced in

BERT

BERT is a bidirectional Transformer encoder. It is also a Language Model that has been studied and used widely after the paper was published.

Transformer