-
Understanding Transformers and GPT: An In-depth Overview
In this post, we delve into the technical details of the widely used transformer architecture by deriving all formulas involved in its forward and backward passes step by step. By doing so, we can implement these passes ourselves and often achieve more efficient performance than using autograd methods. Additionally, we introduce the technical details on the construction of the popular GPT-3 model using the transformer architecture.
-
Machine Learning Fundamentals
Supplementary materials for the book "Machine Learning Fundamentals", published by Cambridge University Press.