Jiang's Blogs

Technology blogs on machine learning and artificial intelligence

A Deterministic View of Diffusion Models

In this post, we present a deterministic perspective on diffusion models. In this approach, neural networks are trained as an inverse function of the deterministic diffusion mapping that progressively corrupts images at each time step. This method simplifies the derivation of diffusion models, enabling us to fully explain and derive them using only a few straightforward mathematical equations.

14 min read · November 11, 2024

2024 · computer-vision machine-learming AI
Understanding Transformers and GPT: An In-depth Overview

In this post, we delve into the technical details of the widely used transformer architecture by deriving all formulas involved in its forward and backward passes step by step. By doing so, we can implement these passes ourselves and often achieve more efficient performance than using autograd methods. Additionally, we introduce the technical details on the construction of the popular GPT-3 model using the transformer architecture.

14 min read · March 05, 2023

2023 · machine-learming AI NLP
Machine Learning Fundamentals

Supplementary materials for the book "Machine Learning Fundamentals", published by Cambridge University Press.

2 min read · January 21, 2022

2022 · machine-learming AI