Hello Deep Learning - Bert Hubert's writings

A from scratch GPU-free introduction to modern machine learning. Many tutorials exist already of course, but this one aims to really explain what is going on, from the ground up. Also, we’ll develop the demo until it is actually useful on real life data which you can supply yourself.

Other documents start out from the (very impressive) PyTorch environment, or they attempt to math it up from first principles. Trying to understand deep learning via PyTorch is like trying to learn aerodynamics from flying an Airbus A380.

Meanwhile the pure maths approach (“see it is easy, it is just a Jacobian matrix”) is probably only suited for seasoned mathematicians.

The goal of this tutorial is to develop modern neural networks entirely from scratch, but where we still end up with really impressive results.

Code is here. Markdown for blogposts can also be found on GitHub so you can turn typos into pull requests (thanks, the first updates have arrived!).

Chapters:

Introduction (which you can skip if you want)
Chapter 1: Linear combinations
Chapter 2: Some actual learning, backward propagation
Chapter 3: Automatic differentiation
Chapter 4: Recognizing handwritten digits using a multi-layer network: batch learning SGD
Chapter 5: Neural disappointments, convolutional networks, recognizing handwritten letters
Chapter 6: Inspecting and plotting what is going on, hyperparameters, momentum, ADAM
Chapter 7: Dropout, data augmentation and weight decay, quantisation
Chapter 8: An actual 1700 line from scratch handwritten letter OCR program
Chapter 9: Gated Recurring Unit / LSTM: Some language processing, DNA scanning
Chapter 10: Attention, transformers, how does this compare to ChatGPT?
Chapter 11: Further reading & worthwhile projects
Chapter 12: What does it all mean?