Hello Deep Learning
A from scratch GPU-free introduction to modern machine learning. Many tutorials exist already of course, but this one aims to really explain what is going on, from the ground up. Also, we’ll develop the demo until it is actually useful on real life data which you can supply yourself.
Other documents start out from the (very impressive) PyTorch environment, or they attempt to math it up from first principles. Trying to understand deep learning via PyTorch is like trying to learn aerodynamics from flying an Airbus A380.
Meanwhile the pure maths approach (“see it is easy, it is just a Jacobian matrix”) is probably only suited for seasoned mathematicians.
The goal of this tutorial is to develop modern neural networks entirely from scratch, but where we still end up with really impressive results.
Code is here. Markdown for blogposts can also be found on GitHub so you can turn typos into pull requests (thanks, the first updates have arrived!).
Chapters:
- Introduction (which you can skip if you want)
- Chapter 1: Linear combinations
- Chapter 2: Some actual learning, backward propagation
- Chapter 3: Automatic differentiation
- Chapter 4: Recognizing handwritten digits using a multi-layer network: batch learning SGD
- Chapter 5: Neural disappointments, convolutional networks, recognizing handwritten letters
- Chapter 6: Inspecting and plotting what is going on, hyperparameters, momentum, ADAM
- Chapter 7: Dropout, data augmentation and weight decay, quantisation
- Chapter 8: An actual 1700 line from scratch handwritten letter OCR program
- Chapter 9: Gated Recurring Unit / LSTM: Some language processing, DNA scanning
- Chapter 10: Attention, transformers, how does this compare to ChatGPT?
- Chapter 11: Further reading & worthwhile projects
- Chapter 12: What does it all mean?