After having completed this series of blogposts (well done!) you should have a good grounding in what deep learning is actually doing. However, this was of course only a small 20k word introduction, so there is a lot left to learn.
Unfortunately, there is a lot of nonsense online. Either the explanations are sloppy or they are just plain wrong.
Here is an as yet pretty short list of things I’ve found to be useful. I very much hope to hear from readers about their favorite books and sites. You can send pull requests directly or email me on email@example.com
- The PyTorch documentation is very useful, even if you are not using PyTorch. It describes pretty well how many layers work exactly.
- Andrej Karpathy’s micrograd Python autogradient implementation is a tiny work of art
- Andrej Karpathy’s post The Unreasonable Effectiveness of Recurrent Neural Networks, and also this post
- FastAI’s Jupyter notebooks.
- Whisper.cpp, by hero worker Georgi Gerganov. An open source self-contained C++ version of OpenAI’s whisper speech recognition model. You can run this locally on very modest hardware and it is incredibly impressive. Because the source code is so small it is a great learning opportunity.
- Llama.cpp, again by Georgi, a C++ version of Meta’s Llama “small” large language model that can run on reasonable hardware. Uses quantisation to fit in normal amounts of memory. If prompted well, the Llama model shows ChatGPT-like capabilities.