Hello Deep Learning: Further reading & worthwhile projects

This page is part of the Hello Deep Learning series of blog posts. You are very welcome to improve this page via GitHub!

After having completed this series of blogposts (well done!) you should have a good grounding in what deep learning is actually doing. However, this was of course only a small 20k word introduction, so there is a lot left to learn.

Unfortunately, there is a lot of nonsense online. Either the explanations are sloppy or they are just plain wrong.

Here is an as yet pretty short list of things I’ve found to be useful. I very much hope to hear from readers about their favorite books and sites. You can send pull requests directly or email me on bert@hubertnet.nl

Sites:

The PyTorch documentation is very useful, even if you are not using PyTorch. It describes pretty well how many layers work exactly.
Andrej Karpathy’s micrograd Python autogradient implementation is a tiny work of art
Andrej Karpathy’s post The Unreasonable Effectiveness of Recurrent Neural Networks, and also this post
FastAI’s Jupyter notebooks.

Projects:

Whisper.cpp, by hero worker Georgi Gerganov. An open source self-contained C++ version of OpenAI’s whisper speech recognition model. You can run this locally on very modest hardware and it is incredibly impressive. Because the source code is so small it is a great learning opportunity.
Llama.cpp, again by Georgi, a C++ version of Meta’s Llama “small” large language model that can run on reasonable hardware. Uses quantisation to fit in normal amounts of memory. If prompted well, the Llama model shows ChatGPT-like capabilities.