It recently occurred to me that as I update each Linux container or VM, I'm downloading a lot of the same files over and over again. While the downloads aren't huge, it still seems wasteful to request the same files from the repo mirrors so many times... So why not just download the update once and then distribute it locally to each of my systems?
That's the purpose of a caching proxy.
I chose apt-cacher ng as it's very simple to setup and use, so I spun up a dedicated LXC and installed apt-cacher ng via apt. Once it was up and running, it was just a matter of following the included documentation to point all of my other systems to that cache.
After upgrading just a couple of systems, I can already see the cache doing it's job:
Those "hits" are requests that were able to be fulfilled locally from the cache instead of needing to download the files from the repo again. Since this is caching every request, it actually becomes more efficient the more that it's used, so hopefully the efficiency will increase even more over time.
So what exactly is happening?
First, this is not a full mirror of the Debian repos. Rather, apt-cacher ng acts as a proxy and cache. When a local client system wants to perform an update, it requests the updated packages from apt-cacher instead of the Debian repo directly. If the updated package is available in apt-cacher's local cache already, it simply provides the package to the requesting client. If the package is not in the local cache, then the proxy requests the package from the repo, provides that package to the client, and then saves a copy of the package to the cache. Now it has a local copy in case another system requests the same package again.
Some packages, like Crowdsec, are only installed on a single machine on my network, so the cache won't provide a benefit there. However, since most of my systems are all running Debian, even through they may be running some different services, they will still all request a lot of the same packages as each other every time they update, like openssh or Python. These will only have to be downloaded the very first time they're requested, and all of the subsequent requests can be filled from the proxy's local cache.
Do you use a cache in your homelab? Let me know below!