10 interesting stories served every morning and every evening.
This document explains why uBO works best in Firefox.
Ability to uncloak 3rd-party servers disguised as 1st-party through the use of CNAME record. The effect of this is to make uBO on Firefox the most efﬁcient at blocking 3rd-party trackers relative to other other browser/blocker pairs:
The dark green/red bars are uBO before/after it gained ability to uncloak CNAMEs on Firefox.
Source: “Characterizing CNAME Cloaking-Based Tracking
on the Web” at Asia Paciﬁc Network Information Centre, August 2020.
HTML ﬁltering is the ability to ﬁlter the response body of HTML documents before it is parsed by the browser.
For example, this allows the removal of speciﬁc tags in HTML documents before they are parsed and executed by the browser, something not possible in a reliable manner in other browsers. This feature requires the webRequest.ﬁlterResponseData() API, currently only available in Firefox.
At browser launch, Firefox will wait for uBO to be up and ready before network requests are ﬁred from already opened tab(s).
This is not the case with Chromium-based browsers, i.e. tracker/advertisement payloads may ﬁnd their way into already opened tabs before uBO is up and ready in Chromium-based browsers, while these are properly ﬁltered in Firefox.
Pre-fetching, which is disabled by default in uBO, is reliably prevented in Firefox, while this is not the case in Chromium-based browsers.
Chromium-based browsers give precedence to websites over user settings when it comes to decide whether pre-fetching is disabled or not.
The Firefox version of uBO makes use of WebAssembly code for core ﬁltering code paths. This is not the case with Chromium-based browsers because this would require an extra permission in the extension manifest which could cause friction when publishing the extension in the Chrome Web Store.
For more about this, see: https://github.com/WebAssembly/content-security-policy/issues/7#issuecomment-441259729.
The Firefox version of uBO use LZ4 compression by default to store raw ﬁlter lists, compiled list data, and memory snapshots to disk storage.
LZ4 compression requires the use of IndexedDB, which is problematic with Chromium-based browsers in incognito mode — where instances of IndexedDB are always reset, causing uBO to always launch inefﬁciently and with out of date ﬁlter lists (see #399). An IndexedDB instance is required because it supports storing Blob-based data, a capability not available to browser.storage.local API.
This is a long-form post breaking down the setup I use to run a SaaS. From load balancing to cron job monitoring to payments and subscriptions. There’s a lot of ground to cover, so buckle up!
As grandiose as the title of this article might sound, I should clarify we’re talking about a low-stress, one-person company that I run from my ﬂat here in Germany. It’s fully self-funded, and I like to take things slow. It’s probably not what most people imagine when I say “tech startup”.
I wouldn’t be able to do this without the vast amount of open-source software and managed services at my disposal. I feel like I’m standing on the shoulders of giants, who did all the hard work before me, and I’m very grateful for that.
For context, I run a one-man SaaS, and this is a more detailed version of my post on the tech stack I use. Please consider your own circumstances before following my advice, your own context matters when it comes to technical choices, there’s no holy grail.
I use Kubernetes on AWS, but don’t fall into the trap of thinking you need this. I learned these tools over several years mentored by a very patient team. I’m productive because this is what I know best, and I can focus on shipping stuff instead. Your mileage may vary.
By the way, I drew inspiration for the format of this post from Wenbin Fang’s blog post. I really enjoyed reading his article, and you might want to check it out too!
With that said, let’s jump right into the tour.
My infrastructure handles multiple projects at once, but to illustrate things I’ll use Panelbear, my most recent SaaS, as a real-world example of this setup in action.
Browser Timings chart in Panelbear, the example project I’ll use for this tour.
From a technical point of view, this SaaS processes a large amount of requests per second from anywhere in the world, and stores the data in an efﬁcient format for real time querying.
Business-wise it’s still in its infancy (I launched six months ago), but it has grown rather quickly for my own expectations, especially as I originally built it for myself as a Django app using SQLite on a single tiny VPS. For my goals at the time, it worked just ﬁne and I could have probably pushed that model quite far.
However, I grew increasingly frustrated having to reimplement a lot of the tooling I was so accustomed to: zero downtime deploys, autoscaling, health checks, automatic DNS / TLS / ingress rules, and so on. Kubernetes spoiled me, I was used to dealing with higher level abstractions, while retaining control and ﬂexibility.
Fast forward six months, a couple of iterations, and even though my current setup is still a Django monolith, I’m now using Postgres as the app DB, ClickHouse for analytics data, and Redis for caching. I also use Celery for scheduled tasks, and a custom event queue for buffering writes. I run most of these things on a managed Kubernetes cluster (EKS).
It may sound complicated, but it’s practically an old-school monolithic architecture running on Kubernetes. Replace Django with Rails or Laravel and you know what I’m talking about. The interesting part is how everything is glued together and automated: autoscaling, ingress, TLS certiﬁcates, failover, logging, monitoring, and so on.
It’s worth noting I use this setup across multiple projects, which helps keep my costs down and launch experiments really easily (write a Dockerﬁle and git push). And since I get asked this a lot: contrary to what you might be thinking, I actually spend very little time managing the infrastructure, usually 0-2 hours per month total. Most of my time is spent developing features, doing customer support, and growing the business.
That said, these are the tools I’ve been using for several years now and I’m pretty familiar with them. I consider my setup simple for what it’s capable of, but it took many years of production ﬁres at my day job to get here. So I won’t say it’s all sunshine and roses.
I don’t know who said it ﬁrst, but what I tell my friends is: “Kubernetes makes the simple stuff complex, but it also makes the complex stuff simpler”.
Now that you know I have a managed Kubernetes cluster on AWS and I run various projects in it, let’s make the ﬁrst stop of the tour: how to get trafﬁc into the cluster.
My cluster is in a private network, so you won’t be able to reach it directly from the public internet. There’s a couple of pieces in between that control access and load balance trafﬁc to the cluster.
Essentially, I have Cloudﬂare proxying all trafﬁc to an NLB (AWS L4 Network Load Balancer). This Load Balancer is the bridge between the public internet and my private network. Once it receives a request, it forwards it to one of the Kubernetes cluster nodes. These nodes are in private subnets spread across multiple availability zones in AWS. It’s all managed by the way, but more on that later.
Trafﬁc gets cached at the edge, or forwarded to the AWS region where I operate.
“But how does Kubernetes know which service to forward the request to?” - That’s where ingress-nginx comes in. In short: it’s an NGINX cluster managed by Kubernetes, and it’s the entrypoint for all trafﬁc inside the cluster.
NGINX applies rate-limiting and other trafﬁc shaping rules I deﬁne before sending the request to the corresponding app container. In Panelbear’s case, the app container is Django being served by Uvicorn.
It’s not much different from a traditional nginx/gunicorn/Django in a VPS approach, with added horizontal scaling beneﬁts and an automated CDN setup. It’s also a “setup once and forget” kind of thing, mostly a few ﬁles between Terraform/Kubernetes, and it’s shared by all deployed projects.
When I deploy a new project, it’s essentially 20 lines of ingress conﬁguration and that’s it:
Those annotations describe that I want a DNS record, with trafﬁc proxied by Cloudﬂare, a TLS certiﬁcate via letsencrypt, and that it should rate-limit the requests per minute by IP before forwarding the request to my app.
Kubernetes takes care of making those infra changes to reﬂect the desired state. It’s a little verbose, but it works well in practice.
The chain of actions that occur when I push a new commit.
Whenever I push to master one of my projects, it kicks off a CI pipeline on GitHub Actions. This pipeline runs some codebase checks, end-to-end tests (using Docker compose to setup a complete environment), and once these checks pass it builds a new Docker image that gets pushed to ECR (the Docker registry in AWS).
As far as the application repo is concerned, a new version of the app has been tested and is ready to be deployed as a Docker image:
“So what happens next? There’s a new Docker image, but no deploy?” - My Kubernetes cluster has a component called ﬂux. It automatically keeps in sync what is currently running in the cluster and the latest image for my apps.
Flux automatically keeps track of new releases in my infrastructure monorepo.
Flux automatically triggers an incremental rollout when there’s a new Docker image available, and keeps record of these actions in an “Infrastructure Monorepo”.
I want version controlled infrastructure, so that whenever I make a new commit on this repo, between Terraform and Kubernetes, they will make the necessary changes on AWS, Cloudﬂare and the other services to synchronize the state of my repo with what is deployed.
It’s all version-controlled with a linear history of every deployment made. This means less stuff for me to remember over the years, since I have no magic settings conﬁgured via clicky-clicky on some obscure UI.
Think of this monorepo as deployable documentation, but more on that later.
A few years ago I used the Actor model of concurrency for various company projects, and fell in love with many of the ideas around its ecosystem. One thing led to another and soon I was reading books about Erlang, and its philosophy around letting things crash.
I might be stretching the idea too much, but in Kubernetes I like to think of liveliness probes and automatic restarts as a means to achieve a similar effect.
From the Kubernetes documentation: “The kubelet uses liveness probes to know when to restart a container. For example, liveness probes could catch a deadlock, where an application is running, but unable to make progress. Restarting a container in such a state can help to make the application more available despite bugs.”
In practice this has worked pretty well for me. Containers and nodes are meant to come and go, and Kubernetes will gracefully shift the trafﬁc to healthy pods while healing the unhealthy ones (more like killing). Brutal, but effective.
My app containers auto-scale based on CPU/Memory usage. Kubernetes will try to pack as many workloads per node as possible to fully utilize it.
In case there’s too many Pods per node in the cluster, it will automatically spawn more servers to increase the cluster capacity and ease the load. Similarly, it will scale down when there’s not much going on.
Here’s what a Horizontal Pod Autoscaler might look like:
In this example, it will automatically adjust the number of panelbear-api pods based on the CPU usage, starting at 2 replicas but capping at 8.
When deﬁning the ingress rules for my app, the annotation cloudﬂare-proxied: “true” is what tells the Kubernetes that I want to use Cloudﬂare for DNS, and to proxy all requests via it’s CDN and DDoS protection too.
From then on, it’s pretty easy to make use of it. I just set standard HTTP cache headers in my applications to specify which requests can be cached, and for how long.
Cloudﬂare will use those response headers to control the caching behavior at the edge servers. It works amazingly well for such a simple setup.
I use Whitenoise to serve static ﬁles directly from my app container. That way I avoid needing to upload static ﬁles to Nginx/Cloudfront/S3 on each deployment. It has worked really well so far, and most requests will get cached by the CDN as it gets ﬁlled. It’s performant, and keeps things simple.
I also use NextJS for a few static websites, such as the landing page of Panelbear. I could serve it via Cloudfront/S3 or even Netlify or Vercel, but it was easy to just run it as a container in my cluster and let Cloudﬂare cache the static assets as they are being requested. There’s zero added cost for me to do this, and I can re-use all tooling for deployment, logging and monitoring.
Besides static ﬁle caching, there’s also application data caching (eg. results of heavy calculations, Django models, rate-limiting counters, etc…).
On one hand I leverage an in-memory Least Recently Used (LRU) cache to keep frequently accessed objects in memory, and I’d beneﬁt from zero network calls (pure Python, no Redis involved).
However, most endpoints just use the in-cluster Redis for caching. It’s still fast and the cached data can be shared by all Django instances, even after re-deploys, while an in-memory cache would get wiped.
My Pricing Plans are based on analytics events per month. For this some sort of metering is necessary to know how many events have been consumed within the current billing period and enforce limits. However, I don’t interrupt the service immediately when a customer crosses the limit. Instead a “Capacity depleted” email is automatically sent, and a grace period is given to the customer before the API starts rejecting new data.
This is meant to give customers enough time to decide if an upgrade makes sense for them, while ensuring no data is lost. For example during a trafﬁc spike in case their content goes viral or if they’re just enjoying the weekend and not checking their emails. If the customer decides to stay in the current plan and not upgrade, there is no penalty and things will go back to normal once usage is back within their plan limits.
So for this feature I have a function that applies the rules above, which require several calls to the DB and ClickHouse, but get cached 15 minutes to avoid recomputing this on every request. It’s good enough and simple. Worth noting: the cache gets invalidated on plan changes, otherwise it might take 15 minutes for an upgrade to take effect.
While I enforce global rate limits at the nginx-ingress on Kubernetes, I sometimes want more speciﬁc limits on a per endpoint/method basis.
For that I use the excellent Django Ratelimit library to easily declare the limits per Django view. It’s conﬁgured to use Redis as a backend for keeping track of the clients making the requests to each endpoint (it stores a hash based on the client key, and not the IP).
In the example above, if the client attempts to POST to this particular endpoint more than 5 times per minute, the subsequent call will get rejected with a HTTP 429 Too Many Requests status code.
The friendly error message you’d get when being rate-limited.
Django gives me an admin panel for all my models for free. It’s built-in, and It’s pretty handy for inspecting data for customer support work on the go.
Django’s built-in admin panel is very useful for doing customer support on the go.
I added actions to help me manage things from the UI. Things like blocking access to suspicious accounts, sending out announcement emails, and approving full account deletion requests (ﬁrst a soft delete, and within 72 hours a full destroy).
Security-wise: only staff users are able to access the panel (me), and I’m planning to add 2FA for extra security on all accounts.
Additionally every time a user logs in, I send an automatic security email with details about the new session to the account’s email. Right now I send it on every new login, but I might change it in the future to skip known devices. It’s not a very “MVP feature”, but I care about security and it was not complicated to add. At least I’d be warned if someone logged in to my account.
Of course, there’s a lot more to hardening an application than this, but that’s out of the scope of this post.
Example security activity email you might receive when logging in.
Another interesting use case is that I run a lot of different scheduled jobs as part of my SaaS. These are things like generating daily reports for my customers, calculating usage stats every 15 minutes, sending staff emails (I get a daily email with the most important metrics) and whatnot.
My setup is actually pretty simple, I just have a few Celery workers and a Celery beat scheduler running in the cluster. They are conﬁgured to use Redis as the task queue. It took me an afternoon to set it up once, and luckily I haven’t had any issues so far.
I want to get notiﬁed via SMS/Slack/Email when a scheduled task is not running as expected. For example when the weekly reports task is stuck or significantly delayed. For that I use Healthchecks.io, but checkout Cronitor and CronHub too, I’ve been hearing great things about them as well.
To abstract their API, I wrote a small Python snippet to automate the monitor creation and status pinging:
All my applications are conﬁgured via environment variables, old school but portable and well supported. For example, in my Django settings.py I’d setup a variable with a default value:
And use it anywhere in my code like this:
I can override the environment variable in my Kubernetes conﬁgmap:
The way secrets are handled is pretty interesting: I want to also commit them to my infrastructure repo, alongside other conﬁg ﬁles, but secrets should be encrypted.
For that I use kubeseal in Kubernetes. This component uses asymmetric crypto to encrypt my secrets, and only a cluster authorized to access the decryption keys can decrypt them.
For example this is what you might ﬁnd in my infrastructure repo:
The cluster will automatically decrypt the secrets and pass them to the corresponding container as an environment variable:
To protect the secrets within the cluster, I use AWS-managed encryption keys via KMS, which are rotated regularly. This is a single setting when creating the Kubernetes cluster, and it’s fully managed.
Operationally what this means is that I write the secrets as environment variables in a Kubernetes manifest, I then run a command to encrypt them before committing, and push my changes.
The secrets are deployed within a few seconds, and the cluster will take care of automatically decrypting them before running my containers.
For experiments I run a vanilla Postgres container within the cluster, and a Kubernetes cronjob that does daily backups to S3. This helps keep my costs down, and it’s pretty simple for just starting out.
However, as a project grows, like Panelbear, I move the database out of the cluster into RDS, and let AWS take care of encrypted backups, security updates and all the other stuff that’s no fun to mess up.
For added security, the databases managed by AWS are still deployed within my private network, so they’re unreachable via the public internet.
I rely on ClickHouse for efﬁcient storage and (soft) real-time queries over the analytics data in Panelbear. It’s a fantastic columnar database, incredibly fast and when you structure your data well you can achieve high compression ratios (less storage costs = higher margins).
I currently self-host a ClickHouse instance within my Kubernetes cluster. I use a StatefulSet with encrypted volume keys managed by AWS. I have a Kubernetes CronJob that periodically backups up all data in an efﬁcient columnar format to S3. In case of disaster recovery, I have a couple of scripts to manually backup and restore the data from S3.
ClickHouse has been rock-solid so far, and it’s an impressive piece of software. It’s the only tool I wasn’t already familiar with when I started my SaaS, but thanks to their docs I was able to get up and running pretty quickly.
I think there’s a lot of low hanging fruit in case I wanted to squeeze out even more performance (eg. optimizing the ﬁeld types for better compression, pre-computing materialized tables and tuning the instance type), but it’s good enough for now.
Besides Django, I also run containers for Redis, ClickHouse, NextJS, among other things. These containers have to talk to each other somehow, and that somehow is via the built-in service discovery in Kubernetes.
It’s pretty simple: I deﬁne a Service resource for the container and Kubernetes automatically manages DNS records within the cluster to route trafﬁc to the corresponding service.
For example, given a Redis service exposed within the cluster:
I can access this Redis instance anywhere from my cluster via the following URL:
Notice the service name and the project namespace is part of the URL. That makes it really easy for all your cluster services to talk to each other, regardless of where in the cluster they run.
For example, here’s how I’d conﬁgure Django via environment variables to use my in-cluster Redis:
Kubernetes will automatically keep the DNS records in-sync with healthy pods, even as containers get moved across nodes during autoscaling. The way this works behind the scenes is pretty interesting, but out of the scope of this post. Here’s a good explanation in case you ﬁnd it interesting.
I want version-controlled, reproducible infrastructure that I can create and destroy with a few simple commands.
To achieve this, I use Docker, Terraform and Kubernetes manifests in a monorepo that contains all-things infrastructure, even across multiple projects. And for each application/project I use a separate git repo, but this code is not aware of the environment it will run on.
If you’re familiar with The Twelve-Factor App this separation may ring a bell or two. Essentially, my application has no knowledge of the exact infrastructure it will run on, and is conﬁgured via environment variables.
By describing my infrastructure in a git repo, I don’t need to keep track of every little resource and conﬁguration setting in some obscure UI. This enables me to restore my entire stack with a single command in case of disaster recovery.
Here’s an example folder structure of what you might ﬁnd on the infra monorepo:
Another advantage of this setup is that all the moving pieces are described in one place. I can conﬁgure and manage reusable components like centralized logging, application monitoring, and encrypted secrets to name a few.
There’s this card trick I saw that I still think about all the time. It’s a simple presentation (which I’ve further simpliﬁed here for clarity): a volunteer chooses a card and seals the card in an envelope. Then, the magician invites the volunteer to choose some tea. There are dozens of boxes of tea, all sealed in plastic. The volunteer chooses one, rips the plastic, and chooses one of the sealed packets containing the tea bags. When the volunteer rips open the packet … inside is their card.
⚠️ If you don’t want to know how the trick is done, stop reading now.
The secret is mundane, but to me it’s thrilling. The card choice is a force. But choice from those dozens of boxes of tea really is a free choice, and the choice of tea bag within that box is also a free choice. There’s no sleight-of-hand: the magician doesn’t touch the tea boxes or the teabag that the volunteer chooses. The card really is inside of that sealed tea packet.
The trick is all in the preparation. Before the trick, the magician buys dozens of boxes of tea, opens every single one, unwraps each tea packet. Puts a Three of Clubs into each packet. Reseals the packet. Puts the packets back in the box. Re-seals each box. And repeats this hundreds of times. This takes hours — days, even.
The only “trick” is that this preparation seems so boring, so impossibly tedious, that when we see the effect we can’t imagine that anyone would do something so tedious just for this simple effect.
Teller writes about this in an article about the seven secrets of magic:
You will be fooled by a trick if it involves more time, money and practice than you (or any other sane onlooker) would be willing to invest. My partner, Penn, and I once produced 500 live cockroaches from a top hat on the desk of talk-show host David Letterman. To prepare this took weeks. We hired an entomologist who provided slow-moving, camera-friendly cockroaches (the kind from under your stove don’t hang around for close-ups) and taught us to pick the bugs up without screaming like preadolescent girls. Then we built a secret compartment out of foam-core (one of the few materials cockroaches can’t cling to) and worked out a devious routine for sneaking the compartment into the hat. More trouble than the trick was worth? To you, probably. But not to magicians.
I often have people newer to the tech industry ask me for secrets to success. There aren’t many, really, but this secret — being willing to do something so terrifically tedious that it appears to be magic — works in tech too.
We’re an industry obsessed with automation, with streamlining, with efﬁciency. One of the foundational texts of our engineering culture, Larry Wall’s virtues of the programmer, includes laziness:
Laziness: The quality that makes you go to great effort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will ﬁnd useful and document what you wrote so you don’t have to answer so many questions about it.
I don’t disagree: being able to ofﬂoad repetitive tasks to a program is one of the best things about knowing how to code. However, sometimes problems can’t be solved by automation. If you’re willing to embrace the grind you’ll look like a magician.
For example, I once joined a team maintaining a system that was drowning in bugs. There were something like two thousand open bug reports. Nothing was tagged, categorized, or prioritized. The team couldn’t agree on which issues to tackle. They were stuck essentially pulling bugs at random, but it was never clear if that issue was important.. New bug reports couldn’t be triaged effectively because ﬁnding duplicates was nearly impossible. So the open ticket count continued to climb. The team had been stalled for months. I was tasked with solving the problem: get the team unstuck, get reverse the trend in the open ticket count, come up with a way to eventually drive it down to zero.
So I used the same trick as the magician, which is no trick at all: I did the work. I printed out all the issues - one page of paper for each issue. I read each page. I took over a huge room and started making piles on the ﬂoor. I wrote tags on sticky notes and stuck them to piles. I shufﬂed pages from one stack to another. I wrote ticket numbers on whiteboards in long columns; I imagined I was Ben Afﬂeck in The Accountant. I spent almost three weeks in that room, and emerged with every bug report reviewed, tagged, categorized, and prioritized.
The trend reversed immediately after that: we were able to close several hundred tickets immediately as duplicates, and triaging new issues now took minutes instead of a day. It took I think a year or more to drive the count to zero, but it was all fairly smooth sailing. People said I did the impossible, but that’s wrong: I merely did something so boring that nobody else had been willing to do it.
Sometimes, programming feels like magic: you chant some arcane incantation and a ﬂeet of robots do your bidding. But sometimes, magic is mundane. If you’re willing to embrace the grind, you can pull off the impossible.
It’s all fun and games until someone loses an eye. Likewise, it’s all fun and games until someone loses access to their private and/or business data because they trusted it to someone else.
You don’t have to be an expert seeker to be able to quickly duck out (it’s like the verb ‘googling’, but used to describe searching the interwebs through a decent search engine, like DuckDuckGo) all the stories about little guys being fucked over by “don’t be evil” type of corporate behemoths.
You know what? Let me duck it out for you:
A drinking game recommendation (careful, it may and probably will lead to alcoholism): take a shot every time you ﬁnd out how someone’s data has been locked and their business was jeopardized because they didn’t own, or at least back up their data.
Owning your data is more than just having backup copies of your digital information. It’s also about control and privacy. It’s about trust. I don’t know about you, but I don’t trust a lot of services with my data (the ones I do are few and far between).
As this is a post about self-hosting, I won’t start preaching (trust me, it’s hard for me not to) how you should consider switching from WhatsApp to Signal, Google Maps to OpenStreetMap, or how you should quit Instagram and Facebook. You’re creating a lot of data there, and they don’t do pretty things with it. Fuck, I’m already preaching. Sorry about that.
Note: I’m not fully off social media. I’m using Twitter and LinkedIn. Everything on Twitter is public/disposable and I don’t use their private messaging feature. LinkedIn is there for professional correspondence and I will start to taper it off slowly, but that one is tough to quit.
I’m aware most of the people are not power users, and not everyone will want to spend time learning to set up their own alternatives to the services mentioned and create the backup strategies as I’ve done. It does take some time (mostly to set everything up) and some money. If you’ll take anything from this post, it should be to always back up your data (yes, even though it’s replicated across 5 Google’s datacenters). If shit hits the fan, it may take you a while to adopt new tools or ﬂows, but you will still have your backup. Do your backups early and often.
I created a simple diagram to roughly show how my personal setup works. Before you say anything — I’m aware that there’s a group of people that wouldn’t consider my self-hosting as pure self-hosting. I’m using Vultr* to host my web-facing applications and not a server in my house. Unfortunately, the current situation doesn’t allow me to do that (yet).
So, here’s the diagram. The detailed explanation continues below.
I’ve separated the diagram into 4 parts — each part represents a different physical location of the data.
The part that gets the most action is the yellow part, living in the cloud.
I’m living in Germany, so the obvious choice was to spin up my instances in Vultr‘s* data center in Frankfurt, as ping is the lowest to that center for me.
Right now, I have six compute instances running there. You can see types of cloud compute instances in the picture below. It’s pretty similar to what you would get from DigitalOcean or AWS EC2.
Why did I choose Vultr*? They have pretty good customer service there, and I just happened to stumble on them before DigitalOcean got big and popular and AWS became the leader in the cloud computing game. Having said that, for purely private use, I wouldn’t opt for AWS even if I had to choose now. I’ll leave it at that.
The breakdown looks like this:
* 2 x $10/mo VPS — several web projects that I run for myself and friends
Everything combined costs me $55 per month.
Nextcloud is the powerhouse of my everyday data ﬂow and manipulation. With the addition of apps, it’s a really powerful one in all solution to serve as an alternative to widely popular offerings of the FAANG crowd. Once properly set up, not much maintenance is needed.
* Tasks are the alternative to Todoist or Any.do which I used previously.
* Notes are the alternative to Google Keep. Not as fully featured as Evernote or OneNote I have also tried out at one point, but it’s good enough for me.
* Calendar is the alternative to Google Calendar I used previously.
* Contacts are the alternative to Google/Samsung contacts I used previously.
I’m also able to stream music from Nextcloud to my phone, using Nextcloud music. For the client, you can use anything compatible with Ampache or Subsonic. My choice is Power Ampache. I’m not streaming a lot of music though. I always have 30-40 GB of MP3s on my phone that I rotate every now and then.
All the data from Nextcloud is in sync with Synology at my home via CloudSync. A big plus is a nice dark theme for the web UI:
I’m a developer and more than I need the air to breathe and coffee to drink, I need version control. My weapon of choice is git, which is lucky because there are a lot of hosting solutions for it out there. I was thinking about this one for a while and it boiled down to GitLab vs Gitea.
For my needs, GitLab was overkill, so I went with Gitea. It’s lightweight, easy to update, and just works. Its interface is clean and easy to understand, and because UI is similar to that one of GitHub, people that I collab with ﬁnd it as a seamless switch. On the negative side, if you want to customize it, it can be a pain in the butt.
Monica is a personal CRM. Some people think I’m weird because I’m using a personal CRM. I ﬁnd it awesome. After I meet people, I often write down some information about them that I would otherwise forget. I sometimes make notes about longer phone calls if I know the information from the call will come in handy later on. Colleagues’ and friends’ birthdays, ideas for their presents, things like that — they go into the CRM.
I mention Monica in my post on how you should not ignore rejection emails, where you can see another example of how it helps with my ﬂow.
Kanboard is a free and open-source Kanban project management software. I use it to manage my side projects, but I also use it for keeping track of books I read, some ﬁnancial planning, study progress tracking, etc. It’s written in PHP, it’s easily customizable and it supports multiple users. Usually, when I do some collaborations, I will immediately create an account for that person on both Gitea and Kanboard.
Plausible is my choice for analytics that I use on several websites that I own. It’s lightweight, it’s open-source, and most importantly — it respects your privacy. There’s a how-to that I wrote on how you can install it on an Ubuntu machine yourself. The bonus thing is that I really like developers’ approach to running a business. They have a cool blog where you can read about it.
Development tools that I’m mentioning are basically a bunch of scripts I have developed and gathered over time. Text encoders/decoders, color pickers, WYSIWYG layout builders, Swagger editor, etc. If I use something often and it’s trivial to implement on my own, I’ll do it.
Desktop PC and NAS are part of my ‘home’ region.
Desktop is nothing special. I don’t play video games, and I don’t do any work that needs a lot of computing performance. It’s the 8th generation i5 with integrated graphics, 1TB SSD, and 16 gigs of RAM. OS I’m using is Ubuntu — the latest LTS version. It’s installed on both the desktop and laptop.
Everything except the OS and the apps is synced in real-time to Synology by using the Synology Drive Client.
Synology NAS I’m using is the DS220j. It’s not the fastest thing, but again, it works for me. I have two Western Digital Red drives (shocking, huh?), 2TB each.
Every last weekend of the month, I will manually backup all the data to Blu-ray discs. Not once, but twice. One copy goes to a safe storage space at home and the other one ends up at a completely different location.
This is my ‘everything is fucked, burnt or stolen’ situation remedy. I’m not particularly happy with the physical security I’ve set up at home, so one of the concerns is the theft of disks and backups. Other than moving to a different location where it would be easier to work on upgrading the physical security, my hands are tied regarding this subject (not for long hopefully).
Other things could happen, like ﬁre, ﬂood, etc. Of course, it’s a bit of a hassle, but I believe in being prepared for any type of situation, no matter how improbable it may seem.
When you’re self-hosting, it will, naturally, also reﬂect on the apps that you use on your portable devices. Previously, my phone’s home screen was ﬁlled with mostly Google Apps — calendar, keep, maps, drive. There were also Dropbox, Spotify/Deezer. It’s different now.
I have De-Googled my phone, using /e/ and F-Droid. There are compromises you’ll have to make if you choose to go down this path. Sometimes it will go smoothly, but sometimes it will frustrate the hell out of you. It was worth it for me. I value my freedom and privacy so much more than an occasional headache caused by buggy software.
This is the list of the apps I use frequently that are related to self-hosting:
* PowerAmpache — lets me stream music from my cloud
* PulseMusic — my main music app that I use to listen to the music collection stored directly on my phone (30-40GB at any time that I rotate from time to time)
* Nextcloud — this is the sync client for the phone and a ﬁle browser
* K-9 Mail — really, really ugly looking email client that is also the best email client for Android I have ever used
As I mentioned previously, my Laptop is running the latest Ubuntu LTS, just like the desktop PC. To have things partially synced to the NextCloud, I’m using the ofﬁcial desktop client. Listing other tools that I use as a developer that may be tangibly related to self-hosting would be another two thousand words, so I won’t go into that right now.
Is it worth the time and hassle? Only you can answer that for yourself.
Researching alternatives to commercial cloud offerings, and setting everything up surely took some time. I didn’t track it, so I can’t say precisely how much time, but it was definitely in the double digits. If I had to guess, I would say ~40 hours.
Luckily, after that phase, things run (mostly) without any interruption. I have a monthly reminder to check for the updates and apply them to the software I’m running. I don’t bother with minor updates, so if it’s not broken, I’m not ﬁxing it.
If I motivate even one person to at least consider the option of self-hosting, I will be happy. Feel free to drop me a message if you decide to do that!
* Links to Vultr contain my referral code, which means that if you choose to subscribe to Vultr after you clicked on that link, it will earn me a small commission.
No one wants to be the bad guy.
When narratives begin to shift and the once good guys are labelled as bad it’s not surprising they ﬁght back. They’ll point to criticisms as exaggerations. Their faults as misunderstandings.
Today’s freshly ordained bad guys are the investors and CEOs of Silicon Valley.
Once championed as the ﬂagbearers of innovation and democratization, now they’re viewed as new versions of the monopolies of old and they’re ﬁghting back.
The title of Paul Graham’s essay, How People Get Rich Now, didn’t prepare me for the real goal of his words. It’s less a tutorial or analysis and more a thinly veiled attempt to ease concerns about wealth inequality.
What he fails to mention is that concerns about wealth inequality aren’t concerned with how wealth was generated but rather the growing wealth gap that has accelerated in recent decades. Tech has made startups both cheaper and easier but only for a small percentage of people. And when a select group of people have an advantage that others don’t it’s compounded over time.
Paul paints a rosy picture but doesn’t mention that incomes for lower and middle-class families have fallen since the 80s. This golden age of entrepreneurship hasn’t beneﬁtted the vast majority of people and the increase in the Gini coefﬁcient isn’t simply that more companies are being started. The rich are getting richer and the poor are getting poorer.
And there we have it. The slight injection of his true ideology relegated to the notes section and vague enough that some might ignore. But keep in mind this is the same guy who argued against a wealth tax. His seemingly impartial and logical writing attempts to hide his true intentions.
Is this really about how people get rich or why we should all be happy that people like PG are getting richer while tons of people and struggling to meet their basic needs. Wealth inequality is just a radical left fairy tale to villainize the hard-working 1%. We could all be rich too, it’s so much easier now. Just pull yourself up by your bootstraps.
There’s no question that it’s easier now than ever to start a new business and reach your market. The internet has had a democratizing effect in this regard. But it’s also obvious to anyone outside the SV bubble that it’s still only accessible to a small minority of people. Most people don’t have the safety net or mental bandwidth to even consider entrepreneurship. It is not a panacea for the masses.
But to use that fact to push the false claim that wealth inequality is solely due to more startups and not a real problem says a lot. This essay is less about how people get rich and more about why it’s okay that people like PG are getting rich. They’re better than the richest people of 1960. And we can join them. We just need to stop complaining and just be rich instead.
Many technologists viscerally felt yesterday’s announcement as a punch to the gut when we heard that the Signal messaging app was bundling an embedded cryptocurrency. This news really cut to heart of what many technologists have felt before when we as loyal users have been exploited and betrayed by corporations, but this time it felt much deeper because it introduced a conﬂict of interest from our fellow technologists that we truly believed were advancing a cause many of us also believed in. So many of us have spent significant time and social capital moving our friends and family away from the exploitative data siphon platforms that Facebook et al offer, and on to Signal in the hopes of breaking the cycle of commercial exploitation of our online relationships. And some of us feel used.
Signal users are overwhelmingly tech savvy consumers and we’re not idiots. Do they think we don’t see through the thinly veiled pump and dump scheme that’s proposed? It’s an old scam with a new face.
Allegedly the controlling entity prints 250 million units of some artiﬁcially scarce trashcoin called MOB (coincidence?) of which the issuing organization controls 85% of the supply. This token then ﬂoats on a shady offshore cryptocurrency exchange hiding in the Cayman Islands or the Bahamas, where users can buy and exchange the token. The token is wash traded back and forth by insiders and the exchange itself to artiﬁcially pump up the price before it’s dumped on users in the UK to buy to allegedly use as “payments”. All of this while insiders are free to silently use information asymmetry to cash out on the inﬂux of pumped hype-driven buys before the token crashes in value. Did I mention that the exchange that ﬂoats the token is the primary investor in the company itself, does anyone else see a major conﬂict of interest here?
Let it be said that everything here is probably entirely legal or there simply is no precedent yet. The question everyone is asking before these projects launch now though is: should it be?
I think I speak for many technologists when I say that any bolted-on cryptocurrency monetization scheme smells like a giant pile of rubbish and feels enormously user-exploitative. We’ve seen this before, after all Telegram tried the same thing in an ICO that imploded when SEC shut them down, and Facebook famously tried and failed to monetize WhatsApp through their decentralized-but-not-really digital money market fund project.
The whole Libra/Diem token (or whatever they’re calling its remains this week) was a failed Facebook initiative exploiting the gaping regulatory loophole where if you simply call yourself a cryptocurrency platform (regardless of any technology) you can effectively function as a shadow bank and money transmistter with no license, all while performing roughly the same function as a bank but with magic monopoly money that you can print with no oversight while your customers assume full counterparty risk. If that sounds like a terrible idea, it’s because it is. But we fully expect that level of evil behavior from Facebookers because that’s kind of their thing.
The sad part of this entire project is that it launched in the UK—likely because it would be blatantly illegal in the United States—where here retail transfers are ubiquitous, instant, and cheap. The digital sterling held at our high street bank or mobile banking app works very well as a currency, and nobody needs or wants to buy a dedicated trashcoin for every single app on our mobiles. This is all just bringing us back to some stone age barter system for no reason.
The larger trend is of activist investors trying to turn every app with a large userbase into a coin operated slot machine which forces users to buy from a supply of penny-stock-like tokens that are thinly traded and which investors and market makers collude on to manipulate prices for their own gain. As we saw with the downfall of Keybase users don’t want a tokenized pay-to-play dystopia, and most technical users rightly associate cryptocurrency with scams and this association will act like a lightning rod for legal scrutiny. This association weakens the entire core value proposition of the Signal app for no reason other than making a few insiders richer.
Signal is a still a great piece of software. Just do one thing and do it well, be the trusted de facto platform for private messaging that empowers dissidents, journalists and grandma all to communicate freely with the same guarantees of privacy. Don’t become a dodgy money transmitter business. This is not the way.
Every year since 1982, Forbes magazine has published a list of the
richest Americans. If we compare the 100 richest people in 1982 to
the 100 richest in 2020, we notice some big differences.
In 1982 the most common source of wealth was inheritance. Of the
100 richest people, 60 inherited from an ancestor. There were 10
du Pont heirs alone. By 2020 the number of heirs had been cut in
half, accounting for only 27 of the biggest 100 fortunes.
Why would the percentage of heirs decrease? Not because inheritance
taxes increased. In fact, they decreased significantly during this
period. The reason the percentage of heirs has decreased is not
that fewer people are inheriting great fortunes, but that more
people are making them.
How are people making these new fortunes? Roughly 3/4 by starting
companies and 1/4 by investing. Of the 73 new fortunes in 2020, 56
derive from founders’ or early employees’ equity (52 founders, 2
early employees, and 2 wives of founders), and 17 from managing
There were no fund managers among the 100 richest Americans in 1982.
Hedge funds and private equity ﬁrms existed in 1982, but none of
their founders were rich enough yet to make it into the top 100.
Two things changed: fund managers discovered new ways to generate
high returns, and more investors were willing to trust them with
But the main source of new fortunes now is starting companies, and
when you look at the data, you see big changes there too. People
get richer from starting companies now than they did in 1982, because
the companies do different things.
In 1982, there were two dominant sources of new wealth: oil and
real estate. Of the 40 new fortunes in 1982, at least 24 were due
primarily to oil or real estate. Now only a small number are: of
the 73 new fortunes in 2020, 4 were due to real estate and only 2
By 2020 the biggest source of new wealth was what are sometimes
called “tech” companies. Of the 73 new fortunes, about 30 derive
from such companies. These are particularly common among the richest
of the rich: 8 of the top 10 fortunes in 2020 were new fortunes of
Arguably it’s slightly misleading to treat tech as a category.
Isn’t Amazon really a retailer, and Tesla a car maker? Yes and no.
Maybe in 50 years, when what we call tech is taken for granted, it
won’t seem right to put these two businesses in the same category.
But at the moment at least, there is definitely something they share
in common that distinguishes them. What retailer starts AWS? What
car maker is run by someone who also has a rocket company?
The tech companies behind the top 100 fortunes also form a
well-differentiated group in the sense that they’re all companies
that venture capitalists would readily invest in, and the others
mostly not. And there’s a reason why: these are mostly companies
that win by having better technology, rather than just a CEO who’s
really driven and good at making deals.
To that extent, the rise of the tech companies represents a qualitative
change. The oil and real estate magnates of the 1982 Forbes 400
didn’t win by making better technology. They won by being really
driven and good at making deals.
And indeed, that way of
getting rich is so old that it predates the Industrial Revolution.
The courtiers who got rich in the (nominal) service of European
royal houses in the 16th and 17th centuries were also, as a rule,
really driven and good at making deals.
People who don’t look any deeper than the Gini coefﬁcient look
back on the world of 1982 as the good old days, because those who
got rich then didn’t get as rich. But if you dig into how they
got rich, the old days don’t look so good. In 1982, 84% of the
richest 100 people got rich by inheritance, extracting natural
resources, or doing real estate deals. Is that really better than
a world in which the richest people get rich by starting tech
Why are people starting so many more new companies than they used
to, and why are they getting so rich from it? The answer to the
ﬁrst question, curiously enough, is that it’s misphrased. We
shouldn’t be asking why people are starting companies, but why
they’re starting companies again.
In 1892, the New York Herald Tribune compiled a list of all the
millionaires in America. They found 4047 of them. How many had
inherited their wealth then? Only about 20% — less than the
proportion of heirs today. And when you investigate the sources of
the new fortunes, 1892 looks even more like today. Hugh Rockoff
found that “many of the richest … gained their initial edge from
the new technology of mass production.”
So it’s not 2020 that’s the anomaly here, but 1982. The real question
is why so few people had gotten rich from starting companies in
1982. And the answer is that even as the Herald Tribune’s list was
being compiled, a wave of consolidation
was sweeping through the
American economy. In the late 19th and early 20th centuries,
ﬁnanciers like J. P. Morgan combined thousands of smaller companies
into a few hundred giant ones with commanding economies of scale.
By the end of World War II, as Michael Lind writes, “the major
sectors of the economy were either organized as government-backed
cartels or dominated by a few oligopolistic corporations.”
In 1960, most of the people who start startups today would have
gone to work for one of them. You could get rich from starting your
own company in 1890 and in 2020, but in 1960 it was not really a
viable option. You couldn’t break through the oligopolies to get
at the markets. So the prestigious route in 1960 was not to start
your own company, but to work your way up the corporate ladder at
an existing one.
Making everyone a corporate employee decreased economic inequality
(and every other kind of variation), but if your model of normal
Please don’t make me go back: Some of the reasons I hope work from home continues and I never have to return to an ofﬁce
I just hope my employer continues to allow me to work remotely. Have been exceeding production for the past yearDon’t have to share a keyboard/mouse/desk with another shift (gross).Do not spend an hour a day(+/- 10 minutes) driving, and save all the money associated with that commute. Do not have to worry about driving on snowy/icy roads as my job requires us to be there unless there is a county emergency making travel by road illegal.Do not have to go to work sick, because they act like you murdered 30 children if you call in sick. I can work comfortably while isolated from my coworkers.Do not have to be exposed to multiple sick coworkers at any given time. Do not have to ﬁght 100-130 people for 1 of 3 microwaves on my 30-minute lunch break, assuming someone hasn’t taken my food out of the refrigerator and eaten it/thrown it away/left it sitting out to make room for theirs.Do not have to use a restroom that, with frequency (to the point of multiple times over the years management threatening supervised restroom usage for adults), has boogers/urine/feces/blood on the ﬂoors/wall/toilet seats.Pack into a noisy open ofﬁce where I can’t focus as people are constantly talking, walking by, burning popcorn, microwaving ox tail and greens that smell worse than burnt popcorn.Deal with ear-damaging ﬁre drills, cram into tiny ofﬁces with 20 co-workers for tornado drills/warnings.Deal with an HVAC system that regularly gifts temperatures in the mid-60s Fahrenheit in the winter and mid-80s Fahrenheit in the summer inside the ofﬁce where business casual is required.Drink milky-white water from the tap AFTER it comes out of a ﬁlter that has algae growing in the transparent tube feeding the faucet.Deal with coworkers damaging vehicles. Just about everyone’s car has multiple door dings. Personally, I have no less than 1 dozen that, I know for a fact, occurred at work.Have to deal with random coworkers coming to my desk/cornering me at the urinal and blabbing incessantly about their divorce/kids/wife’s boyfriend/vet bill, this was a regular occurrence in person and has happened exactly once via teams since work from home started.Don’t have to breathe jet exhaust that inevitably makes its way into our building through the cracks around the doors that you can pass multiple pieces of paper through.Can be here to sign for packages.Can eat healthier because I have an entire kitchen at my disposal during my half-hour lunch.Am far more likely to work overtime when needed as I’m saving an hour commuting!Don’t need to spend tens of seconds inspecting the toilet seat and/or constructing an elaborate ring of squares to ease my mind if I need to poop.Don’t have a desk and chair forced upon me, instead, I have the freedom to pick a desk and chair that I like, that ﬁts my needs.
should check it out
We’re Fly.io. We take container images and run them on our hardware around the world. It’s pretty neat, and you should check it out ; with an already-working Docker container, you can be up and running on Fly in well under 10 minutes.
Even though most of our users deliver software to us as Docker containers, we don’t use Docker to run them. Docker is great, but we’re high-density multitenant, and despite strides, Docker’s isolation isn’t strong enough for that. So, instead, we transmogrify container images into Firecracker micro-VMs.
They do their best to make it look a lot more complicated, but OCI images — OCI is the standardized container format used by Docker — are pretty simple. An OCI image is just a stack of tarballs.
Backing up: most people build images from Dockerﬁles. A useful way to look at a Dockerﬁle is as a series of shell commands, each generating a tarball; we call these “layers”. To rehydrate a container from its image, we just start the the ﬁrst layer and unpack one on top of the next.
You can write a shell script to pull a Docker container from its registry, and that might clarify. Start with some conﬁguration; by default, we’ll grab the base image for golang:
We need to authenticate to pull public images from a Docker registry — this is boring but relevant to the next section — and that’s easy:
That token will allow us to grab the “manifest” for the container, which is a JSON index of the parts of a container.
The ﬁrst query we make gives us the “manifest list”, which gives us pointers to images for each supported architecture:
Pull the digest out of the matching architecture entry and perform the same fetch again with it as an argument, and we get the manifest: JSON pointers to each of the layer tarballs:
It’s as easy to grab the actual data associated with these entries as you’d hope:
And with those pieces in place, pulling an image is simply:
Unpack the tarballs in order and you’ve got the ﬁlesystem layout the container expects to run in. Pull the “conﬁg” JSON and you’ve got the entrypoint to run for the container; you could, I guess, pull and run a Docker container with nothing but a shell script, which I’m probably the 1,000th person to point out. At any rate, here’s the whole thing.
You’re likely of one of two mindsets about this: (1) that it’s extremely Unixy and thus excellent, or (2) that it’s extremely Unixy and thus horrifying.
Unix tar is problematic. Summing up Aleksa Sarai: tar isn’t well standardized, can be unpredictable, and is bad at random access and incremental updates. Tiny changes to large ﬁles between layers pointlessly duplicate those ﬁles; the poor job tar does managing container storage is part of why people burn so much time optimizing container image sizes.
Another fun detail is that OCI containers share a security footgun with git repositories: it’s easy to accidentally build a secret into a public container, and then inadvertently hide it with an update in a later image.
We’re of a third mindset regarding OCI images, which is that they are horrifying, and that’s liberating. They work pretty well in practice! Look how far they’ve taken us! Relax and make crappier designs; they’re all you probably need.
Back to Fly.io. Our users need to give us OCI containers, so that we can unpack and run them. There’s standard Docker tooling to do that, and we use it: we host a Docker registry our users push to.
Running an instance of the Docker registry is very easy. You can do it right now; docker pull registry && docker run registry. But our needs are a little more complicated than the standard Docker registry: we need multi-tenancy, and authorization that wraps around our API. This turns out not to be hard, and we can walk you through it.
A thing to know off the bat: our users drive Fly.io with a command line utility called ﬂyctl. ﬂyctl is a Go program (with public source) that runs on Linux, macOS, and Windows. A nice thing about working in Go in a container environment is that the whole ecosystem is built in the same language, and you can get a lot of stuff working quickly just by importing it. So, for instance, we can drive our Docker repository clientside from ﬂyctl just by calling into Docker’s clientside library.
If you’re building your own platform and you have the means, I highly recommend the CLI-ﬁrst tack we took. It is so choice. ﬂyctl made it very easy to add new features, like databases , private networks , volumes , and our bonkers SSH access system
On the serverside, we started out simple: we ran an instance of the standard Docker registry with an authorizing proxy in front of it. ﬂyctl manages a bearer token and uses the Docker APIs to initiate Docker pushes that pass that token; the token authorizes repositories serverside using calls into our API.
What we do now isn’t much more complicated than that. Instead of running a vanilla Docker registry, we built a custom repository server. As with the client, we get a Docker registry implementation just by importing Docker’s registry code as a Go dependency.
We’ve extracted and simpliﬁed some of the Go code we used to build this here, just in case anyone wants to play around with the same idea. This isn’t our production code (in particular, all the actual authentication is ripped out), but it’s not far from it, and as you can see, there’s not much to it.
Our custom server isn’t architecturally that different from the vanilla registry/proxy system we had before. We wrap the Docker registry API handlers with authorizer middleware that checks tokens, references, and rewrites repository names. There are some very minor gotchas:
* Docker is content-addressed, with blobs “named” for their SHA256 hashes, and attempts to reuse blobs shared between different repositories. You need to catch those cross-repository mounts and rewrite them.
* Docker’s registry code generates URLs with _state parameters that embed references to repositories; those need to get rewritten too. _state is HMAC-tagged; our code just shares the HMAC key between the registry and the authorizer.
In both cases, the source of truth for who has which repositories and where is the database that backs our API server. Your push carries a bearer token that we resolve to an organization ID, and the name of the repository you’re pushing to, and, well, our design is what you’d probably come up with to make that work. I suppose my point here is that it’s pretty easy to slide into the Docker ecosystem.
The pieces are on the board:
* We can accept containers from users
* We can store and manage containers for different organizations.
* We’ve got a VMM engine, Firecracker, that we’ve written about already.
What we need to do now is arrange those pieces so that we can run containers as Firecracker VMs.
As far as we’re concerned, a container image is just a stack of tarballs and a blob of conﬁguration (we layer additional conﬁguration in as well). The tarballs expand to a directory tree for the VM to run in, and the conﬁguration tells us what binary in that ﬁlesystem to run when the VM starts.
Meanwhile, what Firecracker wants is a set of block devices that Linux will mount as it boots up.
There’s an easy way on Linux to take a directory tree and turn it into a block device: create a ﬁle-backed loop device, and copy the directory tree into it. And that’s how we used to do things. When our orchestrator asked to boot up a VM on one of our servers, we would:
Pull the matching container from the registry. Create a loop device to store the container’s ﬁlesystem on. Unpack the container (in this case, using Docker’s Go libraries) into the mounted loop device. Create a second block device and inject our init, kernel, conﬁguration, and other goop into. Track down any persistent volumes attached to the application, unlock them with LUKS, and collect their unlocked block devices. Create a TAP device, conﬁgure it for our network, and attach BPF code to it. Hand all this stuff off to Firecracker and tell it to boot .
This is all a few thousand lines of Go.
This system worked, but wasn’t especially fast. Part of the point of Firecracker is to boot so quickly that you (or AWS) can host Lambda functions in it and not just long-running programs. A big problem for us was caching; a server in, say, Dallas that’s asked to run a VM for a customer is very likely to be asked to run more instances of that server (Fly.io apps scale trivially; if you’ve got 1 of something running and would be happier with 10 of them, you just run ﬂyctl scale count 10). We did some caching to try to make this faster, but it was of dubious effectiveness.
The system we’d been running was, as far as container ﬁlesystems are concerned, not a whole lot more sophisticated than the shell script at the top of this post. So Jerome replaced it.
What we do now is run, on each of our servers, an instance of containerd. containerd does a whole bunch of stuff, but we use it as as a cache.
If you’re a Unix person from the 1990s like I am, and you just recently started paying attention to how Linux storage works again, you’ve probably noticed that a lot has changed. Sometime over the last 20 years, the block device layer in Linux got interesting. LVM2 can pool raw block devices and create synthetic block devices on top of them. It can treat block device sizes as an abstraction, chopping a 1TB block device into 1,000 5GB synthetic devices (so long as you don’t actually use 5GB on all those devices!). And it can create snapshots, preserving the blocks on a device in another synthetic device, and sharing those blocks among related devices with copy-on-write semantics.
containerd knows how to drive all this LVM2 stuff, and while I guess it’s out of fashion to use the devmapper backend these days, it works beautifully for our purposes. So now, to get an image, we pull it from the registry into our server-local containerd, conﬁgured to run on an LVM2 thin pool. containerd manages snapshots for every instance of a VM/container that we run. Its API provides a simple “lease”-based garbage collection scheme; when we boot a VM, we take out a lease on a container snapshot (which synthesizes a new block device based on the image, which containerd unpacks for us); LVM2 COW means multiple containers don’t step on each other. When a VM terminates, we surrender the lease, and containerd eventually GCs.
The ﬁrst deployment of a VM/container on one of our servers does some lifting, but subsequent deployments are lightning fast (the VM build-and-boot process on a second deployment is faster than the logging that we do).
Jerome wrote our init in Rust, and, after being cajoled by Josh Triplett, we released the code, which you can go read.
The ﬁlesystem that Firecracker is mounting on the snapshot checkout we create is pretty raw. The ﬁrst job our init has is to ﬁll in the blanks to fully populate the root ﬁlesystem with the mounts that Linux needs to run normal programs.
We inject a conﬁguration ﬁle into each VM that carries the user, network, and entrypoint information needed to run the image. init reads that and conﬁgures the system. We use our own DNS server for private networking, so init overrides resolv.conf. We run a tiny SSH server for user logins over WireGuard; init spawns and monitors that process. We spawn and monitor the entry point program. That’s it; that’s an init.
So, that’s about half the idea behind Fly.io. We run server hardware in racks around the world; those servers are tied together with an orchestration system that plugs into our API. Our CLI, ﬂyctl, uses Docker’s tooling to push OCI images to us. Our orchestration system sends messages to servers to convert those OCI images to VMs. It’s all pretty neato, but I hope also kind of easy to get your head wrapped around.
The other “half” of Fly is our Anycast network, which is a CDN built in Rust that uses BGP4 Anycast routing to direct trafﬁc to the nearest instance of your application. About which: more later.