10 interesting stories served every morning and every evening.
How big was the Tonga eruption?
The explosive eruption of the Hunga Tonga-Hunga Ha’apai volcano may be one of the largest recorded in such detail. The blast was visible from space, with images of the massive ash plume going viral over the following days. But just how big was it?
The underwater volcano erupted with a deafening explosion on Jan. 15, triggering deadly tsunamis, covering islands in ash, and knocking out communications for Tonga’s 105,000 people
The event was captured in astonishing detail by satellites including the NOAA GOES-West satellite, shown below.
Breaking down the stages of the eruption into intervals allows us to plot the expansion of the enormous plume of material that volcanologists call an “umbrella cloud”.
Around the time of the initial eruption, a cloud measuring 38 km (24 miles) wide is thrust into the atmosphere. Its diameter already measures almost twice the length of Manhattan, New York. One hour later, it appears to measure around 650 km wide, including shock waves around its edge.
The scale of the umbrella cloud is comparable to the 1991 Pinatubo eruption in the Philippines and is one of largest of the satellite era, according to Michigan Tech volcanologist Simon Carn in a NASA blog post.
The satellite images of the event show mostly ocean with scattered islands of Tonga and Fiji barely noticeable. Gauging the actual size of the eruption is difﬁcult when in such a remote part of the South Paciﬁc.
Here, we take the cloud of volcanic material and place it over well-known land masses and coastlines to get a true sense of just how big the eruption was.
At 650 km in diameter, the cloud would obscure most of Great Britain and the east coast of Ireland. It is almost the same size as mainland Spain.
When compared to parts of the U. S., it would cover a large part of Florida or a section of California from San Francisco to Los Angeles.
If we compare the scale of the eruption to Southeast Asia, it would cover Cambodia and part of Laos, Vietnam, and Thailand, or obscure almost all of North and South Korea.
If we placed the scale of the eruption over Egypt’s Mount Sinai region, it would cover Israel, spilling over into Jordan and the Mediterranean Sea. It is also large enough to obscure the Horn of Africa.
Scientists are still determining the Volcanic Explosivity Index (VEI) of the eruption, a measurement from 1 to 8 that examines the explosivity of eruptions. Pinatubo, which is considered similar to the Hunga Tonga-Hunga Haʻapai eruption, scored a 6 on the index.
Pinatubo produced an eruption column of gas and ash that rose 40 km into the atmosphere, whereas “preliminary data on the Tongan eruption is that the gas and ash column was at least 20 km high,” said Raymod Cas, a volcanologist at Monash University in Australia.
All maps and satellite imagery reprojected to orthographic projections for accuracy of measurements
PRQL is a modern language for transforming data — a simpler and more powerful SQL
PRQL is a modern language for transforming data — a simpler and more powerful SQL
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
I’ve used NixOS as the only OS on my laptop for around three years at this point. Installing it has felt sort of like a curse: on the one hand, it’s so clearly the only operating system that actually gets how package management should be done. After using it, I can’t go back to anything else. One the other hand, it’s extremely complicated constantly changing software that requires conﬁguration with the second-worst homegrown conﬁg programming language I’ve ever used1.
I don’t think that NixOS is the future, but I do absolutely think that the ideas in it are, so I want to write about what I think it gets right and what it gets wrong, in the hopes that other projects can take note. As such, this post will not assume knowledge of NixOS — if you’ve used NixOS significantly, there probably isn’t anything new in here for you.
The fundamental thing that NixOS gets right is that software is never installed globally. All packages are stored in a content-addressable store — for instance, my editor is stored in the directory “/nix/store/frlxim9yz5qx34ap3iaf55caawgdqkip-neovim-0.5.1/” — the binary, global default conﬁguration, libraries, and everything else included in the vim package exists in that directory. Just downloading that doesn’t “install” it, though — there isn’t really such a thing as “installation” in the traditional sense. Instead, I can open a shell that has a $PATH variable set so that it can see neovim. This is quite simple to do — I can run nix-shell -p neovim, and I’ll get dropped into a shell that has neovim in the $PATH.
Crucially, this doesn’t affect any software that doesn’t have its $PATH changed. This means that it’s possible to have as many different versions of the same software package coexisting at the same time as you want, which is impossible with most distributions! You can have one shell with Python 3.7, another with Python 3.9, and install a different set of libraries on both of them. If you have two different pieces of software that have the same dependency, you don’t need to make sure they’re compatible with the same version, since each one can use the version of the dependency that it wants to.
Almost all of the good things about NixOS are natural consequences of this single decision.
For instance, once you have this, rollbacks are trivial — since multiple versions of the same software can coexist, rolling back just means changing which version of the software is used by default. As long as you save the information about what versions you used to be on (which is a tiny amount of information), rolling back is essentially just changing some symlinks. Since the kernel is a package like any other, you can have the bootloader remember the list of different versions, and let people boot into previous conﬁgurations just by selecting an older version on the boot menu.
This also makes running patched versions of software much simpler — I don’t need to worry about fucking up my system by patching something like the Python interpreter, since I know that my patched version will only run when I specifically want it. But at the same time, I can patch the Python interpreter and then have some software running on my system actually use the patched version, since all of this stuff is conﬁgured through the same conﬁguration system.
Another advantage to this systems is that it makes zero-downtime deploys significantly simpler, since you can have multiple versions of the same software running at the same time. You don’t need to take down the current version of the software before you install the new one, instead you can install the new version of the software, run both at the same time, and then cut over once you’re conﬁdent that the new version works2.
Mobile phones and embedded devices have had to build a less general version of this in order to avoid occasionally bricking themselves when they update, in the form of an A/B partitioning scheme. So far, desktop computers, and particularly Linux distributions have largely accepted that occasionally bricking themselves on update is basically ﬁne, but it doesn’t have to be this way! Using a NixOS-style system eliminates this problem in a clean, uniﬁed manner3.
One clear reason to believe that this is the future is that language package managers (which are more plentiful and can iterate faster) have largely landed on essentially this solution — virtualenv, Poetry, Yarn, Cargo and many others have landed on basically this model. Most use version numbers instead of content-addressable storage, due to the language ecosystems that they’re built around, but the fundamentals are the same, and it’s pretty clear from looking at trends in package managers that this model tends to be successful.
There are essentially two fundamental design mistakes in NixOS that lead to the problems with it.
The ﬁrst is relatively simple: they developed their own programming language to do conﬁguration, which is not very good and is extremely difﬁcult to learn. The vast majority of people using NixOS do not understand the language, and simply copy/paste example conﬁgurations, which mostly works until you need to do something complicated, at which point you’re completely high and dry. There seem to be a handful of people with a deep understanding of the language who do most of the infrastructural work, and then a long tail of people with no clue what’s going on. This is exacerbated by poor documentation — there are docs for learning Nix as a language, and docs for using NixOS, but the connection between those two things is essentially undocumented. One of the things that’s theoretically nice about having everything deﬁned in the Nix language is that it’s easily understandable once you learn Nix. Unfortunately, Nix is difﬁcult enough to learn that I couldn’t tell you if this is true or not. Nix needs more docs explaining deeply how practical applications of the Nix language actually work. It could also do with less ugly syntax, but I think that ship has sailed.
There are many other minor complaints about NixOS that stem from this — patching packages is theoretically easy, but annoying to ﬁgure out how to do in practice, for instance, and conﬁguration tends to have a lot of spooky action-at-a-distance.
The second ﬂaw is that NixOS does not actually provide real isolation. Running bash -c ‘type $0’ will get you bash is /nix/store/90y23lrznwmkdnczk1dzdsq4m35zj8ww-bash-interactive-5.1-p8/bin/bash — bash knows that it’s running from the Nix store. This means that all software needs to be recompiled to work on NixOS, often with some terrifying hacks involved. It also means that it’s impossible to statically know what other packages a given package might depend on. Currently, the way this is implemented is essentially grepping a package for /nix/store/ to try to ﬁgure out what the dependencies are, which is obviously… not great. It also means that binaries that link against /lib/ld-linux.so.2 or scripts that use #!/bin/bash won’t work without patching.
Unfortunately, the tools for ﬁxing this are not really there yet. Last fall, I prototyped a Linux distribution trying to combine a nix-store style package repository with overlayfs4. Unfortunately, overlayfs becomes very unhappy when you try to overlay too many different paths (with three distinct failure modes, interestingly), which severely limits this approach. I still think that there’s a lot of potential here — overlayfs could be fast for arbitrary numbers of paths if that was a design goal — but it’s not there yet. This means that trying to build content-addressable store that is transparent to the apps installed in it requires essentially building a container image for every composition of packages (this is the approach that Silverblue takes), which is fundamentally unsatisfying to me.
The advantage to this approach is that you can piggyback off of existing package repositories. One of the main barriers for adoption of new Linux distributions is packaging, but a distribution taking an content addressable store + overlay approach could automatically get all the beneﬁts of NixOS along with all of the packages from Debian, Ubuntu, RedHat, Arch, NixOS, and any other distributions it fancies.
NixOS very clearly has the correct way of thinking about dependency management, but is hampered by a few poor technical decisions made long ago. I’m going to keep using it, since I can’t stand anything else after having a taste of NixOS, but I’m rooting for something new to rise up and take its place, that learns from the lessons of NixOS and implements its features in a more user-friendly way.
I’ve been a Linux (or GNU/Linux, for the purists) user since 1996. I’ve been a FreeBSD user since 2002. I have always successfully used both operating systems, each for speciﬁc purposes. I have found, on average, BSD systems to be more stable than their Linux equivalents. By stability, I don’t mean uptime (too much uptime means too few kernel security updates, which is wrong). I mean that things work as they should, that they don’t “break” from one update to the next, and that you don’t have to revise everything because of a missing or modiﬁed basic command.
I’ve always been for development and innovation as long as it doesn’t (necessarily, automatically and unreasonably) break everything that is already in place. And the road that the various Linux distributions are taking seems to be that of modifying things that work just for the sake of it or to follow the diktats of the Kernel and those who manage it - but not only.
Some time ago we started a complex, continuous and not always linear operation, that is to migrate, where possible, most of the servers (ours and of our customers) from Linux to FreeBSD.
There are many alternative operating systems to Linux and the *BSD family is varied and complete. FreeBSD, in my opinion, today is the “all rounder” system par excellence, i.e. well reﬁned and suitable both for use on large servers and small embedded systems. The other BSDs have strengths that, in some ﬁelds, make them particularly suitable but FreeBSD, in my humble opinion, is suitable (almost) for every purpose.
So back to the main topic of this article, why am I migrating many of the servers we manage to FreeBSD? The reasons are many, I will list some of them with corresponding explanations.
One of the fundamental problems with Linux is that (we shall remember) it is a kernel, everything else is created by different people/companies. On more than one occasion Linus Torvalds as well as other leading Linux kernel developers have remarked that they care about the development of the kernel itself, not how users will use it. In the technical decisions, therefore, they don’t take into account what is the real use of the systems but that the kernel will go its own path. This is a good thing, as the development of the Linux kernel is not “held back” by the struggle between distributions and software solutions, but at the same time it is also a disadvantage. In FreeBSD, the kernel and its userland (i.e. all the components of the base operating system) are developed by the same team and there is, therefore, a strong cohesion between the parties. In many Linux distributions it was necessary to “deprecate” ifconﬁg in favor of ip because new developments in the kernel were no longer supported by ifconﬁg, without breaking compatibility with other (previous) kernel versions or having functions (on the same network interface) managed by different tools. In FreeBSD, with each release of the operating system, there are both kernel and userland updates, so these changes are consistently incorporated and documented, making the tools compatible with their kernel-side updates.
In other words, in FreeBSD there is no need to ” revolutionise” everything every few years and changes are made primarily in the form of additions that can enrich (and not break) each update. If a modiﬁcation was to change the way it interacts with network devices, ifconﬁg would be modiﬁed to take advantage of that and remain compatible with the “old” syntax. In the long-term, this kind of approach is definitely appreciated by system administrators who ﬁnd themselves with a linear, consistent, and always well-documented update path.
Linux and related distributions now have contributions from many companies, many of which (e.g. Red Hat) push (justiﬁably) in the direction of what is convenient for them, their products, and their services. Being big contributors to the project they have a big clout so, indeed, their solutions often become de-facto standards. Consider systemd - was there really a need for such a system? While it brought some advantages, it added some complexity to an otherwise extremely simple and functional system. It remains divisive to this day, with many asking, “but was it really necessary? Did the advantages it brought balance the disadvantages?”. 70 binaries just for initialising and logging and a million and a half lines of code just for that? But Red Hat threw the rock…and many followed along. Because sometimes it’s nice to follow the trend, the hype of a speciﬁc solution.
Even FreeBSD has big companies behind it, collaborating in a more or less direct way. The license is more permissive, so not everyone who uses it commercially contributes to it, but knowing that FreeBSD is at the base of Netﬂix CDNs, Whatsapp servers (waiting for Meta to replace them, for internal coherence reasons, with Linux servers), Sony Playstations and, in part, macOS, iOS, iPadOS, etc. surely gives conﬁdence on its level. These realities, however, do not have enough clout to drive the development of the core team.
FreeBSD jails are very powerful tools for jailing and separating services. There is controversy about Docker not running on FreeBSD but I believe (like many others) that FreeBSD has a more powerful tool. Jails are older and more mature - and by far - than any containerization solution on Linux. Jails are efﬁcient and are well integrated throughout the operating system. All major commands (ps, kill, top, etc.) are able to display jail information as well. There are many management tools but, in fact, they all do the same thing: they interact with FreeBSD base and create custom conﬁguration ﬁles. Personally I’m very comfortable with BastilleBSD but there are a lot of very good tools as well as a sufﬁciently simple manual management. When I need Docker I launch a Linux machine - often Alpine, which I think is a great minimalist distribution, or Debian. But I’m moving a lot of services from Docker to a dedicated jail on FreeBSD. Docker containers are a great tool for rapid (and consistent) software deployment, but it’s not all fun and games. Containers, for example, rely on images that sometimes age and are no longer updated. This is a security issue that should not be overlooked.
UFS2 is still a very good and efﬁcient ﬁle system and, when conﬁgured to use softupdates, capable of performing live snapshots of the ﬁle system. This is great for backups. Ext4 and XFS do not support snapshots except through external tools (like DattoBD or snapshots through the volume manager). This works, of course, but it is not native. Btrfs is great in its intentions but still not as stable as it should be after all these years of development. FreeBSD supports ZFS natively in the base system and this brings many advantages: separation of datasets for jails, as well as Boot Environments, to make snapshots before upgrades/changes and to be able to boot (from bootloader) even on a different BE, etc.
Linux has always used excellent tools such as grub, lilo (now outdated), etc.. FreeBSD has always used a very linear and consistent boot system, with its own bootloader and dedicated boot partition. Whether on mbr, gpt, etc. things are very similar and consistent. I’ve never had a problem getting a FreeBSD system to boot after a move or recovery from backup. On Linux, however, grub has sometimes given me problems, even after a simple kernel security update.
Meta has been trying to bring the performance of the Linux network stack up to the level of FreeBSD’s for years. Many will ask why, then, not move services to FreeBSD. Large companies with huge datacenters can’t change solutions overnight, and their engineers, at any level, are Linux experts. They have invested heavily in btrfs, in Linux, in their speciﬁcs. Clearly, upon acquiring Whatsapp, they preferred to migrate the “few” Whatsapp servers to Linux and move them to their datacenters. Regarding the real system performance (i.e. disregarding benchmarks, useful only up to a certain point), FreeBSD shines, especially under high load conditions. Where Linux starts to gasp (ex: waiting for I/O) with 100% CPU, FreeBSD has lower processor load and room for more stuff. In the real world (of my servers and load types), I sometimes experienced severe system slowdowns due to high I/O, even if the data to be processed was not read/write dependent. On FreeBSD this does not happen, and if something is blocking, it blocks THAT operation, not the rest of the system. When performing backups or other important operations this factor becomes extremely important to ensure proper (and stable) system performance.
FreeBSD, in the base system, has all the tools to analyze possible problems and system loads. “vmstat” , in a single line, tells me if the machine is struggling for CPU, for I/O or for Ram. “gstat -a” shows me how much, disk by disk, partition by partition, the storage is active, also in percentage with reference to its performance. “top”, then, also has support for ﬁguring out, process by process, how much I/O is being used (“m” option). On Linux, to get the same results, you have to install speciﬁc applications, different from distribution to distribution.
For my purposes, Bhyve is a great virtualization tool. KVM is definitely more complete but since I don’t have any special or speciﬁc needs not covered by Bhyve on FreeBSD, I found (on average) better performance with this combination. On FreeBSD, however, KSM is missing which, in some cases, can be very useful.
Will I abandon Linux for FreeBSD? Obviously not, just as I haven’t for the last 20 years. Both have their uses, their space, their strengths. But if up to now I have had 80% Linux and 20% FreeBSD, the perspective is to invert the percentages of use and, where possible, directly implement solutions based on FreeBSD.
NOTE: this article has been translated from its Italian original version. Even if it’s been reviewed and adapted, there might be some errors.
GitHub Actions by Example is an introduction to the service through annotated examples.
On Friday January 21, 2022 I received this email. I tweeted about it and it took off like crazy.
The email comes from a fortune-500 multi-billion dollar company that apparently might be using a product that contains my code, or maybe they have customers who do. Who knows?
My guess is that they do this for some compliance reasons and they “forgot” that their open source components are not automatically provided by “partners” they can just demand this information from.
I answered the email very brieﬂy and said I will be happy to answer with details as soon as we have a support contract signed.
I think maybe this serves as a good example of the open source pyramid and users in the upper layers not at all thinking of how the lower layers are maintained. Building a house without a care about the ground the house stands on.
In my tweet and here in my blog post I redact the name of the company. I most probably have the right to tell you who they are, but I still prefer to not. (Especially if I manage to land a profitable business contract with them.) I suspect we can ﬁnd this level of entitlement in many companies.
The level of ignorance and incompetence shown in this single email is mind-boggling.
While they don’t even specifically say which product they are using, no code I’ve ever been involved with or have my copyright use log4j and any rookie or better engineer could easily verify that.
In the picture version of the email I padded the name ﬁelds to better anonymize the sender, and in the text below I replaced them with NNNN.
Continue down for the reply.
Dear Haxx Team Partner,
You are receiving this message because NNNN uses a product you developed. We request you review and respond within 24 hours of receiving this email. If you are not the right person, please forward this message to the appropriate contact.
As you may already be aware, a newly discovered zero-day vulnerability is currently impacting Java logging library Apache Log4j globally, potentially allowing attackers to gain full control of affected servers.
The security and protection of our customers’ conﬁdential information is our top priority. As a key partner in serving our customers, we need to understand your risk and mitigation plans for this vulnerability.
Please respond to the following questions using the template provided below.
1. If you utilize a Java logging library for any of your application, what Log4j versions are running?
2. Have there been any conﬁrmed security incidents to your company?
3. If yes, what applications, products, services, and associated versions are impacted?
4. Were any NNNN product and services impacted?
5. Has NNNN non-public or personal information been affected?
6. If yes, please provide details of affected information NNNN immediately.
7. What is the timeline (MM/DD/YY) for completing remediation? List the NNNN steps, including dates for each.
8. What action is required from NNNN to complete this remediation?
In an effort to maintain the integrity of this inquiry, we request that you do not share information relating to NNNN outside of your company and to keep this request to pertinent personnel only.
Thank you in advance for your prompt attention to this inquiry and your partnership!
NNNN Information Security
The information contained in this message may be CONFIDENTIAL and is for the intended addressee only. Any unauthorized use, dissemination of the information, or copying of this message is prohibited. If you are not the intended addressee, please notify the sender immediately and delete this message.
On January 24th I received this response, from the same address and it quotes my reply so I know they got it ﬁne.
Thank you for your reply. Are you saying that we are not a customer of your organization?
/ [a ﬁrst name]
I replied again (22:29 CET on Jan 24) to this mail that identiﬁed me as “David”. Now there’s this great story about a David and some giant so I couldn’t help myself…
No, you have no established contract with me or anyone else at Haxx whom you addressed this email to, asking for a lot of information. You are not our customer, we are not your customer. Also, you didn’t detail what product it was regarding.
So, we can either establish such a relationship or you are free to search for answers to your questions yourself.
I can only presume that you got our email address and contact information into your systems because we produce a lot of open source software that are used widely.
The image version of the initial email
What you read here is my personal opinions and views. You may think differently. Organizations I’m involved with may have different stand-points and people I work with or know may think differently.
Our AWS CloudFront bill spiked to $2,457 in October 2021 from $370 in September. When we dug into the bill, we saw that egress in the EU region accounted for most of this increase, with egress in the US making up the rest.
This wasn’t an indication of some misconﬁguration on ourend, but rather, a symptom of success. Our primary product is Fleet, an open core platform for device management built on osquery. We offer an update server for agent updates that is freely accessible to both community users and our paying customers. Getting these costs under control became a priority so that we could continue to offer free access.
Our needs for this server are pretty simple. We generate and sign static metadata ﬁles with The Update Framework, then serve those along with the binary artifacts. We don’t have any strict requirements around latency, as these are background processes being updated.
At ﬁrst we looked at Cloudﬂare’s free tier; Free egress is pretty appealing. Digging into Cloudﬂare’s terms, we found that they only allow for free tier caching to be used on website assets. To avoid risking a production outage by violating these terms, we got in touch with them for a quote. This came out to about a 2x savings over AWS. But we knew we needed orders of magnitude savings in order to expand our free offering.
Having heard of Hetzner’s low egress costs (20TB free + €1.19/TB/month), we investigated what it would take to run our own server. We stood up a Caddy ﬁle server with automatic HTTPS via Let’s Encrypt over the course of a few hours.
Our December Hetzner bill came out to €36.75 ($41.63). This represents a savings of 59x over our prior AWS bill, putting us solidly in the range to continue offering the free update server. We can still double our egress with Hetzner before incurring additional charges, which will render a savings of over 118x from AWS. Beyond that, the additional egress costs should remain reasonable.
DIYing it does come with additional maintenance burden, but so far we’ve found this manageable. Caddy on Hetzner has proved exceptionally reliable, with well over 99% uptime in the last two months and no manual interventions required.
Can a native Mac code editor really be that much better?
Can a native Mac code editor really be that much better?
If we’re being honest, Mac apps are a bit of a lost art.
There are great reasons to make cross-platform apps — to start, they’re cross-platform — but it’s just not who we are. Founded as a Mac software company in 1997, our joy at Panic comes from building things that feel truly, well, Mac-like.
Long ago, we created Coda, an all-in-one Mac web editor that broke new ground. But when we started work on Nova, we looked at where the web was today, and where we needed to be. It was time for a fresh start.
It all starts with our ﬁrst-class text-editor.
It’s new, hyper-fast, and ﬂexible, with all the features you want: smart autocomplete, multiple cursors, a Minimap, editor overscroll, tag pairs and brackets, and way, way more.
It’s also very expandable, with a robust API and a built-in extension browser.
But even the best text engine in the world means nothing unless you actually enjoy spending your time in the app. So, how does Nova look?
You can make Nova look exactly the way you want, while still feeling Mac-like. Bright, dark, cyberpunk, it’s all you. Plus, themes are CSS-like and easy to write. Nova can even automatically change your theme when your Mac switches from light to dark mode.
Nova doesn’t just help you code. It helps your code run.
You can easily create build and run tasks for your projects. We didn’t have them in Coda, but boy do we have them now. They’re custom scripts that can be triggered at any time by toolbar buttons or keyboard shortcuts.
Imagine building content, and with the single click of a button watching as Nova ﬁres up your local server, grabs the appropriate URL, and opens a browser for you, instantly. Just think of the time you’ll save.
Nova supports separate Build, Run, and Clean tasks. It can open a report when run. And the scripts can be written in a variety of languages.
A Nova extension can do lots of things, like add support for new languages, extend the sidebar, draw beautiful new themes and syntax colors, validate different code, and much more.
Check out some of this week’s popular extensions…
Displays tags, such as TODO and FIXME, in a sidebar.
Automatically format PHP ﬁles using php-cs-ﬁxer with HTML, Blade and Twig supp…
And we’re here to help. Nova has a whole host of settings. We have easily customizable key bindings. We have custom, quickly-switchable workspace layouts. And we have loads of editor tweaks, from matching brackets to overscroll.
Click around to see Nova’s preferences!
Consider the following problem: Given two sequences of natural numbers, compute the sums of the square roots of the respective lists, and then return which of the sums was larger. So if your lists were [1,4] and [9,9], you’d get 1 + 2 compared to 3 + 3, and you’d say that the second was larger.
How quickly can we compute this, as a function of the length of the input encoded in binary? You might enjoy taking a second to think about this.
The best known algorithm for this problem is in PSPACE. That is, it takes exponential time but only polynomial space. That means that as far as we know, this problem is crazily hard—among other things, PSPACE is harder than NP.
When I ﬁrst heard this claim, I thought it was unbelievable. After all, isn’t there an obvious algorithm that works? Namely, in Haskell:
And that looks to me like it takes linear time.
But it doesn’t work. The problem is that this algorithm is assuming ﬁnite precision, which isn’t necessarily enough to answer this question. Suppose that I can ﬁnd some lists of numbers whose sums of square roots are equal for the ﬁrst ten million decimal points and then start being different. If we run my Haskell algorithm, we’ll erroneously be told that the sums are equal when actually they’re not. So we need to do something smarter.
The correct algorithm acknowledges that square roots (and therefore sums of square roots) aren’t ﬁnite precision numbers, they’re inﬁnite streams of decimals. So it tells us to we take our two lists and start iteratively computing more and more digits of their sums-of-square-roots until we ﬁnd a place where they disagree. And then we’re done. (If the sums of square roots are equal, this algorithm of course won’t halt. There’s a known, reasonably fast algorithm (in BPP) for checking equality of sums of square roots, so we shouldn’t worry about that too much.)
But how long will we need to look through these sequences of digits before we ﬁnd the disagreeing digit? It feels intuitively like we should be able to establish some kind of bound on this. Like, maybe we should be able to say “if you add two lists of n numbers, each of which has d digits, then they can’t disagree for more than k * n * d digits” for some k. But no-one’s been able to prove anything like this.
This comes down to “are you able to embed complicated relationships inside of sums of square roots”. Like, we’re basically asking whether you can construct lists with absurdly close sums of square roots. This feels to me like a pretty deep question about square roots. There are other domains where this problem is obviously hard and obviously easy. For example, suppose you want to know whether two programs encode the same bit string, and all you can do is run them a step at a time and see what they output. It’s really easy for me to construct short programs that take an extremely long time before they disagree: for example “always print 0” and “output Graham’s number of zeros, then ones”. On the other hand, comparing the sums of fractions is pretty easy, because division is nice and well behaved. So the question is how complicated square roots are.
My guess is that this problem is actually in P, and we’re just stuck on proving it because irrational numbers are confusing and hard to prove things about, and most of the people who would be good at working on this are doing something more useful instead.
But if it turns out that I’m wrong, and that sums of square roots can get very close indeed, I’m going to update towards thinking that integers and square roots are much scarier, richer objects than I’d thought. I’ve updated to being more scared of real numbers than I used to be—they have all these sketchy properties like “almost none of them have ﬁnite descriptions”. Real numbers, and sets, and logical statements, have all started feeling to me like Cthuluesque monstrosities whose appearances are only tolerable because we only look at the pretty parts of them and don’t let ourselves look at the horrors that lurk below.
Incidentally, comparing sums of square roots is actually kind of a common subtask in eg computational geometry, so a bunch of their problems (including “shortest path through a graph in Euclidean space”!) are as-far-as-we-know extremely hard.
Like all great mind-blowing facts, this one sounded initially preposterous and then after a few minutes seemed completely obvious. I love it.
To read more about this, see here.
EDIT: I think that Edward Kmett and Paul Crowley might have ﬁgured out how to solve this problem in the comments on my Facebook post; see here. I’ll investigate further and update.
EDIT 2: actually we didn’t solve the problem, but it might still be a good direction for future research.