10 interesting stories served every morning and every evening.
10 interesting stories served every morning and every evening.
Writing this makes me irrationally sad, but Ghostty will be leaving GitHub1.
I’m GitHub user 1299, joined Feb 2008.
Since then, I’ve opened GitHub every single day. Every day, multiple times per
day, for over 18 years. Over half my life. A handful of exceptions in there
(I’d love to see the data), but I can’t imagine more than a week per year.
GitHub is the place that has made me the most happy. I always made time for
it. When I went through tough breakups? I lost myself in open source… on
GitHub. During college at 4 AM when everyone is passed out? Let me get one
commit in. During my honeymoon while my wife is still asleep? Yeah, GitHub.
It’s where I’ve historically been happiest and wanted to be.
Even the annoying stuff! Some people doom scroll social media. I’ve been doom
scrolling GitHub issues since before that was a word. On vacations I’d have
bookmarks of different projects on GitHub I wanted to study. Not just source
code, but OSS processes, how other maintainers react to difficult situations.
Etc. Believe it or not, I like this.
Some might call this sick, but my hobby and work and passion all align and for
most of my life they got to also live in one place on the internet: GitHub.
Did you know I started Vagrant (my first successful open source project) in
large part because I hoped it would get me a job at GitHub? It’s no secret,
I’ve said this repeatedly, and in my first public talk about Vagrant, when I
was a mere 20 years old, I joked “maybe GitHub will hire me if it’s good!”
GitHub was my dream job. I didn’t ever get to work there (not their fault).
But it was the perfect place I wanted to be. The engineers were incredible,
the product was incredible, and it was something I lived and breathed every
day. I still do and consistently have… for these 18 years. Enough time for
an entire human to become an adult, all on GitHub.
Lately, I’ve been very publicly critical of GitHub. I’ve been mean about it.
I’ve been angry about it. I’ve hurt people’s feelings. I’ve been lashing out.
Because GitHub is failing me, every single day, and it is personal. It is
irrationally personal. I love GitHub more than a person should love a thing,
and I’m mad at it. I’m sorry about the hurt feelings to the people working on
it.
I’ve felt this way for a long time, but for the past month I’ve kept a journal
where I put an “X” next to every date where a GitHub outage has negatively
impacted my ability to work2. Almost every day has an X. On the day I am
writing this post, I’ve been unable to do any PR review for ~2 hours because
there is a GitHub Actions outage3. This is no longer a place for serious
work if it just blocks you out for hours per day, every day.
It’s not a fun place for me to be anymore. I want to be there but it doesn’t
want me to be there. I want to get work done and it doesn’t want me to get
work done. I want to ship software and it doesn’t want me to ship software.
I want it to be better, but I also want to code. And I can’t code with GitHub
anymore. I’m sorry. After 18 years, I’ve got to go. I’d love to come back one
day, but this will have to be predicated on real results and improvements,
not words and promises.
I’ll share more details about where the Ghostty project will be moving to in
the coming months. We have a plan but I’m also very much still in discussions
with multiple providers (both commercial and FOSS).
It’ll take us time to remove all of our dependencies on GitHub and we have a
plan in place to do it as incrementally as possible. We plan on keeping a
read-only mirror available on GitHub at the current URL.
My personal projects and other work will remain on GitHub for now.
Ghostty is where I, our maintainers, and our open source community are
most impacted so that is the focus of this change. We’ll see where it
goes after that.
Footnotes
The timing of this is coincidental with the large outage on April 27, 2026.
We’ve been discussing and putting together a plan to leave GitHub
for months, and this blog post was written over a week ago. We only
made the final decision this week. ↩
The timing of this is coincidental with the large outage on April 27, 2026.
We’ve been discussing and putting together a plan to leave GitHub
for months, and this blog post was written over a week ago. We only
made the final decision this week. ↩
To the “Git is distributed!” crowd: the issue isn’t Git, it’s the
infrastructure we rely on around it: issues, PRs, Actions, etc. ↩
To the “Git is distributed!” crowd: the issue isn’t Git, it’s the
infrastructure we rely on around it: issues, PRs, Actions, etc. ↩
This is not the large Elasticsearch outage they had on April 27, 2026.
This blog post was written a week before that, so this was a different
outage. ↩
This is not the large Elasticsearch outage they had on April 27, 2026.
This blog post was written a week before that, so this was a different
outage. ↩
🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length.
🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world’s top closed-source models.
🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice.
Try it now at chat.deepseek.com via Expert Mode / Instant Mode. API is updated & available today!
📄 Tech Report: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek_V4.pdf
🤗 Open Weights: https://huggingface.co/collections/deepseek-ai/deepseek-v4
DeepSeek-V4-Pro
🔹 Enhanced Agentic Capabilities: Open-source SOTA in Agentic Coding benchmarks.
🔹 Rich World Knowledge: Leads all current open models, trailing only Gemini-3.1-Pro.
🔹 World-Class Reasoning: Beats all current open models in Math/STEM/Coding, rivaling top closed-source models.
DeepSeek-V4-Flash
🔹 Reasoning capabilities closely approach V4-Pro.
🔹 Performs on par with V4-Pro on simple Agent tasks.
🔹 Smaller parameter size, faster response times, and highly cost-effective API pricing.
Structural Innovation & Ultra-High Context Efficiency
🔹 Novel Attention: Token-wise compression + DSA (DeepSeek Sparse Attention).
🔹 Peak Efficiency: World-leading long context with drastically reduced compute & memory costs.
🔹 1M Standard: 1M context is now the default across all official DeepSeek services.
Dedicated Optimizations for Agent Capabilities
🔹 DeepSeek-V4 is seamlessly integrated with leading AI agents like Claude Code, OpenClaw & OpenCode.
🔹 Already driving our in-house agentic coding at DeepSeek.
The figure below showcases a sample PDF generated by DeepSeek-V4-Pro.
API is Available Today!
🔹 Keep base_url, just update model to deepseek-v4-pro or deepseek-v4-flash.
🔹 Supports OpenAI ChatCompletions & Anthropic APIs.
🔹 Both models support 1M context & dual modes (Thinking / Non-Thinking): https://api-docs.deepseek.com/guides/thinking_mode
⚠️ Note: deepseek-chat & deepseek-reasoner will be fully retired and inaccessible after Jul 24th, 2026, 15:59 (UTC Time). (Currently routing to deepseek-v4-flash non-thinking/thinking).
🔹 Amid recent attention, a quick reminder: please rely only on our official accounts for DeepSeek news. Statements from other channels do not reflect our views.
🔹 Thank you for your continued trust. We remain committed to longtermism, advancing steadily toward our ultimate goal of AGI.
The DeepSeek API uses an API format compatible with OpenAI/Anthropic. By modifying the configuration, you can use the OpenAI/Anthropic SDK or softwares compatible with the OpenAI/Anthropic API to access the DeepSeek API.
* The model names deepseek-chat and deepseek-reasoner will be deprecated on 2026/07/24. For compatibility, they correspond to the non-thinking mode and thinking mode of deepseek-v4-flash, respectively.
Invoke The Chat API
Once you have obtained an API key, you can access the DeepSeek model using the following example scripts in the OpenAI API format. This is a non-stream example, you can set the stream parameter to true to get stream response.
For examples using the Anthropic API format, please refer to Anthropic API.
curl
python
nodejs
curl https://api.deepseek.com/chat/completions \ -H “Content-Type: application/json” \ -H “Authorization: Bearer ${DEEPSEEK_API_KEY}” \ -d ‘{ “model”: “deepseek-v4-pro”, “messages”: [ {“role”: “system”, “content”: “You are a helpful assistant.“}, {“role”: “user”, “content”: “Hello!“} ], “thinking”: {“type”: “enabled”}, “reasoning_effort”: “high”, “stream”: false }’
Your phone is about to stop being yours.
125 days until lockdown
Starting September 2026, a silent update, nonconsensually pushed by Google, will block every Android app whose developer hasn’t registered with Google, signed their contract, paid up, and handed over government ID.
Every app and every device, worldwide, with no opt-out.
Post on X Post on Mastodon Post on Bluesky LinkedIn Facebook
What Google is doing
In August 2025, Google announced a new requirement: starting September 2026, every Android app developer must register centrally with Google before their software can be installed on any device. Not just Play Store apps: all apps. This includes apps shared between friends, distributed through F-Droid, built by hobbyists for personal use. Independent developers, church and community groups, and hobbyists alike will all be frozen out of being able to develop and distribute their software.
Registration requires:
Paying a fee to Google
Agreeing to Google’s Terms and Conditions
Surrendering your government-issued identification
Providing evidence of your private signing key
Listing all current and all future application identifiers
If a developer does not comply, their apps get silently blocked on every Android device worldwide.
Who this hurts
You
You bought an Android phone because Google told you it was open. You could install what you wanted, and that was the deal.
Google is now rewriting that deal, retroactively, on hardware you already own. After the update lands, you can only run software that Google has pre-approved. On your phone: your property, that you paid for.
Independent developers
A teenager’s first app, a volunteer’s privacy tool, or a company’s confidential internal beta. It doesn’t matter. After September 2026, none of these can be installed without Google’s blessing.
F-Droid, home to thousands of free and open-source Android apps, has called this an “existential” threat. Cory Doctorow calls it “Darth Android”.
Governments & civil society
Google has a documented track record of complying when authoritarian regimes demand app removals. With this program, the software that runs your country’s institutions will exist at the pleasure of a single unaccountable foreign corporation.
The EFF calls app gatekeeping “an ever-expanding pathway to internet censorship.”
Google’s “escape hatch” is a trap door
Google says “power users” can “still install” unverified apps. Here’s what that actually looks like:
Delve into System Settings, find Developer Options
Tap the build number seven times to enable Developer Mode
Dismiss scare screens about coercion
Enter your PIN
Restart the device
Wait 24 hours
Come back, dismiss more scare screens
Pick “allow temporarily” (7 days) or “allow indefinitely”
Confirm, again, that you understand “the risks”
Nine steps. A mandatory 24-hour cooling-off period. For installing software on a device you own.
Worse: this flow runs entirely through Google Play Services, not the Android OS. Google can change it, tighten it, or kill it at any time, with no OS update required and no consent needed. And as of today, it hasn’t shipped in any beta, preview, or canary build. It exists only as a blog post and some mockups.
This is bigger than Android
If Google can retroactively lock down billions of devices that were sold as open platforms, every hardware manufacturer on the planet is watching.
The principle being established: the company that made your device gets to decide, after you’ve bought it, what software you’re allowed to run. In software, this is called a “rug pull”; but at least you could always install competing software. In hardware, it is a fait accompli that strips you of your agency and renders you powerless to the whims of a single unaccountable gatekeeper and convicted monopolist.
Android’s openness was never just a feature. It was the promise that distinguished it from iPhone. Millions chose Android for exactly that reason. Google is now revoking that promise unilaterally, on devices already in people’s pockets, because they’ve decided they have enough market dominance and regulatory capture to get away with it.
Ars Technica: “Google’s Apple envy threatens to dismantle Android’s open legacy.”
But wait, isn’t this…
″…just about security?”
The security rationale is a smokescreen. Google Play Protect already scans for malware independent of developer identity. Requiring a government ID doesn’t make code safer. It makes developers identifiable and controllable. Malware authors can register. Indie developers and dissidents often can’t. The EFF is blunt: identity-based gatekeeping is a censorship tool, not a security one.
″…still sideloading if you use the advanced flow?”
Nine steps, 24-hour wait, buried in Developer Options, delivered through a proprietary service that Google can revoke whenever they want. That’s not sideloading. That’s a deterrence mechanism built to ensure almost nobody completes it. And since it runs through Play Services rather than the OS, Google can tighten or kill it silently.
″…only a problem if you have something to hide?”
Whistleblowers, journalists, and activists under authoritarian governments will be the first victims. People in domestic abuse situations are next. All these groups have legitimate reasons to distribute or use software without putting their legal identity in a Google database. Anonymous open-source contribution is a tradition older than Google itself. This policy ends it on Android.
″…the same thing Apple does?”
Apple has been a walled garden from day one. People chose Android because it was different. “Apple does it too” is a race to the bottom and a weak tu quoque argument. And under regulatory pressure (the EU’s Digital Markets Act), even Apple is being forced to open up. Google is moving in the opposite direction: attempting to further entrench its gatekeeping status.
″…just $25 and some paperwork?”
Maybe, if you’re a developer in the US with a credit card and a driver’s license. Try being a student in sub-Saharan Africa, or a dissident in Myanmar, or a volunteer maintaining a community health app. The cost isn’t only financial: you’re surrendering government ID and evidence of your signing keys to a company that routinely complies with government demands to remove apps and expose developers.
Fight back
Everyone
Install F-Droid on every Android device you own. Alternative stores only survive if people actually use them.
Contact your regulators. Regulators worldwide are genuinely concerned about monopolies and the centralization of power in the tech sector, and want to hear directly from individuals who are affected and concerned.
Share this page. Link to keepandroidopen.org everywhere.
Push back on astroturfers. The “well, actually…” crowd is out in force. Don’t let them set the narrative.
Sign the change.org petition and join the over 100,000 signatories who have made their voices heard.
Read and share our open letter
Tell Google what you think of this through their own developer verification survey (for all the good that will do).
Developers
Do not sign up. Don’t join the program by signing up for the Android Developer Console and agreeing to their irrevocable Terms and Conditions. Don’t verify your identity. Don’t play ball.
Google’s plan only works if developers comply. Don’t.
Talk other developers and organizations out of signing up.
Add the FreeDroidWarn library to your apps to warn users.
Run a website? Add the countdown banner.
Google employees
If you know something about the program’s technical implementation or internal rationale, contact tips@keepandroidopen.org from a non-work machine and a non-Gmail account. Strict confidence guaranteed.
All those opposed…
69 organizations from 21 countries have signed the open letter
Read the full open letter and thank the signatories →
What they’re saying
Tech press
“Google plans to block side-loading like Apple, declaring war on Android freedom” Tuta Blog
“Google plans to block side-loading like Apple, declaring war on Android freedom”
“Google’s developer registration ‘decree’ means the end for alternative app stores” Cybernews
“Google’s developer registration ‘decree’ means the end for alternative app stores”
“Open-Source Android Apps at Risk Under Google’s New Decree” TechRepublic
“Open-Source Android Apps at Risk Under Google’s New Decree”
“F-Droid Slams Google for Misleading Users About Android’s App Verification” Android Headlines
“F-Droid Slams Google for Misleading Users About Android’s App Verification”
“Android, Epic, and What’s Really Behind Google’s ‘Existential’ Threat to F-Droid” Slashdot
“Android, Epic, and What’s Really Behind Google’s ‘Existential’ Threat to F-Droid”
“Google is restricting one of Android’s most important features, and users are outraged” SlashGear
“Google is restricting one of Android’s most important features, and users are outraged”
“Resistance to Google’s Android verification grows among developers” Techzine EU
“Resistance to Google’s Android verification grows among developers”
“Google says it’s making Android sideloading ‘high-friction’ to better warn users about potential risks” XDA Developers
“Google says it’s making Android sideloading ‘high-friction’ to better warn users about potential risks”
“Google’s dev registration plan ‘will end the F-Droid project’” The Register
“Google’s dev registration plan ‘will end the F-Droid project’”
“Keep Android Open — Abwehr gegen Verbot anonymer Apps von Google” heise online
“Keep Android Open — Abwehr gegen Verbot anonymer Apps von Google”
“Google will make you wait 24 hours to sideload Android apps” How-To Geek
“Google will make you wait 24 hours to sideload Android apps”
“Sideloading is dead for all intents and purposes. The Android you know and love is slowly disappearing.” Android Police
“Sideloading is dead for all intents and purposes. The Android you know and love is slowly disappearing.”
“Sideloading on Android? Soon It’ll Be Like a TSA Check for Apps” Android Headlines
In 2023, Raytheon’s president stood at the Paris Air Show and described what it took to restart Stinger missile production. They brought back engineers in their 70s to teach younger workers how to build a missile from paper schematics drawn during the Carter administration. Test equipment had been sitting in warehouses for years. The nose cone still had to be attached by hand, exactly as it was forty years ago.
The Pentagon hadn’t bought a new Stinger in twenty years. Then Russia invaded Ukraine, and suddenly everyone needed them. The production line was shut down. The electronics were obsolete. The seeker component was out of production. An order placed in May 2022 wouldn’t deliver until 2026. Four years. Not because of money. Because the people who knew how to build them retired a decade earlier and nobody replaced them.
I run engineering teams in Ukraine. My people lived the other side of this equation. Not the factory floor. The receiving end. While Raytheon was struggling to restart production from forty-year-old blueprints, the US was shipping thousands of Stingers to Ukraine. RTX CEO Greg Hayes: ten months of war burned through thirteen years’ worth of Stinger production. I’ve seen this pattern before. It’s happening in my industry right now.
In March 2023, the EU promised Ukraine one million artillery shells within twelve months. European production capacity sat at 230,000 shells per year. Ukraine was consuming 5,000 to 7,000 rounds per day. Anyone with a calculator could see this wouldn’t work.
By the deadline, Europe delivered about half. Macron called the original promise reckless. An investigation by eleven media outlets across nine countries found actual production capacity was roughly one-third of official EU claims. The million-shell mark wasn’t hit until December 2024, nine months late.
It wasn’t one bottleneck. It was all of them. France had halted domestic propellant production in 2007. Seventeen years of nothing. Europe’s single major TNT producer was in Poland. Germany had two days of ammunition stored. A Nammo plant in Denmark was shut down in 2020 and had to be restarted from scratch. The entire continent’s defense industry had been optimized for making small batches of expensive custom products. Nobody planned for volume. Nobody planned for crisis.
The U.S. wasn’t much better. One plant in Scranton, one facility in Iowa for explosive fill, no domestic TNT production since 1986. Billions of investment later, production still hadn’t hit half the target.
This wasn’t an accident. In 1993, the Pentagon told defense CEOs to consolidate or die. Fifty-one major defense contractors collapsed into five. Tactical missile suppliers went from thirteen to three. Shipbuilders from eight to two. The workforce fell from 3.2 million to 1.1 million. A 65% cut.
The ammunition supply chain had single points of failure everywhere. One manufacturer for 155mm shell casings, sitting in Coachella, California, on the San Andreas Fault. One facility in Canada for propellant charges. Optimized for minimum cost with zero margin for surge. On paper, efficient. In practice, one bad day away from collapse.
Then there’s Fogbank. A classified material used in nuclear warheads. Produced from 1975 to 1989, then the facility was shut down. When the government needed to reproduce it for a warhead life extension program, they discovered they couldn’t. A GAO report found that almost all staff with production expertise had retired, died, or left the agency. Few records existed.
After $69 million in cost overruns and years of failed attempts, they finally produced viable Fogbank. Then discovered the new batch was too pure. The original process had relied on an unintentional impurity that was critical to the material’s function. Nobody knew. Not the engineers trying to reproduce it. Not even the original workers who made it decades earlier. Los Alamos called it an unknowing dependency in the original process.
A nuclear weapons program lost the ability to make a material it invented. The knowledge didn’t just leave with people. It was never fully understood by anyone.
(Correction: the original version stated that the workers who made Fogbank knew about the impurity. They didn’t. The dependency was unwitting, which makes the knowledge-loss argument stronger, not weaker. Thanks to John F. in the comments for catching this.)
I read the Fogbank story and recognized it immediately. Not the nuclear material. The pattern. Build capability over decades. Find a cheaper substitute. Let the human pipeline atrophy. Enjoy the savings. Then watch it all collapse when a crisis demands what you optimized away.
In defense, the substitute was the peace dividend. In software, it’s AI.
I wrote about the talent pipeline collapse before. The hiring numbers and the junior-to-senior problem are documented. So is the comprehension crisis. What I didn’t have was the right historical parallel. Now I do.
And it tells you something the hiring data doesn’t: how long rebuilding actually takes.
Every major defense production ramp-up took three to five years for simple systems. Five to ten for complex ones. Stinger: thirty months minimum from order to delivery. Javelin: four and a half years to less than double production. 155mm shells: four years and still not at target despite five billion dollars invested. France only restarted propellant production in 2024, seventeen years after shutting it down.
Money was never the constraint. Knowledge was. RAND found that 10% of technical skills for submarine design need ten years of on-the-job experience to develop, sometimes following a PhD. Apprenticeships in defense trades take two to four years, with five to eight years to reach supervisory competence.
Now map that onto software. A junior developer needs three to five years to become a competent mid-level engineer. Five to eight years to become senior. Ten or more to become a principal or architect. That timeline can’t be compressed by throwing money at it. It can’t be compressed by AI either.
A METR randomized controlled trial found that experienced developers using AI coding tools actually took 19% longer on real-world open source tasks. Before starting, they predicted AI would make them 24% faster. The gap between prediction and reality was 43 percentage points. When researchers tried to run a follow-up, a significant share of developers refused to participate if it meant working without AI. They couldn’t imagine going back.
The software industry is in year three of the same optimization. Salesforce said it won’t hire more software engineers in 2025. A LeadDev survey found 54% of engineering leaders believe AI copilots will reduce junior hiring long-term. A CRA survey of university computing departments found 62% reported declining enrollment this year.
I see it in code review. Review is now the bottleneck. AI generates code fast. Humans review it slow. The industry’s answer is predictable: let AI review AI’s code. I’m not doing that. I’ve reworked our pull request templates instead. Every PR now has to explain what changed, why, what type of change it is, screenshots of before and after. Structured context so the reviewer isn’t guessing. I’m adding dedicated reviewers per project. More eyes, more chances to catch what the model missed.
But even that doesn’t solve the deeper problem. The skills you need to be effective now are different. Technical expertise alone isn’t enough anymore. You need people who can take ownership, communicate tradeoffs, push back on bad suggestions from a machine that sounds very confident. Leadership qualities. Our last hiring round tells you how rare that is: 2,253 candidates, 2,069 disqualified, 4 hired. A 0.18% conversion rate. The combination of technical skill and the judgment to know when the AI is wrong barely exists in the market anymore.
We document everything. Site Books, SDDs, RVS reports, boilerplate modules with full coverage. It works today, because the people reading those docs have the engineering expertise to act on them. What happens when they don’t? Honestly, I don’t know. Maybe AI in five years is good enough that it won’t matter. Maybe the problem stays manageable. I can’t predict the capabilities of models in 2031.
But crises don’t send calendar invites. Nobody expected a full-scale land war in Europe in 2022. The defense industry had thirty years to prepare and didn’t. Even Fogbank had records. There weren’t enough. The original workers didn’t fully understand their own process.
Five to ten years from now, we’ll need senior engineers. People who understand systems end to end, who can debug distributed failures at 2 AM, who carry institutional knowledge that exists nowhere in the codebase. Those engineers don’t exist yet because we’re not creating them. The juniors who should be learning right now are either not being hired or developing what a DoD-funded workforce study calls “AI-mediated competence.” They can prompt an AI. They can’t tell you what the AI got wrong.
It’s Fogbank for code. When juniors skip debugging and skip the formative mistakes, they don’t build the tacit expertise. And when my generation of engineers retires, that knowledge doesn’t transfer to the AI.
It just disappears.
The West already made this mistake once. The bill came due in Ukraine.
I know how this sounds. I know I’ve written about the talent pipeline before. The defense example isn’t about repeating the argument. It’s about showing what happens if the industry’s expectations don’t work out. Stinger, Javelin, Fogbank, a million shells nobody could make. That’s the cost of betting wrong on optimization. We’re making the same bet with software engineering right now.
Maybe AI gets good enough, and the bet pays off. Maybe it doesn’t. The defense industry thought peace would last forever, too.
No posts
I am building a cloud
2026 – 04-22
Today is fundraising announcement day. As is the nature of writing for a larger audience, it is a formal, safe announcement. As it should be. Writing must necessarily become impersonal at scale. But I would like to write something personal about why I am doing this. What is the goal of building exe.dev? I am already the co-founder of one startup that is doing very well, selling a product I love as much as when I first helped design and build it.
What could possess me to go through all the pain of starting another company? Some fellow founders have looked at me with incredulity and shock that I would throw myself back into the frying pan. (Worse yet, experience tells me that most of the pain is still in my future.) It has been a genuinely hard question to answer because I start searching for a “big” reason, a principle or a social need, a reason or motivation beyond challenge. But I believe the truth is far simpler, and to some I am sure almost equally incredulous.
I like computers.
In some tech circles, that is an unusual statement. (“In this house, we curse computers!”) I get it, computers can be really frustrating. But I like computers. I always have. It is really fun getting computers to do things. Painful, sure, but the results are worth it. Small microcontrollers are fun, desktops are fun, phones are fun, and servers are fun, whether racked in your basement or in a data center across the world. I like them all.
So it is no small thing for me when I admit: I do not like the cloud today.
I want to. Computers are great, whether it is a BSD installed directly on a PC or a Linux VM. I can enjoy Windows, BeOS, Novell NetWare, I even installed OS/2 Warp back in the day and had a great time with it. Linux is particularly powerful today and a source of endless potential. And for all the pages of products, the cloud is just Linux VMs. Better, they are API driven Linux VMs. I should be in heaven.
But every cloud product I try is wrong. Some are better than others, but I am constantly constrained by the choices cloud vendors make in ways that make it hard to get computers to do the things I want them to do.
These issues go beyond UX or bad API design. Some of the fundamental building blocks of today’s clouds are the wrong shape. VMs are the wrong shape because they are tied to CPU/memory resources. I want to buy some CPUs, memory, and disk, and then run VMs on it. A Linux VM is a process running in another Linux’s cgroup, I should be able to run as many as I like on the computer I have. The only way to do that easily on today’s clouds is to take isolation into my own hands, with gVisor or nested virtualization on a single cloud VM, paying the nesting performance penalty, and then I am left with the job of running and managing, at a minimum, a reverse proxy onto my VMs. All because the cloud abstraction is the wrong shape.
Clouds have tried to solve this with “PaaS” systems. Abstractions that are inherently less powerful than a computer, bespoke to a particular provider. Learn a new way to write software for each compute vendor, only to find half way into your project that something that is easy on a normal computer is nearly impossible because of some obscure limit of the platform system buried so deep you cannot find it until you are deeply committed to a project. Time and again I have said “this is the one” only to be betrayed by some half-assed, half-implemented, or half-thought-through abstraction. No thank you.
Consider disk. Cloud providers want you to use remote block devices (or something even more limited and slow, like S3). When remote block devices were introduced they made sense, because computers used hard drives. Remote does not hurt sequential read/write performance, if the buffering implementation is good. Random seeks on a hard drive take 10ms, so 1ms RTT for the Ethernet connection to remote storage is a fine price to pay. It is a good product for hard drives and makes the cloud vendor’s life a lot easier because it removes an entire dimension from their standard instance types.
But then we all switched to SSD. Seek time went from 10 milliseconds to 20 microseconds. Heroic efforts have cut the network RTT a bit for really good remote block systems, but the IOPS overhead of remote systems went from 10% with hard drives to more than 10x with SSDs.
It is a lot of work to configure an EC2 VM to have 200k IOPS, and you will pay $10k/month for the privilege. My MacBook has 500k IOPS. Why are we hobbling our cloud infrastructure with slow disk?
Finally networking. Hyperscalers have great networks. They charge you the earth for them and make it miserable to do deals with other vendors. The standard price for a GB of egress from a cloud provider is 10x what you pay racking a server in a normal data center. At moderate volume the multiplier is even worse. Sure, if you spend $XXm/month with a cloud the prices get much better, but most of my projects want to spend $XX/month, without the little m. The fundamental technology here is fine, but this is where limits are placed on you to make sure whatever you build cannot be affordable.
Finally, clouds have painful APIs. This is where projects like K8S come in, papering over the pain so engineers suffer a bit less from using the cloud. But VMs are hard with Kubernetes because the cloud makes you do it all yourself with lumpy nested virtualization. Disk is hard because back when they were designing K8S Google didn’t really even do usable remote block devices, and even if you can find a common pattern among clouds today to paper over, it will be slow. Networking is hard because if it were easy you would private link in a few systems from a neighboring open DC and drop a zero from your cloud spend. It is tempting to dismiss Kubernetes as a scam, artificial make work designed to avoid doing real product work, but the truth is worse: it is a product attempting to solve an impossible problem: make clouds portable and usable. It cannot be done.
You cannot solve the fundamental problems with cloud abstractions by building new abstractions on top. Making Kubernetes good is inherently impossible, a project in putting (admittedly high quality) lipstick on a pig.
We have been muddying along with these miserable clouds for 15 years now. We make do, in the way we do with all the unpleasant parts of our software stack, holding our nose whenever we have to deal with and trying to minimize how often that happens.
This however, is the moment to fix it.
This is the moment because something has changed: we have agents now. (Indeed my co-founder Josh and I started tinkering because we wanted to use LLMs in programming. It turns out what needs building for LLMs are better traditional abstractions.) Agents, by making it easiest to write code, means there will be a lot more software. Economists would call this an instance of Jevons paradox. Each of us will write more programs, for fun and for work. We need private places to run them, easy sharing with friends and colleagues, minimal overhead.
With more total software in our lives the cloud, which was an annoying pain, becomes a much bigger pain. We need a lot more compute, we need it to be easier to manage. Agents help to some degree. If you trust them with your credentials they will do a great job driving the AWS API for you (though occasionally it will delete your production DB). But agents struggle with the fundamental limits of the abstractions as much as we do. You need more tokens than you should and you get a worse result than you should. Every percent of context window the agent spends thinking about how to contort classic clouds into working is context window is not using to solve your problem.
So we are going to fix it. What we have launched on exe.dev today addresses the VM resource isolation problem: instead of provisioning individual VMs, you get CPU and memory and run the VMs you want. We took care of a TLS proxy and an authentication proxy, because I do not actually want my fresh VMs dumped directly on the internet. Your disk is local NVMe with blocks replicated off machine asynchronously. We have regions around the world for your machines, because you want your machines close. Your machines are behind an anycast network to give all your global users a low latency entrypoint to your product (and so we can build some new exciting things soon).
There is a lot more to build here, from obvious things like static IPs to UX challenges like how to give you access to our automatic historical disk snapshots. Those will get built. And at the same time we are going right back to the beginning, racking computers in data centers, thinking through every layer of the software stack, exploring all the options for how we wire up networks.
So, I am building a cloud. One I actually want to use. I hope it is useful to you.
Please make sure your browser supports JavaScript and cookies and that you are not blocking them from loading. For more information you can review our Terms of Service and Cookie Policy.
Prefix Note
As this post is gaining more attention than expected on HN, I want to make some things clear - although I thought they are obvious.
It sounds like a rant about Claude’s quality in general - if you don’t read to the end. But my concerns are more focused on the support performance and token issues - fully aware of the challenges a company this size faces and assuming the people at Anthropic are working hard to make things better. I am pointing at some “bad design decisions” they probably made. The “quality issues” are just the cherry on top of the cake.
After all, Claude Code is delivering and I use it to build stuff. Still, I experienced a degradation in quality. It just takes longer. Which is a relative observation. However I know that this is highly subjective, too. That’s what that comment in a paragraph means: “The failure usually appears in front of the screen”. Of course the agent is only as good as the operator and their instructions.
I am coding for a couple of decades now, I like to get my hands dirty. Three years ago I implemented AI into my workflow. It started with code completion and now I am at the point where I barely write code. For me “software engineeering” is not constituted by the simple act of writing code. It’s about conducting tools, being creative, understanding the problem and delivering a solution. Still - and that what this comment in another paragraph means: “While I was browsing the model’s thinking log - which I strongly suggest doing not only occasionally” - I am escorting the agent while it’s working. I still have to figure out a concept, think about a data model and verify it’s implementation. That’s what software engineering is about. It’s not n LOC…
Having said that - enjoy this post and stay happy, fellow developers.
First enthusiasm
A couple of weeks ago I subscribed to Claude Code, and during the first few weeks I had a really nice experience. It was fast, the token allowance was fair, and the quality was good.
I learned they had
raised the token allowance for non-rush hours
, and since they opposed some governmental rules, it felt good to support the right cause.
(づ  ̄ ³ ̄)づ
However… for about three weeks now my initial enthusiasm has been rapidly waning.
It began with an issue three weeks ago. I started working in the morning after about a ten-hour break; enough time for my tokens to refresh.
I sent two small questions to Claude Haiku. They were simple questions, not even related to the repository.
Suddenly, token usage spiked to 100%.
Have a nice break…
I contacted their “AI support bot”, which returned some default support nonsense and didn’t really understand the problem. So I asked for human support. A couple of days later a - what appeared to be - human support person sent a reply. It began like this:
“Our systems are detecting your inquiry is regarding usage limits on your Pro or Max plan.”
Yeah, well — it’s the Pro plan. Seems like your systems weren’t actually queried; it was just a default intro and probably a default answer, because:
This was followed by an extensive what seems to be copy-and-paste answer from their docs explaining how daily and weekly limits work.
And it closed with the typically frustrating line, that no customer likes to read at the end of an e-mail and which is just the classical middle-finger of customer support - we don’t care if your problem is solved or not, we declared it closed.
“Note that further replies to this ticket may not be monitored. If your request is not regarding usage limits on your Pro or Max plan, or you need additional support, please visit our help page at”
Great! Sending an automated e-mail that does not refer to the actual problem and then closing the channel. Thanks for nothing, I guess? Or was I wrong. I asked Claude Haiku:
@Haiku:
See the customer’s request here and the response from the AI and later W***** - did they answer the concern/question of the customer?
See the customer’s request here and the response from the AI and later W***** - did they answer the concern/question of the customer?
(╯°_°)╯︵ ┻━┻
Declining quality
In the following days and weeks, the quality was far from satisfying my needs or matching my initial experience. While I used to be able to work on up to three projects at once, now the token limit was exhausted after two hours on a single project.
And the quality was degrading. I am fully aware this is quite subjective and that the quality of the agent is always heavily impacted by the operator. The failure usually appears in front of the screen. But hey, I also develop using Github’s Copilot, OpenAI’s Codex and I am running my own inference with OMLX and Continue using Qwen3.5 – 9B. I’m not the expert, I’m lazy sometimes but I probably know a thing or two.
Let me give you this wonderful example: yesterday I asked Claude Opus to refactor a project.
While I was browsing the model’s thinking log - which I strongly suggest doing not only occasionally - I found this:
Rather than editing every slider in JSX, I’ll add a generic initializer in ui-events.js that auto-injects value displays for all range inputs that lack one.
Rather than editing every slider in JSX, I’ll add a generic initializer in ui-events.js that auto-injects value displays for all range inputs that lack one.
This is clearly bad practice. It’s a cheap workaround you wouldn’t expect even from a junior dev; it reads like someone who just doesn’t want to deliver a good result. My response:
“you can’t be serious — is this how you fix things? just WORKAROUNDS????”
At least Opus admitted:
“You’re right, that was lazy. Let me do it properly — add the labels directly in the JSX and wire them explicitly.”
Needless to say, this shortcut cost me around 50% of my five-hour token allowance.
(ง •̀_•́)ง
And even more…
Now this cache topic comes up
-
among others
. at least they are talking about it openly. The problem was: when you get back to work after some time, your conversation cache is gone and the model starts reading your codebase again. Cost-wise this is smart. But experience-wise? It means you paid tokens for the initial load and, after a forced break because the five-hour token window hit its limit, you pay again for the same load.
Think that’s all? Wait, I also got this funny anecdote: all of a sudden the weekly window changed from today to Monday. OK, I was thankful because it came with a reset to zero. But still: what is going on, Anthropic? Not only that — while I was working on my project, watching token usage with Argus-eyed vigilance, this little warning popped up:
Wait, what? I’m neither part of an organization nor do I see any hint why I suddenly have to worry about a “monthly usage limit” — also the hourly and weekly limits were still not exceeded. What is happening right now?
Turns out — two hours later - it allowed me to continue working. The warning was gone.
At least
this documentation
does not mention a monthly usage limit. And the settings page only lists the limits for the current session and week.
So… what is this monthly limit all about, Anthropic?
Sorry to let you down, Anthropic
I am a huge fan of the product. Theoretically everything just works like a charm; it offers so many opportunities. I built my
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.