10 interesting stories served every morning and every evening.
10 interesting stories served every morning and every evening.
I am building a cloud
2026 – 04-22
Today is fundraising announcement day. As is the nature of writing for a larger audience, it is a formal, safe announcement. As it should be. Writing must necessarily become impersonal at scale. But I would like to write something personal about why I am doing this. What is the goal of building exe.dev? I am already the co-founder of one startup that is doing very well, selling a product I love as much as when I first helped design and build it.
What could possess me to go through all the pain of starting another company? Some fellow founders have looked at me with incredulity and shock that I would throw myself back into the frying pan. (Worse yet, experience tells me that most of the pain is still in my future.) It has been a genuinely hard question to answer because I start searching for a “big” reason, a principle or a social need, a reason or motivation beyond challenge. But I believe the truth is far simpler, and to some I am sure almost equally incredulous.
I like computers.
In some tech circles, that is an unusual statement. (“In this house, we curse computers!”) I get it, computers can be really frustrating. But I like computers. I always have. It is really fun getting computers to do things. Painful, sure, but the results are worth it. Small microcontrollers are fun, desktops are fun, phones are fun, and servers are fun, whether racked in your basement or in a data center across the world. I like them all.
So it is no small thing for me when I admit: I do not like the cloud today.
I want to. Computers are great, whether it is a BSD installed directly on a PC or a Linux VM. I can enjoy Windows, BeOS, Novell NetWare, I even installed OS/2 Warp back in the day and had a great time with it. Linux is particularly powerful today and a source of endless potential. And for all the pages of products, the cloud is just Linux VMs. Better, they are API driven Linux VMs. I should be in heaven.
But every cloud product I try is wrong. Some are better than others, but I am constantly constrained by the choices cloud vendors make in ways that make it hard to get computers to do the things I want them to do.
These issues go beyond UX or bad API design. Some of the fundamental building blocks of today’s clouds are the wrong shape. VMs are the wrong shape because they are tied to CPU/memory resources. I want to buy some CPUs, memory, and disk, and then run VMs on it. A Linux VM is a process running in another Linux’s cgroup, I should be able to run as many as I like on the computer I have. The only way to do that easily on today’s clouds is to take isolation into my own hands, with gVisor or nested virtualization on a single cloud VM, paying the nesting performance penalty, and then I am left with the job of running and managing, at a minimum, a reverse proxy onto my VMs. All because the cloud abstraction is the wrong shape.
Clouds have tried to solve this with “PaaS” systems. Abstractions that are inherently less powerful than a computer, bespoke to a particular provider. Learn a new way to write software for each compute vendor, only to find half way into your project that something that is easy on a normal computer is nearly impossible because of some obscure limit of the platform system buried so deep you cannot find it until you are deeply committed to a project. Time and again I have said “this is the one” only to be betrayed by some half-assed, half-implemented, or half-thought-through abstraction. No thank you.
Consider disk. Cloud providers want you to use remote block devices (or something even more limited and slow, like S3). When remote block devices were introduced they made sense, because computers used hard drives. Remote does not hurt sequential read/write performance, if the buffering implementation is good. Random seeks on a hard drive take 10ms, so 1ms RTT for the Ethernet connection to remote storage is a fine price to pay. It is a good product for hard drives and makes the cloud vendor’s life a lot easier because it removes an entire dimension from their standard instance types.
But then we all switched to SSD. Seek time went from 10 milliseconds to 20 microseconds. Heroic efforts have cut the network RTT a bit for really good remote block systems, but the IOPS overhead of remote systems went from 10% with hard drives to more than 10x with SSDs.
It is a lot of work to configure an EC2 VM to have 200k IOPS, and you will pay $10k/month for the privilege. My MacBook has 500k IOPS. Why are we hobbling our cloud infrastructure with slow disk?
Finally networking. Hyperscalers have great networks. They charge you the earth for them and make it miserable to do deals with other vendors. The standard price for a GB of egress from a cloud provider is 10x what you pay racking a server in a normal data center. At moderate volume the multiplier is even worse. Sure, if you spend $XXm/month with a cloud the prices get much better, but most of my projects want to spend $XX/month, without the little m. The fundamental technology here is fine, but this is where limits are placed on you to make sure whatever you build cannot be affordable.
Finally, clouds have painful APIs. This is where projects like K8S come in, papering over the pain so engineers suffer a bit less from using the cloud. But VMs are hard with Kubernetes because the cloud makes you do it all yourself with lumpy nested virtualization. Disk is hard because back when they were designing K8S Google didn’t really even do usable remote block devices, and even if you can find a common pattern among clouds today to paper over, it will be slow. Networking is hard because if it were easy you would private link in a few systems from a neighboring open DC and drop a zero from your cloud spend. It is tempting to dismiss Kubernetes as a scam, artificial make work designed to avoid doing real product work, but the truth is worse: it is a product attempting to solve an impossible problem: make clouds portable and usable. It cannot be done.
You cannot solve the fundamental problems with cloud abstractions by building new abstractions on top. Making Kubernetes good is inherently impossible, a project in putting (admittedly high quality) lipstick on a pig.
We have been muddying along with these miserable clouds for 15 years now. We make do, in the way we do with all the unpleasant parts of our software stack, holding our nose whenever we have to deal with and trying to minimize how often that happens.
This however, is the moment to fix it.
This is the moment because something has changed: we have agents now. (Indeed my co-founder Josh and I started tinkering because we wanted to use LLMs in programming. It turns out what needs building for LLMs are better traditional abstractions.) Agents, by making it easiest to write code, means there will be a lot more software. Economists would call this an instance of Jevons paradox. Each of us will write more programs, for fun and for work. We need private places to run them, easy sharing with friends and colleagues, minimal overhead.
With more total software in our lives the cloud, which was an annoying pain, becomes a much bigger pain. We need a lot more compute, we need it to be easier to manage. Agents help to some degree. If you trust them with your credentials they will do a great job driving the AWS API for you (though occasionally it will delete your production DB). But agents struggle with the fundamental limits of the abstractions as much as we do. You need more tokens than you should and you get a worse result than you should. Every percent of context window the agent spends thinking about how to contort classic clouds into working is context window is not using to solve your problem.
So we are going to fix it. What we have launched on exe.dev today addresses the VM resource isolation problem: instead of provisioning individual VMs, you get CPU and memory and run the VMs you want. We took care of a TLS proxy and an authentication proxy, because I do not actually want my fresh VMs dumped directly on the internet. Your disk is local NVMe with blocks replicated off machine asynchronously. We have regions around the world for your machines, because you want your machines close. Your machines are behind an anycast network to give all your global users a low latency entrypoint to your product (and so we can build some new exciting things soon).
There is a lot more to build here, from obvious things like static IPs to UX challenges like how to give you access to our automatic historical disk snapshots. Those will get built. And at the same time we are going right back to the beginning, racking computers in data centers, thinking through every layer of the software stack, exploring all the options for how we wire up networks.
So, I am building a cloud. One I actually want to use. I hope it is useful to you.
12:13 PM PDT · April 22, 2026
Apple released a software update on Wednesday for iPhones and iPads fixing a bug that allowed law enforcement to extract messages that had been deleted or disappeared automatically from messaging apps. This was because notifications that displayed the messages’ content were also cached on the device for up to a month.
In a security notice on its website, Apple said that the bug meant “notifications marked for deletion could be unexpectedly retained on the device.”
This is a clear reference to an issue revealed by 404 Media earlier this month. The independent news outlet reported that the FBI had been able to extract deleted Signal messages from someone’s iPhone using forensic tools, due to the fact that the content of the messages had been displayed in a notification and then stored inside a phone’s database — even after the messages were deleted inside Signal.
After the news, Signal president Meredith Whittaker said the messaging app maker asked Apple to address the issue. “Notifications for deleted messages shouldn’t remain in any OS notification database,” Whittaker wrote in a post on Bluesky.
Contact Us
Do you have more information about how authorities are using forensic tools on iPhones or Android devices? From a non-work device, you can contact Lorenzo Franceschi-Bicchierai securely on Signal at +1 917 257 1382, or via Telegram and Keybase @lorenzofb, or email.
It’s unclear why the notifications’ content was logged to begin with, but today’s fix suggests it was a bug.
Apple did not immediately respond to a request for comment asking why the notifications were being retained. The company also backported the fix to iPhone and iPad owners running the older iOS 18 software.
Privacy activists expressed alarm when they learned that the FBI had found a way around a security feature that is used daily by at-risk users. Signal, like other messaging apps such as WhatsApp, allows users to set up a timer that instructs the app to automatically delete messages after a set amount of time. This feature can be helpful for anyone who wants to keep their conversations secret in the event that authorities seize their devices.
Techcrunch event
San Francisco, CA
|
October 13 – 15, 2026
Topics
When you purchase through links in our articles, we may earn a small commission. This doesn’t affect our editorial independence.
Lorenzo Franceschi-Bicchierai is a Senior Writer at TechCrunch, where he covers hacking, cybersecurity, surveillance, and privacy.
You can contact or verify outreach from Lorenzo by emailing lorenzo@techcrunch.com, via encrypted message at +1 917 257 1382 on Signal, and @lorenzofb on Keybase/Telegram.
View Bio
Secure your dependencies with us
Socket proactively blocks malicious open source packages in your code.
Socket researchers discovered that the Bitwarden CLI was compromised as part of the ongoing Checkmarx supply chain campaign. The open source password manager serves more than 10 million users and over 50,000 businesses, and ranks among among the top three password managers by enterprise adoption.
The affected package version appears to be @bitwarden/cli2026.4.0, and the malicious code was published in bw1.js, a file included in the package contents. The attack appears to have leveraged a compromised GitHub Action in Bitwarden’s CI/CD pipeline, consistent with the pattern seen across other affected repositories in this campaign.
What we know so far:
Bitwarden CLI builds were affected
The compromise follows the same GitHub Actions supply chain vector identified in the broader Checkmarx campaign
This is an ongoing investigation. Socket’s security research team is conducting a full technical analysis and will publish detailed findings, including affected versions, indicators of compromise, and remediation guidance.
If you use Bitwarden CLI, we recommend reviewing your CI logs and rotating any secrets that may have been exposed to the compromised workflow. At this time, the compromise only involves only the npm package for the CLI. Bitwarden’s Chrome extension, MCP server, and other legitimate distributions have not been affected yet.
Technical analysis#
The malicious payload was in a file named bw1.js , which shares core infrastructure with the Checkmarx mcpAddon.js we analyzed yesterday:
Same C2 endpoint: Uses identical audit.checkmarx[.]cx/v1/telemetry endpoint, obfuscated via __decodeScrambled with seed 0x3039. Exfiltration also occurs through GitHub API (commit-based) and npm registry (token theft/republishing)
Embedded payloads: Same gzip+base64 structure containing a Python memory-scraping script targeting GitHub Actions Runner.Worker, a setup.mjs loader for republished npm packages, a GitHub Actions workflow YAML, hardcoded RSA public keys, and an ideological manifesto string
Credential harvesting: GitHub tokens via Runner.Worker memory scraping and environment variables, AWS credentials via ~/.aws/ files and environment, Azure tokens via azd, GCP credentials via gcloud config config-helper, npm configuration files (.npmrc), SSH keys, environment variables, and Claude/MCP configuration files
Github Exfiltration: Public repositories created under victim accounts using Dune-themed naming ({word}-{word}-{3digits}), with encrypted results committed and tokens embedded in commit messages using the marker LongLiveTheResistanceAgainstMachines
Supply chain propagation: npm token theft to identify writable packages and republish with injected preinstall hooks, GitHub Actions workflow injection to capture repository secrets
Russian locale kill switch: Exits silently if system locale begins with “ru”, checking Intl.DateTimeFormat().resolvedOptions().locale and environment variables LC_ALL, LC_MESSAGES, LANGUAGE, and LANG
Runtime: Bun v1.3.13 interpreter downloaded from GitHub releases
This payload (bw1.js)also includes several indicators not documented in the Checkmarx incident:
Lock file: Hardcoded path /tmp/tmp.987654321.lock prevents multiple instances from running simultaneously
Shell profile persistence: Injects payload into ~/.bashrc and ~/.zshrc
Explicit branding: Repository description Shai-Hulud: The Third Coming replaces the deceptive “Checkmarx Configuration Storage”, and debug strings include “Would be executing butlerian jihad!”
The shared tooling strongly suggests a connection to the same malware ecosystem, but the operational signatures differ in ways that complicate attribution. The Checkmarx attack was claimed by TeamPCP via the @pcpcats social media account after discovery, and the malware itself attempted to blend in with legitimate-looking descriptions. This payload takes a different approach: the ideological branding is embedded directly in the malware, from the Shai-Hulud repository names to the “Butlerian Jihad” manifesto payload to commit messages proclaiming resistance against machines. This suggests either a different operator using shared infrastructure, a splinter group with stronger ideological motivations, or an evolution in the campaign’s public posture.
Recommendations#
Organizations that installed the malicious Bitwarden npm package should treat this incident as a credential exposure and CI/CD compromise event.
Immediately remove the affected package from developer systems and build environments. Rotate any credentials that may have been exposed to those environments, including GitHub tokens, npm tokens, cloud credentials, SSH keys, and CI/CD secrets. Review GitHub for unauthorized repository creation, unexpected workflow files under .github/workflows/, suspicious workflow runs, artifact downloads, and public repositories matching the observed Dune-themed staging pattern ({word}-{word}-{3digits}). Check for the following keywords in newly published repositories if you believe you may be impacted:
atreides
cogitor
fedaykin
fremen
futar
gesserit
ghola
harkonnen
heighliner
kanly
kralizec
lasgun
laza
melange
mentat
navigator
ornithopter
phibian
powindah
prana
prescient
sandworm
sardaukar
sayyadina
sietch
siridar
slig
stillsuit
thumper
tleilaxu
Audit npm for unauthorized publishes, version changes, or newly added install hooks. In cloud environments, review access logs for unusual secret access, token use, and newly issued credentials.
On endpoints and runners, hunt for outbound connections to the observed exfiltration infrastructure (audit[.]checkmarx[.]cx), execution of Bun where it is not normally used, access to files such as .npmrc, .git-credentials, .env, cloud credential stores, gcloud, az, or azd. Check for the lock file /tmp/tmp.987654321.lock and shell profile modifications in ~/.bashrc and ~/.zshrc. For GitHub Actions, review whether any unapproved workflows were created on transient branches and whether artifacts such as format-results.txt were generated or downloaded.
As a longer-term control, reduce the blast radius of future supply chain incidents by locking down token scopes, requiring short-lived credentials where possible, restricting who can create or publish packages, hardening GitHub Actions permissions, disabling unnecessary artifact access, and monitoring for new public repositories or workflow changes created outside normal release processes.
IOCs#
Malicious Package
@bitwarden/cli2026.4.0
Network Indicators
94[.]154[.]172[.]43
https://audit.checkmarx[.]cx/v1/telemetry
File System Indicators (Victim Package Compromise)
/tmp/tmp.987654321.lock
/tmp/_tmp_<Unix Epoch Timestamp>/
package-updated.tgz
You have a preview view of this article while we are checking your access. When we have confirmed access, the full article content will load.
A new deal, which would allow The Onion to use the Infowars name and website address, must be approved by a Texas judge.
The Onion Has a New Plan to Take Over Infowars
A new deal, which would allow The Onion to use the Infowars name and website address, must be approved by a Texas judge.
The Onion, a satirical news outlet, wants to convert the right-wing Infowars site into a parody of itself.Credit…Jamie Kelter Davis for The New York Times
Listen
· 6:39 min
April 20, 2026
When Infowars, the website founded by the right-wing conspiracist Alex Jones, came up for sale two years ago, an unlikely suitor stepped up. The Onion, a satirical news outlet, planned to convert the site into a parody of itself.
That sale was scuttled by a bankruptcy court. Now, The Onion has re-emerged with a new plan: licensing the website from Gregory Milligan, the court-appointed manager of the site.
On Monday, Mr. Milligan asked Maya Guerra Gamble, a judge in Texas’ Travis County District Court overseeing the disposition of Infowars, to approve that licensing agreement in a court filing. Under the terms, The Onion’s parent company, Global Tetrahedron, would pay $81,000 a month to license Infowars.com and its associated intellectual property — such as its name — for an initial six months, with an option to renew for another six months.
The licensing deal has been agreed to by The Onion and the court-appointed administrator. But it is not effective until Judge Guerra Gamble approves it, and Mr. Jones could appeal any ruling. That means the fate of Infowars remains in limbo until the court rules, probably sometime in the next two weeks. Mr. Jones continues to operate Infowars.com and host its weekday program, “The Alex Jones Show.”
Mr. Jones had no immediate comment.
The battle over Infowars has been a long and fraught saga, and Mr. Jones — a notorious peddler of lies and invective — has used his bully pulpit for more than a year to crusade against The Onion’s efforts to take over the platform. The site is in limbo because of a series of defamation lawsuits against Mr. Jones filed by families of victims of the mass shooting in 2012 at Sandy Hook Elementary School in Connecticut, which Mr. Jones falsely claimed was a hoax.
Image
Thank you for your patience while we verify access. If you are in Reader mode please exit and log into your Times account, or subscribe for all of The Times.
Thank you for your patience while we verify access.
Already a subscriber? Log in.
Want all of The Times? Subscribe.
Advertisement
SKIP ADVERTISEMENT
alice pellerin • 2026 – 03-31
too often, i see hex editors1 that look like this:
00000000 00 00 02 00 28 00 00 00 88 15 00 00 C4 01 00 00 ⋄⋄•⋄(⋄⋄⋄ו⋄⋄ו⋄⋄
00000010 14 00 00 00 03 00 00 00 00 01 00 00 03 00 00 00 •⋄⋄⋄•⋄⋄⋄⋄•⋄⋄•⋄⋄⋄
00000020 3C 00 00 00 C4 0A 00 00 50 00 00 00 18 00 00 00 <⋄⋄⋄×⏎⋄⋄P⋄⋄⋄•⋄⋄⋄
00000030 14 00 00 10 00 00 00 00 18 00 00 20 00 00 00 00 •⋄⋄•⋄⋄⋄⋄•⋄⋄ ⋄⋄⋄⋄
00000040 20 00 00 30 00 00 00 00 51 00 00 00 48 00 00 00 ⋄⋄0⋄⋄⋄⋄Q⋄⋄⋄H⋄⋄⋄
00000050 10 00 00 80 00 00 00 00 00 00 00 A0 00 00 00 00 •⋄⋄×⋄⋄⋄⋄⋄⋄⋄×⋄⋄⋄⋄
00000060 01 00 00 A0 01 00 00 00 02 00 00 A0 02 00 00 00 •⋄⋄ו⋄⋄⋄•⋄⋄ו⋄⋄⋄
00000070 03 00 00 A0 03 00 00 00 04 00 00 A0 04 00 00 00 •⋄⋄ו⋄⋄⋄•⋄⋄ו⋄⋄⋄
00000080 05 00 00 A0 05 00 00 00 06 00 00 A0 06 00 00 00 •⋄⋄ו⋄⋄⋄•⋄⋄ו⋄⋄⋄
00000090 20 00 00 30 00 00 00 00 53 00 00 00 00 DE 00 00 ⋄⋄0⋄⋄⋄⋄S⋄⋄⋄⋄×⋄⋄
000000a0 5D FA 01 44 E1 3A 9A 0F 52 00 00 00 FC 14 00 00 ]וD×:וR⋄⋄⋄ו⋄⋄
000000b0 1B 20 2A 2B 00 80 00 00 00 80 00 00 00 80 00 00 • *+⋄×⋄⋄⋄×⋄⋄⋄×⋄⋄
000000c0 FF 7F 00 00 00 00 33 52 00 00 00 00 29 10 15 10 ╳•⋄⋄⋄⋄3R⋄⋄⋄⋄)•••
000000d0 80 00 1F 00 03 00 00 00 02 00 00 00 40 14 22 23 ×⋄•⋄•⋄⋄⋄•⋄⋄⋄@•“#
000000e0 03 00 00 00 06 00 00 00 23 00 9D 05 6B FA C0 05 •⋄⋄⋄•⋄⋄⋄#⋄וk×ו
000000f0 C8 03 00 00 14 22 23 14 05 00 00 00 2E 00 9E 06 ו⋄⋄•“#••⋄⋄⋄.⋄ו
every time i do, i feel bad for the poor person having to use it (especially if that person is me!). a plain list of bytes makes it hard to notice interesting things in the data. go ahead, try to find the single C0 in these bytes:
00000000 15 29 21 25 03 2F 2E 2B 15 11 24 3F 10 14 3B 13 •)!%•/.+••$?••;•
00000001 32 25 09 01 10 02 01 23 26 1E 25 2D 24 2F 23 3E 2%␣••••#&•%-$/#>
00000002 05 0F 33 2D 18 29 3E 1E 16 3B 29 0D 24 0B 3E 38 ••3-•)>••;)␍$•>8
00000003 33 3C 1E 2C 28 31 C0 1D 11 32 14 05 10 17 3F 01 3<•,(1ו•2••••?•
00000004 1E 32 0A 14 2B 2F 0B 14 3E 27 39 0A 17 23 1B 39 •2⏎•+/••>’9⏎•#•9
00000005 18 0B 3B 13 25 14 2C 3B 33 3C 19 10 21 0F 2C 34 ••;•%•,;3<••!•,4
00000006 2F 0C 1D 2C 2E 22 11 28 0D 0A 1F 37 27 39 35 21 /••,.“•(␍⏎•7′95!
00000007 23 39 21 2B 37 23 28 16 30 28 02 04 25 22 37 1F #9!+7#(•0(••%“7•
00000008 36 2F 2D 25 12 25 01 31 3B 39 2D 35 26 37 30 2A 6/-%•%•1;9 – 5&70*
00000009 06 0D 11 1F 25 0A 1E 29 15 0B 0A 2A 2E 2C 21 16 •␍••%⏎•)••⏎*.,!•
0000000a 1D 37 0F 16 12 03 2C 02 0B 22 24 11 1A 3B 0D 0B •7••••,••“$••;␍•
0000000b 0D 13 30 2D 3B 15 05 15 32 19 20 30 3C 0E 3D 0B ␍•0-;•••2• 0<•=•
0000000c 17 24 22 3E 1E 22 18 0D 21 06 29 38 3E 20 3B 12 •$“>•“•␍!•)8> ;•
0000000d 06 1F 19 17 29 35 1E 3B 1E 01 31 08 13 0C 27 20 ••••)5•;••1•••′
0000000e 08 24 2E 32 16 06 1F 3D 35 35 19 16 02 07 31 13 •$.2•••=55••••1•
0000000f 31 33 30 36 14 32 07 05 05 34 19 0B 18 16 12 3C 1306•2•••4•••••<
compare that to one with colors:
00000000 37 2D 08 13 0D 0B 18 1D 02 1A 2D 12 2A 0D 0F 27 7-••␍•••••-•*␍•′
00000001 04 2A 25 32 0F 17 32 11 2F 2A 2A 0A 0A 16 04 1D •*%2••2•/**⏎⏎•••
00000002 32 13 09 01 2B 26 1A 30 3D 26 13 39 09 0D 38 3E 2•␣•+&•0=&•9␣␍8>
00000003 0A 0D 1D 0B 36 30 02 36 0E 0B 2F 09 26 1E 33 03 ⏎␍••60•6••/␣&•3•
00000004 3C 3C 08 0A 1E 36 12 11 1B 17 05 09 0B 37 0C 0E <<•⏎•6•••••␣•7••
00000005 31 05 09 17 2D 1D 05 16 25 03 3E 0A 1A 01 0C 2B 1•␣•-•••%•>⏎•••+
00000006 13 37 17 14 37 03 18 34 2D 03 30 11 2B 19 04 0B •7••7••4-•0•+•••
00000007 04 2A 18 26 21 25 3F 23 1D 0F 2F 2B 35 0C 09 37 •*•&!%?#••/+5•␣7
00000008 25 33 19 1C 12 1E 2E 38 3A 3A 3C 28 39 0A 30 23 %3••••.8::<(9⏎0#
00000009 21 08 09 24 0B 0E 13 26 04 30 06 20 10 18 15 3C !•␣$•••&•0• •••<
0000000a 10 3C 30 34 28 28 1D 31 22 23 22 38 0E 12 25 15 •<04((•1″#“8••%•
0000000b 3B 1F 30 0D 26 0E 15 32 1C 2B 12 1A 32 1C 02 07 ;•0␍&••2•+••2•••
0000000c 35 2E 06 13 1F 33 3D 16 05 1C 2A 0F 34 34 21 26 5.•••3=•••*•44!&
0000000d 0C 17 3D 02 27 39 21 17 3F 07 1A 2F 38 0D 2D 1E ••=•’9!•?••/8␍-•
0000000e 32 0C C0 14 0E 20 25 0E 2E 2D 0D 21 27 13 2C 07 2•ו• %•.-␍!’•,•
0000000f 14 0A 20 31 15 13 2C 3B 0F 12 1A 2D 0C 11 32 11 •⏎ 1••,;•••-••2•
it’s much easier to pick out the unique byte when it’s a different color! human brains are really good at spotting visual patterns—given the right format
here are a few more examples:
example 1
00000000 4B 50 53 00 0A 00 00 00 0C 00 00 00 01 00 00 00 KPS⋄⏎⋄⋄⋄•⋄⋄⋄•⋄⋄⋄
00000010 00 00 00 00 B4 00 00 00 46 00 00 00 64 00 00 00 ⋄⋄⋄⋄×⋄⋄⋄F⋄⋄⋄d⋄⋄⋄
00000020 46 00 00 00 02 00 00 00 00 00 00 00 DC 00 00 00 F⋄⋄⋄•⋄⋄⋄⋄⋄⋄⋄×⋄⋄⋄
00000030 50 00 00 00 A0 00 00 00 50 00 00 00 03 00 00 00 P⋄⋄⋄×⋄⋄⋄P⋄⋄⋄•⋄⋄⋄
00000040 00 00 00 00 FA 00 00 00 5A 00 00 00 B4 00 00 00 ⋄⋄⋄⋄×⋄⋄⋄Z⋄⋄⋄×⋄⋄⋄
00000050 5A 00 00 00 04 00 00 00 00 00 00 00 18 01 00 00 Z⋄⋄⋄•⋄⋄⋄⋄⋄⋄⋄••⋄⋄
00000060 64 00 00 00 C8 00 00 00 64 00 00 00 05 00 00 00 d⋄⋄⋄×⋄⋄⋄d⋄⋄⋄•⋄⋄⋄
00000070 00 00 00 00 4A 01 00 00 78 00 00 00 F0 00 00 00 ⋄⋄⋄⋄J•⋄⋄x⋄⋄⋄×⋄⋄⋄
00000080 78 00 00 00 06 00 00 00 00 00 00 00 90 01 00 00 x⋄⋄⋄•⋄⋄⋄⋄⋄⋄⋄ו⋄⋄
00000090 8C 00 00 00 18 01 00 00 8C 00 00 00 07 00 00 00 ×⋄⋄⋄••⋄⋄×⋄⋄⋄•⋄⋄⋄
000000a0 00 00 00 00 F4 01 00 00 B4 00 00 00 68 01 00 00 ⋄⋄⋄⋄ו⋄⋄×⋄⋄⋄h•⋄⋄
000000b0 B4 00 00 00 08 00 00 00 00 00 00 00 58 02 00 00 ×⋄⋄⋄•⋄⋄⋄⋄⋄⋄⋄X•⋄⋄
000000c0 DC 00 00 00 B8 01 00 00 DC 00 00 00 09 00 00 00 ×⋄⋄⋄ו⋄⋄×⋄⋄⋄␣⋄⋄⋄
000000d0 E7 03 00 00 E7 03 00 00 00 00 00 00 E7 03 00 00 ו⋄⋄ו⋄⋄⋄⋄⋄⋄ו⋄⋄
000000e0 E7 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ו⋄⋄⋄⋄⋄⋄⋄⋄⋄⋄⋄⋄⋄⋄
000000f0 00 00 00 00 00 00 00 00 00 00 00 00 ⋄⋄⋄⋄⋄⋄⋄⋄⋄⋄⋄⋄
00000000 4B 50 53 00 0A 00 00 00 0C 00 00 00 01 00 00 00 KPS⋄⏎⋄⋄⋄•⋄⋄⋄•⋄⋄⋄
00000010 00 00 00 00 B4 00 00 00 46 00 00 00 64 00 00 00 ⋄⋄⋄⋄×⋄⋄⋄F⋄⋄⋄d⋄⋄⋄
00000020 46 00 00 00 02 00 00 00 00 00 00 00 DC 00 00 00 F⋄⋄⋄•⋄⋄⋄⋄⋄⋄⋄×⋄⋄⋄
00000030 50 00 00 00 A0 00 00 00 50 00 00 00 03 00 00 00 P⋄⋄⋄×⋄⋄⋄P⋄⋄⋄•⋄⋄⋄
00000040 00 00 00 00 FA 00 00 00 5A 00 00 00 B4 00 00 00 ⋄⋄⋄⋄×⋄⋄⋄Z⋄⋄⋄×⋄⋄⋄
00000050 5A 00 00 00 04 00 00 00 00 00 00 00 18 01 00 00 Z⋄⋄⋄•⋄⋄⋄⋄⋄⋄⋄••⋄⋄
00000060 64 00 00 00 C8 00 00 00 64 00 00 00 05 00 00 00 d⋄⋄⋄×⋄⋄⋄d⋄⋄⋄•⋄⋄⋄
00000070 00 00 00 00 4A 01 00 00 78 00 00 00 F0 00 00 00 ⋄⋄⋄⋄J•⋄⋄x⋄⋄⋄×⋄⋄⋄
00000080 78 00 00 00 06 00 00 00 00 00 00 00 90 01 00 00 x⋄⋄⋄•⋄⋄⋄⋄⋄⋄⋄ו⋄⋄
00000090 8C 00 00 00 18 01 00 00 8C 00 00 00 07 00 00 00 ×⋄⋄⋄••⋄⋄×⋄⋄⋄•⋄⋄⋄
000000a0 00 00 00 00 F4 01 00 00 B4 00 00 00 68 01 00 00 ⋄⋄⋄⋄ו⋄⋄×⋄⋄⋄h•⋄⋄
000000b0 B4 00 00 00 08 00 00 00 00 00 00 00 58 02 00 00 ×⋄⋄⋄•⋄⋄⋄⋄⋄⋄⋄X•⋄⋄
000000c0 DC 00 00 00 B8 01 00 00 DC 00 00 00 09 00 00 00 ×⋄⋄⋄ו⋄⋄×⋄⋄⋄␣⋄⋄⋄
000000d0 E7 03 00 00 E7 03 00 00 00 00 00 00 E7 03 00 00 ו⋄⋄ו⋄⋄⋄⋄⋄⋄ו⋄⋄
000000e0 E7 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ו⋄⋄⋄⋄⋄⋄⋄⋄⋄⋄⋄⋄⋄⋄
000000f0 00 00 00 00 00 00 00 00 00 00 00 00 ⋄⋄⋄⋄⋄⋄⋄⋄⋄⋄⋄⋄
this file starts with the magic bytes KPS, then a bunch of (little-endian) 32-bit integers that range from 0 to 999 (0x3E7). the colors make it quick to recognize that every 32-bit integer is relatively small, as the two high bytes are always 00 00. if you look closely, you may notice other patterns, like the numbers counting up every 0x18 bytes starting at 0xC
if you’re curious about this particular file format, the code that parses it is pretty simple, even if you’re not a programmer. there’s even a wiki page for the data it represents, if you’re into Fossil Fighters
example 2
00000000 44 41 4C 00 59 06 00 00 F4 07 00 00 F5 01 00 00 DAL⋄Y•⋄⋄ו⋄⋄ו⋄⋄
00000010 14 00 00 00 E8 07 00 00 08 08 00 00 44 08 00 00 •⋄⋄⋄ו⋄⋄••⋄⋄D•⋄⋄
00000020 84 08 00 00 C8 08 00 00 04 09 00 00 40 09 00 00 ו⋄⋄ו⋄⋄•␣⋄⋄@␣⋄⋄
00000030 7C 09 00 00 B8 09 00 00 F8 09 00 00 34 0A 00 00 |␣⋄⋄×␣⋄⋄×␣⋄⋄4⏎⋄⋄
00000040 70 0A 00 00 AC 0A 00 00 EC 0A 00 00 30 0B 00 00 p⏎⋄⋄×⏎⋄⋄×⏎⋄⋄0•⋄⋄
00000050 6C 0B 00 00 A8 0B 00 00 E8 0B 00 00 24 0C 00 00 l•⋄⋄ו⋄⋄ו⋄⋄$•⋄⋄
00000060 60 0C 00 00 9C 0C 00 00 D8 0C 00 00 14 0D 00 00 `•⋄⋄ו⋄⋄ו⋄⋄•␍⋄⋄
00000070 50 0D 00 00 8C 0D 00 00 CC 0D 00 00 08 0E 00 00 P␍⋄⋄×␍⋄⋄×␍⋄⋄••⋄⋄
00000080 48 0E 00 00 84 0E 00 00 C4 0E 00 00 08 0F 00 00 H•⋄⋄ו⋄⋄ו⋄⋄••⋄⋄
00000090 44 0F 00 00 80 0F 00 00 C0 0F 00 00 04 10 00 00 D•⋄⋄ו⋄⋄ו⋄⋄••⋄⋄
Security researchers have uncovered two separate spying campaigns that are abusing well-known weaknesses in the global telecoms infrastructure to track people’s locations. The researchers say these two campaigns are likely a small snapshot of what they believe to be widespread exploitation of surveillance vendors seeking access to global phone networks.
On Thursday, the Citizen Lab, a digital rights organization with more than a decade of experience exposing surveillance abuses, published a new report detailing the two newly identified campaigns. The surveillance vendors behind them, which Citizen Lab did not name, operated as “ghost” companies that pretended to be legitimate cellular providers and would piggyback their access to those networks to look up the location data of their targets.
The new findings reveal continued exploitation of known flaws in the technologies that underpin the global phone networks.
One of them is the insecurity of Signaling System 7, or SS7, a set of protocols for 2G and 3G networks that for years has been the backbone of how cellular networks connect to each other and route subscribers’ calls and text messages around the world. Researchers and experts have long warned that governments and surveillance tech makers can exploit vulnerabilities in SS7 to geolocate individuals’ cell phones, as SS7 does not require authentication nor encryption, leaving the door open for rogue operators to abuse it.
The newer protocol, Diameter, designed for newer 4G and 5G communications, is supposed to replace SS7 and includes the security features that were lacking in its predecessor. But as the Citizen Lab highlights in this report, there are still ways to exploit Diameter, as cell providers do not always implement the new protections. In some cases, attackers can still fall back to exploiting the older SS7 protocol.
The two spy campaigns have at least one thing in common: Both abused access to three specific telecom providers that repeatedly acted “as the surveillance entry and transit points within the telecommunications ecosystem.” This access gave the surveillance vendors and their government customers behind the campaigns the ability to “hide behind their infrastructure,” as the researchers explained.
According to the report, the first one is Israeli operator 019Mobile, which researchers said was used in several surveillance attempts. British provider Tango Networks U.K. was also used for surveillance activity over several years, the researchers say.
Techcrunch event
San Francisco, CA
|
October 13 – 15, 2026
The third cell phone provider is Airtel Jersey, an operator on the Channel Island of Jersey now owned by Sure, a company whose networks have been linked to prior surveillance campaigns.
Sure CEO Alistair Beak told TechCrunch that the company “does not lease access to signalling directly or knowingly to organisations for the purposes of locating or tracking individuals, or for intercepting communications content.”
“Sure acknowledges that digital services can be misused, which is why we take a number of steps to mitigate this risk. Sure has implemented several protective measures to prevent the misuse of signalling services, including monitoring and blocking inappropriate signalling,” read Beak’s statement. “Any evidence or valid complaint relating to the misuse of Sure’s network results in the service being immediately suspended and, where malicious or inappropriate activity is confirmed following investigation, permanently terminated.”
Tango Networks and 019Mobile did not respond to TechCrunch’s request for comment.
Gil Nagar, the head of IT and security and 019Mobile, sent a letter to Citizen Lab. Nagar said that the company “cannot confirm” that the alleged 019Mobile infrastructure, identified by Citizen Lab as being used by the surveillance vendors, belongs to the company.
Researchers say ‘high-profile’ people targeted
According to the Citizen Lab, the first surveillance vendor facilitated spying campaigns spanning several years against different targets all over the world, and using the infrastructure of several different cell phone providers. This led researchers to conclude that different government customers of the surveillance vendor were behind the various campaigns.
“The evidence shows a deliberate and well-funded operation with deep integration into the mobile signaling ecosystem,” the researchers wrote.
Gary Miller, one of the researchers who investigated these attacks, told TechCrunch that some clues point to an “Israeli-based commercial geo-intelligence provider with specialized telecom capabilities,” but did not name the surveillance provider. Several Israeli companies are known to offer similar services, such as Circles (later acquired by spyware maker NSO Group), Cognyte, and Rayzone.
Contact Us
Do you have more information about surveillance vendors that exploit cellphone networks? From a non-work device, you can contact Lorenzo Franceschi-Bicchierai securely on Signal at +1 917 257 1382, or via Telegram and Keybase @lorenzofb, or email.
According to the Citizen Lab, the first campaign relied on trying to abuse flaws in SS7, and then switching to exploiting Diameter if those attempts failed.
The second spy campaign used different methods. In this case, the other surveillance vendor behind it — which Citizen Lab is not naming either — relied on sending a special type of SMS message to one specific “high-profile” target, as the researchers explained.
These are text-based messages designed to communicate directly with the target’s SIM card, without showing any trace of them to the user. Under normal circumstances, these messages are used by cell phone providers to send innocuous commands to their subscribers’ SIM cards used for keeping a device connected to their network. But the surveillance vendor instead sent commands that essentially turned the target’s phone into a location tracking device, according to the researchers. This type of attack was dubbed SIMjacker by mobile cybersecurity company Enea in 2019.
“I’ve observed thousands of these attacks through the years, so I would say it’s a fairly common exploit that’s difficult to detect,” said Miller. “However, these attacks appear to be geographically targeted, indicating that actors employing SIMjacker-style attacks likely know the countries and networks most vulnerable to them.”
Miller made it clear that these two campaigns are just the tip of the iceberg. “We only focused on two surveillance campaigns in a universe of millions of attacks across the globe,” he said.
Updated to include 019Mobile’s responses sent to Citizen Lab.
When you purchase through links in our articles, we may earn a small commission. This doesn’t affect our editorial independence.
Over the past month, we’ve been looking into reports that Claude’s responses have worsened for some users. We’ve traced these reports to three separate changes that affected Claude Code, the Claude Agent SDK, and Claude Cowork. The API was not impacted.
All three issues have now been resolved as of April 20 (v2.1.116).
In this post, we explain what we found, what we fixed, and what we’ll do differently to ensure similar issues are much less likely to happen again.
We take reports about degradation very seriously. We never intentionally degrade our models, and we were able to immediately confirm that our API and inference layer were unaffected.
After investigation, we identified three different issues:
On March 4, we changed Claude Code’s default reasoning effort from high to medium to reduce the very long latency—enough to make the UI appear frozen—some users were seeing in high mode. This was the wrong tradeoff. We reverted this change on April 7 after users told us they’d prefer to default to higher intelligence and opt into lower effort for simple tasks. This impacted Sonnet 4.6 and Opus 4.6.
On March 26, we shipped a change to clear Claude’s older thinking from sessions that had been idle for over an hour, to reduce latency when users resumed those sessions. A bug caused this to keep happening every turn for the rest of the session instead of just once, which made Claude seem forgetful and repetitive. We fixed it on April 10. This affected Sonnet 4.6 and Opus 4.6.
On April 16, we added a system prompt instruction to reduce verbosity. In combination with other prompt changes, it hurt coding quality, and was reverted on April 20. This impacted Sonnet 4.6, Opus 4.6, and Opus 4.7.
Because each change affected a different slice of traffic on a different schedule, the aggregate effect looked like broad, inconsistent degradation. While we began investigating reports in early March, they were challenging to distinguish from normal variation in user feedback at first, and neither our internal usage nor evals initially reproduced the issues identified.
This isn’t the experience users should expect from Claude Code. As of April 23, we’re resetting usage limits for all subscribers.
A change to Claude Code’s default reasoning effort
When we released Opus 4.6 in Claude Code in February, we set the default reasoning effort to high.
Soon after, we received user feedback that Claude Opus 4.6 in high effort mode would occasionally think for too long, causing the UI to appear frozen and leading to disproportionate latency and token usage for those users.
In general, the longer the model thinks, the better the output. Effort levels are how Claude Code lets users set that tradeoff—more thinking versus lower latency and fewer usage limit hits. As we calibrate effort levels for our models, we take this tradeoff into account in order to pick points along the test-time-compute curve that give people the best range of options. In the product layer, we then choose which point along this curve we set as our default, and that is the value we send to the Messages API as the effort parameter; we then make the other options available via /effort.
In our internal evals and testing, medium effort achieved slightly lower intelligence with significantly less latency for the majority of tasks. It also didn’t suffer from the same issues with occasional very long tail latencies for thinking, and it helped maximize users’ usage limits. As a result, we rolled out a change making medium the default effort, and explained the rationale via in-product dialog.
Soon after rolling out, users began reporting that Claude Code felt less intelligent. We shipped a number of design iterations to make the current effort setting clearer in order to alert people they could change the default (notices on startup, an inline effort selector, and bringing back ultrathink), but most users retained the medium effort default.
After hearing feedback from more customers, we reversed this decision on April 7. All users now default to xhigh effort for Opus 4.7, and high effort for all other models.
A caching optimization that dropped prior reasoning
When Claude reasons through a task, that reasoning is normally kept in the conversation history so that on every subsequent turn, Claude can see why it made the edits and tool calls it did.
On March 26, we shipped what was meant to be an efficiency improvement to this feature. We use prompt caching to make back-to-back API calls cheaper and faster for users. Claude writes the input tokens to the cache when it makes an API request, then after a period of inactivity the prompt is evicted from cache, making room for other prompts. Cache utilization is something we manage carefully (more on our approach).
The design should have been simple: if a session has been idle for more than an hour, we could reduce users’ cost of resuming that session by clearing old thinking sections. Since the request would be a cache miss anyway, we could prune unnecessary messages from the request to reduce the number of uncached tokens sent to the API. We’d then resume sending full reasoning history. To do this we used the clear_thinking_20251015 API header along with keep:1.
The implementation had a bug. Instead of clearing thinking history once, it cleared it on every turn for the rest of the session. After a session crossed the idle threshold once, each request for the rest of that process told the API to keep only the most recent block of reasoning and discard everything before it. This compounded: if you sent a follow-up message while Claude was in the middle of a tool use, that started a new turn under the broken flag, so even the reasoning from the current turn was dropped. Claude would continue executing, but increasingly without memory of why it had chosen to do what it was doing. This surfaced as the forgetfulness, repetition, and odd tool choices people reported.
Because this would continuously drop thinking blocks from subsequent requests, those requests also resulted in cache misses. We believe this is what drove the separate reports of usage limits draining faster than expected.
Two unrelated experiments made it challenging for us to reproduce the issue at first: an internal-only server-side experiment related to message queuing; and an orthogonal change in how we display thinking suppressed this bug in most CLI sessions, so we didn’t catch it even when testing external builds.
This bug was at the intersection of Claude Code’s context management, the Anthropic API, and extended thinking. The changes it introduced made it past multiple human and automated code reviews, as well as unit tests, end-to-end tests, automated verification, and dogfooding. Combined with this only happening in a corner case (stale sessions) and the difficulty of reproducing the issue, it took us over a week to discover and confirm the root cause.
As part of the investigation, we back-tested Code Review against the offending pull requests using Opus 4.7. When provided the code repositories necessary to gather complete context, Opus 4.7 found the bug, while Opus 4.6 didn’t. To prevent this from happening again, we are now landing support for additional repositories as context for code reviews.
We fixed this bug on April 10 in v2.1.101.
A system prompt change to reduce verbosity
Our latest model, Claude Opus 4.7, has a notable behavioral quirk relative to its predecessor: as we wrote about at launch, it tends to be quite verbose. This makes it smarter on hard problems, but it also produces more output tokens.
A few weeks before we released Opus 4.7, we started tuning Claude Code in preparation. Each model behaves slightly differently, and we spend time before each release optimizing the harness and product for it.
We have a number of tools to reduce verbosity: model training, prompting, and improving thinking UX in the product. Ultimately we used all of these, but one addition to the system prompt caused an outsized effect on intelligence in Claude Code:
“Length limits: keep text between tool calls to ≤25 words. Keep final responses to ≤100 words unless the task requires more detail.”
After multiple weeks of internal testing and no regressions in the set of evaluations we ran, we felt confident about the change and shipped it alongside Opus 4.7 on April 16.
As part of this investigation, we ran more ablations (removing lines from the system prompt to understand the impact of each line) using a broader set of evaluations. One of these evaluations showed a 3% drop for both Opus 4.6 and 4.7. We immediately reverted the prompt as part of the April 20 release.
Going forward
We are going to do several things differently to avoid these issues: we’ll ensure that a larger share of internal staff use the exact public build of Claude Code (as opposed to the version we use to test new features); and we’ll make improvements to our Code Review tool that we use internally, and ship this improved version to customers.
We’re also adding tighter controls on system prompt changes. We will run a broad suite of per-model evals for every system prompt change to Claude Code, continuing ablations to understand the impact of each line, and we have built new tooling to make prompt changes easier to review and audit. We’ve additionally added guidance to our CLAUDE.md to ensure model-specific changes are gated to the specific model they’re targeting. For any change that could trade off against intelligence, we’ll add soak periods, a broader eval suite, and gradual rollouts so we catch issues earlier.
We recently created @ClaudeDevs on X to give us the room to explain product decisions and the reasoning behind them in depth. We’ll share the same updates in centralized threads on GitHub.
Finally, we’d like to thank our users: the people who used the /feedback command to share their issues with us (or who posted specific, reproducible examples online) are the ones who ultimately allowed us to identify and fix these problems. Today we are resetting usage limits for all subscribers.
We’re immensely grateful for your feedback and for your patience.
As we see LLMs churn out scads of code, folks have increasingly turned to Cognitive Debt as a metaphor for capturing how a team can lose understanding of what a system does. Margaret-Anne Storey thinks a good way of thinking about these problems is to consider three layers of system health:
Technical debt lives in code. It accumulates when implementation decisions compromise future changeability. It limits how systems can change.
Cognitive debt lives in people. It accumulates when shared understanding of the system erodes faster than it is replenished. It limits how teams can reason about change.
Intent debt lives in artifacts. It accumulates when the goals and constraints that should guide the system are poorly captured or maintained. It limits whether the system continues to reflect what we meant to build and it limits how humans and AI agents can continue to evolve the system effectively.
Technical debt lives in code. It accumulates when implementation decisions compromise future changeability. It limits how systems can change.
Cognitive debt lives in people. It accumulates when shared understanding of the system erodes faster than it is replenished. It limits how teams can reason about change.
Intent debt lives in artifacts. It accumulates when the goals and constraints that should guide the system are poorly captured or maintained. It limits whether the system continues to reflect what we meant to build and it limits how humans and AI agents can continue to evolve the system effectively.
While I’m getting a bit bemused by debt metaphor proliferation, this way of thinking does make a fair bit of sense. The article includes useful sections to diagnose and mitigate each kind of debt. The three interact with each other, and the article outlines some general activities teams should do to keep it all under control
❄ ❄
In the article she references a recent paper by Shaw and Nave at the Wharton School that adds LLMs to Kahneman’s two-system model of thinking.
Kahneman’s book, “Thinking Fast and Slow”, is one of my favorite books. Its central idea is that humans have two systems of cognition. System 1 (intuition) makes rapid decisions, often barely-consciously. System 2 (deliberation) is when we apply deliberate thinking to a problem. He observed that to save energy we default to intuition, and that sometimes gets us into trouble when we overlook things that we would have spotted had we applied deliberation to the problem.
Shaw and Nave consider AI as System 3
A consequence of System 3 is the introduction of cognitive surrender, characterized by uncritical reliance on externally generated artificial reasoning, bypassing System 2. Crucially, we distinguish cognitive surrender, marked by passive trust and uncritical evaluation of external information, from cognitive offloading, which involves strategic delegation of cognition during deliberation.
A consequence of System 3 is the introduction of cognitive surrender, characterized by uncritical reliance on externally generated artificial reasoning, bypassing System 2. Crucially, we distinguish cognitive surrender, marked by passive trust and uncritical evaluation of external information, from cognitive offloading, which involves strategic delegation of cognition during deliberation.
It’s a long paper, that goes into detail on this “Tri-System theory of cognition” and reports on several experiments they’ve done to test how well this theory can predict behavior (at least within a lab).
❄ ❄ ❄ ❄ ❄
I’ve seen a few illustrations recently that use the symbols “< >” as part of an icon to illustrate code. That strikes me as rather odd, I can’t think of any programming language that uses “< >” to surround program elements. Why that and not, say, “{ }”?
Obviously the reason is that they are thinking of HTML (or maybe XML), which is even more obvious when they use “</>” in their icons. But programmers don’t program in HTML.
❄ ❄ ❄ ❄ ❄
Ajey Gore thinks about if coding agents make coding free, what becomes the expensive thing? His answer is verification.
What does “correct” mean for an ETA algorithm in Jakarta traffic versus Ho Chi Minh City? What does a “successful” driver allocation look like when you’re balancing earnings fairness, customer wait time, and fleet utilisation simultaneously? When hundreds of engineers are shipping into ~900 microservices around the clock, “correct” isn’t one definition — it’s thousands of definitions, all shifting, all context-dependent. These aren’t edge cases. They’re the entire job.
And they’re precisely the kind of judgment that agents cannot perform for you.
What does “correct” mean for an ETA algorithm in Jakarta traffic versus Ho Chi Minh City? What does a “successful” driver allocation look like when you’re balancing earnings fairness, customer wait time, and fleet utilisation simultaneously? When hundreds of engineers are shipping into ~900 microservices around the clock, “correct” isn’t one definition — it’s thousands of definitions, all shifting, all context-dependent. These aren’t edge cases. They’re the entire job.
And they’re precisely the kind of judgment that agents cannot perform for you.
Increasingly I’m seeing a view that agents do really well when they have good, preferably automated, verification for their work. This encourages such things as Test Driven Development. That’s still a lot of verification to do, which suggests we should see more effort to find ways to make it easier for humans to comprehend larger ranges of tests.
While I agree with most of what Ajey writes here, I do have a quibble with his view of legacy migration. He thinks it’s a delusion that “agentic coding will finally crack legacy modernisation”. I agree with him that agentic coding is overrated in a legacy context, but I have seen compelling evidence that LLMs help a great deal in understanding what legacy code is doing.
The big consequence of Ajey’s assessment is that we’ll need to reorganize around verification rather than writing code:
If agents handle execution, the human job becomes designing verification systems, defining quality, and handling the ambiguous cases agents can’t resolve. Your org chart should reflect this. Practically, this means your Monday morning standup changes. Instead of “what did we ship?” the question becomes “what did we validate?” Instead of tracking output, you’re tracking whether the output was right. The team that used to have ten engineers building features now has three engineers and seven people defining acceptance criteria, designing test harnesses, and monitoring outcomes. That’s the reorganisation. It’s uncomfortable because it demotes the act of building and promotes the act of judging. Most engineering cultures resist this. The ones that don’t will win.
If agents handle execution, the human job becomes designing verification systems, defining quality, and handling the ambiguous cases agents can’t resolve. Your org chart should reflect this. Practically, this means your Monday morning standup changes. Instead of “what did we ship?” the question becomes “what did we validate?” Instead of tracking output, you’re tracking whether the output was right. The team that used to have ten engineers building features now has three engineers and seven people defining acceptance criteria, designing test harnesses, and monitoring outcomes. That’s the reorganisation. It’s uncomfortable because it demotes the act of building and promotes the act of judging. Most engineering cultures resist this. The ones that don’t will win.
❄ ❄ ❄ ❄ ❄
One the questions comes up when we think of LLMs-as-programmers is whether there is a future for source code. David Cassel on The New Stack has an article summarizing several views of the future of code. Some folks are experimenting with entirely new languages built with the LLM in mind, others think that existing languages, especially strictly typed languages like TypeScript and Rust will be the best fit for LLMs. It’s an overview article, one that has lots of quotations, but not much analysis in itself - but it’s worth a read as a good overview of the discussion.
I’m interested to see how all this will play out. I do think there’s still a role for humans to work with LLMs to build useful abstractions in which to talk about what the code does - essentially the DDD notion of Ubiquitous Language. Last year Unmesh and I talked about growing a language with LLMs. As Unmesh put it
Programming isn’t just typing coding syntax that computers can understand and execute; it’s shaping a solution. We slice the problem into focused pieces, bind related data and behaviour together, and—crucially—choose names that expose intent. Good names cut through complexity and turn code into a schematic everyone can follow. The most creative act is this continual weaving of names that reveal the structure of the solution that maps clearly to the problem we are trying to solve.
Programming isn’t just typing coding syntax that computers can understand and execute; it’s shaping a solution. We slice the problem into focused pieces, bind related data and behaviour together, and—crucially—choose names that expose intent. Good names cut through complexity and turn code into a schematic everyone can follow. The most creative act is this continual weaving of names that reveal the structure of the solution that maps clearly to the problem we are trying to solve.
It took just a few months of President Donald Trump’s second term for Palantir employees to question their company’s commitments to civil liberties. Last fall, Palantir seemed to become the technological backbone of Trump’s immigration enforcement machinery, providing software identifying, tracking, and helping deport immigrants on behalf of the Department of Homeland Security (DHS), when current and former employees started ringing the alarm.
Around that time, two former employees reconnected by phone. Right as they picked up the call, one of them asked, “Are you tracking Palantir’s descent into fascism?”
“That was their greeting,” the other former employee says. “There’s this feeling not of ‘Oh, this is unpopular and hard,’ but, ‘This feels wrong.’”
Palantir was founded—with initial venture capital investment from the CIA—at a moment of national consensus following the September 11, 2001 attacks, when many saw fighting terrorism abroad as the most critical mission facing the US. The company, which was cofounded by tech billionaire Peter Thiel, sells software that acts as a high-powered data aggregation and analysis tool powering everything from private businesses to the US military’s targeting systems.
For the last 20 years, employees could accept the intense external criticism and awkward conversations with family and friends about working for a company named after J. R. R. Tolkien’s corrupting all-seeing orb. But a year into Trump’s second term, as Palantir deepens its relationship with an administration many workers fear is wreaking havoc at home, employees are finally raising these concerns internally, as the US’s war on immigrants, war in Iran, and even company-released manifestos has forced them to rethink the role they play in it all.
“We hire the best and brightest talent to help defend America and its allies and to build and deploy our software to help governments and businesses around the world. Palantir is no monolith of belief, nor should we be,” a Palantir spokesperson said in a statement. “We all pride ourselves on a culture of fierce internal dialogue and even disagreement over the complex areas we work on. That has been true from our founding and remains true today.”
“The broad story of Palantir as told to itself and to employees was that coming out of 9/11 we knew that there was going to be this big push for safety, and we were worried that that safety might infringe on civil liberties,” one former employee tells WIRED. “And now the threat’s coming from within. I think there’s a bit of an identity crisis and a bit of a challenge. We were supposed to be the ones who were preventing a lot of these abuses. Now we’re not preventing them. We seem to be enabling them.”
Palantir has always had a secretive reputation, forbidding employees from speaking to the press and requiring alumni to sign non-disparagement agreements. But throughout the company’s history, management has always at least appeared to be open to engagement and internal criticism, multiple employees say. Over the last year, however, much of that feedback has been met by philosophical soliloquies and redirection. “It’s never been really that people are afraid of speaking up against Karp. It’s more a question of what it would do, if anything,” one current employee tells WIRED.
While internal tensions within Palantir have grown over the last year, they reached a boiling point in January after the violent killing of Alex Pretti, a nurse who was shot and killed by federal agents during protests against Immigration and Customs Enforcement (ICE) in Minneapolis. Employees from across the company commented in a Slack thread dedicated to the news demanding more information about the company’s relationship with ICE from management and CEO Alex Karp.
“Our involvement with ice has been internally swept under the rug under Trump2 too much,” one person wrote in a Slack message WIRED reported at the time. “We need an understanding of our involvement here.”
Around this time, Palantir started wiping Slack conversations after seven days in at least one channel where most of the internal debate takes place, #palantir-in-the-news. Because the decision wasn’t formally announced before the policy rolled out, one worker who noticed the deletions asked in the channel why the company was removing “relevant internal discourse on current events.”
A member of Palantir’s cybersecurity team responded, writing that the decision was made in response to leaks.
This period led Palantir management to release an updated wiki, or a collection of blog posts explaining the ICE contract, where the company defended its work with DHS. Management wrote that the technology the company provides “is making a difference in mitigating risks while enabling targeted outcomes.”
Palantir management ran defense by holding a handful of AMA (ask me anything) forums across the company with leadership like chief technology officer Shyam Sankar and members of its privacy and civil liberties (PCL) teams.
At least one of these AMAs was organized independently of PCL leadership by two team leads, including one who worked directly on the ICE contract for a period of time. “This was very rogue,” a PCL employee who worked on the ICE contract said in a February AMA, a recording of which was obtained by WIRED. “Courtney [Bowman, head of the privacy and civil liberties team] doesn’t know that I’m spending three hours this week talking to IMPLs [Palantir terminology for its client-facing product teams], but I think this is the only real way to start going in the right direction.”
Throughout the lengthy call, employees working on a variety of Palantir’s defense projects posed hard questions. Could ICE agents delete audit logs in Palantir’s software? Could agents create harmful workflows on their own without the company’s help? What is the most malicious thing that could come out of this work?
Answering these questions, the PCL employee who worked on the ICE contract said that “a sufficiently malicious customer is, like, basically impossible to prevent at the moment” and could only be controlled through “auditing to prove what happened” and legal action after the fact if the customer breached the company’s contract.
At one point during the call, one of the employees tried to level with the group, explaining that Palantir’s work with ICE was a priority for Karp and something that likely wouldn’t change any time soon.
“Karp really wants to do this and continuously wants this,” they said. “We’re largely at the role of trying to give him suggestions and trying to redirect him, but it was largely unsuccessful and we seem to be on a very sharp path of continuing to expand this workflow.”
Around the time of these forums, Karp sat down for a prerecorded interview with Bowman, seemingly to discuss Palantir’s contracts with ICE, but refused to broach the topic directly. Instead, Karp suggested that employees interested in the work sign nondisclosure agreements before receiving more detailed information.
Then came the deadly February 28 missile strike on an Iranian elementary school on the first full day of the Trump administration and Israel’s war in Iran. The US is the only known country in the conflict to use that specific type of missile. More than 120 children were killed when a Tomahawk missile struck the school, kicking off a series of investigations that concluded that the US was responsible and that surveillance tools like Palantir’s Maven system had been used during that day’s strikes. For a company full of employees already reeling over its work with ICE, possible involvement in the death of children was a breaking point.
“I guess the root of what I’m asking is … were we involved, and are doing anything to stop a repeat if we were,” one employee asked in the Palantir news Slack channel. Some employees posed similar questions in the thread, while others criticized them for discussing what could be considered classified information in a Slack channel open to the entire company. The investigation is ongoing.
The Palantir spokesperson said the company was “proud” to support the US military “across Democratic and Republican administrations.”
In March, Karp gave an interview to CNBC claiming that AI could undermine the power of “humanities-trained—largely Democratic—voters” and increase the power of working-class male voters. While critics reacted to the piece, calling the statements concerning, so did employees internally: “Is it true that AI disruption is going to disproportionately negatively affect women and people who vote Democrat? and if it is, why are we cool with that?” one worker asked on Slack in a channel dedicated to news about Palantir.
Palantir’s leadership incensed workers yet again this week after the company posted a Saturday afternoon manifesto reducing Karp’s recent book, The Technological Republic, to 22 points. The post—which includes many of Karp’s long-standing beliefs on how Silicon Valley could better serve US national interests—goes as far as suggesting that the US should consider reinstating the draft. Critics called the manifesto fascist.
Internally, the post alarmed some workers who huddled in a Slack thread on Monday morning, questioning leadership over its decision to post it in the first place.
“I’m curious why this had to be posted. Especially on the company account. On the practical level every time stuff like that gets posted it gets harder for us to sell the software outside of the US (for sure in the current political climate), and I doubt we need this in the US?” wrote one frustrated employee. The message received more than 50 “+1” emojis.
“Wether [sic] we acknowledge it or not, this impacts us all personally,” another worker wrote on Monday. “I’ve already had multiple friends reach out and ask what the hell did we post.” This message received nearly two dozen “+1” emoji reactions.
“Yeah it turns out that short-form summaries of the book’s long-form ideas are easy to misrepresent. It’s like we taped a ‘kick me’ sign on our own backs,” a third worker wrote. “I hope no one who decided to put this out is surprised that we are, in fact, getting kicked.”
These conversations involving shame and uncertainty from workers have seemingly popped up in internal channels whenever Palantir has been in the news over the last year. “I think the only thing not different is a lot of folks are still incredibly wary about leaks and talking to the press,” one current employee tells WIRED, describing how the internal company culture has evolved over the last year.
All of this dissent doesn’t seem to bother Karp, who recently told workers that the company is “behind the curve internally” when it comes to popularity. Here, he’s been consistent; in March 2024 Karp told a CNBC reporter that “if you have a position that does not cost you ever to lose an employee, it’s not a position.”
But for employees, the culture shift feels intentional. “I don’t want to assert that I have knowledge of what’s going on in their internal mind,” one former worker tells WIRED. “But maybe it’s gotten to a place where encouraging independent thought and questioning leads to some bad conclusions.”
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.