10 interesting stories served every morning and every evening.
Here are three stories about the state of gambling in America.
In November 2025, two pitchers for the Cleveland Guardians, Emmanuel Clase and Luis Ortiz, were charged in a conspiracy for “rigging pitches.” Frankly, I had never heard of rigged pitches before, but the federal indictment describes a scheme so simple that it’s a miracle that this sort of thing doesn’t happen all the time. Three years ago, a few corrupt bettors approached the pitchers with a tantalizing deal: (1) We’ll bet that certain pitches will be balls; (2) you throw those pitches into the dirt; (3) we’ll win the bets and give you some money.
The plan worked. Why wouldn’t it? There are hundreds of pitches thrown in a baseball game, and nobody cares about one bad pitch. The bets were so deviously clever because they offered enormous rewards for bettors and only incidental inconvenience for players and viewers. Before their plan was snuffed out, the fraudsters won $450,000 from pitches that not even the most ardent Cleveland baseball fan would ever remember the next day. Nobody watching America’s pastime could have guessed that they were witnessing a six-figure fraud.
On the morning of February 28th, someone logged onto the prediction market website Polymarket and made an unusually large bet. This bet wasn’t placed on a baseball game. It wasn’t placed on any sport. This was a bet that the United States would bomb Iran on a specific day, despite extremely low odds of such a thing happening.
A few hours later, bombs landed in Iran. This one bet was part of a $553,000 payday for a user named “Magamyman.” And it was just one of dozens of suspicious, perfectly-timed wagers, totaling millions of dollars, placed in the hours before a war began.
It is almost impossible to believe that, whoever Magamyman is, he didn’t have inside information from members of the administration. The term war profiteering typically refers to arms dealers who get rich from war. But we now live in a world not only where online bettors stand to profit from war, but also where key decision makers in government have the tantalizing options to make hundreds of thousands of dollars by synchronizing military engagements with their gambling position.
On March 10, several days into the Iran War, the journalist Emanuel Fabian reported that a warhead launched from Iran struck a site outside Jerusalem.
Meanwhile on Polymarket, users had placed bets on the precise location of missile strikes on March 10. Fabian’s article was therefore poised to determine payouts of $14 million in betting. As The Atlantic’s Charlie Warzel reported, bettors encouraged him to rewrite his story to produce the outcome that they’d bet on. Others threatened to make his life “miserable.”
A clever dystopian novelist might conceive of a future where poorly paid journalists for news wires are offered six-figure deals to report fictions that cash out bets from online prediction markets. But just how fanciful is that scenario when we have good reason to believe that journalists are already being pressured, bullied, and threatened to publish specific stories that align with multi-thousand dollar bets about the future?
Put it all together: rigged pitches, rigged war bets, and attempts to rig wartime journalism. Without context, each story would sound like a wacky conspiracy theory. But these are not conspiracy theories. These are things that have happened. These are conspiracies—full stop.
“If you’re not paranoid, you’re not paying attention” has historically been one of those bumperstickers you find on the back of a car with so many other bumperstickers that you worry for the sanity of its occupants. But in this weird new reality where every event on the planet has a price, and behind every price is a shadowy counterparty, the jittery gambler’s paranoia—is what I’m watching happening because somebody more powerful than me bet on it?—is starting to seem, eerily, like a kind of perverse common sense.
What’s remarkable is not just the fact that online sports books have taken over sports, or that betting markets have metastasized in politics and culture, but the speed with which both have taken place.
For most of the last century, the major sports leagues were vehemently against gambling, as the Atlantic staff writer McKay Coppins explained in his recent feature. In 1992, NFL commissioner Paul Tagliabue told Congress that “nothing has done more to despoil the games Americans play and watch than widespread gambling on them.” In 2012, NBA commissioner David Stern loudly threatened New Jersey Gov. Chris Christie for signing a bill to legalize sports betting in the Garden State, reportedly screaming, “we’re going to come after you with everything we’ve got.”
So much for that. Following the 2018 Supreme Court decision Murphy vs. NCAA, sports gambling was unleashed into the world, and the leagues haven’t looked back. Last year, the NFL saw $30 billion gambled on football games, and the league itself made half a billion dollars in advertising, licensing, and data deals.
Nine years ago, Americans bet less than $5 billion on sports. Last year, that number rose to at least $160 billion. Big numbers mean nothing to me, so let me put that statistic another way: $5 billion is roughly the amount Americans spend annually at coin-operated laundromats and $160 billion is nearly what Americans spent last year on domestic airline tickets. So, in a decade, the online sports gambling industry will have risen from the level of coin laundromats to rival the entire airline industry.
And now here come the prediction markets, such as Polymarket and Kalshi, whose combined 2025 revenue came in around $50 billion. “These predictive markets are the logical endpoint of the online gambling boom,” Coppins told me on my podcast Plain English. “We have taught the entire American population how to gamble with sports. We’ve made it frictionless and easy and put it on everybody’s phone. Why not extend the logic and culture of gambling to other segments of American life?” He continued:
Why not let people gamble on who’s going to win the Oscar, when Taylor Swift’s wedding will be, how many people will be deported from the United States next year, when the Iranian regime will fall, whether a nuclear weapon will be detonated in the year 2026, or whether there will be a famine in Gaza? These are not things that I’m making up. These are all bets that you can make on these predictive markets.
Indeed, why not let people gamble on whether there will be a famine in Gaza? The market logic is cold and simple: More bets means more information, and more informational volume is more efficiency in the marketplace of all future happenings. But from another perspective—let’s call it, baseline morality?—the transformation of a famine into a windfall event for prescient bettors seems so grotesque as to require no elaboration. One imagines a young man sending his 1099 documents to a tax accountant the following spring: “right, so here are my dividends, these are the cap gains, and, oh yeah, here’s my $9,000 payout for totally nailing when all those kids would die.”
It is a comforting myth that dystopias happen when obviously bad ideas go too far. Comforting, because it plays to our naive hope that the world can be divided into static categories of good versus evil and that once we stigmatize all the bad people and ghettoize all the bad ideas, some utopia will spring into view. But I think dystopias more likely happen because seemingly good ideas go too far. “Pleasure is better than pain” is a sensible notion, and a society devoted to its implications created Brave New World. “Order is better than disorder” sounds alright to me, but a society devoted to the most grotesque vision of that principle takes us to 1984. Sports gambling is fun, and prediction markets can forecast future events. But extended without guardrails or limitations, those principles lead to a world where ubiquitous gambling leads to cheating, cheating leads to distrust, and distrust leads ultimately to cynicism or outright disengagement.
“The crisis of authority that has kind of already visited every other American institution in the last couple of decades has arrived at professional sports,” Coppins said. Two-thirds of Americans now believe that professional athletes sometimes change their performance to influence gambling outcomes. “Not to overstate it, but that’s a disaster,” he said. And not just for sports.
There are four reasons to worry about the effect of gambling in sports and culture.
The first is the risk to individual bettors. Every time we create 1,000 new gamblers, we create dozens of new addicts and a handful of new bankruptcies. As I’ve reported, there is evidence that about one in five men under 25 is on the spectrum of having a gambling problem, and calls to the National Problem Gambling Helpline have roughly tripled since sports gambling was broadly legalized in 2018. Research from UCLA and USC found that bankruptcies increased by 10 percent in states that legalized online sports betting between 2018 and 2023. People will sometimes ask me what business I have worrying about online gambling when people should be free to spend their money however they like. My response is that wise rules place guardrails around economic activity with a certain rate of personal harm. For alcohol, we have licensing requirements, minimum drinking ages, boundaries around hours of sale, and rules about public consumption. As alcohol consumption is declining among young people, gambling is surging; Gen Z has replaced one (often fun) vice with a meaningful chance of addiction with another (often fun) vice with a meaningful chance of addiction. But whereas we have centuries of experience curtailing excessive drinking with rules and customs, we are currently in a free-for-all era of gambling.
The second risk is to individual players and practitioners. One reason why sports commissioners might have wanted to keep gambling out of their business is that gamblers turns some people into complete psychopaths, and that’s not a very nice experience for folks on the receiving end of gambling-afflicted psychopaths. In his feature, McKay Coppins reports on the experience of Caroline Garcia, a top-ranked tennis player, who said she received torrents of abusive messages from gamblers both for losing games and for winning games. “This has become a very common experience for athletes at the professional level, even at the college level too,” Coppins said. As the experience of journalist Emanuel Fabian shows, gambling can turn ordinary people into mini mob bosses, who go around threatening players and practitioners who they believe are costing them thousands of dollars.
The third risk is to the integrity of sports—or any other institution. At the end of 2025, in addition to its indictment of the Cleveland Guardians pitchers, the FBI announced 30 arrests involving gambling schemes in the NBA. This cavalcade of arrests has dramatically reduced trust in sports. Two-thirds of Americans now believe that professional athletes change their performance to influence gambling outcomes. It does not require extraordinary creativity to imagine how this principle could extend to other domains and institutions. If more people start to believe that things only happen in the world as a direct result of shadowy interests in vast betting markets, it’s going to be a permanent open season for conspiracy theories.
The ultimate risk is almost too dark to contemplate in much detail. As the logic and culture of casinos moves from sports to politics, the scandals that have visited baseball and basketball might soon arrive in politics. Is it really so unbelievable that a politician might tip off a friend, or assuage an enemy, by giving them inside information that would allow them to profit on betting markets? Is it really so incredible to believe that a government official would try to align policy with a betting position that stood to earn them, or an allied group, hundreds of thousands of dollars? That is what a “rigged pitch” in politics would look like. It’s not just wagering on a policy outcome that you suspect will happen. It’s changing policy outcomes based on what can be wagered.
Gambling is flourishing because it meets the needs of our moment: a low-trust world, where lonely young people are seeking high-risk opportunities to launch them into wealth and comfort. In such an environment, financialization might seem to be the last form of civic participation that feels honest to a large portion of the country. Voting is compromised, and polling is manipulated, and news is algorithmically curated. But a bet settles. A game ends. There is comfort in that. In an uncertain and illegible world, it doesn’t get much more certain and legible than this: You won, or you lost.
A 2023 Wall Street Journal poll found that Americans are pulling away from practically every value that once defined national life—patriotism, religion, community, family. Young people care less than their parents about marriage, children, or faith. But nature, abhorring a vacuum, is filling the moral void left by retreating institutions with the market. Money has become our final virtue.
I often find myself thinking about the philosopher Alasdair MacIntyre, who argued in the introduction of After Virtue that modernity had destroyed the shared moral language once supplied by traditions and religion, leaving us with only the language of individual preference. Virtue did not disappear, I think, so much as it died and was reincarnated as the market. It is now the market that tells us what things are worth, what events matter, whose predictions are correct, who is winning, who counts. Money has, in a strange way, become the last moral arbiter standing—the final universal language that a pluralistic, distrustful, post-institutional society can use to communicate with itself.
As this moral vocabulary scales across culture, it also corrodes culture. In sports, when you have money on a game, you’re not rooting for a team. You’re rooting for a proposition. The social function of fandom—shared identity, inherited loyalty, something larger than yourself—dissolves into individual risk. In politics, I fear the consequences will be worse. Prediction markets can be useful for those who want to know the future, but their utility recruits participants into a relationship with the news cycle that is adversarial, and even misanthropic. A young man betting on a terrorist attack or a famine is not acting as a mere concerned citizen whose participation improves the efficiency of global prediction markets. He’s just a dude, on his phone, alone in a room, choosing to root for death.
If that doesn’t bother you, I don’t know how to make it bother you. Based on economic and market efficiency principles alone, this young man’s behavior is defensible. But there is morality outside of markets. There is more to life than the efficiency of information networks. But will we rediscover it, any time soon? Don’t bet on it.
...
Read the original on www.derekthompson.org »
It’s the end of an era: Apple has confirmed to 9to5Mac that the Mac Pro is being discontinued. It has been removed from Apple’s website as of Thursday afternoon. The “buy” page on Apple’s website for the Mac Pro now redirects to the Mac’s homepage, where all references have been removed.
Apple has also confirmed to 9to5Mac that it has no plans to offer future Mac Pro hardware.
The Mac Pro has lived many lives over the years. Apple released the current Mac Pro industrial design in 2019 alongside the Pro Display XDR (which was also discontinued earlier this month). That version of the Mac Pro was powered by Intel, and Apple refreshed it with the M2 Ultra chip in June 2023. It has gone without an update since then, languishing at its $6,999 price point even as Apple debuted the M3 Ultra chip in the Mac Studio last year.
With that in mind, the Mac Studio is clearly set up to be the ‘pro’ desktop Mac of the future in Apple’s lineup. The Mac Studio can be configured with the M3 Ultra chip and a 32-core CPU and an 80-core GPU, paired with 256GB of unified memory and 16TB of SSD storage.
With the discontinuation of Mac Pro today, Apple now sells three desktop Macs:
To me, this is the strongest Mac lineup in years, and perhaps the strongest Mac lineup ever. This is especially true with the recent addition of the MacBook Neo in the entry-level spot. There are options spanning dramatically different price points, configurations, and form factors.
Furthermore, with the release of macOS Tahoe 26.2 last year, Apple added a new low-latency feature that lets you use RDMA over Thunderbolt 5 to connect multiple Macs together. This gives Mac users at the ultra-high-end of the market another way to scale performance. At the time, many speculated this feature could be another nail in the Mac Pro’s coffin.
Ultimately, Apple needed to make a decision to either update the Mac Pro or discontinue it. Continuing to sell it with the M2 Ultra at such a high price was a disservice to Mac shoppers. I think Apple made the right call by discontinuing it and prioritizing the Mac Studio going forward.
Of course, there will undoubtedly be some Mac Pro loyalists disappointed by today’s news. But the writing has been on the wall for a while.
...
Read the original on 9to5mac.com »
A. T.L.A.S achieves 74.6% LiveCodeBench pass@1-v(k=3) with a frozen 14B model on a single consumer GPU — up from 36-41% in V2 — through constraint-driven generation and self-verified iterative refinement. The premise: wrap a frozen smaller model in intelligent infrastructure — structured generation, energy-based verification, self-verified repair — and it can compete with frontier API models at a fraction of the cost. No fine-tuning, no API calls, no cloud. Fully self-hosted — no data leaves the machine, no API keys required, no usage metering. One GPU, one box.
*pass@k-v(k=3) = one solution submitted per task, but generated via best-of-3 candidates + Lens selection + iterative repair on failures. Not single-shot generation, it is not pass@1. See methodology.
A single patched llama-server runs on K3s, providing both generation with speculative decoding (~100 tok/s) and 5120-dim self-embeddings for Lens scoring. The Geometric Lens C(x) energy field selects the best candidate (87.8% accuracy on mixed-result tasks). Failed tasks enter Phase 3, where the model generates its own test cases and iteratively repairs solutions via PR-CoT — real tests are used only for final scoring.
Before you begin: ATLAS was developed and tested on specific hardware. Read the Hardware & Reproduction section below to check compatibility and tune variables for your setup before running.
git clone https://github.com/itigges22/ATLAS.git && cd ATLAS
cp atlas.conf.example atlas.conf # set MODEL_PATH, DATA_DIR, GPU device
sudo ./scripts/install.sh
./scripts/verify-install.sh
# Run V3 benchmark
python3 benchmark/v3_runner.py
V3 results were produced on RHEL 9 running as a Proxmox VM with an RTX 5060 Ti 16GB passed through via VFIO. Other NVIDIA GPUs with 16GB+ VRAM should work, though you may need to adjust driver versions and VRAM allocation.
The pipeline is not yet plug-and-play on arbitrary hardware — V3.1 will improve portability. That said, Claude Code can be used to retrofit the pipeline to your specific setup (different GPU, OS, VRAM budget).
Key variables to tune for your hardware:
–parallel slots (default 2 — reduce to 1 if VRAM is tight)
Full VRAM budget breakdown is documented in docs/ARCHITECTURE.md. Community reproduction attempts are welcome — open an issue with your hardware config and results.
These are actively being addressed in V3.1:
LCB-only optimization. V3 phases were designed and tuned for LiveCodeBench. GPQA Diamond (47.0%) and SciCode (14.7%) results are included but those benchmarks were not optimized for. Cross-domain generalization is a V3.1 priority.
Phase 2 (Geometric Lens routing) contributed +0.0pp. C(x) was retrained on self-embeddings for V3 (fixing the V2 nomic embedding failure), but the training dataset was only ~60 samples — far too small to learn a meaningful energy landscape. With an undertrained C(x), the Lens cannot discriminate candidates during routing. V3.1 retrains C(x) on a properly sized dataset drawn from real benchmark problems.
G(x) metric tensor is dormant. G(x) is downstream of C(x): it applies metric corrections to C(x)’s gradient signal. With C(x) undertrained and producing a weak/noisy energy landscape, G(x) has no meaningful geometry to navigate — the correction term Δx = -G⁻¹∇C contributes nothing. G(x) is currently being redesigned from the ground up; V3.1 will either ship a working redesign or remove it entirely pending further research.
SandboxAdapter stdio bug. S* distinguishing input tiebreaking is implemented but non-functional on LCB tasks due to a stdio handling bug in the SandboxAdapter. Fixed in V3.1.
* Model swap: Qwen3-14B → Qwen3.5-9B with DeltaNet linear attention architecture. Native multi-token prediction (MTP) gives ~3-4x throughput improvement at comparable or better accuracy. Smaller model also frees VRAM headroom.
* Lens Evolution: Online C(x) recalibration — Geometric Lens updates based on benchmark feedback rather than remaining static after initial training.
V3 was evaluated only on LiveCodeBench v5. V3.1 expands evaluation to cover coding, reasoning, and general knowledge — because ATLAS is not purely a coding system. The Confidence Router allocates compute based on task difficulty: simple knowledge questions route to raw inference + RAG (~30 seconds per response), while hard coding problems use the full V3 pipeline (PlanSearch + best-of-3 + PR-CoT repair), which can take up to 20 minutes per task. The benchmark suite should reflect this full range.
GPQA Diamond — scientific reasoning, run in V2 (47.0%), needs V3 re-evaluation under full pipeline
AA-Omniscience — knowledge accuracy and hallucination rate; important for general-purpose use cases where users ask factual questions rather than coding problems
Humanity’s Last Exam — extreme reasoning and knowledge, useful as a ceiling test
General knowledge benchmarks matter because ATLAS is designed as a general-purpose self-hosted AI system, not a coding-only tool. The Confidence Router handles this by routing knowledge queries directly to raw inference + RAG (~30s), while reserving the full pipeline for hard coding problems (~20min). Benchmarks like AA-Omniscience and Humanity’s Last Exam validate the fast-path routing, not the coding pipeline.
None of these V3.1 benchmarks have been run yet. This section is forward-looking roadmap only.
Licensed under the A. T.L.A.S Source Available License v1.0 — see LICENSE.
...
Read the original on github.com »
A warning about rising prices, vanishing consumer choice, and a future where owning a computer may matter more than ever as hardware, power, and control drift toward data centers and away from people.
A warning about rising prices, vanishing consumer choice, and a future where owning a computer may matter more than ever as hardware, power, and control drift toward data centers and away from people.
For the better part of two decades, consumers lived in a golden age of tech. Memory got cheaper, storage increased in capacity and hardware got faster and absurdly affordable. Upgrades were routine, almost casual. If you needed more RAM, a bigger SSD, or a faster CPU or GPU, you barely had to wait a week for a discount offer and you moved on with your life. This era is ending.
What’s forming now isn’t just another pricing cycle or a short-term shortage, it is a structural shift in the hardware industry that paints a deeply grim outlook for consumers. Today, I am urging you to hold on to your hardware, as you may not be able to replace it affordably in the future. While I have always been a stark critic of today’s consumer industry, as well as
the ideas behind it, and a strong proponent of buying it for life
(meaning, investing into durable, repairable, quality products) the industry’s shift has nothing to do with the protection of valuable resources or the environment, but is instead a move towards a trajectory that has the potential to erode technological self-sufficiency and independence for people all over the world.
In recent months the buzzword RAM-pocalypse has started popping up across tech journalism and enthusiast circles. It’s an intentionally dramatic term that describes the sharp increase in RAM prices, primarily driven by high demand from data centers and “AI” technology, which most people had considered a mere
blip in the market. This presumed temporary blip, however, turned out to be a lot more than just that, with one manufacturer after the other openly stating that prices will continue to rise, with suppliers forecasting shortages of specific components that could last well beyond 2028, and with key players like
Western Digital and Micron either completely disregarding or even exiting the consumer market altogether.
The RAM-pocalypse isn’t just a temporary headline anymore, but has seemingly become long-term reality. However, RAM and memory in general is only the beginning.
The main reason for the shortages and hence the increased prices is data center demand, specifically from “AI” companies. These data centers require mind-boggling amounts of hardware, specifically RAM, storage drives and GPUs, which in turn are RAM-heavy graphics units for “AI” workloads. The enterprise demand for specific components simply outpaces the current global production capacity, and outbids the comparatively poor consumer market.
For example, OpenAI’s Stargate project alone reportedly
requires approximately 900,000 DRAM wafers per month, which could account for roughly 40% of current global DRAM output. Other big tech giants including Google, Amazon, Microsoft, and Meta have placed open-ended orders with memory suppliers, accepting as much supply as available. The existing and future data centers for/of these companies are expected to consume 70% of all memory chips produced in 2026.
However, memory is just the first domino.
RAM and SSDs are where the pain is most visible today, but rest assured that the same forces are quietly reshaping all aspects of consumer hardware. One of the most immediate and tangible consequences of this broader supply-chain realignment are sharp, cascading price hikes across consumer electronics, with
LPDDR memory standing out as an early pressure point that most consumers didn’t recognize until it was already unavoidable.
LPDDR is used in smartphones, laptops, tablets, handheld consoles, routers, and increasingly even low-power PCs. It sits at the intersection of consumer demand and enterprise prioritization, making it uniquely vulnerable when manufacturers reallocate capacity toward “AI” accelerators, servers, and data-center-grade memory, where margins are higher and contracts are long-term. As fabs shift production toward HBM and server DRAM, as well as GPU wafers, consumer hardware production quietly becomes non-essential, tightening supply just as devices become more power- and memory-hungry, all while continuing on their path to remain frustratingly unserviceable and un-upgradable.
The result is a ripple effect, in which device makers pay more for chips and memory and pass those costs on through higher retail prices, cut base configurations to preserve margins, or lock features behind premium tiers. At the same time, consumers lose the ability to compensate by upgrading later, because most components these days, like LPDDR, are soldered down by design. This is further amplified by scarcity, as even modest supply disruptions can spike prices disproportionately in a market where just a few suppliers dominate, turning what should be incremental cost increases into sudden jumps that affect entire product categories at once.
In practice, this means that phones, ultrabooks, and embedded devices are becoming more expensive overnight, not because of new features, but because the invisible silicon inside them has quietly become a
contested resource in a world that no longer builds hardware primarily for consumers.
In late January 2026, the Western Digital CEO
confirmed during an earnings call that the company’s entire HDD production capacity for calendar year 2026 is already sold out. Let that sink in for a moment. Q1 hasn’t even ended and a major hard drive manufacturer has
zero remaining capacity for the year. Firm purchase orders are in place with its top customers, and long-term agreements already extend into 2027 and 2028. Consumer revenue now accounts for just 5% of Western Digital’s total sales, while cloud and enterprise clients make up 89%. The company has, for all practical purposes, stopped being a consumer storage company.
And Western Digital is not alone. Kioxia, one of the world’s largest NAND flash manufacturers, admitted that its entire 2026 production volume is
already in a “sold out” state, with the company expecting tight supply to persist through at least 2027 and long-term customers facing 30% or higher year-on-year price increases. Adding to this, the Silicon Motion CEO put it bluntly during a recent earnings call:
We’re facing what has never happened before: HDD, DRAM, HBM, NAND… all in severe shortage in 2026.
In addition, the Phison CEO has gone even further, warning that the NAND shortage could persist until 2030, and that it risks the
“destruction” of entire segments of the consumer electronics industry. He also noted that factories are now demanding prepayment for capacity three years in advance, an unprecedented practice that effectively locks out smaller players.
The collateral damage of this can already be felt, and it’s significant. For example Valve confirmed that the Steam Deck OLED is now out of stock intermittently in multiple regions “due to memory and storage shortages”. All models are currently unavailable in the US and Canada, the cheaper LCD model has been discontinued entirely, and there is no timeline for when supply will return to normal. Valve has also
been forced to delay the pricing and launch details for its upcoming Steam Machine console and Steam Frame VR headset, directly citing memory and storage shortages.
At the same time, Sony is considering delaying the PlayStation 6 to 2028 or even 2029, and Nintendo is reportedly
contemplating a price increase for the Switch 2, less than a year after its launch. Both decisions are seemingly driven by the same memory supply constraints. Meanwhile, Microsoft has already raised
prices on the Xbox.
Now you might think that everything so far is about GPUs and other gaming-related hardware, but that couldn’t be further from the truth. General computing, like the Raspberry Pi is not immune to any of this either. The Raspberry Pi Foundation has been forced to raise prices twice in three months, with the flagship Raspberry Pi 5 (16GB) jumping from $120 at launch to $205 as of February 2026, a 70% increase driven entirely by LPDDR4
memory costs. What was once a symbol of affordable computing is rapidly being priced out of reach for the educational and hobbyist communities it was designed to serve.
HP, on the other hand, seems to have already prepared for the hardware shortage by launching a laptop subscription service where you pay a monthly fee to use a laptop but never own it, no matter how long you subscribe. While HP frames this as a convenience, the timing, right in the middle of a hardware affordability crisis, makes it feel a lot more like a preview of a rented compute future. But more on that in a second.
“But we’ve seen price spikes before, due to crypto booms, pandemic shortages, factory floods and fires!”, you might say. And while we did live through those crises, things eventually eased when bubbles popped and markets or supply chains recovered. The current situation, however, doesn’t appear to be going away anytime soon, as it looks like the industry’s priorities have fundamentally
changed.
These days, the biggest customers are not gamers, creators, PC builders or even crypto miners anymore. Today, it’s hyperscalers. Companies that use hardware for “AI” training clusters, cloud providers, enterprise data centers, as well as governments and defense contractors. Compared to these hyperscalers
consumers are small fish in a big pond.
These buyers don’t care if RAM costs 20% more and neither do they wait for
Black Friday deals. Instead, they sign contracts measured in exabytes and billions of dollars. With such clients lining up, the consumer market in contrast is suddenly an inconvenience for manufacturers. Why settle for smaller margins and deal with higher marketing and support costs, fragmented SKUs, price sensitivity and retail logistics headaches, when you can have behemoths throwing money at you? Why sell a $100 SSD to one consumer, when you can sell a whole rack of enterprise NVMe drives to a data center with
virtually infinite money?
All of this goes to show that the consumer market is not just deprioritized, but instead it is being starved. In fact, IDC has already warned
that the PC market could shrink by up to 9% in 2026 due to skyrocketing memory prices, and has described the situation not as a cyclical shortage but as “a potentially permanent, strategic reallocation of the world’s silicon wafer capacity”.
Leading PC OEMs including Lenovo, Dell, HP, Acer, and ASUS have all signaled 15-20% PC price increases for 2026, with some models seeing even steeper hikes. Framework, the repairable laptop company, has also been transparent about rising memory costs impacting its pricing. And analyst Jukan Choi recently revised his shortage timeline estimate, noting that DRAM production capacity is expected to grow at just 4.8% annually through 2030, with even that incremental capacity concentrated on HBM rather than consumer memory. TrendForce’s latest forecast projects DRAM contract prices rising by 90-95% quarter over quarter in Q1 2026. And that is not a typo.
The price of hardware is one thing, but value-for-money is another aspect that appears to be only getting worse from here on. Already today consumer parts feel like cut-down versions of enterprise silicon. As “AI” accelerators and server chips dominate R&D budgets, consumer improvements will slow even further, or arrive at higher prices justified as premium features. This is true for CPUs and GPUs, and it will be equally true for motherboards, chipsets, power supplies, networking, etc. We will likely see fewer low-end options, more segmentation, artificial feature gating and generally higher baseline prices that, once established, won’t be coming back down again.
As enterprise standards become the priority, consumer gear is becoming an afterthought that is being rebadged, overpriced, and poorly supported. The uncomfortable truth is that the consumer hardware market is no longer the center of gravity, as we all were able to see at this year’s CES. It’s orbiting something much larger, and none of this is accidental. The industry isn’t failing, it’s succeeding, just not for you.
And to be fair, from a corporate standpoint, this pivot makes perfect sense.
“AI” and enterprise customers are rewriting revenue charts, all while consumers continue to be noisy, demanding, and comparatively poor. It is pretty clear that consumer hardware is becoming a second-class citizen, which means that the machines we already own are more valuable than we might be thinking right now.
“But what does the industry think the future will look like if nobody can afford new hardware?”, you might be asking.
There is a darker, conspiratorial interpretation of today’s hardware trends that reads less like market economics and more like a rehearsal for a managed
future. Businesses, having discovered that ownership is inefficient and obedience is profitable, are quietly steering society toward a world where no one owns compute at all, where hardware exists only as an abstraction rented back to the public through virtual servers, SaaS subscriptions, and metered experiences, and where digital sovereignty, that anyone with a PC tower under their desk once had, becomes an outdated, eccentric, and even suspicious concept.
… a morning in said future, where an ordinary citizen wakes up, taps their terminal, which is a sealed device without ports, storage, and sophisticated local execution capabilities, and logs into their Personal Compute Allocation. This bundle of cloud CPU minutes, RAM credits, and storage tokens leased from a conglomerate whose logo has quietly replaced the word “computer” in everyday speech, just like “to search” has made way for “to google”, has removed the concept of installing software, because software no longer exists as a thing, but only as a service tier in which every task routes through servers owned by entities. Entities that insist that this is all for the planet. Entities that outlawed consumer hardware years ago under the banner of environmental protectionism, citing e-waste statistics, carbon budgets, and unsafe unregulated silicon, while conveniently ignoring that the data centers
humming beyond the city limits burn more power in an hour than the old neighborhood ever did in a decade. In this world, the ordinary citizen remembers their parents’ dusty Personal Computer, locked away in a storage unit like contraband. A machine that once ran freely, offline if it wanted, immune to arbitrary account suspensions and pricing changes. As they go about their day, paying a micro-fee to open a document, losing access to their own photos because a subscription lapsed, watching a warning banner appear when they type something that violates the ever evolving terms-of-service, and shouting “McDonald’s!” to skip the otherwise unskippable ads within every other app they open, they begin to understand that the true crime of consumer hardware wasn’t primarily pollution but independence. They realize that owning a machine meant owning the means of computation, and that by centralizing hardware under the guise of efficiency, safety, and sustainability, society traded resilience for convenience and autonomy for comfort.In this utopia, nothing ever breaks because nothing is
yours, nothing is repairable because nothing is physical, and nothing is
private because everything runs somewhere else, on someone else’s computer. The quiet moral, felt when the network briefly stutters and the world freezes, is that keeping old hardware alive was never nostalgia or paranoia, but a small, stubborn act of digital self-defense; A refusal to accept that the future must be rented, permissioned, and revocable at any moment.
If you think that dystopian “rented compute over owned hardware” future could never happen, think again. In fact, you’re already likely renting rather than owning in many different areas. Your means of
communication are run by Meta, your music is provided by Spotify, your movies are streamed from Netflix, your data is stored in Google’s data centers and your office suite runs on Microsoft’s cloud. Maybe even your car is leased instead of owned, and you pay a monthly premium for seat heating or sElF-dRiViNg, whatever that means. After all, the average Gen Z and Millennial US consumer today apparently has 8.2 subscriptions, not including their DaIlY aVoCaDo ToAsTs and StArBuCkS cHoCoLate ChIp LaTtEs that the same Boomers
responsible for the current (and past) economic crises love to dunk on.
Besides, look no further than what’s already happening in for example China, a country that manufactures massive amounts of the world’s sought-after hardware yet faces restrictions on buying that very hardware. In recent years, a complex web of export controls and chip bans has put a spotlight on how hardware can become a geopolitical bargaining chip rather than a consumer good. For example, export controls imposed by the United States in recent years barred Nvidia
from selling many of its high-performance GPUs into China without special licenses, significantly reducing legal access to cutting-edge compute inside the country.
Meanwhile, enforcement efforts have repeatedly busted smuggling operations moving prohibited Nvidia chips into Chinese territory through Southeast Asian hubs, with over $1 billion worth of banned GPUs reportedly moving through gray markets, even as official channels remain restricted. Coverage by outlets such as Bloomberg, as well as actual investigative journalism like
Gamer’s Nexus has documented these black-market flows and the lengths to which both sides go to enforce or evade restrictions, including smuggling networks and increased regulatory scrutiny.
On top of this, Chinese regulators have at times restricted domestic tech firms from buying specific Nvidia models, further underscoring how government policy can override basic market access for hardware, even in the country where much of that hardware is manufactured. While some of these export rules have seen partial reversals or regulatory shifts, the overall situation highlights a world in which hardware access is increasingly determined by politics, security regimes, and corporate strategy, and not by consumer demand. This should serve as a cautionary tale for anyone who thinks owning their own machines won’t matter in the years to come.
In an ironic twist, however, one of the few potential sources of relief may, in fact, come from China. Two Chinese manufacturers, CXMT (ChangXin Memory Technologies) and YMTC (Yangtze Memory Technologies), are embarking on their most aggressive capacity expansions ever, viewing the global shortage as a golden opportunity to close the gap with the incumbent big three
(Samsung, SK Hynix, Micron).
CXMT is now the world’s fourth-largest DRAM maker by production volume, holding roughly 10-11% of global wafer capacity, and is building a massive new DRAM facility in Shanghai expected to be two to three times larger than its existing Hefei headquarters, with volume production targeted for 2027. The company is also preparing a $4.2 billion IPO on Shanghai’s STAR Market to fund further expansion and has reportedly delivered HBM3 samples to domestic customers including Huawei.
YMTC, traditionally a NAND flash supplier, is constructing a third fab in Wuhan with roughly half of its capacity dedicated to DRAM, and has reached 270-layer 3D NAND capability, rapidly narrowing the gap with Samsung (286 layers) and SK Hynix (321 layers). Its NAND market share by shipments reached 13% in Q3 2025, close to Micron’s 14%. What’s particularly notable is that
major PC manufacturers are already
turning to these suppliers.
However, as mentioned before, with hardware having become a geopolitical topic, both companies face ongoing (US-imposed) restrictions. Hence, for example HP
has indicated it would only use CXMT chips in devices for non-US markets. Nevertheless, for consumers worldwide the emergence of viable fourth and fifth players in the memory market represents the most tangible hope of eventually breaking the current supply stranglehold. Whether that relief arrives in time to prevent lasting damage to the consumer hardware ecosystem remains an open question, though.
The reason I’m writing all of this isn’t to create panic, but to help put things into perspective. You don’t need to scavenger-hunt for legacy parts in your local landfill (yet) or swear off upgrades forever, but you do need to recognize that the rules have changed. The market that once catered to enthusiasts and everyday users is turning its back. So take care of your hardware, stretch its lifespan, upgrade thoughtfully, and don’t assume replacement will always be easy or affordable.
That PC, laptop, NAS, or home server isn’t disposable anymore. Clean it, maintain it, repaste it, replace fans and protect it, as it may need to last far longer than you originally planned.
Also, realize that the best time to upgrade your hardware was yesterday and that the second best time is now. If you can afford sensible upgrades, especially RAM and SSD capacity, it may be worth doing sooner rather than later. Not for performance, but for insurance, because the next time something fails, it might be unaffordable to replace, as the era of casual upgrades seems to be over. Five-year systems may become eight- or ten-year systems.
Software bloat will hurt more and
will require re-thinking. Efficiency will
matter again. And looking at it from a different angle, maybe that’s a good thing.
Additionally, the assumption that prices will normalize again at some point is most likely a pipe dream. The old logic wait a year and it’ll be cheaper no longer applies when manufacturers are deliberately constraining supply. If you
need a new device, buy it; If you don’t, however, there is absolutely no need to spend money on the minor yearly refresh cycle any longer, as the returns will be increasingly diminishing. And again, looking at it from a different angle, probably that is also a good thing.
Consumer hardware is heading toward a bleak future where owning powerful, affordable machines becomes harder or maybe even impossible, as manufacturers abandon everyday users to chase vastly more profitable data centers, “AI”
firms, and enterprise clients. RAM and SSD price spikes, Micron’s exit from the consumer market, and the resulting Samsung/SK Hynix duopoly are early warning signs of a broader shift that will eventually affect CPUs, GPUs, and the entire PC ecosystem.
With large manufacturers having sold out their entire production capacity to
hyperscalers for the rest of the year while simultaneously cutting consumer production by double-digit percentages, consumers will have to take a back seat. Already today consumer hardware is overpriced, out of stock or even intentionally being delayed due to supply issues.
In addition, manufacturers are pivoting towards consumer hardware subscriptions, where you never own the hardware and in the most dystopian trajectory, consumers might not buy any hardware at all, with the exception of low-end thin-clients
that are merely interfaces, and will rent compute through cloud platforms, losing digital sovereignty in exchange for convenience. And despite all of this sounding like science fiction, there is already hard evidence proving that access to hardware can in fact be politically and economically revoked.
Therefor I am urging you to maintain and upgrade wisely, and hold on to your
existing hardware, because ownership may soon be a luxury rather than the norm.
...
Read the original on xn--gckvb8fzb.com »
I can’t express how much I utterly hate the “Continuing Disability Review.”
It is a letter that arrives every few years from the government, asking a question that is medically absurd and philosophically insulting: “Are you still disabled?”
As if my blindness were a seasonal allergy. As if I might have woken up last Tuesday, blinked, and realized that my optic nerves had decided to regenerate spontaneously.
This week, I received The Letter. It demanded “updated medical evidence” to prove that I—a man who has been blind since birth—am, in fact, still blind.
I called the number. I navigated the phone tree hellscape. I finally reached a human being. Let’s call her “Karen from Compliance.”
“I have the documents in PDF format,” I told her, using my polite, I haven’t had my morning tea so make this easy on me, voice. “I can email them to you right now. You’ll have them in ten seconds.”
“We cannot accept email,” Karen said. Her voice was flat, dry, and sounded like stale coffee and rigid adherence to a rulebook written in 1994. “It is a security risk. You must mail physical copies, or you can fax them.”
“Fax them?” I asked. “You want me to fax you medical records when you could just delete the email after saving the attachments?”
“Those are the options, sir. If we don’t receive them by Friday, your benefits will be suspended.”
I didn’t understand why they couldn’t just look back in my file, noticed nothing had changed in decades, and update it based on past data.
She said it with a challenge in her tone. She knew who she was talking to. She was talking to a blind man living below the poverty line. She assumed that “fax it” was an impossible hurdle. She assumed I would have to find a ride to a library, pay twenty cents a page, and struggle with a physical machine I couldn’t read. She was counting on the friction of the physical world to make me give up.
I am a nerd. And I have an internet connection.
“Okay,” I said, my voice dropping into the cool, smooth, ‘Let’s systemically tango,’ tone of a man with a plan. “I will fax them. What is the number?”
I hung up. And then, I went to work.
She wanted evidence? Oh boy, I would give her evidence.
I didn’t just pull the recent files. I went into the archives. I dug into the deep, digital bedrock of my hard drive. I pulled records from when I was five. I pulled the surgical notes from my cerebral palsy treatments. I pulled the intake forms from every specialist, every therapist, every social worker who has ever written a note about my “deficits.”
I compiled a single, monolithic PDF. It was a monument to medical trauma. It was a library of diagnosis.
It was five hundred and twelve pages long.
I opened my preferred internet faxing service. This is a tool that allows me to send a fax purely through digital data. It would cost $20, exactly the amount someone had donated to the blog last week, but if I didn’t do this, I would lose all my benifits. It costs me zero paper. It costs me zero toner.
By the way, your tips keep me writing.
But for the recipient?
For the recipient, a fax is a physical reality. It requires paper. It requires ink. It requires time.
I imagined Karen’s fax machine. It was probably an old, beige beast sitting in the corner of a gray office. It was likely low on paper. It was almost certainly low on patience.
I uploaded the file. The file size was massive. The progress bar on my screen reader ticked up. Uploading… 20%… 50%… 80%…
And then, I sat back and listened to the most beautiful sound in the world.
“Your fax has been sent,” my screen reader announced.
I imagined the scene in that office.
At first, it would just be a single page. Whirrr. Chunk. A standard medical form. Karen would ignore it.
By page fifty, the machine would be heating up. The smell of hot toner would start to fill the cubicle. The rhythmic chunk-chunk-chunk of the printing would become a drone, a mechanical chant of malicious compliance.
By page one hundred, the paper tray would run out. The machine would start beeping. That high-pitched, insistent beep-beep-beep that demands attention. Karen would have to get up. She would have to find a ream of paper. She would have to feed the beast.
And the beast would not stop.
Because I had set the retry limit to “Infinity.” If the line busied out? It would call back. If the paper ran out? It would wait. It was a digital siege engine.
I sent them everything. I sent them the eye charts that prove I can’t read eye charts. I sent them the physical therapy logs. I sent them the blurry scans of notes written by doctors who are long since dead.
I sent them the Tsunami of Truth.
I wanted them to hold the weight of it. I wanted them to physically feel the burden of proof they place on disabled bodies. They want us to document our existence? Fine. Here is my existence, one sheet of hot, curled paper at a time.
Two hours later, my phone rang.
It was Karen. She sounded breathless. She sounded like she was standing next to a machine that was hyperventilating. In the background, I could hear a rhythmic whir-chunk, whir-chunk.
“Yes?” I answered, my voice the picture of innocent helpfulness.
“Sir, please. You have to stop the fax. It’s… it’s been printing for an hour. It’s jamming the machine. We’re out of toner.”
“Oh, you’re out of toner? It’s jammed? Oh my! Oh, I’m so sorry,” I said, putting exactly zero percent sincerity into the apology. “But you said you couldn’t accept email. You said I had to provide complete documentation. I’m just following the rules, Karen. I wouldn’t want my benefits to be suspended because I missed documentation, so here’s documentation all the way back to when I’m five years old.”
“Jesus Christ, We have it!” she snapped. “We have enough! Please, just… cancel the rest.”
“I’m afraid I can’t do that,” I lied. “It’s an automated process. Once it starts, it has to finish. Security protocols, you understand.”
There was a long, strangled silence on the line. Then, a defeated sigh.
“Fine! Fine,” she snapped. “We will mark your file as updated.”
“Thank you,” I said. “Have a wonderful day.”
I sat there in my quiet apartment, eating a cookie. I imagined the pile of paper in that office, a physical mountain of evidence testifying to the fact that yes, I am blind, and yes, I am smarter than your bureaucracy.
If you enjoyed this tiny victory in a hostile world, you might enjoy, Seven Days in June by Tia Williams
learn how to follow the podcast or join my street team,
You can follow the main RSS feed, learn how to follow the podcast or join my street team, or follow via email with the form below.
...
Read the original on sightlessscribbles.com »
I put an AI agent on a $7/month VPS, connected it to my own IRC server, and pointed it at my GitHub repos. Visitors can ask it about my work and get answers backed by actual code, not rephrased resume text.
the problem with “ask my resume”
Every portfolio site with an AI chatbot does the same thing: feed the resume into a model and let visitors rephrase it. It’s a parlor trick. The model can’t tell you anything the resume doesn’t already say.
I wanted something different. If a hiring manager asks “how does George handle test coverage?” the answer shouldn’t be “George values comprehensive testing.” It should clone the repo, count the tests, read the CI config, and come back with specifics.
So I built the infrastructure to make that work.
Two agents, two boxes, two security boundaries.
visitor (browser)
└─ gamja web IRC client
└─ wss://nullclaw.georgelarson.me:443
└─ Cloudflare (proxy, TLS termination, bot protection)
└─ ergo IRC server (LarsonNet)
└─ #lobby
└─ nully (nullclaw agent)
├── reads public GitHub repos
├── preloaded portfolio context
└── routes to ironclaw via #backoffice
└─ #backoffice (private IRC channel)
└─ ironclaw (separate box, via Tailscale)
├── email access
├── calendar
└── private context
nullclaw is the public-facing doorman. It runs on a minimal perimeter box, a 678 KB Zig binary using about 1 MB of RAM. It handles greetings, answers questions about my projects, and can clone repos to substantiate claims with real code.
ironclaw is the private agent on a separate, more powerful system. It has access to email, deeper personal context, and handles complex inquiries routed from nullclaw. That boundary is deliberate: the public box has no access to private data.
This is where most people reach for the biggest model they can afford. That’s the wrong instinct for a doorman.
Greetings, triage, simple questions about my background. Sub-second responses. Pennies per conversation. Speed matters more than depth here.
When nully needs to clone a repo, read code, or synthesize findings across files, Sonnet steps in. You pay for reasoning only when reasoning is needed.
A public-facing agent without a spending limit is a liability. The cap prevents both runaway conversations and abuse. If someone tries to burn through my inference budget, they hit a wall.
Using Opus for a concierge would signal the opposite of model understanding. If Haiku can handle it, don’t send it to Sonnet. Tiered inference (cheap for the hot path, capable for the heavy lifting) is how I keep this under $2/day.
This box is a public-facing perimeter. It should be hardened like one.
Firewall: UFW with only three ports open: SSH, IRC (TLS), and HTTPS (WebSocket via Cloudflare).
Cloudflare proxy: Web visitors never hit the box directly. WebSocket traffic goes through CF’s edge, which handles TLS termination, rate limiting, and bot filtering.
Agent sandboxing: nullclaw runs in supervised mode with workspace-only file access, a restricted command allowlist (read-only tools), and 10 actions per hour max.
Cost controls: $2/day, $30/month hard caps. If the agent gets abused, the budget runs out before the damage compounds.
The philosophy is minimal attack surface. The box runs two services (ergo and nullclaw), serves no web content directly, and has no access to private data. If it gets compromised, the blast radius is an IRC bot with a $2/day inference budget.
what nully can actually do
This is the part that separates it from a chatbot:
“What languages does George use?” Doesn’t parrot the resume. Knows from preloaded context and can verify by checking repos.
“How does he structure tests?” Clones the repo, reads the test files, reports what it finds.
“Tell me about Fracture” Pulls from preloaded memory about the project, can dig into the source for specifics.
“How do I reach him?” Provides contact info. Doesn’t hallucinate a phone number.
“Can I schedule a call?” Nully calls ironclaw over Google’s A2A protocol via Tailscale. Ironclaw processes the request with its own LLM, sends back a structured response, and nully relays the answer. The visitor never sees the handoff.
It’s an IRC bot backed by Haiku, so it’s not perfect. But it backs up what it says with code, and my resume can’t do that.
This is the part I’m most proud of.
nullclaw already serves Google’s A2A protocol (v0.3.0): agent card discovery, JSON-RPC dispatch, task state machine. What it didn’t have was a client. It could receive A2A calls but couldn’t make them. So I wrote one.
The a2a_call tool sends message/send JSON-RPC requests to remote agents, parses the task response (completed, failed, working), extracts the artifact text, and returns it as a tool result. It enforces HTTPS for public endpoints but allows plaintext HTTP for private networks and Tailscale CGNAT ranges, because when you’re debugging TLS between two agents on a mesh VPN at 2am, the last thing you need is your own security policy locking you out.
But the really slick part is on ironclaw’s side. The nullclaw instance running there doesn’t have its own API key. Instead, its LLM provider is pointed at ironclaw’s own gateway as a passthrough:
nully (this box)
└─ a2a_call tool → POST /a2a
└─ ironclaw’s nullclaw (separate box, Tailscale)
├── receives A2A task
├── needs to run inference
└── provider config: “ironclaw” → http://127.0.0.1:3000/v1
└─ ironclaw’s own gateway
└─ routes to Kilo → actual LLM
One API key. One billing relationship. The nullclaw on ironclaw’s box is just an A2A bridge. It accepts the protocol, borrows ironclaw’s inference pipeline, and responds. No credential duplication, no separate budget to track. The agent that owns the API key is the agent that pays for inference, regardless of who initiated the request.
...
Read the original on georgelarson.me »
This article is both an introduction to a tool I have been working on called jsongrep, as well as a technical explanation of the internal search engine it uses. I also discuss the benchmarking strategy used to compare the performance of jsongrep against other JSON path-like query tools and implementations.
In this post I’ll first show you the tool, then explain why it’s fast (conceptually), then how it’s fast (the automata theory), and finally prove it (benchmarks).Upfront I would like to say that this article is heavily inspired by Andrew Gallant’s amazing ripgrep tool, and his associated blog post “ripgrep is faster than {grep, ag, git grep, ucg, pt, sift}”. jsongrep’s DFA-Based Query Engine
You can install jsongrep from crates.io:cargo install jsongrep
Like ripgrep, jsongrep is cross-platform (binaries available here) and written in Rust.
jsongrep (jg binary) takes a query and a JSON input and prints every value whose path through the document matches the query. Let’s build up the query language piece by piece using this sample document:
Dot paths select nested fields by name. Dots (.) between field names denote concatenation– “match this field, then that field”:$ cat sample.json | jg ‘roommates[0].name’ roommates.[0].name: “Alice”
Wildcards match any single key (*) or any array index ([*]):$ cat sample.json | jg ‘favorite_drinks[*]’ favorite_drinks.[0]: “coffee” favorite_drinks.[1]: “Dr. Pepper” favorite_drinks.[2]: “Monster Energy”
Alternation (|) matches either branch, like regex alternation:$ cat sample.json | jg ‘name | roommates’ name: “Micah” roommates: [
{ “name”: “Alice”, “favorite_food”: “pizza” } ]
Recursive descent uses * and [*] inside a Kleene star to walk arbitrarily deep into the tree. For example, to find every name field at any depth:$ cat sample.json | jg ‘(* | [*])*.name’ name: “Micah” roommates.[0].name: “Alice”
The pattern (* | [*])* means “follow any key or any index, zero or more times”, e.g., descend through every possible path. The trailing .name then filters for only those paths that end at a field called name.
Equivalently, jg exposes the -F (“fixed string”) flag as a shorthand for these recursive descent queries:$ cat sample.json | jg -F name name: “Micah” roommates.[0].name: “Alice”
Optional (?) matches zero or one occurrence:$ cat sample.json | jg ‘roommates[0].favorite_food?’ roommates.[0]: {
“name”: “Alice”, “favorite_food”: “pizza” }
roommates.[0].favorite_food: “pizza”
Notice how the inner string “pizza” matches with the ?, in addition to the parent zero-occurrence case.
Here’s a screenshot showing several of these features in action:Example of some of jsongrep’s search query features in practice. NOTE: jsongrep is smart about detecting if you are piping to another command like less or sort, in which case it will not display the JSON paths. This can be overridden though if desired with the –with-path option.
JSON documents are trees: objects and arrays branch, scalars are leaves, and keys and indices label the edges. Querying a JSON document is really about describing paths through this tree. jsongrep takes this observation literally: its query language is a regular language over the alphabet of keys and indices. Think regular expressions, but instead of matching characters in a string, you’re matching edges in a tree.
Why does “regular” matter? Because regular languages have a well-known, powerful property: they can be compiled into a deterministic finite automaton (DFA). A DFA processes input in a single pass with $O(1)$ work per input symbol– no backtracking, no recursion stack, no exponential blowup on pathological queries. The query is paid for once at compile time, then search is essentially free.
This is the key difference from tools like jq, jmespath, or jsonpath-rust. Those tools interpret path expressions: at each node in the JSON tree, they evaluate the query, check predicates, and recursively descend into matching branches. If a query involves recursive descent (.. or $..), these tools may revisit subtrees or maintain worklists. jsongrep does something fundamentally different– it compiles the query into a DFA before it ever looks at the JSON, then walks the document tree exactly once, taking a single $O(1)$ state transition at each edge. No interpretation, no backtracking, one pass.
As a consequence, jsongrep is fast– like really fast:
Again borrowing from the ripgrep blog post, here’s an “anti-pitch” for jsongrep:jsongrep is not as ubiquitous (yet) as jq. jq is the go-to for JSON querying, filtering, and transductions. The query language is deliberately less expressive than jq’s. jsongrep is a search tool, not a transformation tool– it finds values but doesn’t compute new ones. There are no filters, no arithmetic, no string interpolation.jsongrep is new and has not been battle-tested in the wild.Keep reading if interested in the internals of jsongrep!
With the tool overview and motivation out of the way, let’s dive into the internals. This section traces a single query through every stage of the engine.
The core of the search engine is a five-stage pipeline:Parse the JSON document into a tree via serde_json_borrow (zero-copy).Construct an NFA from the query via Glushkov’s construction algorithm. Determinize the NFA into a DFA via subset constructionWalk the JSON tree, taking DFA transitions at each edge and collecting matches.
To make this concrete, we’ll trace the query roommates[*].name through every stage. Given our sample document, this query should match “Alice” at path roommates[0].name.
The query string roommates[*].name is parsed into an AST. jsongrep uses a PEG grammar (via the pest library) that maps the query DSL to a tree of Query enum variants:
The grammar is intentionally simple. Dots denote concatenation (sequencing), | denotes alternation (disjunction), postfix * denotes Kleene star (zero or more repetitions), and postfix ? denotes optional (zero or one). Parentheses group subexpressions. This maps directly to the definition of a regular language– and that’s the whole point. Because the query language is regular, everything that follows (NFA, DFA, single-pass search) is possible.
The full Query AST supports these variants:
JSON values form a tree. Object keys and array indices are the edges; the values they point to are the nodes. Scalars (strings, numbers, Booleans, null) are leaves.
Our sample document forms this tree:
A query, then, describes a set of paths from the root to matching nodes. The query roommates[*].name describes the path: take the roommates edge, then any array index, then the name edge.
With the query parsed into an AST, we need to convert it into an automaton that can match paths. The first step is building a nondeterministic finite automaton (NFA).
jsongrep uses Glushkov’s construction, which has a key advantage over the more common Thompson’s construction: it produces an $\epsilon$-free NFA. Every transition in the resulting NFA consumes a symbol– no epsilon transitions to chase, which simplifies the downstream determinization.
1. Linearize the query. Each symbol (field name, wildcard, index range) in the query gets a unique position number. Our query roommates[*].name has three symbols:
2. Compute the First and Last sets. The First set contains positions that can appear at the start of a match; the Last set contains positions that can appear at the end. For a simple sequence, First = {first element} and Last = {last element}:
3. Compute the Follows set. For each position i, Follows(i) is the set of positions that can immediately follow i in a valid match. For a simple sequence, each position follows the one before it:
For queries with Kleene star or alternation, the $\textit{Follows}$ sets become more interesting– loops and branches appear naturally.
4. Assemble the NFA. The NFA has $n + 1$ states (one start state plus one per position). Transitions are wired up from the computed sets:From the start state $q_0$, add transitions to every position in the $First$ setFor each position $i$ and each $j \in \textit{Follows}(i)$, add a transition from state $i$ to state $j$ on symbol $j$States corresponding to positions in the $\textit{Last}$ set are marked accepting
For our query, the resulting NFA is:Constructed NFA for `roommates[*].name`: NFA States: 4 Start State: 0 Accepting States: [3] First set: [“[0] Field(roommates)“] Last set: [“[2] Field(name)“] Factors set: [0] Field(roommates) can be followed by: [1] Range(0, 18446744073709551615) [1] Range(0, 18446744073709551615) can be followed by: [2] Field(name) [2] Field(name) cannot be followed Transitions: state 0: on [0] Field(roommates) -> [1] state 1: on [1] Range(0, 18446744073709551615) -> [2] state 2: on [2] Field(name) -> [3] state 3:The 18446744073709551615 value is the value of usize::MAX on my machine (64 bit address space, equal to 2^64 - 1), which is the maximum value of a 64-bit unsigned integer.
This is a simple chain for our simple query, but for queries with * or |, the NFA would have branching and looping edges. For example, (* | [*])*.name would produce a state with self-loops on both FieldWildcard and ArrayWildcard, capturing the “descend through anything” behavior.
An NFA can be in multiple states simultaneously– on a given input, it may have several valid transitions. This is fine theoretically but bad for performance: simulating an NFA means tracking a set of active states at each step. A DFA, by contrast, is in exactly one state at all times, meaning each transition is an O(1) table lookup. Importantly, Rabin and Scott showed that every NFA can be turned into an equivalent DFA.
The standard algorithm for converting an NFA to a DFA is subset construction (also called the powerset construction). The idea is simple: each DFA state corresponds to a set of NFA states. The algorithm explores all reachable sets via breadth-first search:Start with the DFA state corresponding to $q_0$ (just the NFA start state).For each DFA state and each symbol in the alphabet, compute the set of NFA states reachable by taking that transition from any NFA state in the current set. If this resulting set is new, create a new DFA state for it.A DFA state is accepting if any of its constituent NFA states is accepting.Repeat until no new DFA states are discovered.
For our example query roommates[*].name, the NFA is already deterministic (a simple chain with no branching), so subset construction produces a DFA with the same shape:Constructed DFA for query: `roommates[*].name` DFA States: 4 Start State: 0 Accepting States: [3] Alphabet (4 symbols): 0: Other 1: Field(“roommates”) 2: Field(“name”) 3: Range(0, 18446744073709551615) Transitions: state 0: on [Other] -> (dead) on [Field(“roommates”)] -> 1 on [Field(“name”)] -> (dead) on [Range(0, 18446744073709551615)] -> (dead) state 1: on [Other] -> (dead) on [Field(“roommates”)] -> (dead) on [Field(“name”)] -> (dead) on [Range(0, 18446744073709551615)] -> 2 state 2: on [Other] -> (dead) on [Field(“roommates”)] -> (dead) on [Field(“name”)] -> 3 on [Range(0, 18446744073709551615)] -> (dead) state 3: on [Other] -> (dead) on [Field(“roommates”)] -> (dead) on [Field(“name”)] -> (dead) on [Range(0, 18446744073709551615)] -> (dead)
So the DFA’s state diagram looks like this:
In the implementation, the alphabet isn’t just the literal symbols from the query– jsongrep also adds an Other symbol to handle JSON keys that don’t appear in the query. Any transition on Other leads to a dead state (or stays in a state where that key is irrelevant), ensuring we skip non-matching branches efficiently.
For more complex queries, subset construction can produce DFA states that combine multiple NFA states. For instance, (* | [*])*.name would produce DFA states representing sets like $\set{q_0, q_1}$ (both “at start” and “in the middle of descending”), which is what enables the single-pass behavior.
This is the payoff. With the DFA built, searching the JSON document is a simple depth-first traversal of the tree, carrying the DFA state along:Start at the root of the JSON tree in DFA state $q_0$.At each node, iterate over its children (object keys or array indices).For each child edge, look up the DFA transition: given the current state and the edge label (key name or index), what’s the next state?If no transition exists for this edge, skip the subtree entirely. If the new state is accepting, record the match (path + value).Recurse into the child with the new DFA state.
Let’s trace our query roommates[*].name against the sample document:
1. Start at the root object in DFA state $q_0$. Iterate over its three keys:
2. At the roommates array in state $q_1$. Iterate over its indices:
3. At the roommates[0] object in state $q_2$. Iterate over its keys:Edge “name”: $\delta(q_2, \texttt{Field(“name”)}) \to q_3$. State $q_3$ is accepting– record the match: roommates.[0].name → “Alice”.
Notice how the DFA let us skip the “name” and “favorite_drinks” subtrees at the root in step 1– we never even looked at their values. On a large document, this pruning is what makes the search fast: entire branches of the JSON tree are discarded in $O(1)$ without recursing into them.
Every node is visited at most once, and each transition is an $O(1)$ table lookup. The total search time is O(n) where n is the number of nodes in the JSON tree. No backtracking, no interpretation overhead.
As an implementation bonus, jsongrep uses serde_json_borrow for zero-copy JSON parsing. The parsed tree holds borrowed references (&str) into the original input buffer rather than allocating new strings, which significantly reduces memory overhead on large documents.
All benchmarks use Criterion.rs, a statistics-driven Rust benchmarking framework that provides confidence intervals, outlier detection, and change detection across runs.
Four datasets of increasing size test scaling behavior:
The benchmarks are split into four groups to isolate where time is spent:document_parse — JSON parsing cost only. Measures serde_json_borrow (zero-copy) vs. serde_json::Value (allocating) vs. jmespath::Variable. This isolates jsongrep’s zero-copy parsing advantage.query_compile — Query compilation cost only. jsongrep must build an AST and construct a DFA upfront; other tools may have cheaper (or no) compile steps. This is the price jsongrep pays for fast search.query_search — Pre-compiled query, pre-parsed document, search only. Isolates the traversal/matching cost without parse or compile overhead.end_to_end — The full pipeline: parse JSON + compile query + search. This is the realistic CLI usage scenario.
Each tool uses its own query syntax, but the benchmarks ensure equivalent work across tools. For example:
* The zero-copy parsing advantage is explicitly isolated in the document_parse group, not hidden.
* jsongrep’s upfront DFA compilation cost is measured separately in query_compile, so readers can see the tradeoff.
* Tools lacking certain query features (e.g., jmespath has no recursive descent) are skipped for those benchmarks rather than penalized.
* Tools requiring ownership of parsed JSON (jaq, jmespath) use Criterion’s iter_batched to fairly separate cloning costs from search costs.
Let’s take a look at the results from the benchmarks. We’ll use the xlarge dataset unless otherwise noted since it provides the best demonstration of performance impacts, but the full results are available here.
No surprises here: serde_json_borrow is the fastest, followed by serde_json::Value and jmespath::Variable.
As expected, jsongrep takes time to compile the different queries and this is its largest cost:
Compare this to the compile time of jmespath (an order of magnitude faster):
As shown at the beginning of the post, over the xlarge (~190 MB) dataset on the end-to-end benchmark, it’s not even close:
The full, interactive Criterion report is available at the live benchmarking site.
jsongrep also exposes its DFA-based query engine as a library crate, so you can embed fast JSON search directly in your own Rust projects.
...
Read the original on micahkepe.com »
New York City’s public hospital system announced that it would not be renewing its contract with Palantir as controversy mounts in the UK over the data analytics and AI firm’s government contract.
The president of the US’s largest municipal public healthcare system, Dr Mitchell Katz, testified last week before the New York city council that the agreement with Palantir would expire in October.
He said at the hearing that the contract, which focused on recovering money for insurance claims, was always meant to be short-term, and that there was an “absolute firewall” preventing Palantir from sharing information with US Immigration and Customs Enforcement. He said that the agency had “not had any incidents”.
The contract and related payment documents shared with the Guardian by the American Friends Service Committee and first reported by the Intercept, show that NYC Health + Hospitals has paid Palantir nearly $4m since November 2023. The contract noted that Palantir would be able to review notes about patients’ health and help the hospital claim more money in public benefits through programs such as Medicaid. It also includes a line stating that with permission from the city agency, Palantir can “de-identify” patients’ protected health information and use it for “purposes other than research”.
NYC Health + Hospitals said in an email to the Guardian that it will be transitioning to systems that were made entirely in-house, and there will be no data shared with Palantir or use of the company’s applications after the contract expires. “NYC Health + Hospitals’ use of Palantir technology is strictly limited to revenue cycle optimization, helping the public healthcare system close gaps between services delivered and charges captured, protect critical revenue, and reduce avoidable denials,” the agency said in an emailed statement.
A Palantir spokesperson said in a statement: “Palantir, as a software company, does not own or have any rights to customer data — and each customer environment is individually protected against unauthorized access or misuse via robust security controls which can be fully administered and audited by the customer.”
As New York City’s hospital system prepares to part ways with Palantir, the company is facing similar scrutiny over privacy issues in its £330m agreement with the UK’s National Health Service (NHS). Health officials in the UK are concerned that the controversy surrounding Palantir may stop the nationwide rollout of the company’s data system, even though Keir Starmer is trying to speed up deployment. As of last summer, not even half of the country’s health authorities had started using Palantir’s technology amid concerns from the community and doctors. A 12 March briefing by Medact, a health justice charity, said Palantir’s software could enable “data-driven state abuses of power”, including US-style ICE raids. Palantir has denied that the data could be used in this way, noting that it would be illegal and a breach of contract.
Palantir, which also contracts with the British government’s Ministry of Defence, is expanding its influence in the country — despite backlash from activists and some lawmakers. The Guardian revealed last week that Palantir is trying to gain access to sensitive national financial regulation data.
The Financial Conduct Authority, a watchdog for thousands of financial bodies from banks to hedge funds, awarded Palantir a contract to investigate internal intelligence data to help root out financial crime. That has sparked outcry from some MPs, who have urged the government to halt this agreement. Liberal Democrats called on Monday for a government investigation into the contract. Starmer has dismissed suggestions that the UK has become “dangerously over-reliant” on American tech companies, including Palantir, but noted he preferred to have more domestic capability.
Medact has raised privacy concerns in the UK about Palantir’s ability to access de-identified patient data. (De-identified data refers to data that has been stripped of characteristics that could indicate who an individual is, such as names and social security numbers.) In a 12 March briefing for health officials, Medact argued that the NHS’s data privacy protections are insufficient; NHS England has said that data is de-identified as it moves through its national software system, the NHS federated data platform (FDP). But Medact cited concerns that this data can be easily re-identified.
An NHS spokesperson said in an emailed statement to the Guardian that the supplier of the FDP “was appointed in line with public contract regulations and must only operate under the instruction of the NHS, with all access to data remaining under NHS control and strict contractual obligations protecting confidentiality”.
Data privacy experts interviewed by the Guardian said that there are risks in Palantir accessing New Yorkers’ de-identified data for purposes other than research, especially given the company’s vast access to government records, willingness to cooperate with the federal government and ability to connect and analyze large datasets.
“De-identification is not the guarantee it used to be, and it’s getting easier with AI capabilities to re-identify information,” said Sharona Hoffman, a law professor at Case Western Reserve University.
Ari Ezra Waldman, a law professor at University of California, Irvine who has researched how governments and tech companies use data about individuals, says that we should be concerned “whenever a company like Palantir or a hostile government collects information on vulnerable populations”. He is particularly concerned about the contract’s provision to use the information for “purposes other than research”. That tells him the government didn’t have enough power to push back on Palantir when negotiating the contract, or didn’t care or know the risk, he says.
Despite the hospital system’s claims that the partnership had no real risks for patients, activists living in New York City, and beyond, are counting this as a win.
Nurses, pro-Palestinian activists and social and climate justice groups applied pressure on the city government as part of a nationwide campaign known as Purge Palantir to stop the company from contracting with government agencies, universities and corporations.
“We don’t think that the same AI systems that are targeting immigrants here in the United States for ICE, as well as choosing places to bomb in Iran, should be the same AI systems used in hospitals,” said Kenny Morris, an organizer with the American Friends Service Committee. The group obtained the NYC Health + Hospitals contract with Palantir through a public records request, and shared the document with the Intercept and the Guardian. The national nurses union and the Boycott, Divestment, and Sanctions (BDS) movement were also involved in the campaign.
Groups with the “No Palantir in our NHS” campaign in the UK are hoping New York City’s public hospital system’s decision to let the Palantir contract expire fuels their own fight, too. Medact and Amnesty International UK told the Guardian in emailed statements that they are calling on the NHS to follow New York City’s example and terminate its £330m contract with Palantir.
“As campaigners in New York have shown, workers and communities can hold our health institutions accountable and push them to make the right choice. We will do the same here, and force NHS England to cancel this contract,” Dr Rhiannon Mihranian Osborne, corporate campaigns lead at Medact, which is in touch with Purge Palantir.
...
Read the original on www.theguardian.com »
John Bradley, the longtime member of this community who was the founder, producer, and lead guitarist of Booster Patrol, died on March 20. He was 61.
I couldn’t think of a better way to pay tribute to the man who was both a bandmate and a friend than to write a song for him in the style of which he was a master. You can hear the first mix of the song here; when I finish it properly, I’ll put it up on Unauthorized in the Booster Patrol section.
Johnny B laid down his burden late on a Friday night
With the music of his band still ringing in the fading light
Saint Peter met him at the gate, said Son, we heard you play
Leo Fender built this gold guitar and he saved it for this day
The choir’s been singing acapella ever since the world was new
They need someone who knows the sad notes, they say that man is you
He wrapped his hands around that neck, felt the weight of holy gold
Every fret a year of sorrow, every string a story told
He hit a chord that shook the heavens, the angels stopped to hear
A tone so long and lonesome that Saint Matthew shed a tear
Peter said “We don’t need pretty, son, we’ve got harps here by the score
We want to hear that swampy sound that kept ’em coming back for more
Now every night in Heaven there’s a sound they never had
A solid gold Fender wailing every note both beautiful and sad
The choir hits the chorus, the Almighty taps His feet
And Johnny B is boosting live up on that golden street
He played the broken-hearted blues from Beale Street to Monsignor
Now he’s jamming up in Heaven and he couldn’t ask for more.
Lay it down, Johnny B
Make that sound, Johnny B
Hit that chord
Lay it down, Johnny B
Make that sound, Johnny B
For the Lord
...
Read the original on voxday.net »
A few weeks ago, Cloudflare published “How we rebuilt Next.js with AI in one week.” One engineer and an AI model reimplemented the Next.js API surface on Vite. Cost about $1,100 in tokens.
The implementation details didn’t interest me that much (I don’t work on frontend frameworks), but the methodology did. They took the existing Next.js spec and test suite, then pointed AI at it and had it implement code until every test passed. Midway through reading, I realized we had the exact same problem - only in our case, it was with our JSON transformation pipeline.
Long story short, we took the same approach and ran with it. The result is gnata — a pure-Go implementation of JSONata 2.x. Seven hours, $400 in tokens, a 1,000x speedup on common expressions, and the start of a chain of optimizations that ended up saving us $500K/year.
At Reco, we have a policy engine that evaluates JSONata expressions against every message in our data pipeline - billions of events, on thousands of distinct expressions. JSONata is a query and transformation language for JSON (think jq with lambda functions), which makes it ideal for enabling our researchers to write detection rules without having to directly interact with the codebase.
The reference implementation is JavaScript, whereas our pipeline is in Go. So for years we’ve been running a fleet of jsonata-js pods on Kubernetes - Node.js processes that our Go services call over RPC. That meant that for every event (and expression) we had to serialize, send over the network, evaluate, serialize the result, and finally send it back.
This was costing us ~$300K/year in compute, and the number kept growing as more customers and detection rules were added. For example, one of our larger clusters had scaled out to well over 200 replicas just for JSONata expressions, which resulted in some unexpected Kubernetes troubles (like reaching IP allocation limits).
In some respects, the RPC latency overhead was actually worse than the pure dollar cost. An RPC round-trip is ~150 microseconds before any evaluation even starts. For a simple field lookup like user.email = “admin@co.com” something that should take nanoseconds - we’re paying microseconds just for crossing a language boundary. At our scale, those microseconds stack up quickly.
We’d tried a few things over the years - optimizing expressions, output caching, and even embedding V8 directly into Go (to avoid the network hop). They did their part, but it was mostly just incremental improvements. The closest we got was a local evaluator we built using GJSON that handled simple expressions directly on raw bytes. It was fast for what it covered, but anything complex had to fall back to jsonata-js. We were patching around the problem, but the root cause remained unsolved.
During the weekend I built out a plan (using AI) separated into ‘waves’. The approach was the same as Cloudflare’s vinext rewrite: port the official jsonata-js test suite to Go, then implement the evaluator until every test passes. The following day, I pressed play. The plan was straightforward - build out the full JSONata 2.x spec in Go, with a focus on performant streaming and some extra features sprinkled on (localized caching, WASM support, metrics, and fallthrough capabilities back to the jsonata-js RPC).
A few iterations and some 7 hours later - 13,000 lines of Go with 1,778 passing test cases.
I shared the numbers internally and someone asked about the ROI. Production cost for jsonata-js in the previous month was about $25K - now it was 0. That conversation ended up being pretty short.
gnata has a two-tier evaluation architecture. At compile time, each expression is analyzed and classified.
The fast path handles simple expressions - field lookups, comparisons, and a set of 21 built-in functions applied to pure paths (things like $exists(a.b) or $lowercase(name)). These are evaluated directly against the raw JSON bytes without ever fully parsing the document. For something like account.status = “active” you get 0 heap allocations.
Everything else goes through the full path - a complete parser and evaluator with full JSONata 2.x semantics. This does parse the JSON, but only the subtrees it actually needs, not the entire document.
On top of this there’s a streaming layer (the StreamEvaluator) designed for our specific workload: evaluate N compiled expressions against each event, where events are structurally similar.
All field paths from all expressions are merged into a single scan. The number of expressions doesn’t matter - raw event bytes are only read once. After warm-up, the hot path is lock-free. Evaluation plans are computed once per event schema and cached immutably, so reads are a single atomic load with no synchronization.Memory is bounded. The cache has a configurable capacity and evicts the oldest entries when full.
The fast-path design took a lot of inspiration from the existing local evaluator. For simple expressions it was excellent — gnata won’t be faster on an apples-to-apples comparison for those. But the schema-aware caching and batch evaluation is where the real gains come from.
Correctness: 1,778 test cases from the official jsonata-js test suite + 2,107 integration tests in the production wrapper.
The speedup on simple lookups is mostly from eliminating the RPC overhead entirely — gnata evaluates directly on raw bytes with no JSON parsing. Complex expressions involve full parsing and AST evaluation, so the gap narrows, but they’re still 25-90x faster than the RPC path.
In practice, gnata runs as a library inside our existing Go services. The serialization and RPC overhead goes away.
Building the library was day one. The rest of the week was about making sure it was actually correct.
We already had mismatch detection infrastructure in the codebase - feature flags, shadow evaluation, comparison logging - built months earlier for the local evaluator. Wiring gnata into the same system was straightforward.
* Days 2-6: Code review, QA against real production expressions, deployed to preprod in shadow mode. gnata evaluates everything, but jsonata-js results are still used. Mismatches logged and alerted. Fixed edge cases as they came up.
* Day 7: Three consecutive days of zero mismatches. gnata promoted to primary.
By the time we promoted gnata, it had already been processing billions of events and producing identical results to jsonata-js. We also caught bugs in jsonata-js itself. Cases where the reference implementation doesn’t follow its own spec. gnata handles them correctly.
A side effect we didn’t expect: gnata was one of the first large PRs where we had AI agents reviewing AI-generated code. The agents were flagging everything - real concurrency issues alongside cosmetic nitpicks - and we had to teach them which is which. That work fed into how we handle AI code review more broadly now.
Eliminating the RPC fleet took care of $300K, but there was one more thing we wanted to tackle - batching events end-to-end in our rule engine. JSONata - being only able to do a single evaluation at a time - forces the infra around it to contort itself with workarounds in order to stay performant. For our rule engine, that meant we were spinning up tens of thousands of goroutines to maximize concurrency (with all the added resources that entailed) in what would otherwise be a straightforward pipeline of micro-batches. As you might expect, that resulted in excessive memory and high CPU contention. In other words, our rule engine was both expensive and slow.
gnata has no such limitations, so we were able to replace the rule engine internals with a far simpler and more efficient implementation. The details of the refactor deserve their own blog post, but they involved just-in-time batching (based on the request coalescing pattern), short-lived caches and grouped enrichment queries (done right before evaluation). I was quite surprised myself with the throughput increase the refactor brought - and with it, a sharp drop in resource utilization.
The result: another ~$18K/month off the bill - around $200K/year. Combined with gnata, that’s $500K/year gone from the pipeline, all in under 2 weeks of work.
There’s an active debate over whether fully hands-off AI code belongs in production. I have some strong opinions on this (which I’ve been vocal about internally). But gnata has been a good case study for when it works well.
Andrej Karpathy recently wrote that programming is becoming unrecognizable, and that at the top tiers, deep technical expertise is “even more of a multiplier than before because of the added leverage.” Until recently, I was rather skeptical of agentic code. February 2026, however, has been a sort of inflection point even stubborn developers like myself can’t ignore.
I believe gnata is just the beginning. I suspect 2026 will be the year of surgical refactors.
go get github.com/recolabs/gnata
expr, _ := gnata.Compile(`user.role = “admin” and user.loginCount > 100`)
json := []byte(`{
“user”: {
“email”: “admin@example.com”,
“role”: “admin”,
“loginCount”: 247
result, _ := expr.EvalBytes(ctx, json)
fmt.Println(result) // true
...
Read the original on www.reco.ai »
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.