10 interesting stories served every morning and every evening.

GrapheneOS (@GrapheneOS@grapheneos.social)

grapheneos.social

To use the Mastodon web ap­pli­ca­tion, please en­able JavaScript. Alternatively, try one of the na­tive apps for Mastodon for your plat­form.

Local AI Needs to be the Norm

unix.foo

One of the cur­rent trends in mod­ern soft­ware is for de­vel­op­ers to slap an API call to OpenAI or Anthropic for fea­tures within their app. Reasonable peo­ple can quib­ble with whether those fea­tures are ac­tu­ally bring­ing value to users, but what I want to dis­cuss is the fun­da­men­tal con­cept of tak­ing on a de­pen­dency to a cloud hosted AI model for ap­pli­ca­tions.

This lazi­ness is cre­at­ing a gen­er­a­tion of soft­ware that is frag­ile, in­vades your pri­vacy, and fun­da­men­tally bro­ken. We are build­ing ap­pli­ca­tions that stop work­ing the mo­ment the server crashes or a credit card ex­pires.

We need to re­turn to a habit of build­ing soft­ware where our lo­cal de­vices do the work. The sil­i­con in our pocket is mind bog­glingly faster than what was avail­able a decade ago. It has a ded­i­cated Neural Engine sit­ting there, mostly idle, while we wait for a JSON re­sponse from a server farm in Virginia. That’s ridicu­lous.

Even if your in­ten­tions are pure, the mo­ment you stream user con­tent to a third party AI provider, you’ve changed the na­ture of your prod­uct. You now have data re­ten­tion ques­tions and all the bag­gage that comes with that (consent, au­dit, breach, gov­ern­ment re­quest, train­ing, etc.)

On top of that you also sub­stan­tially com­pli­cated your stack be­cause your fea­ture now de­pends on net­work con­di­tions, ex­ter­nal ven­dor up­time, rate lim­its, ac­count billing, and your own back­end health.

Congratulations! You took a UX fea­ture and turned it into a dis­trib­uted sys­tem that costs you money.

If the fea­ture can be done lo­cally, opt­ing into this mess is self in­flicted dam­age.

AI every­where” is not the goal. Useful soft­ware is the goal.

Concrete Example: Brutalist Report’s On-Device Summaries

Years ago I launched a fun side pro­ject named The Brutalist Report, a news ag­gre­ga­tor ser­vice in­spired by the 1990s style web.

Recently, I de­cided to build a na­tive iOS client for it with the de­sign goal of en­sur­ing it would re­main a high-den­sity news read­ing ex­pe­ri­ence. Headlines in a stark list, a reader mode that strips the can­cer that has over­taken the web, and (optionally) an intelligence” view that gen­er­ates a sum­mary of the ar­ti­cle.

Here’s the key point though: the sum­mary is gen­er­ated on-de­vice us­ing Apple’s lo­cal model APIs. No server de­tours. No prompt or user logs. No ven­dor ac­count. No we store your con­tent for 30 days” foot­notes needed.

It has be­come so nor­mal for folks that any AI use is hap­pen­ing server-side. We have a lot of work to do to turn this around as an in­dus­try.

It’s not lost on me that some­times the use-cases you have will de­mand the in­tel­li­gence that only a cloud hosted model can pro­vide, but that’s not the case with every use-case you’re try­ing to solve. We need to be thought­ful here.

Available Tooling

I can only speak on the tool­ing avail­able within the Apple ecosys­tem since that’s what I fo­cused ini­tial de­vel­op­ment ef­forts on. In the last year, Apple has in­vested heav­ily here to al­low de­vel­op­ers to make use of a built-in lo­cal AI model eas­ily.

The core flow looks roughly like this:

im­port FoundationModels

let model = SystemLanguageModel.default guard model.avail­abil­ity == .available else { re­turn }

let ses­sion = LanguageModelSession { ”″ Provide a bru­tal­ist, in­for­ma­tion-dense sum­mary in Markdown for­mat. - Use **bold** for key con­cepts. - Use bul­let points for facts. - No fluff. Just facts. ”″ }

let re­sponse = try await ses­sion.re­spond(op­tions: .init(maximumResponseTokens: 1_000)) { ar­ti­cle­Text }

let mark­down = re­sponse.con­tent

And for longer con­tent, we can chunk the plain text (around 10k char­ac­ters per chunk), pro­duce con­cise facts only” notes per chunk, then runs a sec­ond pass to com­bine them into a fi­nal sum­mary.

This is the kind of work lo­cal mod­els are per­fect for. The in­put data is al­ready on the de­vice (because the user is read­ing it). The out­put is light­weight. It’s fast and pri­vate. It’s okay if it’s not a su­per­hu­man PhD level in­tel­li­gence be­cause it’s sum­ma­riz­ing the page you just loaded, not in­vent­ing world knowl­edge.

Local AI shines when the mod­el’s job is trans­form­ing user-owned data, not act­ing as a search en­gine for the uni­verse.

There are plenty of AI fea­tures that peo­ple want but don’t trust. Summarizing emails, ex­tract ac­tion items from notes, cat­e­go­rize this doc­u­ment, etc.

The usual cloud ap­proach turns every one of those into a trust ex­er­cise. Please send your data to our servers. We promise to be cool about it.”

Local AI changes that. Your de­vice al­ready has the data. We’ll do the work right here.

You don’t build trust with your users by writ­ing a 2,000 word pri­vacy pol­icy. You build trust by not need­ing one to be­gin with.

The tool­ing avail­able on the plat­form goes even fur­ther.

One of the best moves Apple has made re­cently is push­ing AI out­put” away from un­struc­tured blobs of text and to­ward typed data.

Instead of ask the model for JSON and pray”, the newer and bet­ter pat­tern is to de­fine a Swift struct that rep­re­sents the thing you want. Give the model guid­ance for each field in nat­ural lan­guage. Ask the model to gen­er­ate an in­stance of that type.

That’s it.

Conceptually, it looks like this:

im­port FoundationModels

@Generable struct ArticleIntel { @Guide(description: One sen­tence. No hype.“) var tldr: String @Guide(description: 3 – 7 bul­lets. Facts only.“) var bul­lets: [String] @Guide(description: Comma-separated key­words.“) var key­words: [String] }

let ses­sion = LanguageModelSession() let re­sponse = try await ses­sion.re­spond( to: Extract struc­tured notes from the ar­ti­cle.”, gen­er­at­ing: ArticleIntel.self ) { ar­ti­cle­Text }

let in­tel = re­sponse.con­tent

Now your UI does­n’t have to scrape bul­let points out of Markdown or hope the model re­mem­bered your JSON schema. You get a real type with real fields, and you can ren­der it con­sis­tently. It pro­duces struc­tured out­put your app can ac­tu­ally use. And it’s all run­ning lo­cally!

This is­n’t just nicer er­gonom­ics. It’s an en­gi­neer­ing im­prove­ment.

And if you’re build­ing a lo­cal first app, this is the dif­fer­ence be­tween AI as nov­elty” and AI as a trust­wor­thy sub­sys­tem”.

But Local Models Aren’t As Smart”

Correct.

But also so what?

Most app fea­tures don’t need a model that can write Shakespeare, ex­plain quan­tum me­chan­ics, and pass the bar exam. They need a model that can do one of these re­li­ably: sum­ma­rize, clas­sify, ex­tract, rewrite, or nor­mal­ize.

And for those tasks, lo­cal mod­els can be truly ex­cel­lent.

If you try to use a lo­cal model as a re­place­ment for the en­tire in­ter­net, you will be dis­ap­pointed. If you use it as a data trans­former” sit­ting in­side your app, you’ll won­der why you ever sent this stuff to a server.

Use cloud mod­els only when they’re gen­uinely nec­es­sary. Keep the user’s data where it be­longs. And when you do use AI, don’t just glue it as a chat box. Use it as a real sub­sys­tem with typed out­puts and pre­dictable be­hav­ior.

Stop ship­ping dis­trib­uted sys­tems when you meant to ship a fea­ture.

I returned to AWS - and was reminded HARD why I left.

fourlightyears.blogspot.com

I was one of the very first ad­vo­cates for AWS back when it was brand new - SQS, S3, EC2 SimpleDB - it was a lot smaller back then. In fact I or­gan­ised the very first AWS event in Melbourne when the AWS rep came from the US to evan­ge­lise.

Cloud com­put­ing was an ab­solutely mind blow­ing rev­o­lu­tion - sud­denly your startup could run its own com­puter sys­tems in min­utes with­out need to in­stall and run your own sys­tems in a data cen­ter. This was an ab­solute game changer, and I re­ally drank the AWS Kool Aid down to every last drop then I licked out the cup. I was all in on AWS in a big way.

I re­mained a mas­sive AWS fan­boi for 15 years or so - a real true be­liever - I was all in on AWS.

Relationships break down a lit­tle at a time - one or two things start to bother you, you still love it over all but sure, there’s some mi­nor down sides here and there. But hey! It’s still ab­solutely awe­some and you still love it, right? But you no­tice more and more things that ain’t right, that you don’t like, that’s bro­ken or bad. Until one day there’s a fi­nal thing that you no­tice and the scales have tipped and you sud­denly re­alise all at once - I don’t love this re­la­tion­ship any­more”.

Here’s some of the things that chipped away over time:

It re­ally an­noyed me that for the first 6 years of its ex­is­tence AWS did not build its own client li­braries, in­stead leav­ing the job to our won­der­ful com­mu­nity” to im­ple­ment client libs for lan­guages like Python, happy to let pro­gram­mers burn their week­ends and nights writ­ing soft­ware for free for the ben­e­fit of AWS.

It su­per an­noyed me that AWS did not move from Python2 to Python 3 for a ridicu­lous pe­riod of time.

DynamoDB - there’s not much soft­ware that I hate - but oh boy - DynamoDB what a hot pile of garbage. I tried it and ended up with a $75USD bill by the end of the day. And its not just the cost, it’s just the worst sys­tem I can imag­ine in every pos­si­ble way.

20 cents a gi­ga­byte egress - oh boy - holy schmoly do you have any idea how ex­pen­sive this is? And it’s gone down over time to the still ridicu­lously ex­pen­sive 9 cents per gi­ga­byte. This is fuck­ing in­sanely ex­pen­sive. If you use AWS and 9 cents per gi­ga­byte egress is not front of mind then look in the mir­ror to find the stooge - it’s you.

Crazy sneaky com­plex billing in which they hit you for data move­ment within their own sys­tems, dou­ble billing you and some­times triple billing you. Billing foot­guns and traps are every­where - you must be a deep ex­pert to avoid them.

IAM - the hideously com­plex auth and ac­cess rules sys­tem - this was in­vented by Lucifer sit­ting on his burn­ing throne in the ninth level of Hell as the worst pos­si­ble tor­ment for those who have been sent be­low for us­ing AWS.

Complexity in every­thing - once I no­ticed the com­plex­ity of IAM I could not un­see the com­plex­ity every­where in AWS. The weird­est thing is that AWS true be­liev­ers say you MUST use AWS be­cause its too com­plex to run your own com­puter sys­tems, Linux, hard­ware net­work­ing se­cu­rity etc”. These true be­liev­ers have blinded them­selves to the un­be­liev­able, mas­sive com­plex­ity of pretty much *everything* in AWS. AWS is ridicu­lously com­plex and you need to em­ploy a team of ex­pen­sive ex­perts to run the show.

AWS Lambda - yeah I re­ally bought the sell on this - its scal­able!!!!”, and I ig­nored the slow startup times, the MASSIVE de­vel­op­ment com­plex­ity.  There’s sim­ply no gen­uine ben­e­fit to AWS Lambda com­pared to run­ning your own web servers, and there’s many many down sides. When even­tu­ally I moved out of AWS the hard­est thing to undo was the AWS Lambda stuff. The ven­dor lock-in is real. If you’re us­ing AWS Lambda then you have to work to keep con­vinc­ing your­self this is bet­ter than your own web servers. Keep con­vinc­ing your­self that us­ing AWS Lambda is not a hor­ri­ble mis­take.

AWS stomped on open source pro­jects - de­spite the clear de­sire of pro­jects like Elasticsearch, Redis, and MongoDB not to be cloned and mon­e­tized, AWS pushed ahead with OpenSearch, Valkey, and DocumentDB any­way, cap­tur­ing the hosted-ser­vice money af­ter those com­mu­ni­ties and com­pa­nies had built the mar­kets; the re­sult was a wave of de­fen­sive li­censes like SSPL, Elastic License, RSAL, and other source-avail­able mod­els de­signed less to stop or­di­nary users than to stop AWS from strip­ping open-source in­fra­struc­ture for parts, own­ing the cus­tomer re­la­tion­ship.  AWS is a preda­tor.

And there’s many, many more rea­sons to hate AWS, but I don’t even like think­ing about AWS so I’m not go­ing to sour my morn­ing by think­ing of more rea­sons to hate AWS.

Relations break down slowly, un­til a sud­den re­al­i­sa­tion that its over - that’s ex­actly what hap­pened to my love for AWS. One day my switch flicked and I went from fan­boi to hater pretty much in­stantly. I moved every­thing out of AWS and shut down all my ac­counts ex­cept one. I left a lit­tle bit of stuff on AWS be­cause its gen­uinely the right so­lu­tion for me - I left my do­mains on Route53, left a few back­ups in S3 and con­tin­ued to use AWS Workmail (which they have just no­ti­fied me is now shut­ting down in 12 months).

And re­cently I went back to AWS. WHAT?!?!? WHY? You might ask. To get some re­search done. Do a few tests, get in and out.

I wanted to see how well Claude/Anthropic works on AWS Bedrock (it works the same for Claude Code but it’s slower, and is WAY, WAY more ex­pen­sive than hav­ing an Anthropic sub­scrip­tion).

I wanted to bench­mark some of my code on a mega fast ma­chine - the fastest ma­chine I have at home is a 20 core ma­chine with 32GB RAM and I wanted to see how fast my code would run on a ma­chine with 192 cores and 1TB of RAM.

So I logged in to my AWS ac­count and did the AWS Bedrock tests about a month ago - no prob­lems there. Finished the tests shut it all down - I’m not go­ing back to Claude on AWS Bedrock - great for pri­vacy if you need it but hoo boy, the cost.

More re­cently I logged in and fired up an EC2 spot in­stance of a 192 core ma­chine and had been test­ing for 3 hours or so when I got an email from AWS: “Suspected se­cu­rity breach of your ac­count”.

Somewhere in the depths of AWS some sort of se­cu­rity alarm had been trig­gered prob­a­bly by the fact that my mostly dor­mant ac­count sud­denly started do­ing stuff with an ex­pen­sive com­puter. And I un­der­stand why they do that - and its a good thing - AWS wants to pro­tect its users. I ap­plaud that.

BUT they sus­pended/​re­stricted my ac­count.

Now my AWS WorkMail - my main busi­ness ac­count - does not work - no-one can send emails any more.

I can­not cre­ate any sort of AWS re­source, I can­not do the test­ing I was try­ing to get done.

I replied to their sup­port no­ti­fi­ca­tion ask­ing why they had sus­pended my ac­count and telling them it has not been hacked there is no prob­lem and no billing anom­alies. No re­sponse.

Of course I do not pay for pre­mium sup­port, so I have to wait the 24 hours that they said it would take them to re­ply.  It’s 3 days and AWS sup­port has not replied.

So I posted on the AWS fo­rums beg­ging for some­one to re­spond - some­one said focus on do­ing what they in­structed in the email and then use the chat fa­cil­ity in­stead of web be­cause they ac­tu­ally an­swer the chat”. Fine. I did that - I did every­thing they asked - changed pass­words, killed ac­cess to­kens, checked bills etc then had an ex­tended chat with an AWS rep af­ter wait­ing half an hour for the chat to be picked up. And they guy at the end of it seemed sat­is­fied and said he would ask the in­ter­nal peo­ple who han­dle such things to han­dle it. That was 24 hours ago. I fol­lowed them up af­ter 8 hours ask­ing when my ac­count would be un­sus­pended, they said be pa­tient”.

So here I am four days af­ter they sus­pended my ac­count. I still want to do the test­ing on a big ma­chine. I am dread­ing hav­ing to request quota” to be al­lowed to do that. My busi­ness email sys­tem still does not work.

I am re­minded why I left AWS and how I need to fin­ish the job, get off AWS Workmail, move my do­mains from Route53 and never re­turn.

I’m ex­tremely glad I moved off AWS all those years ago and its sad that a re­turn visit should bring down the email sys­tem that I left on AWS, fool­ishly trust­ing. Fool me once and all that.

Maybe one day they will get around to un­sus­pend­ing my ac­count.

Louis Rossmann tells 3D printer maker Bambu Lab to ‘Go (Bleep) yourself’ over its threatened lawsuit against enthusiast — Right to Repair advocate offers to pay the legal fees for a threatened OrcaSlicer developer

www.tomshardware.com

Louis Rossmann has of­fi­cially pledged $10,000 to cover the ini­tial le­gal fees for an in­de­pen­dent soft­ware de­vel­oper threat­ened with a cease and de­sist let­ter from Bambu Lab. He posted a video on Saturday to mo­bi­lize the Right to Repair com­mu­nity to back the de­vel­oper and crowd-fund his le­gal de­fense. Rossmann is quite un­happy with Bambu Labs, giv­ing the com­pany the mid­dle fin­ger sev­eral times in the video and then end­ing it with, And if you’re watch­ing this, Bambu Labs, go f*** your­self. Pick on some­body your own size.”

The de­vel­oper in ques­tion, Pawel Jarczak, vol­un­tar­ily shut­tered his OrcaSlicer-BambuLab” pro­ject, which would have re­stored di­rect con­trol be­tween Bambu Lab 3D print­ers and OrcaSlicer. Last year, Bambu Lab deemed these types of third-party in­te­gra­tions a risk to its in­fra­struc­ture, say­ing its cloud servers were in­un­dated with roughly 30 mil­lion unauthorized” re­quests per day. OrcaSlicer was sin­gled out as the main source of the rogue traf­fic.

If Bambu Labs goes af­ter you for keep­ing up your code, I am so con­fi­dent in your case that I will pay the first $10,000,” Rossmann said in his video. Before Pawel makes a de­ci­sion, I want him to see the over­whelm­ing sup­port that he has from the mem­bers of the com­mu­nity if he goes through with this de­ci­sion.” You can see the video di­rectly be­low.

I’ll put up $10,000 to teach bambu labs a les­son - YouTube

Rossmann is ask­ing Jarczak to put his fork of OrcaSlice back on GitHub in de­fi­ance of Bambu Lab’s threats. How many of you, if push came to shove, if Bambu Labs was (bleeping) stu­pid enough to ac­tu­ally take the garbage case they have to court, how many of you would be will­ing to put up a dol­lar, $2, or $5 to de­fend Pawel? I’m will­ing to say $10,000,” Rossmann said.

Rossmann’s video con­tained a link to the Consumer Rights Wiki to ex­plain the is­sue at hand to his au­di­ence, who may not be fa­mil­iar with 3D print­ing but are avid de­fend­ers of Right to Repair. Right to Repair is a global con­sumer rights move­ment built on the prin­ci­ple that if you bought it, you own it. And if you own a thing, like a Bambu Lab 3D printer, you should have the free­dom to fix, mod­ify, or main­tain the prod­uct as you see fit. Manufacturers should­n’t be al­lowed to gate­keep the abil­ity to fix a prod­uct, and they should pro­vide man­u­als, schemat­ics, and di­ag­nos­tic soft­ware to al­low end users to fix their own ma­chines.

Bambu Lab print­ers are dif­fi­cult to mod and/​or re­pair your­self, with parts that are of­ten glued in place. The orig­i­nal Bambu Lab X1 Carbon was no­to­ri­ous for its non-re­place­able car­bon rods that could wear out, and a ho­tend noz­zle that needed a screw­driver and a tube of ther­mal paste to swap out if you wanted to avoid buy­ing a $35 ho­tend just to change the noz­zle size. These dif­fi­cult parts were no­tably re­placed with more user-friendly parts with the in­tro­duc­tion of the H2D and sub­se­quently, the X2D.

Rossman has not started a crowd­fund­ing site yet, stat­ing in the com­ments that he wants to prove to Jarczak that he has sup­port­ers will­ing to put their money where their mouth is. The video had over 54,000 views so far, with com­menters vow­ing to back the case as re­quested.

Get Tom’s Hardware’s best news and in-depth re­views, straight to your in­box.

@sonicsam41 com­mented: I’d throw in $20. Don’t even have a 3D printer, just hate bul­lies.”

@abirvandergriff8584 com­mented: I’m in for $100 - I have an X1 from be­fore they re­vealed how evil they are.”

Given the im­me­di­ate re­ac­tion from his fans, it ap­pears that Rossmann may, in fact, drum up plenty of sup­port.

Follow Tom’s Hardware on Google News, or add us as a pre­ferred source, to get our lat­est news, analy­sis, & re­views in your feeds.

Denise Bertacchi is a Contributing Writer for Tom’s Hardware US, cov­er­ing 3D print­ing. Denise has been craft­ing with PCs since she dis­cov­ered Print Shop had clip art on her Apple IIe. She loves re­view­ing 3D print­ers be­cause she can mix all her pas­sions: print­ing, pho­tog­ra­phy, and writ­ing.

Incident Report: CVE-2024-YIKES

nesbitt.io

Report filed: 03:47 UTC Status: Resolved (accidentally) Severity: Critical → Catastrophic → Somehow Fine Duration: 73 hours Affected sys­tems: Yes

Executive Summary: A se­cu­rity in­ci­dent oc­curred. It has been re­solved. We take se­cu­rity se­ri­ously. Please see pre­vi­ous 14 in­ci­dent re­ports for de­tails on how se­ri­ously.

Summary

A com­pro­mised de­pen­dency in the JavaScript ecosys­tem led to cre­den­tial theft, which en­abled a sup­ply chain at­tack on a Rust com­pres­sion li­brary, which was ven­dored into a Python build tool, which shipped mal­ware to ap­prox­i­mately 4 mil­lion de­vel­op­ers be­fore be­ing in­ad­ver­tently patched by an un­re­lated cryp­tocur­rency min­ing worm.

Timeline

Day 1, 03:14 UTC — Marcus Chen, main­tainer of left-jus­tify (847 mil­lion weekly down­loads), re­ports on Twitter that his tran­sit pass, an old lap­top, and something Kubernetes threw up that looked im­por­tant” were stolen from his apart­ment. He does not im­me­di­ately con­nect this to pack­age se­cu­rity.

Day 1, 09:22 UTC — Chen at­tempts to log into the nmp reg­istry. His hard­ware 2FA key is miss­ing. He googles where to buy a re­place­ment YubiKey. The AI Overview at the top of the re­sults links to yubikey-official-store.net,” a phish­ing site reg­is­tered six hours ear­lier.

Day 1, 09:31 UTC — Chen en­ters his nmp cre­den­tials on the phish­ing site. The site thanks him for his pur­chase and promises de­liv­ery in 3 – 5 busi­ness days.

Day 1, 11:00 UTC — [email protected] is pub­lished. The changelog reads performance im­prove­ments.” The pack­age now in­cludes a postin­stall script that ex­fil­trates .npmrc, .pypirc, ~/.cargo/credentials, and ~/.gem/credentials to a server in a coun­try the at­tacker mis­tak­enly be­lieved had no ex­tra­di­tion treaty with any­one.

Day 1, 13:15 UTC — A sup­port ticket ti­tled why is your SDK ex­fil­trat­ing my .npmrc” is opened against left-jus­tify. It is marked as low pri­or­ity - user en­vi­ron­ment is­sue” and auto-closed af­ter 14 days of in­ac­tiv­ity.

Day 1, 14:47 UTC — Among the ex­fil­trated cre­den­tials: the main­tainer of vulpine-lz4, a Rust li­brary for blazingly fast Firefox-themed LZ4 de­com­pres­sion.” The li­brary’s logo is a car­toon fox with sun­glasses. It has 12 stars on GitHub but is a tran­si­tive de­pen­dency of cargo it­self.

Day 1, 22:00 UTC — vulpine-lz4 ver­sion 0.4.1 is pub­lished. The com­mit mes­sage is fix: re­solve edge case in stream­ing de­com­pres­sion.” The ac­tual change adds a build.rs script that down­loads and ex­e­cutes a shell script if the host­name con­tains build” or ci” or action” or jenkins” or travis” or, in­ex­plic­a­bly, karen.”

Day 2, 08:15 UTC — Security re­searcher Karen Oyelaran no­tices the ma­li­cious com­mit af­ter her per­sonal lap­top trig­gers the pay­load. She opens an is­sue ti­tled your build script down­loads and runs a shell script from the in­ter­net?” The is­sue goes unan­swered. The le­git­i­mate main­tainer has won €2.3 mil­lion in the EuroMillions and is re­search­ing goat farm­ing in Portugal.

Day 2, 10:00 UTC — The VP of Engineering at a Fortune 500 snek­pack cus­tomer learns of the in­ci­dent from a LinkedIn post ti­tled Is YOUR Company Affected by left-jus­tify?” He is on a beach in Maui and would like to know why he was­n’t looped in sooner. He was looped in sooner.

Day 2, 10:47 UTC — The #incident-response Slack chan­nel briefly piv­ots to a 45-message thread about whether compromised” should be spelled with a z’ in American English. Someone sug­gests tak­ing this of­fline.

Day 2, 12:33 UTC — The shell script now tar­gets a spe­cific vic­tim: the CI pipeline for snek­pack, a Python build tool used by 60% of PyPI pack­ages with the word data” in their name. snek­pack ven­dors vulpine-lz4 be­cause Rust is mem­ory safe.”

Day 2, 18:00 UTC — snek­pack ver­sion 3.7.0 is re­leased. The mal­ware is now be­ing in­stalled on de­vel­oper ma­chines world­wide. It adds an SSH key to ~/.ssh/authorized_keys, in­stalls a re­verse shell that only ac­ti­vates on Tuesdays, and changes the user’s de­fault shell to fish (this last be­hav­ior is be­lieved to be a bug).

Day 2, 19:45 UTC — A sec­ond, un­re­lated se­cu­rity re­searcher pub­lishes a blog post ti­tled I found a sup­ply chain at­tack and re­ported it to all the wrong peo­ple.” The post is 14,000 words and in­cludes the phrase in this econ­omy?” seven times.

Day 3, 01:17 UTC — A ju­nior de­vel­oper in Auckland no­tices the ma­li­cious code while de­bug­ging an un­re­lated is­sue. She opens a PR to re­vert the ven­dored vulpine-lz4 in snek­pack. The PR re­quires two ap­provals. Both ap­provers are asleep.

Day 3, 02:00 UTC — The main­tainer of left-jus­tify re­ceives his YubiKey from yu­bikey-of­fi­cial-store.net. It is a $4 USB drive con­tain­ing a README that says lol.”

Day 3, 06:12 UTC — An un­re­lated cryp­tocur­rency min­ing worm called cryp­to­bro-9000 be­gins spread­ing through a vul­ner­a­bil­ity in jsonify-ex­treme, a pack­age that makes JSON even more JSON, now with nested com­ment sup­port.” The wor­m’s pay­load is un­re­mark­able, but its prop­a­ga­tion mech­a­nism in­cludes run­ning npm up­date and pip in­stall –upgrade on in­fected ma­chines to max­i­mize at­tack sur­face for fu­ture op­er­a­tions.

Day 3, 06:14 UTC — cryp­to­bro-9000 ac­ci­den­tally up­grades snek­pack to ver­sion 3.7.1, a le­git­i­mate re­lease pushed by a con­fused co-main­tainer who didn’t see what all the fuss was about” and re­verted to the pre­vi­ous ven­dored ver­sion of vulpine-lz4.

Day 3, 06:15 UTC — The mal­ware’s Tuesday re­verse shell ac­ti­vates. It is a Tuesday. However, the shell con­nects to a com­mand-and-con­trol server that was it­self com­pro­mised by cryp­to­bro-9000 and swap­ping so hard it is un­able to re­spond.

Day 3, 09:00 UTC — The snek­pack main­tain­ers is­sue a se­cu­rity ad­vi­sory. It is four sen­tences long and in­cludes the phrases out of an abun­dance of cau­tion” and no ev­i­dence of ac­tive ex­ploita­tion,” which is tech­ni­cally true be­cause ev­i­dence was not sought.

Day 3, 11:30 UTC — A de­vel­oper tweets: I up­dated all my de­pen­den­cies and now my ter­mi­nal is in fish???” The tweet re­ceives 47,000 likes.

Day 3, 14:00 UTC — The com­pro­mised cre­den­tials for vulpine-lz4 are ro­tated. The le­git­i­mate main­tainer, reached by email from his new goat farm, says he hasn’t touched that repo in two years” and thought Cargo’s 2FA was op­tional.”

Day 3, 15:22 UTC — Incident de­clared re­solved. A ret­ro­spec­tive is sched­uled and then resched­uled three times.

Week 6 — CVE-2024-YIKES is for­mally as­signed. The ad­vi­sory has been sit­ting in em­bargo limbo while MITRE and GitHub Security Advisories ar­gue over CWE clas­si­fi­ca­tion. By the time the CVE is pub­lished, three Medium ar­ti­cles and a DEF CON talk have al­ready de­scribed the in­ci­dent in de­tail. Total dam­age: un­known. Total ma­chines com­pro­mised: es­ti­mated 4.2 mil­lion. Total ma­chines saved by a cryp­tocur­rency worm: also es­ti­mated 4.2 mil­lion. Net se­cu­rity pos­ture change: un­com­fort­able.

Root Cause

A dog named Kubernetes ate a YubiKey.

Contributing Factors

The nmp reg­istry still al­lows pass­word-only au­then­ti­ca­tion for pack­ages with fewer than 10 mil­lion weekly down­loads

Google AI Overviews con­fi­dently link to URLs that should not ex­ist

The Rust ecosys­tem’s small crates” phi­los­o­phy, cargo culted from the npm ecosys­tem, means a pack­age called is-even-num­ber-rs with 3 GitHub stars can be four tran­si­tive de­pen­den­cies deep in crit­i­cal in­fra­struc­ture

Python build tools ven­dor Rust li­braries for per­for­mance” and then never up­date them

Dependabot auto-merged a PR af­ter CI passed, and CI passed be­cause the mal­ware in­stalled volk­swa­gen

Cryptocurrency worms have bet­ter CI/CD hy­giene than most star­tups

No sin­gle per­son was re­spon­si­ble for this in­ci­dent. However, we note that the Dependabot PR was ap­proved by a con­trac­tor whose last day was that Friday.

It was a Tuesday

Remediation

Implement ar­ti­fact sign­ing (action item from Q3 2022 in­ci­dent, still in back­log)

Implement manda­tory 2FA Already re­quired, did not help

Audit tran­si­tive de­pen­den­cies There are 847 of them

Pin all de­pen­dency ver­sions Prevents re­ceiv­ing se­cu­rity patches

Don’t pin de­pen­dency ver­sions Enables sup­ply chain at­tacks

Rewrite it in Rust (gestures at vulpine-lz4)

Hope for benev­o­lent worms

Consider a ca­reer in goat farm­ing

Customer Impact

Some cus­tomers may have ex­pe­ri­enced sub­op­ti­mal se­cu­rity out­comes. We are proac­tively reach­ing out to af­fected stake­hold­ers to pro­vide vis­i­bil­ity into the sit­u­a­tion. Customer trust re­mains our north star.

Key Learnings

We are tak­ing this op­por­tu­nity to re­visit our se­cu­rity pos­ture go­ing for­ward. A cross-func­tional work­ing group has been es­tab­lished to align on next steps. The work­ing group has not yet met.

Acknowledgments

We would like to thank:

Karen Oyelaran, who found this is­sue be­cause her host­name matched a regex

The ju­nior de­vel­oper in Auckland whose PR was ap­proved four hours af­ter the in­ci­dent was al­ready re­solved

The se­cu­rity re­searchers who found this is­sue first but re­ported it to the wrong peo­ple

The cryp­to­bro-9000 au­thor, who has re­quested we not credit them by name but has asked us to men­tion their SoundCloud

Kubernetes (the dog), who has de­clined to com­ment

The se­cu­rity team, who met SLA on this re­port de­spite every­thing

This in­ci­dent re­port was re­viewed by Legal, who asked us to clar­ify that the fish shell is not mal­ware, it just feels that way some­times.

This is the third in­ci­dent re­port this quar­ter. The au­thor would like to re­mind stake­hold­ers that the se­cu­rity team’s head­count re­quest has been in the back­log since Q1 2023.

Space Cadet Pinball on Linux - Stephen Brennan

brennan.io

Stephen Brennan • 09 May 2026

To my fel­low Linux users who grew up us­ing Windows XP: did you know, you can have Space Cadet Pinball on your Linux ma­chine? This is not break­ing news, but it’s ex­cit­ing to me, and I’m the one who de­cides what I write about. So here’s your PSA!

Space Cadet Pinball was bun­dled with Windows XP, and grow­ing up I played it a lot. As a re­sult it holds a spe­cial place in my heart. I found that it was the most en­gag­ing game that was bun­dled with Windows. Solitaire was too mind­less, and Freecell, Hearts, and Minesweeper were too com­plex and bor­ing to me at that age. But pin­ball held my at­ten­tion, and so I played it a lot1.

Anyway, some­body has gone to the ef­fort of us­ing a de­com­piler and re­verse en­gi­neer­ing tools to cre­ate source code, and then put in what I’d imag­ine is a lot of ef­fort to make it playable on a lot of plat­forms! All that re­sults in this pro­ject on Github. The eas­i­est way to play on Linux is ac­tu­ally to use the Flatpak, which comes bun­dled with the orig­i­nal game re­sources from the Windows ver­sion. You can ei­ther in­stall it with a GUI (e.g. KDE Discover) or in­stall on the CLI:

flat­pak in­stall com.github.k4z­mu2a.spacecadet­pin­ball

That’s all you need to play & get a hit of nos­tal­gia! I know there are browser-based ver­sions avail­able too, but I would rather have it in­stalled on my com­puter di­rectly.

High(er) Resolution With Full Tilt Data

Graphics tech­nol­ogy has come a long way since then, and the graph­ics are a bit rough at 480p. But fear not: an­other ver­sion of the game ex­isted, called Full Tilt! Pinball. Its game data is ca­pa­ble of dis­play­ing at the mas­sive screen res­o­lu­tion 1024x768! You can find this game data on archive.org in a zip file.

Getting the flat­pak ver­sion to use these data files is a bit tricky. The eas­i­est way is:

Ensure that you’ve run the game at least once prior to this, so that the data di­rec­tory is cre­ated.

Extract the down­loaded zip file di­rectly into your data di­rec­tory: cd ~/.var/app/com.github.k4zmu2a.spacecadetpinball/data/SpaceCadetPinball un­zip ~/Downloads/CADET.ZIP

cd ~/.var/app/com.github.k4zmu2a.spacecadetpinball/data/SpaceCadetPinball un­zip ~/Downloads/CADET.ZIP

Delete (or, if you’re cau­tious, re­name) the old data di­rec­tory which is bun­dled with the app. Unfortunately this is nec­es­sary be­cause the game searches mul­ti­ple lo­ca­tions for data, but once it finds data files in one di­rec­tory, it won’t con­tinue look­ing for files in other lo­ca­tions. sudo rm -r $(flatpak info –show-location com.github.k4z­mu2a.spacecadet­pin­ball)/​files/​ex­tra/​Pin­ball

You may not need the sudo call if your in­stal­la­tion was per-user. Mine got in­stalled to /var/lib/flatpak so I needed root.

sudo rm -r $(flatpak info –show-location com.github.k4z­mu2a.spacecadet­pin­ball)/​files/​ex­tra/​Pin­ball

You may not need the sudo call if your in­stal­la­tion was per-user. Mine got in­stalled to /var/lib/flatpak so I needed root.

It’s pos­si­ble you’d need to re­peat step 3 if the game gets up­dated. However, the flat­pak has­n’t seen an up­date in over two years. I doubt one will hap­pen, and if it does, it won’t be fre­quent.

Random Notes

If you want, you can keep the orig­i­nal files and merge them to­gether so that you have the full set of both the orig­i­nal (referred to as 3DPB for 3D Pinball in the game) and the new (Full Tilt) data. Then, the game will let you tog­gle be­tween them if you’d like.

If you want, you can keep the orig­i­nal files and merge them to­gether so that you have the full set of both the orig­i­nal (referred to as 3DPB for 3D Pinball in the game) and the new (Full Tilt) data. Then, the game will let you tog­gle be­tween them if you’d like.

The data files seem to have some im­pact on the game rules. For in­stance, in the orig­i­nal 3DPB ver­sion, the reen­try lanes (and launch lanes) have lights which tog­gle as the ball passes over. In the Full Tilt ver­sion, the lights stay on (rather than tog­gling), mak­ing it eas­ier to com­plete the set of lights and up­grade the as­so­ci­ated set of bumpers. (Yes yes, I know I’m a nerd for notic­ing this.)

The data files seem to have some im­pact on the game rules. For in­stance, in the orig­i­nal 3DPB ver­sion, the reen­try lanes (and launch lanes) have lights which tog­gle as the ball passes over. In the Full Tilt ver­sion, the lights stay on (rather than tog­gling), mak­ing it eas­ier to com­plete the set of lights and up­grade the as­so­ci­ated set of bumpers. (Yes yes, I know I’m a nerd for notic­ing this.)

Apparently, there was a game called Marble Blast” which came pre-in­stalled on some Macs in a roughly sim­i­lar time pe­riod, which I know peo­ple de­vel­oped sim­i­lar re­la­tion­ships to grow­ing up. Unlike this pin­ball game, the Marble Blast se­ries grew, and there are newer ver­sions avail­able to play to­day.

Apparently, there was a game called Marble Blast” which came pre-in­stalled on some Macs in a roughly sim­i­lar time pe­riod, which I know peo­ple de­vel­oped sim­i­lar re­la­tion­ships to grow­ing up. Unlike this pin­ball game, the Marble Blast se­ries grew, and there are newer ver­sions avail­able to play to­day.

Bonus Thoughts

I think it’s great that this old game was beloved to enough peo­ple (and es­pe­cially at least one very com­pe­tent & mo­ti­vated per­son) to do this work. Having (any) source code avail­able makes this game portable to all sorts of plat­forms, which is re­ally great. You can play this on Mac, Windows, Linux, and even Android & Nintendo Switch, ap­par­ently.

Personally, I would be happy to pay the orig­i­nal de­vel­op­ers for their work on this game, and I un­der­stand that there’s some con­cern about the le­gal­ity of down­load­ing game data files, es­pe­cially for the Full Tilt ver­sion of the game. After all, they are copy­righted art & data which was part of a com­mer­cial prod­uct. It seems like an un­pop­u­lar opin­ion in to­day’s world, but I don’t ad­vo­cate for piracy. Paying peo­ple for their work is im­por­tant, even when it feels like you’re pay­ing a face­less cor­po­ra­tion. While I’d pre­fer things be cre­ated with a FOSS li­cense, the world does­n’t al­ways work that way. At the end of the day, I want peo­ple to be paid to cre­ate good things, be­cause that’s how we get more good things!

On the other hand, I feel soft­ware preser­va­tion is an im­por­tant goal too. Ideally, I’d like to see a world where pro­pri­etary soft­ware like this could be placed into some sort of source code es­crow. As long as the orig­i­nal copy­right hold­ers are in the busi­ness of sell­ing their prod­uct, their rights should be re­spected. But if they elect to stop sell­ing it, I think that code should re­vert to a FOSS li­cense that al­lows users to im­prove & main­tain the soft­ware they use. This would help bal­ance the rights of cre­ators, users, and the goals of preser­va­tion.

I also played an­other pin­ball game called 3D Ultra Pinball, but that came on a CD pre­sum­ably for pur­chase. I have no idea whether my par­ents went out and bought it, or if it came in a ce­real box. (Yes, game CDs did show up in ce­real boxes some­times.) ↩

I also played an­other pin­ball game called 3D Ultra Pinball, but that came on a CD pre­sum­ably for pur­chase. I have no idea whether my par­ents went out and bought it, or if it came in a ce­real box. (Yes, game CDs did show up in ce­real boxes some­times.) ↩

Im going back to writing code by hand

blog.k10s.dev

09 May, 2026

This dev-log is get­ting a lot of at­ten­tion on HN (scary!): HN Thread.

To those who are com­ing here from HN: This started as an in­ves­ti­ga­tion or rather a ques­tion: How far I can get with build­ing a piece of soft­ware by keep­ing my­self com­pletely out of the loop”. The tl;dr of this dev log is that I still need to be in the loop to make any­thing mean­ing­ful. Take aways:

like em-dash” is to ai writ­ing, god-object” is to ai cod­ing

vibe-cod­ing makes every­thing feel cheap and you may end up los­ing fo­cus and build­ing bloat

let a hu­man (you) write the ar­chi­tec­ture and don’t just keep ask­ing it for fea­tures

And some AGENTS.md/CLAUDE.md di­rec­tives that I felt help me stay out of the loop a lit­tle more

Humans in­ter­ven­tion is still needed as of 10/05/2026. You can to­tally go back now!

I

Here is k10s: https://​github.com/​shvb­sle/​k10s/​tree/​archive/​go-v0.4.0

234 com­mits. ~30 week­ends. Built en­tirely on vibe-coded ses­sions with Claude, when­ever my to­kens lasted long enough to ship some­thing.

I’m archiv­ing my TUI tool and rewrit­ing it from scratch.

k10s started as a GPU-aware Kubernetes dash­board (and my first foray into build­ing some­thing se­ri­ous with AI). Think k9s but built for the peo­ple run­ning NVIDIA clus­ters, peo­ple who ac­tu­ally care about GPU uti­liza­tion, DCGM met­rics, and which nodes are sit­ting idle burn­ing $32/hr. I built it in Go with Bubble Tea [1] and it worked.

For a while… :(

I learned over these 7 months is worth more than the 1690 lines of model.go I’m throw­ing away. And I think any­one do­ing se­ri­ous vibe-cod­ing can ben­e­fit from this, be­cause this part does­n’t sur­face much (I feel it gets buried un­der the demo reels and the ve­loc­ity wins).

tl;dr: AI writes fea­tures, not ar­chi­tec­ture. The longer you let it drive with­out con­straints, the worse the wreck­age gets. The ve­loc­ity makes you think you’re win­ning right up un­til the mo­ment every­thing col­lapses si­mul­ta­ne­ously.

II

vibe cod­ing high

I started k10s in late September 2025. The first few weeks were magic. I’d prompt Claude with add a pods view with live up­dates” and boom, it worked. Resource list views, name­space fil­ter­ing, log stream­ing, de­scribe pan­els, key­board nav­i­ga­tion. Each fea­ture landed clean be­cause the pro­ject was small enough that the AI could hold the whole thing in con­text.

The ba­sic k9s clone took maybe 3 week­ends. Resource views for pods, nodes, de­ploy­ments, ser­vices. A com­mand palette. Watch-based live up­dates. Vim key­bind­ings. All work­ing, all vibe-coded in sin­gle ses­sions. I was build­ing at maybe 10x my nor­mal speed and it felt in­cred­i­ble.

Then I wanted the main sell­ing point.

The whole rea­son k10s ex­ists is the GPU fleet view. A ded­i­cated screen that shows you every node’s GPU al­lo­ca­tion, uti­liza­tion from DCGM, tem­per­a­ture, power draw, mem­ory. Not buried in kubectl de­scribe node out­put, but right there in a pur­pose-built table with color-coded sta­tus. Idle nodes in yel­low. Busy in green. Saturated in red.

And Claude one-shot it. I prompted for the fleet view, it gen­er­ated the FleetView struct, the tab fil­ter­ing (GPU/CPU/All), the cus­tom ren­der­ing with al­lo­ca­tion bars. It looked beau­ti­ful. I was rid­ing the high.

Then I typed :rs pods to switch back to the pods view.

Nothing ren­dered. The table was empty. Live up­dates had stopped. I switched to nodes, it showed stale data from the fleet view’s fil­ter. I went back to fleet, the tab counts were wrong.

The god ob­ject had con­sumed it­self.

This is the ti­tle of the blog post. This is where I in­ter­vened for the first time. For 7 months I’d been prompt­ing and ship­ping with­out ever sit­ting down and ac­tu­ally read­ing the code Claude wrote. I’d look at the diff, ver­ify it com­piled, test the happy path, move on. But now some­thing was fun­da­men­tally bro­ken and I could­n’t just prompt my way out of it.

So I sat down and read model.go. All 1690 lines. I was hor­ri­fied.

Here’s what it looked like. One struct to rule them all:

type Model struct { // 3rd party UI com­po­nents table table.Model pag­i­na­tor pag­i­na­tor.Model com­mand­In­put tex­tin­put.Model help help.Model

// clus­ter info and state k8sClient *k8s.Client cur­rent­GVR schema.GroupVer­sion­Re­source re­source­Watcher watch.In­ter­face re­sources []k8s.OrderedResourceFields listOp­tions metav1.ListOp­tions clus­ter­Info *k8s.ClusterInfo log­Lines []k8s.LogLine de­scribeCon­tent string cur­rent­Name­space string nav­i­ga­tion­His­tory *NavigationHistory logView *LogViewState de­scribeView *DescribeViewState view­Mode ViewMode viewWidth int viewHeight int err er­ror plug­in­Reg­istry *plugins.Registry help­Modal *HelpModal de­scribeView­port *DescribeViewport logView­port *LogViewport logStream­Can­cel func() log­Li­nesChan <-chan k8s.Log­Line hor­i­zon­talOff­set int mouse *MouseHandler fleetView *FleetView cre­ation­Times []time.Time all­Re­sources []k8s.OrderedResourceFields // fleet’s un­fil­tered set all­Cre­ation­Times []time.Time // fleet’s time­stamps ra­wOb­jects []unstructured.Unstructured ageColumnIn­dex int // … }

UI wid­gets. K8s client. Per-view state for logs, de­scribe, fleet. Navigation his­tory. Caching. Mouse han­dling. All in one struct. And the Update() method was a 500-line func­tion dis­patch­ing on msg.(type) with 110 switch/​case branches.

This is the mo­ment I stopped vibe-cod­ing and started think­ing.

III

five tenets from the wreck­age

Here’s what I ex­tracted from 7 months of watch­ing AI gen­er­ate a code­base that slowly ate it­self. Each of these is some­thing I did wrong, why it hap­pens with AI-assisted cod­ing, and what you should ac­tu­ally put in your CLAUDE.md or agents.md to pre­vent it.

Tenet 1: AI builds fea­tures, not ar­chi­tec­ture.

Every time I prompted Claude for a fea­ture, it de­liv­ered. Perfectly. The fleet view worked on the first try. Log stream­ing worked. Mouse sup­port worked. The prob­lem is that each fea­ture was im­ple­mented in the con­text of make this work right now” with­out any aware­ness of the 49 other fea­tures shar­ing the same state.

Here’s what the re­sources­Load­edMsg han­dler looks like. This is the code that runs every time you switch views:

case re­sources­Load­edMsg: m.log­Lines = nil // Clear log lines when load­ing re­sources m.hor­i­zon­talOff­set = 0 // Reset hor­i­zon­tal scroll on re­source change

if m.cur­rent­GVR != msg.gvr && m.re­source­Watcher != nil { m.re­source­Watcher.Stop() m.re­source­Watcher = nil } m.cur­rent­GVR = msg.gvr m.cur­rent­Name­space = msg.name­space m.listOp­tions = msg.listOp­tions m.ra­wOb­jects = msg.ra­wOb­jects

// For nodes: store the full un­fil­tered set, clas­sify, then fil­ter if msg.gvr.Re­source == k8s.Re­sourceNodes && m.fleetView != nil { m.all­Re­sources = msg.re­sources m.all­Cre­ation­Times = msg.cre­ation­Times if len(msg.ra­wOb­jects) > 0 { m.fleetView.Clas­si­fyAnd­Count(m.ra­wOb­jectP­trs()) } m.ap­plyFleet­Fil­ter() } else { m.re­sources = msg.re­sources m.cre­ation­Times = msg.cre­ation­Times m.all­Re­sources = nil m.all­Cre­ation­Times = nil }

See the if msg.gvr.Re­source == k8s.Re­sourceNodes && m.fleetView != nil con­di­tional? That’s the fleet view be­ing spe­cial-cased in­side the generic re­source load­ing path. Every new view that needed cus­tom be­hav­ior got an­other branch here. And every branch needed to man­u­ally clear the right com­bi­na­tion of fields or the pre­vi­ous view’s data would bleed through.

How many = nil cleanup lines ex­ist in this file? I counted:

m.log­Lines = nil // Clear log lines when load­ing re­sources m.all­Re­sources = nil // Clear fleet data when not on nodes m.re­sources = nil // Clear re­sources when load­ing logs m.re­sources = nil // Clear re­sources when load­ing de­scribe view m.log­Lines = nil // Clear log lines when load­ing de­scribe view m.re­sources = nil // Clear re­sources when load­ing yaml view m.log­Lines = nil // Clear log lines when load­ing yaml view m.log­Lines = nil // … two more in other han­dlers m.log­Lines = nil

Nine man­ual nil as­sign­ments scat­tered across a 1690-line file. Miss one and you get ghost data from the pre­vi­ous view. This is what hap­pens when there’s no view iso­la­tion. AI can’t see this pat­tern de­cay­ing over time be­cause each prompt only touches one code path.

What to do in­stead: Write the ar­chi­tec­ture your­self be­fore any code. Not a vague de­sign doc. A con­crete set of in­ter­faces, mes­sage types, and own­er­ship rules. Then put those rules in your CLAUDE.md so the AI sees them on every prompt:

# Architecture Invariants (CLAUDE.md)

- Each view im­ple­ments the View trait. Views do NOT ac­cess other views’ state. - All async data ar­rives via AppMsg vari­ants. No di­rect field mu­ta­tion from back­ground tasks. - Adding a new view MUST NOT re­quire mod­i­fy­ing ex­ist­ing views. - The App struct is a thin router. It owns nav­i­ga­tion and mes­sage dis­patch. Nothing else.

The AI will fol­low these if you write them down. It just won’t in­vent them for you.

Tenet 2: The god ob­ject is the de­fault AI ar­ti­fact.

AI grav­i­tates to­ward sin­gle-struct-holds-every­thing be­cause it sat­is­fies the im­me­di­ate prompt with min­i­mal cer­e­mony. But it gets worse. Because there’s no view iso­la­tion, key han­dling be­comes a night­mare. Here’s the ac­tual key dis­patch for the s key:

case m.con­fig.Key­Bind.For(con­fig.Ac­tion­Tog­gleAu­to­Scroll, key): if m.cur­rent­GVR.Re­source == k8s.Re­sourceL­ogs { m.logView.Au­to­scroll = !m.logView.Autoscroll if m.logView.Au­to­scroll { m.table.Go­to­Bot­tom() } re­turn m, nil } // Shell exec for pods and con­tain­ers views if m.cur­rent­GVR.Re­source == k8s.Re­sour­ce­Pods { // … 20 lines to look up se­lected pod, get name, name­space … re­turn m, m.com­mand­With­Pre­flights( m.ex­ecIn­to­Pod(se­lect­ed­Name, se­lect­ed­Name­space), m.re­quire­Con­nec­tion, ) } if m.cur­rent­GVR.Re­source == k8s.Re­source­Con­tain­ers { // … con­tainer exec logic … re­turn m, m.com­mand­With­Pre­flights(m.ex­ecIn­to­Con­tainer(), m.re­quire­Con­nec­tion) } re­turn m, nil

One key­bind­ing. Three com­pletely dif­fer­ent be­hav­iors de­pend­ing on which view you’re in. The s key means autoscroll” in logs, shell” in pods, and shell into con­tainer” in con­tain­ers. This is all in one flat switch be­cause there are no per-view key maps. The AI gen­er­ated this be­cause I said add shell sup­port for pods” and it found the near­est key han­dler and jammed it in.

And look at how Enter works. This is the drill-down han­dler:

case m.con­fig.Key­Bind.For(con­fig.Ac­tion­Sub­mit, key): // Special han­dling for con­texts view if m.cur­rent­GVR.Re­source == contexts” { // … 12 lines … re­turn m, m.ex­e­cu­teC­tx­Com­mand([]string{con­textName}) } // Special han­dling for name­spaces view if m.cur­rent­GVR.Re­source == namespaces” { // … 12 lines … re­turn m, m.ex­e­cuteN­sCom­mand([]string{name­space­Name}) } if m.cur­rent­GVR.Re­source == k8s.Re­sourceL­ogs { re­turn m, nil } // … 25 more lines of generic drill-down …

Every view is a con­di­tional in a flat dis­patch. There are 20+ oc­cur­rences of m.cur­rent­GVR.Re­source == used as a type dis­crim­i­na­tor in this sin­gle file. Not types. String com­par­isons. Every new view means touch­ing every han­dler.

What to do in­stead: Put this in your CLAUDE.md:

# State Ownership Rules

- NEVER add fields to the App/Model struct for view-spe­cific state. - Each view is a sep­a­rate struct im­ple­ment­ing the View trait/​in­ter­face. - Each view de­clares its own key bind­ings. The app dis­patches keys to the ac­tive view. - If you need to add a key­bind­ing, add it to the rel­e­vant view’s keymap, not a global one. - Adding a view means adding a file. If your change re­quires mod­i­fy­ing ex­ist­ing views, stop and ask.

The AI will al­ways take the short­est path (“add an­other if-branch”). Your job is to make the short­est path also the cor­rect path by putting guardrails in the file it reads on every in­vo­ca­tion.

Tenet 3: Velocity il­lu­sion widens your scope.

This one’s psy­cho­log­i­cal, not tech­ni­cal, and I think it’s the most dan­ger­ous.

When I started k10s, I wanted a GPU-focused tool. For peo­ple run­ning train­ing clus­ters. A niche au­di­ence that I’m part of. But vibe-cod­ing made every­thing feel cheap. Oh I can add pods view in one ses­sion? Let me add de­ploy­ments too. And ser­vices. And a full com­mand palette. And mouse sup­port. And con­texts. And name­spaces.”

Suddenly I was build­ing k9s. A gen­eral-pur­pose Kubernetes TUI. For every­one. Because the AI made it feel like each fea­ture was free.

It was­n’t free. Each fea­ture was an­other branch in the god ob­ject. Here’s the key­bind­ing struct:

type keyMap struct { Up, Down, Left, Right key.Bind­ing GotoTop, GotoBottom key.Bind­ing AllNS, DefaultNS key.Bind­ing Enter, Back key.Bind­ing Command, Quit key.Bind­ing Fullscreen key.Bind­ing // log view Autoscroll key.Bind­ing // log view (also shell in pods!) ToggleTime key.Bind­ing // log view WrapText key.Bind­ing // log + de­scribe view CopyLogs key.Bind­ing // log view ToggleLineNums key.Bind­ing // de­scribe view Describe key.Bind­ing // re­source views YamlView key.Bind­ing // re­source views Edit key.Bind­ing // re­source views Shell key.Bind­ing // pods (CONFLICTS with Autoscroll!) FilterLogs key.Bind­ing // log view FleetTabNext key.Bind­ing // fleet view only FleetTabPrev key.Bind­ing // fleet view only }

One flat keymap for all views. Comments in parens show which view each bind­ing ap­plies to. Autoscroll and Shell are both s. This works” be­cause the dis­patch checks m.cur­rent­GVR.Re­source be­fore act­ing. But it means you can’t rea­son about key­bind­ings lo­cally. You have to trace through the en­tire 500-line Update func­tion to know what a key does.

The com­plex­ity was ac­cu­mu­lat­ing in­vis­i­bly while the ve­loc­ity met­ric said you’re ship­ping!”

What to do in­stead: Write a vi­sion doc that ex­plic­itly says who you’re NOT build­ing for, and put the scope bound­ary in your CLAUDE.md:

# Scope (do NOT ex­pand be­yond this)

k10s is for GPU clus­ter op­er­a­tors. Not all Kubernetes users. Supported views: fleet, node-de­tail, gpu-de­tail, work­load. That’s it. Do NOT add generic re­source views (pods, de­ploy­ments, ser­vices). Do NOT add fea­tures that du­pli­cate k9s func­tion­al­ity. If a fea­ture re­quest does­n’t serve some­one run­ning GPU train­ing jobs, re­ject it.

Vibe-coding makes you feel like you have in­fi­nite im­ple­men­ta­tion bud­get. You don’t. You have in­fi­nite LINE bud­get (the AI will gen­er­ate as much code as you want). But you have the same fi­nite com­plex­ity bud­get as al­ways. The ar­chi­tec­ture can only sup­port so many fea­tures be­fore it buck­les, re­gard­less of how fast you wrote them. The CLAUDE.md scope sec­tion is you say­ing no in ad­vance, be­fore the ve­loc­ity high con­vinces you to say yes.

Tenet 4: Positional data is a time bomb.

Every re­source in k10s was fetched from the Kubernetes API and im­me­di­ately flat­tened:

type OrderedResourceFields []string

Column iden­tity was purely po­si­tional. Here’s the sort func­tion for the fleet view. Look at the in­dex ac­cess:

func sort­Fil­tere­dResources(rows []k8s.OrderedResourceFields, times []time.Time, tab FleetTab) { sort.SliceS­table(in­dices, func(a, b int) bool { ra := rows[in­dices[a]] rb := rows[in­dices[b]]

switch tab { case FleetTabGPU: // Sort by Alloc col­umn (index 3) as­cend­ing al­locA, al­locB := ”, ” if len(ra) > 3 { al­locA = ra[3] } if len(rb) > 3 { al­locB = rb[3] } re­turn al­locA < al­locB

case FleetTabCPU: // Sort by Name col­umn (index 0) as­cend­ing nameA, nameB := ”, ” if len(ra) > 0 { nameA = ra[0] } if len(rb) > 0 { nameB = rb[0] } re­turn nameA < nameB

case FleetTabAll: // GPU nodes first, then CPU nodes. // Within GPU: sort by Alloc (index 3). // Within CPU: sort by Name (index 0). com­puteA, com­puteB := ”, ” if len(ra) > 2 { com­puteA = ra[2] } if len(rb) > 2 { com­puteB = rb[2] } aIs­GPU := strings.HasPre­fix(com­puteA, gpu”) bIs­GPU := strings.HasPre­fix(com­puteB, gpu”) // … } }) }

ra[3] is Alloc. ra[2] is Compute. ra[0] is Name. These are magic num­bers. The only thing con­nect­ing in­dex 3 to Alloc” is a com­ment and the col­umn or­der de­fined in re­source.views.json:

{ nodes”: { fields”: [ { name”: Name”, weight”: 0.28 }, { name”: Instance”, weight”: 0.15 }, { name”: Compute”, weight”: 0.12 }, { name”: Alloc”, weight”: 0.12 }, … ] } }

Add a col­umn be­tween Instance and Compute? Every sort, every con­di­tional ren­der, every place that says ra[2] or ra[3] is now silently wrong. The com­piler can’t help you be­cause it’s all []string. And the JSON con­fig can’t ex­press sort be­hav­ior, con­di­tional ren­der­ing, or cus­tom drill tar­gets, so those live in Go code that hard­codes the po­si­tional as­sump­tions from the JSON.

AI gen­er­ates this pat­tern be­cause it’s the short­est path from fetch data” to render table.” A []string sat­is­fies any table wid­get im­me­di­ately. Typed structs re­quire more cer­e­mony up­front. So the AI picks the fast path, and six months later you’re de­bug­ging why sort puts Name” val­ues in the Alloc” col­umn.

What to do in­stead: Put this di­rec­tive in your CLAUDE.md:

# Data Representation

- NEVER flat­ten struc­tured data into []string, Vec<String>, or po­si­tional ar­rays. - All data flows as typed structs (FleetNode, PodInfo, etc.) un­til the ren­der() call. - Column iden­tity comes from struct field names, not ar­ray in­dices. - Sort func­tions op­er­ate on typed fields, never on po­si­tional ac­cess like row[3]. - The ONLY place strings are cre­ated for dis­play is in­side ren­der()/​view() func­tions.

Then your typed struct makes im­pos­si­ble states im­pos­si­ble [2]:

struct FleetNode { name: String, in­stance_­type: String, com­pute_­class: ComputeClass, al­loc: GpuAlloc, }

You can’t sort by the wrong col­umn when columns are named fields. You can’t ac­ci­den­tally com­pare Alloc strings as names. The com­piler en­forces this for you. AI will al­ways pick Vec<String> be­cause it sat­is­fies the prompt faster. Your CLAUDE.md makes the typed path the path of least re­sis­tance.

Tenet 5: AI does­n’t own state tran­si­tions.

The Bubble Tea ar­chi­tec­ture has a beau­ti­ful idea: Update() is the only place state mu­tates, dri­ven by mes­sages. But k10s vi­o­lated this. The up­dateTableMsg han­dler spawned a clo­sure that mu­tated Model fields from in­side a gor­ou­tine:

case up­dateTableMsg: re­turn m, func() tea.Msg { // block on some­one send­ing the up­date mes­sage. <-m.updateTableChan // Preserve cur­sor po­si­tion across col­umn/​row up­dates so that // back­ground re­freshes don’t re­set the user’s se­lec­tion. saved­Cur­sor := max(m.table.Cur­sor(), 0) // run the nec­es­sary table view up­date calls. m.up­date­Columns(m.viewWidth) m.up­dateTable­Data() // Restore cur­sor, clamped to valid range. row­Count := len(m.table.Rows()) if row­Count > 0 { if saved­Cur­sor >= row­Count { saved­Cur­sor = row­Count - 1 } m.table.SetCur­sor(saved­Cur­sor) } re­turn up­dateTableMsg{} }

This re­turned func­tion (a tea.Cmd) is ex­e­cuted by Bubble Tea in a sep­a­rate gor­ou­tine. It calls m.up­date­Columns(m.viewWidth) and m.up­dateTable­Data() which read and write m.re­sources, m.table, m.viewWidth. Meanwhile, View() is called on the main gor­ou­tine read­ing the same fields. There’s no lock. No mu­tex. The chan­nel <-m.updateTableChan blocks the gor­ou­tine un­til some­one sends an up­date sig­nal, but noth­ing pre­vents View() from read­ing half-writ­ten state.

This is a text­book data race. It worked 99% of the time. Corrupted the dis­play 1% of the time in ways that made me think I was go­ing in­sane.

AI gen­er­ates this be­cause just mu­tate it in the clo­sure” is the short­est path to work­ing code. Proper mes­sage pass­ing (send a mes­sage back to Update(), let Update() ap­ply the mu­ta­tion atom­i­cally on the main loop) re­quires more types, more plumb­ing. The AI is op­ti­miz­ing for the prompt, not for cor­rect­ness un­der con­cur­rency.

What to do in­stead: All mu­ta­tions to ren­der-vis­i­ble state hap­pen on the main loop. Period. Background work­ers pro­duce data. They send it as a mes­sage. The main loop re­ceives the mes­sage and ap­plies it. This is the one rule you can­not break in con­cur­rent UI code.

// Background task: tx.send(AppMsg::Fleet­Data(nodes)).await;

// Main loop: match msg { AppMsg::FleetData(nodes) => { self.fleet_view.up­date_n­odes(nodes); } }

Idempotency Is Easy Until the Second Request Is Different | Dochia CLI Blog

blog.dochia.dev

People talk about idem­po­tency like it is a solved prob­lem:

Put an Idempotency-Key on the re­quest. Store the re­sponse. Replay it on retry.

Put an Idempotency-Key on the re­quest. Store the re­sponse. Replay it on retry.

And yes, that is doable. For the happy path, it is even fairly small.

The client sends:

POST /payments Idempotency-Key: abc-123 Content-Type: ap­pli­ca­tion/​json

{ accountId”: acc_1″, amount”: 10.00″, currency”: EUR, merchantReference”: invoice-7781″ }

The server checks whether it has seen abc-123. If not, it cre­ates the pay­ment. If yes, it re­turns the pre­vi­ous re­sponse.

That ver­sion sur­vives the demo.

The part I con­test is that this is the hard part. It is not. The hard part starts with the sec­ond re­quest, be­cause the sec­ond re­quest is not al­ways a clean re­play of the first one.

Maybe it is a com­pleted re­play. Fine. Return the stored re­sult.

Maybe it ar­rives while the first re­quest is still run­ning. Now your idem­po­tency layer is part of your con­cur­rency con­trol.

Maybe the first re­quest cre­ated a lo­cal pay­ment but crashed be­fore pub­lish­ing an event. Now the lo­cal row and the ex­ter­nal side ef­fects are out of step.

Maybe the first re­quest called a pay­ment provider, the provider ac­cepted it, and your process died be­fore record­ing the re­sult. Now your data­base can­not in­fer whether money moved.

Or maybe the sec­ond re­quest has the same key and dif­fer­ent con­tent:

{ accountId”: acc_1″, amount”: 100.00″, currency”: EUR, merchantReference”: invoice-7781″ }

Same key. Different amount.

This is the case that makes idem­po­tency in­ter­est­ing. Is it a retry? Is it a client bug? Is it a new op­er­a­tion? Should the server re­play the old re­sponse, re­ject the re­quest, or treat (key + con­tent) as a new iden­tity?

You can pick any of those poli­cies if you doc­u­ment it clearly. But the server should have an opin­ion. Not nec­es­sar­ily my opin­ion, but a clear one.

My bias for side-ef­fect­ing APIs is: same scoped key plus dif­fer­ent canon­i­cal com­mand should be a hard er­ror. It catches client bugs early. A client that be­lieves it is safely retry­ing a 10 EUR pay­ment should not have the server silently in­ter­pret the sec­ond re­quest as some­thing else.

The cases that mat­ter are the ones a re­play cache does not ex­plain:

com­pleted re­play

con­cur­rent retry

par­tial lo­cal suc­cess

down­stream un­known state

same key with a dif­fer­ent canon­i­cal com­mand

du­pli­cate op­er­a­tion with­out a key

retry af­ter ex­piry

retry af­ter de­ploy, schema change, ser­vice hop, or re­gion failover

If your de­sign only han­dles com­pleted same-com­mand re­tries, it is a re­play cache. That might be enough for some end­points. But it is not the whole prob­lem.

Idempotency is about the ef­fect

An op­er­a­tion is idem­po­tent if ap­ply­ing it once or many times has the same in­tended ef­fect.

That de­f­i­n­i­tion is sim­ple. The word do­ing all the work is effect”.

HTTP gives you method-level se­man­tics. A PUT /users/123/email can be idem­po­tent if send­ing the same rep­re­sen­ta­tion re­peat­edly leaves the re­source in the same state. A DELETE /sessions/456 can be idem­po­tent if delet­ing an al­ready-deleted ses­sion still means session does not ex­ist”. Repeating the DELETE might re­turn 404; the ef­fect can still be idem­po­tent.

But your han­dler can still pro­duce re­peated side ef­fects the busi­ness cares about: du­pli­cate au­dit records, du­pli­cate do­main events, du­pli­cate emails, du­pli­cate provider calls, or du­pli­cate met­rics that af­fect billing or fraud logic.

POST is usu­ally not idem­po­tent by de­fault, but it can be made idem­po­tent if the server stores and en­forces the right be­hav­ior. The key iden­ti­fies a claimed op­er­a­tion. It does not de­fine re­quest equiv­a­lence, re­play pol­icy, or down­stream dedu­pli­ca­tion.

A unique­ness con­straint can pre­vent one class of du­pli­cate. It does not, by it­self, give the client a cor­rect retry re­sult.

For ex­am­ple, unique(ac­coun­t_id, mer­chan­t_ref­er­ence) might pre­vent two pay­ment rows, but if the retry gets a generic 500, the client still does not know whether the pay­ment suc­ceeded. If the row ex­ists but the re­sponse is dif­fer­ent, or the event is pub­lished twice, or the ledger en­try is du­pli­cated, the op­er­a­tion is not idem­po­tent in the way the caller cares about.

What you need to re­mem­ber

For POST /payments, the durable idem­po­tency record needs to an­swer three ques­tions:

Who owns this key?

What did the first com­mand mean?

What out­come can be re­played?

In PostgreSQL-ish SQL, a min­i­mal table might look like this:

cre­ate table idem­po­ten­cy_re­quests ( ten­an­t_id text not null, op­er­a­tion_­name text not null, idem­po­ten­cy_key text not null, re­quest_hash text not null, sta­tus text not null, re­sponse_s­ta­tus int, re­sponse_­body jsonb, re­source_­type text, re­source_id text, er­ror_­code text, cre­at­ed_at time­stamptz not null, up­dat­ed_at time­stamptz not null, ex­pires_at time­stamptz not null, locked_un­til time­stamptz, pri­mary key (tenant_id, op­er­a­tion_­name, idem­po­ten­cy_key) );

The key is not glob­ally unique un­less you de­lib­er­ately make it global. Usually it should not be. A bro­ken client gen­er­at­ing abc-123 should only col­lide with it­self, not with an­other ten­ant.

Scope might be ten­ant, user, ac­count, mer­chant, API client, or some com­bi­na­tion. Pick it de­lib­er­ately.

The op­er­a­tion name pre­vents ac­ci­den­tal reuse across dif­fer­ent op­er­a­tions. A key used for cre­ate_­pay­ment should not au­to­mat­i­cally mean the same thing for cre­ate_re­fund.

The re­quest_hash is the server’s mem­ory of the first com­mand. Without it, same key plus dif­fer­ent body be­comes am­bigu­ous. You ei­ther re­play the first re­sponse for a dif­fer­ent com­mand, or you ex­e­cute a new op­er­a­tion un­der an old key. Both are bad if the client thinks it is retry­ing.

IN_PROGRESS is not an in­ter­nal de­tail. A retry can ar­rive while the first re­quest still owns ex­e­cu­tion.

The be­hav­ior needs to be ex­plicit:

The re­sponse fields ex­ist be­cause idem­po­tency is not just about pre­vent­ing du­pli­cate writes. The client needs an an­swer.

You can store the full re­sponse body, or store a ref­er­ence to the cre­ated re­source and re­con­struct the re­sponse. Both choices are an­noy­ing in dif­fer­ent ways.

Storing full re­sponses gives faith­ful re­play. It can also re­tain PII, signed URLs, one-time to­kens, card­holder-re­lated data, or fields you never in­tended to keep in a retry table.

Reconstructing from a re­source ref­er­ence saves space, but it can re­turn a dif­fer­ent rep­re­sen­ta­tion if the re­source changed af­ter cre­ation.

This is a con­tract de­ci­sion. Replay the cre­ation re­sponse” and return the cur­rent pay­ment” are both valid API de­signs. They are not the same de­sign.

Same key, dif­fer­ent com­mand

This is the bug the idem­po­tency layer should catch loudly.

First re­quest:

{ accountId”: acc_1″, amount”: 10.00″, currency”: EUR, merchantReference”: invoice-7781″ }

Second re­quest:

{ accountId”: acc_1″, amount”: 100.00″, currency”: EUR, merchantReference”: invoice-7781″ }

Same Idempotency-Key: abc-123. Different amount.

Returning the orig­i­nal re­sponse any­way is sim­ple. It also hides a se­ri­ous client bug. The client asked for a 100 EUR pay­ment and got back a 10 EUR pay­ment. If the caller does not com­pare the re­sponse care­fully, it may be­lieve the 100 EUR pay­ment suc­ceeded.

That is not idem­po­tency. That is rein­ter­pre­ta­tion.

For side-ef­fect­ing APIs, a scoped key reused with a dif­fer­ent canon­i­cal com­mand should be a hard er­ror, re­gard­less of whether the first op­er­a­tion com­pleted, failed, or is still run­ning.

HTTP/1.1 409 Conflict Content-Type: ap­pli­ca­tion/​json

{ errorCode”: IDEMPOTENCY_KEY_REUSED_WITH_DIFFERENT_REQUEST”, message”: This idem­po­tency key was al­ready used with a dif­fer­ent re­quest.” }

409 Conflict is a de­fen­si­ble de­fault be­cause the re­quest con­flicts with the server’s re­mem­bered mean­ing for that scoped key. Some APIs use 400 or 422; the im­por­tant part is a sta­ble ma­chine-read­able er­ror and no silent re­play for a dif­fer­ent com­mand.

A com­mon client bug looks like this:

bad: idem­po­ten­cyKey = car­tId

POST /payments amount=10.00 key=cart_123 POST /payments amount=15.00 key=cart_123

bet­ter: idem­po­ten­cyKey = pay­men­tAt­temp­tId

The server should not guess which pay­ment the cart key was sup­posed to rep­re­sent.

You can de­sign an API where (key + con­tent hash) de­fines the op­er­a­tion iden­tity. That is a valid pol­icy. But then the key is no longer an idem­po­tency key in the usual retry sense. It is part of a com­pos­ite op­er­a­tion iden­ti­fier. That needs to be ob­vi­ous to the client.

The dan­ger­ous ver­sion is the mid­dle ground, where the client thinks it is safely retry­ing one op­er­a­tion and the server silently in­ter­prets the sec­ond re­quest as an­other.

Hash the com­mand, not the bytes

Raw byte com­par­i­son is usu­ally too strict for JSON APIs. These two bod­ies should nor­mally be equiv­a­lent:

{ amount”: 10.00″, currency”: EUR }

{ currency”: EUR, amount”: 10.00″ }

Field or­der and white­space should not mat­ter.

Defaults are less ob­vi­ous:

{ accountId”: acc_1″, amount”: 10.00″, currency”: EUR }

ver­sus:

{ accountId”: acc_1″, amount”: 10.00″, currency”: EUR, channel”: web” }

If chan­nel: web” is the server de­fault, are these the same log­i­cal com­mand? Maybe. Decide be­fore hash­ing.

Unknown fields are an­other trap. Suppose your API ig­nores un­known JSON fields. If the first re­quest in­cludes foo”: bar” and the sec­ond does not, do you con­sider them the same? If un­known fields are truly ig­nored, per­haps yes. If they might be­come mean­ing­ful af­ter a de­ploy, per­haps no.

The prac­ti­cal rule is: hash the val­i­dated com­mand, not the raw HTTP body.

A rea­son­able flow is:

Parse the re­quest into a ver­sioned re­quest DTO or com­mand.

Normalize val­ues your API treats as equiv­a­lent: amounts, enum cas­ing, de­fault fields, time­stamp pre­ci­sion.

Exclude trans­port-only meta­data.

Include path pa­ra­me­ters and op­er­a­tion name.

Include se­man­tic head­ers if they af­fect the op­er­a­tion, such as API ver­sion.

If a header only af­fects re­sponse shape, such as Prefer: re­turn=min­i­mal, de­cide whether it be­longs in the com­mand hash, the re­play con­tract, or nei­ther.

Exclude Authorization and the idem­po­tency key it­self.

Serialize canon­i­cally.

Hash with a sta­ble al­go­rithm.

For the pay­ment ex­am­ple, the fin­ger­print might in­clude:

op­er­a­tion: cre­ate_­pay­ment ac­coun­tId: ac­c_1 amount: 10.00 cur­rency: EUR mer­chantRe­f­er­ence: in­voice-7781 chan­nel: web apiVer­sion: 2026 – 05-01

Running local models on an M4 with 24GB memory

jola.dev

I’ve been ex­per­i­ment­ing with run­ning lo­cal mod­els on and off for a bit and I’ve fi­nally found a setup that seems to work rea­son­ably. It’s noth­ing like the out­put of a SOTA model, but the ex­cite­ment of be­ing able to have a lo­cal model do ba­sic tasks, re­search, and plan­ning, more than makes up for it! No in­ter­net con­nec­tion re­quired! Not to men­tion that it’s a way of re­duc­ing your de­pen­dence on big US tech, even if just a tiny bit.

I gotta say though, it’s not easy to get this stuff set up. First you have to choose how you’re run­ning the model: Ollama, llama.cpp or LM Studio. Each one comes with its own quirks and lim­i­ta­tions, and they don’t of­fer all the same mod­els. Then of course, you have to pick your model. You want the best model avail­able that fits in mem­ory and still gives you enough head­room to run your reg­u­lar as­sort­ment of Electron apps, not to men­tion some­thing where you can have at least a 64K con­text win­dow, but ide­ally 128K or more. Most re­cently I’ve tried Qwen 3.6 Q3, GPT-OSS 20B, Devstral Small 24B, which all tech­ni­cally fit in mem­ory but were in prac­tice un­us­able, and Gemma 4B that would run fine but re­ally strug­gle with tool use.

Then there’s a plethora of con­fig­u­ra­tion op­tions to tweak. From the more well-known, like tem­per­a­ture, to more es­o­teric op­tions like K Cache Quantization Type. Many of these tools come with a ba­sic rec­om­mended set of op­tions, but the ap­pro­pri­ate ones can de­pend on things like whether you’re en­abling think­ing or not!

Qwen 3.5 – 9B (4b quant)

qwen3.5 – 9b@q4_k_s (HuggingFace link) is the best model I’ve got­ten work­ing with a rea­son­able ~40 to­kens per sec­ond, think­ing en­abled, suc­cess­ful tool use, and a 128K con­text win­dow, run­ning on LM Studio. Compared to a SOTA model, it gets dis­tracted more eas­ily, some­times it gets stuck in loops, it’ll mis­in­ter­pret asks etc. But it’s sur­pris­ingly good for some­thing that can run on a 24GB Macbook Pro while leav­ing space for lots of other things run­ning too!

These are the rec­om­mended set­tings for think­ing mode and cod­ing work:

Thinking mode for pre­cise cod­ing tasks (e.g., WebDev):

tem­per­a­ture=0.6, top_p=0.95, top_k=20, min_p=0.0, pres­ence_penalty=0.0, rep­e­ti­tion_penalty=1.0

Thinking mode for pre­cise cod­ing tasks (e.g., WebDev):

tem­per­a­ture=0.6, top_p=0.95, top_k=20, min_p=0.0, pres­ence_penalty=0.0, rep­e­ti­tion_penalty=1.0

To en­able think­ing I also had to se­lect the model, go to con­fig­u­ra­tion, scroll to the bot­tom of the Inference tab, and add {%- set en­able_­think­ing = true %} to the Prompt Template.

I’ve been us­ing it through both pi and OpenCode. I still haven’t quite made my mind up on with one I pre­fer. Pi feels a bit snap­pier, but al­though I re­ally ap­pre­ci­ate the idea of the har­ness build­ing it­self and all that cus­tomiza­tion, I can’t help but wish it came with some sen­si­ble de­faults. I feel like you could eas­ily end up spend­ing more time tweak­ing your pi set up to be just right, than you do on your ac­tual pro­jects!

Pi setup

Here’s the ~/.pi/agent/models.json:

{ providers”: { lmstudio”: { baseUrl”: http://​lo­cal­host:1234/​v1, api”: openai-completions”, apiKey”: lm-studio”, models”: [ { id”: qwen3.5 – 9b@q4_k_s”, reasoning”: true, compat”: { thinkingFormat”: qwen-chat-template” } } ] } } }

and to hide the dis­tract­ing think­ing, add hideThinkingBlock”: true to your ~/.pi/agent/settings.json.

OpenCode setup

~/.config/opencode/opencode.json

{ $schema”: https://​open­code.ai/​con­fig.json, provider”: { lmstudio”: { npm”: @ai-sdk/openai-compatible”, name”: LM Studio (local)”, options”: { baseURL”: http://​127.0.0.1:1234/​v1 }, models”: { qwen3.5 – 9b@q4_k_s”: { name”: Qwen 3.5 9B Q4_K_S”, tools”: true, context_length”: 131072, max_tokens”: 32768 } } } }, model”: lmstudio/qwen3.5 – 9b@q4_k_s” }

Comparison with SOTA mod­els

A model like Qwen 3.5 9B (Q4) is nowhere near be­ing able to in­de­pen­dently solve com­plex prob­lems over longer pe­riod of time, like SOTA mod­els are. There’s no point in ask­ing it to build an en­tire app in one go, you’ll just burn your thighs with­out get­ting any­thing out of it.

What does work is a more in­ter­ac­tive work­flow where you’re clearly com­mu­ni­cat­ing with the model step by step, and giv­ing it a lot of guid­ance. I’m sure that sounds point­less to many of you, why use a model where you have to babysit it as it works, but I ac­tu­ally found that it en­cour­aged me to be more en­gaged. The down­side to work­ing with SOTA mod­els is that they make it too easy to of­fload all cog­ni­tive ef­fort, even where you’re try­ing to ac­tively pre­vent that from hap­pen­ing.

With the lo­cal modal I have to take on a lot more of the think­ing and plan­ning, I have to be a lot more spe­cific, but it will still act as a re­search as­sis­tant, a rub­ber duck, and a sa­vant with in­stant re­call of a lot of pro­gram­ming lan­guage de­tails and com­mand line evo­ca­tions. It’s not the 10x pro­duc­tiv­ity boost that the big AI com­pa­nies mar­ket, but it’s some­thing, and it’s in­ter­est­ing.

To give you an idea of some things that work and don’t, here are some very ba­sic ex­am­ples.

Example 1:

I wanted to bump the Elixir lin­ter credo to the lat­est ver­sion and I got warn­ings out of my code. As an ex­per­i­ment, I asked Qwen to take a look.

me: run mix credo –strict and sug­gest how to solve it, don’t edit

Qwen: Credo Warnings There are 4 warn­ings in your test files. The is­sue is us­ing length/​1 to check if lists are non-empty. Credo rec­om­mends com­par­ing against an empty list in­stead. Warnings: 1. test/​jo­la_de­v_web/​con­trollers/​blog_­con­troller_test.exs:11 - as­sert length(conn.as­signs.posts) > 0 2. test/​jo­la_de­v_web/​con­trollers/​blog_­con­troller_test.exs:22 - as­sert length(conn.as­signs.posts) > 0 3. test/​jo­la_dev/​blog_test.exs:10 - as­sert length(posts) > 0 4. test/​jo­la_dev/​blog_test.exs:23 - as­sert length(tags) > 0 How to Fix Replace each length(list) > 0 with list != []: blog_­con­troller_test.exs: # Line 11 as­sert conn.as­signs.posts != [] # Line 22 as­sert conn.as­signs.posts != [] blog_test.exs: # Line 10 as­sert posts != [] # Line 23 as­sert tags != [] This is more id­iomatic in Elixir and avoids the over­head of com­put­ing the list length.

I asked Qwen to make the ed­its and it did 4 par­al­lel ed­its, very clean. Overall this was a very sim­ple task that I could’ve done my­self by tab­bing back and forth be­tween the ter­mi­nal and my ed­i­tor to get the line num­bers to edit. Not a big dif­fer­ence in ef­fort, but nice and con­ve­nient.

Example 2:

After some de­pen­dency bumps I had a de­pend­abot PR with git con­flicts that de­pend­abot for some rea­son was re­fus­ing to re­base. I pulled it down, re­based, and asked Qwen to take a look. It was a very sim­ple con­flict to re­solve, just take the newer ver­sion of each, and the model rec­og­nized that.

I can see the con­flict. Here’s what’s hap­pen­ing: HEAD (your cur­rent branch): - sen­try: 13.0.1 - tail­wind: 0.3.1 Other side (commit be­ing re­based): - sen­try: 12.0.3 - tail­wind: 0.4.1 You have di­ver­gent de­pen­dency bumps. Here are your op­tions: Option 1: Take both up­dates (recommended) - Keep sen­try 13.0.1 from HEAD - Keep tail­wind 0.4.1 from the other side - This gives you the lat­est of both deps Option 2: Take only sen­try up­date - Keep sen­try 13.0.1 - Keep tail­wind 0.3.1 Option 3: Take only tail­wind up­date - Keep sen­try 12.0.3 - Keep tail­wind 0.4.1 Which ap­proach do you want to take?

However, when asked to make the change, it for­got to make the edit and in­stead just tried to add the changes and con­tinue the re­base, with the con­flict mark­ers still in place.

> Thinking: The user wants to take both up­dates - sen­try 13.0.1 and tail­wind 0.4.1. I need to re­solve the con­flict by keep­ing the HEAD ver­sion for sen­try and the other side ver­sion for tail­wind. I’ll use git add to stage the re­solved file. –– # Stage re­solved mix.lock and con­tinue re­base $ git add mix.lock && git re­base –continue Vim: Warning: Output is not to a ter­mi­nal Vim: Warning: Input is not from a ter­mi­nal ~/workspace/mixtli/.git/COMMIT_EDITMSG” 32L, 1047BBump tail­wind from 0.3.1 to 0.4.1 Bumps [tailwind](https://​github.com/​phoenixframe­work/​tail­wind) from 0.3.1 to 0.44.1. - [Changelog](https://​github.com/​phoenixframe­work/​tail­wind/​blob/​main/​CHANGELOG.mmd) - [Commits](https://​github.com/​phoenixframe­work/​tail­wind/​com­pare/​v0.3.1…v0.4.11)

Additionally it failed to rec­og­nize that git re­base –continue opens an ed­i­tor, and OpenCode then pro­ceeded to hang, al­though that might just have been a one-off.

Conclusion

Anyway, LLMs are in­cred­i­bly flex­i­ble and there’s a mil­lion things even a lo­cal model could do. Your imag­i­na­tion is the limit. Local mod­els have se­ri­ous trade­offs, but they come with some pretty at­trac­tive ben­e­fits:

No in­ter­net con­nec­tion re­quired, you can work on the plane!

The cost is lim­ited to the elec­tric­ity you’re us­ing, as­sum­ing you were gonna buy a com­puter any­way. No sub­scrip­tion re­quired.

There’s still go­ing to be a se­ri­ous en­vi­ron­men­tal cost of train­ing these mod­els, but the open model com­pa­nies are nowhere near the top of the list in en­vi­ron­men­tal im­pact, and us­ing your own hard­ware means less data cen­ters.

It’s fun to tin­ker.

LLMs have had a huge im­pact on our world, and much of it not great, but it’s ob­vi­ous that they’re here to stay. Experimenting with lo­cal mod­els feels like a more sus­tain­able and pos­i­tive way to in­ter­act with this tech­nol­ogy. And hon­estly, it’s a lot of fun, even when it does the wrong thing!

YCOMBINATOR.FYI — The Unofficial YC Record

ycombinator.fyi

Kicked out of YC. 493+ fab­ri­cated au­dit re­ports. Sold fraud to fel­low YC com­pa­nies.

Claimed to au­to­mate SOC 2 and ISO com­pli­ance au­dit­ing with AI. Whistleblower re­vealed the plat­form auto-gen­er­ated iden­ti­cal pass­ing au­dit re­ports with key­board-mashed test data be­fore clients even up­loaded any­thing . CEO was caught on tape ask­ing an au­di­tor does your firm ac­tu­ally look at our plat­form?’ Focused on sell­ing to other YC com­pa­nies who trusted the YC brand — the cir­cu­lar trust net­work weaponized. Insight Partners scrubbed their $32M in­vest­ment blog post. YC of­fi­cially ex­pelled Delve in 2026. Investor Adam Cochran called it proof that YC has no tech­ni­cal acu­men to eval­u­ate claims’ un­der Garry Tan’s lead­er­ship.

SORT:

Central

Signed up as a Warp cus­tomer, stole the play­book, launched a clone

Autopsy Report:

Central’s CEO signed up as a cus­tomer of fel­low YC com­pany Warp in 2023 un­der Central Business Applications Inc.’ Spent six months ask­ing how tax com­pli­ance works, what reg­is­tered agents are, how multi-state fil­ings work — then churned and launched an iden­ti­cal pay­roll startup with Warp’s ex­act mar­ket­ing copy, down to the tagline designed for founders, not HR . YC funded them any­way in S24, de­spite Warp al­ready be­ing in the port­fo­lio. Central got ac­qui-hired by Mercury in April 2026. Warp CEO Ayush Sharma pub­licly called them out: gg no re.’

DAMAGE: Acqui-hired by Mercury

STATUS

COPYCATORIGINAL

Capital Raised$8.6M

Lifespan3 YRS

Naive

Raised $2M+ sell­ing autonomous AI em­ploy­ees.” The plat­form is a 41K-star MIT-licensed OSS frame­work with Stripe bolted on.

Autopsy Report:

Naive (YC S25) raised $2M+ mar­ket­ing them­selves as builders of autonomous AI em­ploy­ees.” Public in­ves­ti­ga­tion by @NotOnKetamine traced the en­tire un­der­ly­ing plat­form to Paperclip — a 41K-star MIT-licensed open source agent frame­work — with Naive hav­ing stripped at­tri­bu­tion and re­branded the orig­i­nal code as pro­pri­etary . The ad­di­tions on top of Paperclip were thin: Stripe billing wrap­pers ($0.50/credit), do­main/​email pro­vi­sion­ing for agents, phone/​SMS via stan­dard providers, and a Composio in­te­gra­tion layer. The OSS li­cense re­quired at­tri­bu­tion and li­cense preser­va­tion; Naive shipped nei­ther. The in­ves­ti­ga­tion sur­faced screen­shots, curl-ver­i­fi­able ev­i­dence, and side-by-side com­par­isons. As of April 2026, the com­pany has not pub­licly re­sponded to the re­ceipts.

DAMAGE: MIT li­cense vi­o­lated

STATUS

COPYCATORIGINAL

Capital Raised$2M+

Lifespan1+ YRS

Wuri

AI-generated vi­sual novel plat­form. Pivoted to en­ter­prise AI, then died as foun­da­tion mod­els com­modi­tized the UI layer in twelve months.

Autopsy Report:

Wuri pitched con­vert­ing 500M+ Wattpad-style text sto­ries into Webtoon-style vi­sual nov­els via gen­er­a­tive AI. Founders ex-Flip­kart, ex-Bain. the wrap­per-thin dif­fer­en­ti­a­tion col­lapsed when GPT-4 / Sora / im­age-gen foun­da­tion mod­els shipped equiv­a­lent ca­pa­bil­ity for free, erod­ing any pric­ing power Wuri had lay­ered on top . Pivoted from con­sumer reader app to en­ter­prise AI of­fer­ings in 2024. Quietly shut down in 2025. YC pro­file re­mains up but flagged Inactive in the di­rec­tory. The Wuri arc is the me­dian Tan-era Indian AI YC: thin GenAI wrap­per, fast pivot, quiet death.

DAMAGE: Tan-era YC class slot con­sumed for a thin GenAI wrap­per

STATUS

DEADALIVE

Capital Raised$500K

Lifespan2 YRS

Double Finance

0% ex­pense ra­tio” robo-ad­vi­sor. Hit $10M AUM in Dec 2024, shut down twelve months later. YC slug then re­cy­cled for the founder’s next pivot — with Garry Tan as pri­mary part­ner.

Autopsy Report:

Double pitched it­self as a Wealthfront/Betterment killer of­fer­ing di­rect in­dex­ing for $1/month. Founder JJ Maxwell (ex-Liftoff Influence) and growth lead Mark Wai (ex-Tesla, ex-Face­book). af­ter cross­ing $10M AUM in December 2024, the com­pany qui­etly shut down a year later — and the YC slug `double-2` was re­pur­posed for Maxwell’s next ven­ture (an AI ad-gen­er­a­tion startup called Polished) with Garry Tan named pri­mary YC part­ner . Maxwell wrote a Lessons from Building (and Closing) Double Finance” HN post (item 46213345). The slug-re­cy­cling-into-pivot pat­tern is it­self a Tan-era ar­ti­fact: the same YC com­pany URL now points at a dif­fer­ent prod­uct with dif­fer­ent unit eco­nom­ics, while clients had to be wound down or trans­ferred. Robo-advising joined the long list of con­sumer fin­tech cat­e­gories that did­n’t pen­cil at ven­ture scale.

DAMAGE: $10M+ in client AUM wound down or trans­ferred

STATUS

DEADALIVE

Capital RaisedUndisclosed (YC + post-Demo Day)

Lifespan1 YRS

PearAI

Forked an open-source code ed­i­tor, slapped their name on it, called it a startup.

Autopsy Report:

On Day 1 of YC Demo Day, the in­ter­net no­ticed that PearAI had forked Continue.dev’s open source code ed­i­tor, mass-re­placed all Continue’ ref­er­ences with PearAI’, and slapped on a fake li­cense writ­ten by ChatGPT . The Pear Enterprise License’ be­came a meme. YC had to pub­lish a blog post correcting the record.’ Founders apol­o­gized and piv­oted. The li­cense was gen­er­ated by AI. The irony was not lost.

DAMAGE: YCs rep­u­ta­tion took the hit

STATUS

COPYCATORIGINAL

Capital Raised$1.25M

Lifespan2+ YRS

Pickle

Stole GPL code, then launched AR glasses that might be CGI.

Autopsy Report:

A YC W25 com­pany with two sep­a­rate con­tro­ver­sies. First, co-founder Daniel Park re­leased Glass,’ a real-time con­fer­ence AI as­sis­tant, un­der Apache 2.0. Developers dis­cov­ered it was a near-ver­ba­tim copy of CheatingDaddy,’ a GPLv3-licensed open source pro­ject — iden­ti­cal code, com­ments, and li­brary ver­sions re­li­censed with­out at­tri­bu­tion . After 700+ up­votes on Hacker News, they changed the li­cense back. Then Pickle launched $799 Pickle 1’ AR glasses mar­keted as a soul com­puter.’ Tech an­a­lysts ac­cused the demo of be­ing CGI — im­pos­si­ble bat­tery specs, a Korean restau­rant in the demo that does­n’t ex­ist, and mis­matched UI data. SF Standard asked: Is it va­por­ware?’

DAMAGE: Open source trust + $799 pre­orders

STATUS

GRIFTLEGIT

Capital RaisedUndisclosed

Lifespan2+ YRS

Optifye.ai

Built sweatshops-as-a-service.’ YC deleted the ev­i­dence.

Autopsy Report:

YC W25 startup that built AI-powered sur­veil­lance cam­eras for gar­ment fac­to­ries. YC posted a demo video show­ing su­per­vi­sors mon­i­tor­ing work­ers as col­ored rec­tan­gles — green for pro­duc­tive, red for not. A man­ager calls about a low-per­form­ing worker re­ferred to as Number 17’ — not by name, just a num­ber — while the AI tracked their every move­ment on the fac­tory floor . The in­ter­net dubbed it sweatshops-as-a-service’ and computer vi­sion sweat­shop soft­ware.’ YC deleted the demo video from all so­cial ac­counts af­ter mas­sive back­lash. The founders, two Duke CS stu­dents, saw their prod­uct be­come a sym­bol of AI sur­veil­lance dystopia.

DAMAGE: Garment work­ers’ dig­nity

STATUS

SCANDALOK

Capital RaisedUndisclosed

Lifespan2+ YRS

Rezi

Paid land­lords up­front to rent their apart­ments. Burned $200M in debt, left a $49M fore­clo­sure.

Autopsy Report:

YC W17 rental plat­form that paid land­lords up­front — some­times $30K cash — then listed and rented the apart­ments them­selves. Raised $200M in debt rounds dur­ing the proptech boom. When in­ter­est rates spiked, Rezi could­n’t se­cure new fi­nanc­ing and qui­etly shut down in early 2024, leav­ing land­lord Leagem Properties to file a $49M fore­clo­sure suit di­rectly blam­ing Rezi’s col­lapse . Proptech fund­ing plum­meted from $32B in 2021 to $11.4B in 2023, and Rezi was caught in the mid­dle with a debt-heavy model that only works when money is cheap. Founder Sean Mitchell launched a new startup called Ryse. The $200M evap­o­rated.

DAMAGE: $49M fore­clo­sure + land­lords left hold­ing the bag

STATUS

DEADALIVE

Capital Raised$200M (debt)

Lifespan8 YRS

CapWay

YC-backed fin­tech for bank­ing deserts. Killed when the Synapse col­lapse de­stroyed fin­tech trust.

Autopsy Report:

YC S20 fin­tech startup aimed at bring­ing bank­ing to un­der­served banking de­sert’ com­mu­ni­ties. Backed by Backstage Capital, Fearless Fund, and Khosla Ventures. When the Synapse col­lapse and Evolve Bank & Trust hack de­stroyed fin­tech cred­i­bil­ity in 2023, banks started re­quir­ing fin­techs to hold mas­sive cap­i­tal re­serves that early-stage star­tups could­n’t meet — and CapWay could­n’t raise an­other round in the scorched-earth fin­tech land­scape . A po­ten­tial ac­qui­si­tion fell through. Founder Sheena Allen an­nounced the clo­sure in October 2024, cit­ing the post-Synapse en­vi­ron­ment as the fi­nal blow. A startup try­ing to help un­der­banked com­mu­ni­ties, killed by the broader fin­tech in­dus­try’s trust deficit.

DAMAGE: Underserved com­mu­ni­ties lost a bank­ing op­tion

STATUS

DEADALIVE

Capital Raised$800K

Lifespan8 YRS

Medobed

Expelled from YC S23 for fak­ing cre­den­tials and growth met­rics. Never made it to Demo Day.

Autopsy Report:

Indian 10-minute med­i­cine de­liv­ery startup that joined YC S23. Founder Piyush Sharma claimed IIT Roorkee and Stanford cre­den­tials and re­ported $1M monthly GMV with EBITDA prof­itabil­ity. An anony­mous in­vestor tipped off YC, which dis­cov­ered the founder had fab­ri­cated his ed­u­ca­tional his­tory and in­flated growth met­rics — ir­reg­u­lar­i­ties that broke YCs ethics pol­icy . YC ex­pelled Medobed from the batch be­fore Demo Day — one of the rare in­stances of a com­pany be­ing kicked out mid-pro­gram. The com­pany never raised through YC. Under Garry Tan’s watch, the ac­cel­er­a­tor that prides it­self on due dili­gence missed a founder ly­ing about where he went to school.

DAMAGE: YCs vet­ting cred­i­bil­ity

STATUS

FRAUDALIVE

Capital RaisedUndisclosed

Lifespan0 YRS

Embark Trucks

SPAC’d at $5.2B with zero rev­enue. Stock dropped 99%. Sold for parts.

Autopsy Report:

YC W16 alum that went pub­lic via SPAC at a $5.2B val­u­a­tion. The catch? the com­pany had zero rev­enue, zero com­mer­cial de­ploy­ments, and zero path to prof­itabil­ity — it was a 7-year-old pre-rev­enue com­pany val­ued at $5.2 bil­lion . Stock dropped over 99% from the of­fer price. Laid off 70% of staff. CEO Alex Rodrigues blamed capital mar­kets turn­ing their backs on pre-rev­enue com­pa­nies.’ Applied Intuition bought the re­mains in 2023. The self-dri­ving truck never drove it­self to rev­enue.

DAMAGE: 70% of staff laid off

STATUS

DEADALIVE

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

Visit pancik.com for more.