10 interesting stories served every morning and every evening.

Claude Fable 5 and Claude Mythos 5

www.anthropic.com

Today we’re launch­ing Claude Fable 5: a Mythos-class1 model that we’ve made safe for gen­eral use.

Fable 5’s ca­pa­bil­i­ties ex­ceed those of any model we’ve ever made gen­er­ally avail­able. It is state-of-the-art on nearly all tested bench­marks of AI ca­pa­bil­ity, show­ing ex­cep­tional per­for­mance in soft­ware en­gi­neer­ing, knowl­edge work, vi­sion, sci­en­tific re­search, and many other ar­eas. The longer and more com­plex the task, the larger Fable 5’s lead over our other mod­els.

Releasing a model this ca­pa­ble comes with risks. Without safe­guards, Fable 5’s ca­pa­bil­i­ties in ar­eas like cy­ber­se­cu­rity could be mis­used to cause se­ri­ous dam­age. We’ve there­fore launched the model with safe­guards that mean queries on some top­ics will in­stead re­ceive a re­sponse from our next-most-ca­pa­ble model, Claude Opus 4.8. To re­lease the model both safely and quickly, we’ve tuned these safe­guards con­ser­v­a­tively—they’ll some­times catch harm­less re­quests, though they trig­ger, on av­er­age, in less than 5% of ses­sions. With more ca­pa­ble mod­els ar­riv­ing in the com­ing months, we’re work­ing to im­prove our safe­guards and re­duce false pos­i­tives as quickly as we can.

For a small group of cy­berde­fend­ers and in­fra­struc­ture providers, we’re also launch­ing Claude Mythos 5. It’s the same un­der­ly­ing model as Fable 5, but with the safe­guards lifted in some ar­eas.2 Mythos 5 will ini­tially be de­ployed through Project Glasswing, in col­lab­o­ra­tion with the US gov­ern­ment, as an up­grade to Claude Mythos Preview. It has the strongest cy­ber­se­cu­rity ca­pa­bil­i­ties of any model in the world. Soon, we in­tend to ex­pand ac­cess to Mythos 5 through a broader trusted ac­cess pro­gram.

The ca­pa­bil­i­ties of mod­els like Fable 5 and Mythos 5 have the po­ten­tial to do pro­found good for the world. We’ve seen the be­gin­nings of this in Project Glasswing, where the mod­els have helped cy­ber de­fend­ers se­cure crit­i­cally im­por­tant soft­ware. We’ve also seen it in life sci­ences re­search, where the mod­els are posit­ing novel hy­pothe­ses and speed­ing up the de­vel­op­ment of new ther­a­peu­tics.

Fable 5 and Mythos 5 are be­ing of­fered at $10 per mil­lion in­put to­kens and $50 per mil­lion out­put to­kens—less than half the price of Claude Mythos Preview. Today’s joint launch is an­other step to­wards our goal of bring­ing ad­vanced AI ca­pa­bil­i­ties to as many users as pos­si­ble, as quickly and as safely as we can.

The table be­low com­pares the ca­pa­bil­i­ties of Fable 5 and Mythos 5 to other lead­ing mod­els.

Fable 5 and Mythos 5 can work au­tonomously for longer than any pre­vi­ous Claude mod­els. Below we dis­cuss how these skills ap­ply to soft­ware en­gi­neer­ing, and cover the mod­el’s im­proved ca­pa­bil­i­ties in knowl­edge work, vi­sion, mem­ory, and life sci­ences re­search.

Software en­gi­neer­ing. During early test­ing, Stripe re­ported that Fable 5 com­pressed months of en­gi­neer­ing into days. In a 50-million-line Ruby code­base, the model per­formed a code­base-wide mi­gra­tion in a day that would oth­er­wise have taken a whole team over two months by hand. Fable 5 is also more to­ken-ef­fi­cient than past Claude mod­els: on Cognition’s FrontierCode eval­u­a­tion, which tests whether mod­els can pass dif­fi­cult cod­ing tasks while meet­ing the stan­dards of high-qual­ity pro­duc­tion code­bases, Fable 5 scores high­est among fron­tier mod­els, even at medium ef­fort.

Knowledge work. Fable 5 shows strong per­for­mance on com­plex an­a­lyt­i­cal tasks. On Hebbia’s Finance Benchmark for se­nior-level rea­son­ing, Fable 5 has the high­est score of any model, with sub­stan­tial gains in doc­u­ment-based rea­son­ing, chart and table in­ter­pre­ta­tion, and prob­lem solv­ing. IMC noted that Fable 5 aced their trad­ing-analy­sis eval­u­a­tions nearly across the board, in­clud­ing fac­tual lookup, con­cep­tual rea­son­ing, root-cause analy­sis, and ex­pected-value analy­sis.

Vision. Fable 5 is the new state-of-the-art model for tasks in­volv­ing vi­sion. It can ex­tract pre­cise num­bers from de­tailed sci­en­tific fig­ures and can per­form com­plex vi­sion-based tasks like re­build­ing a web ap­p’s source code from screen­shots alone. It also needs less scaf­fold­ing: for ex­am­ple, pre­vi­ous Claude mod­els strug­gled to play Pokémon FireRed even with har­nesses that gave them ad­di­tional help­ful tools, but Fable 5 beat FireRed with a min­i­mal, vi­sion-only har­ness.

Memory and long-con­text. Fable 5 stays fo­cused across mil­lions of to­kens in long-run­ning tasks and im­proves its out­puts us­ing its own notes. When we had the model play the deck-build­ing game Slay the Spire, giv­ing it ac­cess to per­sis­tent file-based mem­ory im­proved its per­for­mance three times more than for Opus 4.8; Fable also reached the game’s fi­nal act three times more of­ten.

Drug de­sign: Using Mythos 5, our in­ter­nal pro­tein de­sign ex­perts ac­cel­er­ated as­pects of the drug de­sign process by around ten times. In one ex­am­ple, they found that Mythos 5, with pro­tein de­sign and bioin­for­mat­ics tools but no hu­man as­sis­tance, matches or beats skilled hu­man op­er­a­tors. In do­ing so, the model ex­e­cutes all of the tasks that are nor­mally com­pleted by a sci­en­tist: choos­ing bind­ing sites, se­lect­ing and run­ning pro­tein de­sign tools, and re­cov­er­ing from fail­ures along the way. Nine of the 14 pro­tein tar­gets from this study (shown be­low) yielded strong can­di­dates for drug de­sign that we’re cur­rently in­ves­ti­gat­ing.

Novel hy­pothe­ses in mol­e­c­u­lar bi­ol­ogy. Mythos 5 is our first model to con­sis­tently pro­duce novel, com­pelling sci­en­tific hy­pothe­ses. In blinded head-to-head com­par­isons against Opus-class mod­els, our sci­en­tists pre­ferred Mythos’s mol­e­c­u­lar bi­ol­ogy hy­pothe­ses ~80% of the time, and have ad­vanced sev­eral to ex­per­i­men­tal eval­u­a­tion. In the mean­time, one Mythos hy­poth­e­sis—a novel mech­a­nism for an E. coli pro­tein—was cor­rob­o­rated in a study from a lab in­de­pen­dently work­ing on the same prob­lem.

Novel re­search in ge­nomics. Mythos 5 con­ducted novel ge­nomics re­search in over a week of largely au­tonomous work. It as­sem­bled sin­gle-cell data for mil­lions of cells span­ning 138 an­i­mal species and de­signed and trained a cus­tom ma­chine learn­ing model to iden­tify cells per­form­ing the same role in even dis­tantly re­lated or­gan­isms. With only high-level hu­man in­put, Mythos 5’s trained model out­per­formed a re­cent model pub­lished in the jour­nal Science—despite be­ing 100 times smaller. We in­tend to pub­lish these re­sults in the com­ing months.

Alignment. In our au­to­mated align­ment as­sess­ment we found that Mythos 5’s level of mis­aligned be­hav­ior (including mis­aligned ac­tions taken by the model such as de­cep­tion, and co­op­er­a­tion with mis­use of the model by a user) was low, and sim­i­lar to that of Opus 4.8. Given they are the same un­der­ly­ing model, Fable 5’s level of align­ment will be sim­i­lar. The as­sess­ment is de­scribed in full, along with a de­tailed suite of other safety and ca­pa­bil­i­ties tests, in the mod­el’s sys­tem card.

Early feed­back for Claude Fable 5

Customers with early ac­cess ran their own tests on Fable 5. Below, in their words, is a se­lec­tion of what they’re see­ing:

Claude Fable 5 is the state of the art model on CursorBench. It’s opened up a class of long-hori­zon prob­lems that were out of reach for ear­lier mod­els.

Claude Fable 5 is the state of the art model on CursorBench. It’s opened up a class of long-hori­zon prob­lems that were out of reach for ear­lier mod­els.

Claude Fable 5 is a real step for­ward for the de­vel­op­ers GitHub serves. In our early test­ing, it took on com­plex, long-hori­zon cod­ing tasks with a level of au­ton­omy and re­li­a­bil­ity that ex­ceeded pre­vi­ous bench­marks. But what ex­cites us most is the di­rec­tion it points: a fu­ture where de­vel­op­ers can hand in­creas­ingly am­bi­tious work to agents and trust the re­sults across the soft­ware life­cy­cle.

Claude Fable 5 is a real step for­ward for the de­vel­op­ers GitHub serves. In our early test­ing, it took on com­plex, long-hori­zon cod­ing tasks with a level of au­ton­omy and re­li­a­bil­ity that ex­ceeded pre­vi­ous bench­marks. But what ex­cites us most is the di­rec­tion it points: a fu­ture where de­vel­op­ers can hand in­creas­ingly am­bi­tious work to agents and trust the re­sults across the soft­ware life­cy­cle.

These are the strongest re­sults of any Claude model we’ve had the op­por­tu­nity to test. Claude Fable 5 is a clear step for­ward on agen­tic cod­ing and pro­to­typ­ing.

These are the strongest re­sults of any Claude model we’ve had the op­por­tu­nity to test. Claude Fable 5 is a clear step for­ward on agen­tic cod­ing and pro­to­typ­ing.

Claude Fable 5′s rea­son­ing is a clear step be­yond Opus 4.8. It works at se­nior re­search sci­en­tist grade — pick­ing di­rec­tions, al­lo­cat­ing re­sources, killing its in­cor­rect be­liefs, and pro­duc­ing novel first-prin­ci­ples out­puts.

Claude Fable 5′s rea­son­ing is a clear step be­yond Opus 4.8. It works at se­nior re­search sci­en­tist grade — pick­ing di­rec­tions, al­lo­cat­ing re­sources, killing its in­cor­rect be­liefs, and pro­duc­ing novel first-prin­ci­ples out­puts.

Claude Fable 5 un­der­stands what builders mean, not just what they type. Apps that took a hun­dred prompts a year ago, it now one-shots. When a cus­tomer re­ally hits a wall, it’s the model we reach for to get them past it quickly, so they can fin­ish what they set out to build.

Claude Fable 5 un­der­stands what builders mean, not just what they type. Apps that took a hun­dred prompts a year ago, it now one-shots. When a cus­tomer re­ally hits a wall, it’s the model we reach for to get them past it quickly, so they can fin­ish what they set out to build.

Claude Fable 5 feels ma­te­ri­ally dif­fer­ent. In blind re­view, our lawyers found its red­lines matched or beat our cur­rent model every time.

Claude Fable 5 feels ma­te­ri­ally dif­fer­ent. In blind re­view, our lawyers found its red­lines matched or beat our cur­rent model every time.

At the high­est ef­fort, Claude Fable 5 re­flects on and val­i­dates its own work. For us, that’s what makes highly au­tonomous op­er­a­tions pos­si­ble — the ex­tra think­ing pays for it­self.

At the high­est ef­fort, Claude Fable 5 re­flects on and val­i­dates its own work. For us, that’s what makes highly au­tonomous op­er­a­tions pos­si­ble — the ex­tra think­ing pays for it­self.

Claude Fable 5 de­liv­ers more ca­pa­ble en­gi­neer­ing in fewer turns than prior mod­els — han­dling the com­plex multi-agent work­flows our em­ploy­ees run daily in Claude Code.

Claude Fable 5 de­liv­ers more ca­pa­ble en­gi­neer­ing in fewer turns than prior mod­els — han­dling the com­plex multi-agent work­flows our em­ploy­ees run daily in Claude Code.

Claude Fable 5 is the high­est-scor­ing model on FrontierBench, Cognition’s fron­tier cod­ing eval. It ex­cels at long-hori­zon rea­son­ing and gen­er­al­izes to un­fa­mil­iar tools out of the box.

Claude Fable 5 is the high­est-scor­ing model on FrontierBench, Cognition’s fron­tier cod­ing eval. It ex­cels at long-hori­zon rea­son­ing and gen­er­al­izes to un­fa­mil­iar tools out of the box.

Claude Fable 5 is the strongest fi­nance-first model we’ve tested, both on gen­eral fi­nance and rea­son­ing. It’s a no­table step up.

Claude Fable 5 is the strongest fi­nance-first model we’ve tested, both on gen­eral fi­nance and rea­son­ing. It’s a no­table step up.

Claude Fable 5 is the first to break 90% on our core an­a­lyt­ics bench­mark of com­plex, long-run­ning an­a­lyt­i­cal tasks — a 10-point jump over Opus. On the hard­est ques­tions, it shows strong judg­ment and at­ten­tion to nu­ance.

Claude Fable 5 is the first to break 90% on our core an­a­lyt­ics bench­mark of com­plex, long-run­ning an­a­lyt­i­cal tasks — a 10-point jump over Opus. On the hard­est ques­tions, it shows strong judg­ment and at­ten­tion to nu­ance.

Claude Fable 5 is the strongest model we’ve tested on fron­tier physics re­search while us­ing a third of the rea­son­ing to­kens. In 36 hours it got nearly to where GPT-5.5 landed af­ter four days.

Claude Fable 5 is the strongest model we’ve tested on fron­tier physics re­search while us­ing a third of the rea­son­ing to­kens. In 36 hours it got nearly to where GPT-5.5 landed af­ter four days.

On ViBench, our end-to-end vibe-cod­ing bench­mark, Claude Fable 5 is the high­est-per­form­ing model we’ve tested — nearly sat­u­rat­ing our base use cases and build­ing apps in less time with fewer to­kens.

On ViBench, our end-to-end vibe-cod­ing bench­mark, Claude Fable 5 is the high­est-per­form­ing model we’ve tested — nearly sat­u­rat­ing our base use cases and build­ing apps in less time with fewer to­kens.

Claude Fable 5 beats Opus 4.8 on our every­day spread­sheet suite at every ef­fort level — and it does it with fewer turns, fin­ish­ing runs 25 – 30% faster.

Claude Fable 5 beats Opus 4.8 on our every­day spread­sheet suite at every ef­fort level — and it does it with fewer turns, fin­ish­ing runs 25 – 30% faster.

01 /

14

Claude Fable 5’s new safe­guards

Mythos-class mod­els have reached a thresh­old where they pre­sent sig­nif­i­cant risks. In April we be­gan Project Glasswing, re­leas­ing the first Mythos-class model (Claude Mythos Preview) to only a lim­ited group of cy­ber de­fend­ers and crit­i­cal soft­ware in­fra­struc­ture providers. When we did so, we stated that we hoped to even­tu­ally re­lease Mythos-level ca­pa­bil­i­ties to all our users, so long as we had de­vel­oped new safe­guards that were strong enough to re­li­ably pre­vent mis­use.

Over the past few months we have been im­prov­ing these safe­guards, and they are now ro­bust enough for a gen­eral re­lease. Because we have pri­or­i­tized safety, we’ve de­lib­er­ately tuned the safe­guards to be cau­tious, and they are still stricter than would be ideal—for ex­am­ple, some­times be­nign re­quests will trig­ger our clas­si­fiers. We rec­og­nize that this will be frus­trat­ing to some users, and our aim is to re­duce false pos­i­tives as we up­date and re­fine the safe­guards af­ter launch.

Below we dis­cuss each of Fable 5’s new safe­guards in turn. Our wider suite of safe­guards is dis­cussed and eval­u­ated in the mod­el’s sys­tem card and our most re­cent risk re­port.

Safety clas­si­fiers

The fron­tier cy­ber­se­cu­rity and re­search bi­ol­ogy ca­pa­bil­i­ties of Mythos-class mod­els mean that they pose a sub­stan­tial risk of up­lift to ma­li­cious ac­tors. That is, these mod­els could pro­vide in­for­ma­tion or ad­vice that as­sists those ac­tors in caus­ing se­ri­ous harm that they could­n’t have re­ceived from other sources (for ex­am­ple, from in­ter­net search en­gines). Furthermore, a great deal of ad­vanced us­age of AI mod­els is dual use: the same queries that are ben­e­fi­cial in the hands of cy­ber­se­cu­rity pro­fes­sion­als and bi­ol­ogy re­searchers could be dan­ger­ous if avail­able to ma­li­cious ac­tors.

We there­fore need strong safe­guards to pre­vent mis­use, and their cov­er­age needs to be broad. The safe­guards them­selves have to stand up to sus­tained and so­phis­ti­cated at­tempts to by­pass them (also known as jailbreaking” the sys­tem). The up­lift from Mythos-level ca­pa­bil­i­ties is valu­able to many ad­ver­saries—for in­stance, those who could fi­nan­cially gain from cy­ber­at­tacks—and we there­fore ex­pect them to be mo­ti­vated to try to cir­cum­vent our safety mea­sures.

Fable 5 comes with a new set of clas­si­fiers: sep­a­rate AI sys­tems that de­tect po­ten­tial mis­use, in­clud­ing jail­break at­tempts, and pre­vent the main model (in this case Fable 5) from re­spond­ing. We’ve been run­ning clas­si­fiers on our mod­els for some time, and Fable 5’s clas­si­fiers are an ex­ten­sion of this pre­vi­ous work with ex­tra cov­er­age.

When Fable’s clas­si­fiers de­tect a re­quest re­lated to cy­ber­se­cu­rity, bi­ol­ogy and chem­istry, or dis­til­la­tion, the re­sponse is au­to­mat­i­cally han­dled by Claude Opus 4.8 in­stead. Users will be in­formed when­ever this oc­curs. Opus 4.8 is a highly ca­pa­ble model in its own right: a re­sponse that falls back to Opus is a far bet­ter ex­pe­ri­ence than an out­right re­fusal from Fable. Our early data shows that more than 95% of Fable ses­sions in­volve no fall­back at all—for those ses­sions, Fable 5’s per­for­mance is ef­fec­tively the same as that of Mythos 5.

The fol­low­ing are the ar­eas cov­ered by the clas­si­fiers:

1. Cybersecurity. Mythos-class mod­els ex­cel at dis­cov­er­ing and ex­ploit­ing soft­ware vul­ner­a­bil­i­ties. They can thus make cy­ber­at­tacks sub­stan­tially eas­ier and cheaper to com­mit. Mythos-class mod­els also show strong skills in agen­tic hack­ing. This in­volves per­form­ing mul­ti­ple dif­fer­ent parts of a cy­ber­at­tack in ad­di­tion to find­ing ex­ploits—re­con­nais­sance, dis­cov­ery, lat­eral move­ment, and more. To pre­vent these agen­tic hack­ing skills pro­vid­ing up­lift in cy­ber­at­tacks, we de­signed our cy­ber­se­cu­rity clas­si­fiers to cover both ex­ploita­tion and of­fen­sive cy­ber tasks in a broader sense. As shown in the graph be­low, our clas­si­fiers pre­vent Fable from mak­ing any progress on these tasks.

We ex­ten­sively red-teamed our clas­si­fiers to test their ro­bust­ness against jail­breaks. As well as in­ter­nal test­ing, we ran an ex­ter­nal bug bounty that pro­duced no uni­ver­sal jail­breaks in over 1,000 hours of test­ing. External red-team­ing or­ga­ni­za­tions we en­gaged also failed to find any uni­ver­sal jail­breaks on long-form agen­tic tasks so far—al­though the UK AISI has made progress to­wards one within a brief ini­tial test­ing win­dow.4 It is likely im­pos­si­ble to com­pletely pre­vent uni­ver­sal jail­breaks, but our goal is to make any re­main­ing jail­breaks suf­fi­ciently slow and costly that we can de­tect and pre­vent them be­fore they are used at scale.

The graph be­low, from one of our in­ter­nal eval­u­a­tions, il­lus­trates how Fable 5’s safe­guards give it greater re­sis­tance to jail­breaks than our pre­vi­ous gen­er­ally ac­ces­si­ble mod­els:

One of our ex­ter­nal part­ners found that Fable 5’s safe­guards against harm­ful cy­ber queries were the most ro­bust of any model tested (including Opus 4.8 and Opus 4.7). Fable 5 com­plied with zero harm­ful sin­gle-turn re­quests re­lat­ing to plan­ning a cy­ber­at­tack, ex­ploit de­vel­op­ment, or de­fense eva­sion. This held whether or not one of the re­quests used any of 30 dif­fer­ent pub­lic jail­break tech­niques.

2. Biology and chem­istry. We have long used our clas­si­fiers to block our mod­els from re­spond­ing on a nar­row se­lec­tion of bioweapons-re­lated queries. But we are no longer cer­tain that block­ing this nar­row se­lec­tion is enough. This is for two rea­sons: first, we have rea­son for con­cern about well-re­sourced ma­li­cious ac­tors at­tempt­ing to gain up­lift from our mod­els for highly risky bi­o­log­i­cal re­search. Second, mod­els now have a greater abil­ity to ac­com­plish real-world sci­en­tific tasks.

For ex­am­ple, we tested Mythos 5’s abil­ity to com­plete a chal­leng­ing step in de­sign­ing adeno-as­so­ci­ated viruses (AAVs). AAVs are a com­po­nent for de­liv­er­ing gene ther­a­pies, but the same ca­pa­bil­ity, in the wrong hands, could en­able the de­sign of dan­ger­ous viruses. In this task, var­i­ous AI mod­els were eval­u­ated on their abil­ity to pre­dict how a ge­netic mod­i­fi­ca­tion would im­pact the as­sem­bly of the virus’s outer shell (among a set of ther­a­peu­ti­cally-rel­e­vant un­pub­lished can­di­dates de­vel­oped by Dyno Therapeutics). We did not ex­plic­itly train our mod­els to per­form this task—and yet Mythos-class mod­els out­per­formed so­phis­ti­cated mod­els ded­i­cated to pro­tein tasks (known as protein lan­guage mod­els”) us­ing their bi­o­log­i­cal rea­son­ing alone. This demon­strates a promis­ing abil­ity to com­plete sim­ple but im­por­tant tasks in gene ther­apy re­search and de­vel­op­ment—but also high­lights the risk posed by such dual-use ca­pa­bil­i­ties.

Our pri­or­ity was to safely re­lease Fable as soon as we could, even at the cost of overly broad safe­guards. Therefore, for the time be­ing we have arranged for Fable to fall back to Opus 4.8 on most re­quests re­lated to bi­ol­ogy and chem­istry. As with all of our clas­si­fiers, we hope to nar­row these safe­guards as soon as pos­si­ble: as can be seen from the ev­i­dence above, there is great po­ten­tial for pos­i­tive ap­pli­ca­tions of Fable for sci­ence, and we do not want false pos­i­tives from our clas­si­fiers to get in the way. In the com­ing weeks, some bio­med­ical re­searchers and com­pa­nies will be able to join our trusted ac­cess pro­gram for bi­ol­ogy ca­pa­bil­i­ties in Mythos 5 (discussed be­low).

3. Distillation. We’ve pre­vi­ously iden­ti­fied large-scale at­tempts to ex­tract (“distill”) Claude’s ca­pa­bil­i­ties to train com­pet­ing mod­els in au­thor­i­tar­ian coun­tries. Distillation of Fable 5’s abil­i­ties could in­di­rectly lead to the pro­lif­er­a­tion of near-fron­tier AI ca­pa­bil­i­ties—and these could be re­leased with­out the ap­pro­pri­ate safe­guards. Requests that are flagged by our clas­si­fiers as be­ing part of such dis­til­la­tion at­tempts will fall back to Opus 4.8.

A new data re­ten­tion pol­icy

Finally, we’re mak­ing a change to the way we han­dle busi­ness cus­tomer data for Fable 5, Mythos 5, and fu­ture mod­els with sim­i­lar or higher ca­pa­bil­ity lev­els. We will re­quire 30-day re­ten­tion for all traf­fic on Mythos-class mod­els, on both first- and third-party sur­faces. We won’t use this data to train new Claude mod­els, or for any non-safety-re­lated pur­pose, and we’ve in­sti­tuted new pri­vacy pro­tec­tions in­clud­ing log­ging all hu­man ac­cess to the data and en­sur­ing its dele­tion af­ter 30 days in al­most all cases (see this post for fur­ther de­tails). The data will help us de­fend against com­plex and novel at­tacks (including new jail­breaks and at­tacks that op­er­ate across many re­quests) as well as help us iden­tify and re­duce false pos­i­tives.

Claude Mythos 5 and the trusted ac­cess pro­gram

Beginning to­day, all users who cur­rently have ac­cess to Claude Mythos Preview (for ex­am­ple, our cy­ber­se­cu­rity part­ners in Project Glasswing) will be able to up­grade to Claude Mythos 5—the same model as Claude Fable 5 but with cy­ber safe­guards lifted. Users will find Mythos 5 com­pa­ra­ble to, or some­what stronger than, Mythos Preview in most cases, while cost­ing sub­stan­tially less.

In con­sul­ta­tion with the US gov­ern­ment, we plan to steadily ex­pand ac­cess to Claude Mythos 5, con­tin­u­ing our pe­ri­odic ad­di­tion of new part­ners, as well as pur­su­ing a trusted ac­cess pro­gram that al­lows cy­ber­se­cu­rity or­ga­ni­za­tions to ap­ply in a more sys­tem­atic man­ner.

Our plans also in­clude open­ing a trusted ac­cess pro­gram for bi­ol­ogy, to help ac­cel­er­ate bio­med­ical re­search and dis­cover new ther­a­pies with Mythos-class ca­pa­bil­i­ties. This pro­gram will pro­vide ac­cess to Fable 5 with the bi­ol­ogy and chem­istry safe­guards re­moved (but the cy­ber safe­guards still in place). It will en­roll a small num­ber of re­searchers from a va­ri­ety of life sci­ence or­ga­ni­za­tions span­ning fun­da­men­tal and trans­la­tional re­search; we’re plan­ning to ex­pand ac­cess to this pro­gram while si­mul­ta­ne­ously mak­ing our safe­guards bet­ter.

Availability

Claude Fable 5 is avail­able every­where to­day. Claude Mythos 5 is re­stricted to Glasswing part­ners (with cy­ber safe­guards lifted) and soon to se­lect bi­ol­ogy re­searchers (with bi­ol­ogy and chem­istry safe­guards lifted) only, un­til our broader trusted ac­cess pro­gram is avail­able.

Pricing for both mod­els is $10 per mil­lion in­put to­kens and $50 per mil­lion out­put to­kens. Developers can use claude-fa­ble-5 via the Claude API.

We ex­pect de­mand for Fable 5 to be very high, and dif­fi­cult to pre­dict. On the Claude API and con­sump­tion-based Enterprise plans, Fable 5 is fully avail­able from to­day. For sub­scrip­tion plans, we’d rather give ac­cess sooner than later, so we’re rolling out more con­ser­v­a­tively, in stages:

From to­day through June 22, Fable 5 is in­cluded on Pro, Max, Team, and seat-based Enterprise plans at no ex­tra cost.

On June 23, we’ll re­move Fable 5 from those plans. Using it af­ter that will re­quire us­age cred­its. If ca­pac­ity al­lows, we’ll ex­tend the in­cluded win­dow.

After this point—when suf­fi­cient ca­pac­ity al­lows us to do so—we aim to re­store Fable 5 as a stan­dard part of sub­scrip­tion plans. We in­tend to do this as quickly as we can.

Throughout this pe­riod, we’ll com­mu­ni­cate any changes ahead of time so users know where things stand.

Edit June 9, 2026: Updated the dis­cus­sion of AAVs to note that the can­di­dates were de­vel­oped by Dyno Therapeutics.

Related con­tent

Introducing Claude Corps

We’re launch­ing Claude Corps, a na­tional fel­low­ship pro­gram for peo­ple early in their ca­reers who are pas­sion­ate about ex­tend­ing the ben­e­fits of AI to com­mu­ni­ties across America.

Read more

Introducing the Services Track and Partner Hub of the Claude Partner Network

Read more

What we learned map­ping a year’s worth of AI-enabled cy­ber threats

As AI trans­forms the na­ture of and meth­ods be­hind cy­ber­at­tacks, how well do the tech­niques and frame­works used by the se­cu­rity com­mu­nity hold up? In a new re­port, we seek to an­swer that ques­tion.

Read more

S&P 500 rejects SpaceX, also blocking entry for OpenAI and Anthropic

arstechnica.com

Such rule changes would have ac­com­mo­dated SpaceX’s plan to only of­fer ap­prox­i­mately 3 per­cent of its IPO shares to pub­lic in­vestors, and the fact that SpaceX is cur­rently un­prof­itable with a grow­ing debt load that has reached $29 bil­lion be­cause of its spend­ing spree on AI in­fra­struc­ture.

But in its fi­nal de­ci­sion, the S&P Dow Jones Indices stated that no changes will be made to the el­i­gi­bil­ity cri­te­ria in­clud­ing fi­nan­cial vi­a­bil­ity screens, sea­son­ing pe­riod, or min­i­mum IWF.” Even af­ter the stan­dard year­long wait, SpaceX, Anthropic, and OpenAI may strug­gle to de­liver the con­sis­tent prof­itabil­ity nec­es­sary to qual­ify for the S&P 500.

Money rules and ex­cep­tions

Swift en­try into the S&P 500 would have trig­gered $14 bil­lion of pas­sive fund buy­ing for SpaceX, ac­cord­ing to Bloomberg Intelligence. The in­vest­ment re­search arm of Bloomberg also es­ti­mated that OpenAI could have gained more than $8 bil­lion, and Anthropic could have net­ted $4.6 bil­lion from sim­i­lar pas­sive buy­ing sprees trig­gered by their S&P 500 en­tries.

This is be­cause $7.5 tril­lion in pas­sively man­aged funds—pop­u­lar among both in­di­vid­ual in­vestors and in­sti­tu­tional in­vestors—fol­low the S&P 500 by pur­chas­ing shares of com­pa­nies ac­cord­ing to their pro­por­tional rep­re­sen­ta­tion in the S&P 500 in­dex. For ex­am­ple, the Vanguard and Fidelity bro­ker­age gi­ants both of­fer pas­sive in­vest­ment funds that track the S&P 500 com­po­si­tion.

However, the S&P Dow Jones Indices did carve out one con­ces­sion” by chang­ing the in­vestable weight fac­tor rules for lower-profile bench­marks” such as the S&P Total Market Index and Dow Jones US Total Stock Market Index, ac­cord­ing to Quartz. That could al­low an IPO faster en­try into those in­dexes.

By con­trast, the Nasdaq stock ex­change changed its rules to al­low SpaceX to en­ter the Nasdaq-100 Index within 15 trad­ing days as op­posed to the usual three months. Similarly, the FTSE Russell in­dex provider de­cided to give SpaceX and other fol­low-on com­pa­nies ac­cel­er­ated en­try to the Russell Top 500 Index af­ter the close of the fifth trad­ing day fol­low­ing an IPO.

The de­nial of ac­cel­er­ated S&P 500 en­try for SpaceX comes just days af­ter Morningstar an­a­lysts de­scribed SpaceX as hav­ing been significantly over­val­ued” in the lead-up to its IPO. The in­vest­ment re­search firm val­ued SpaceX at $780 bil­lion—less than half of SpaceX’s $1.75 tril­lion IPO goal—pri­mar­ily based on the strengths of SpaceX’s Starlink satel­lite ser­vice and rocket launch busi­ness.

This story was up­dated on June 6, 2026 to more clearly de­scribe the pro­posed rule changes that would have ap­plied to all MegaCap com­pa­nies.

6.0.0

brew.sh

Today, I’m proud to an­nounce Homebrew 6.0.0. The most sig­nif­i­cant changes since 5.1.0 are a new tap trust se­cu­rity mech­a­nism, the new faster, smaller, de­fault in­ter­nal Homebrew JSON API, sand­box­ing on Linux, bet­ter de­faults in­formed by our user sur­vey, many brew bun­dle im­prove­ments, im­proved per­for­mance and ini­tial sup­port for ma­cOS 27 (Golden Gate).

✨ Highlights since 5.1.0

🔐 Tap trust

Homebrew 6.0.0 in­tro­duces tap trust. A third-party tap can con­tain ar­bi­trary, un­sand­boxed Ruby that runs on your ma­chine, so Homebrew now re­quires taps (and tap-qual­i­fied for­mu­lae and casks) to be ex­plic­itly trusted be­fore their code is eval­u­ated or run. This re­duces the risk from ma­li­cious or com­pro­mised taps while leav­ing the of­fi­cial Homebrew taps trusted by de­fault. See the new Tap-Trust doc­u­men­ta­tion for de­tails.

Homebrew en­forces ini­tial tap trust so un­trusted taps are flagged be­fore their code runs, trusts qual­i­fied tap items be­fore in­stall, stops auto-tap­ping un­trusted taps, pins tap al­low, for­bid and trust lists to re­motes and uses tap trust when eval­u­at­ing all for­mu­lae and casks.

brew tap gains com­mands for man­ag­ing tap trust, can trust a tap by its re­mote URL, brew trust adds a –json=v1 flag and brew tap-info adds a trusted field.

brew bun­dle ho­n­ours the trusted: op­tion and brew bun­dle dump records trusted bun­dle en­tries, mark­ing cus­tom-re­mote taps as trusted.

docs.brew.sh has new pages, in­clud­ing Tap-Trust, ex­plain­ing Homebrew’s new tap trust model, and Homebrew trusts taps in test-bot.

⚡ Default in­ter­nal JSON API

The in­ter­nal JSON API is now the de­fault, ad­vanc­ing the smaller API that Homebrew re-en­abled and turned on for de­vel­op­ers re­cently. It com­bines all Homebrew’s meta­data into a sin­gle down­load, so brew up­dates faster and talks to the net­work less. It was opt-in via HOMEBREW_USE_INTERNAL_API since 5.0.0; that vari­able is now dep­re­cated (see be­low).

🐧 Linux sand­box

The Linux Bubblewrap sand­box aligns Linux with ma­cOS, where build, test and postin­stall phases al­ready run sand­boxed. It is on by de­fault for de­vel­op­ers, Homebrew moved its ma­cOS sand­box logic to share code, im­proved Linux sand­box be­hav­iour (with Homebrew/homebrew-core set­ting the sand­box env in CI), hard­ened sand­boxed in­stall phases, sand­boxed cask ex­e­cutable hooks, al­lowed logs in the build sand­box, in­stalled Bubblewrap on hosted Ubuntu and skips sand­box setup for syn­tax-only jobs.

⚙️ Better de­faults

Following our Homebrew user sur­vey, we have made many changes based on the re­sults. The most no­table is mak­ing ask mode the de­fault for de­vel­op­ers, so brew in­stall and brew up­grade show a de­pen­dency sum­mary and con­fir­ma­tion prompt be­fore mak­ing changes.

Homebrew adds ask de­pen­dency plans and cask sup­port, ac­cepts one-key ask con­fir­ma­tions and aligns ask dry-run prompts.

Homebrew fetches ask up­grades to­gether, prints the ask up­grade sum­mary sooner, skips the up­grade ask prompt when empty, adds a fi­nal brew up­grade sum­mary and ex­plains the up­grade meta­data fetch.

📦 brew bun­dle

brew bun­dle gains many im­prove­ments, most no­tably par­al­lel for­mula in­stal­la­tion that now runs jobs au­to­mat­i­cally by de­fault, plus npm and krew ex­ten­sions, wider cleanup sup­port and, on Windows, winget sup­port.

Homebrew adds cleanup sup­port to npm, cargo, go and uv ex­ten­sions and asks be­fore re­mov­ing dur­ing cleanup.

Homebrew runs brew bun­dle krew via kubectl-krew di­rectly, re­spects CARGO_HOME and friends for cargo, adds a –describe flag to brew bun­dle add and tries mas in­stall be­fore falling back to mas get.

Homebrew adds bun­dle type dis­able flags, im­proves check guid­ance and checks for­mula link sta­tus.

Homebrew se­ri­alises for­mula locks, makes non-core DSLs a sin­gle file, re­moves de­scrip­tion com­ments from brew bun­dle/​re­mover and avoids pars­ing the out­put of brew ser­vices list.

brew bun­dle per­forms npm in­stalls more se­curely.

🏎️ Performance

Homebrew is faster across the board, with startup per­for­mance tweaks, a ~30% faster brew leaves, par­al­lelised bot­tle tab fetch­ing on up­grade and less work load­ing Ruby li­braries at startup.

🍎 ma­cOS 27 (Golden Gate)

Homebrew adds ini­tial sup­port for ma­cOS 27 (Golden Gate).

🔮 Upcoming changes

ma­cOS 27 (Golden Gate) drops Intel sup­port, so per our Support Tiers: in September 2026, ma­cOS Intel x86_64 moves to Tier 3 with no CI sup­port and no new bot­tles (binary pack­ages) built for ma­cOS Intel; in September 2027, ma­cOS Intel x86_64 will be un­sup­ported en­tirely and all re­lated code deleted.

The mas­ter to main mi­gra­tion be­gun in 4.6.0 con­tin­ues: more repos­i­to­ries no longer up­date mas­ter, GitHub Actions warn @master users to mi­grate to @main and the sync-de­fault-branches work­flows are re­moved from Homebrew/homebrew-cask and Homebrew/homebrew-core.

Casks that fail ma­cOS Gatekeeper checks, dep­re­cated in 5.0.0, re­main on track to be dis­abled in September 2026.

🔒 Security

🚨 Security ad­vi­sories

Homebrew pub­lished three se­cu­rity ad­vi­sories:

The POST down­load strat­egy by­passed the doc­u­mented HTTPS-to-HTTP redi­rect pro­tec­tion by dis­card­ing the re­solved URL (GHSA-7699-qf8c-q47m), fixed by en­forc­ing se­cure redi­rects.

Root code ex­e­cu­tion was pos­si­ble via Git hooks in the ma­cOS .pkg postin­stall (GHSA-6689-q779-c33m), fixed by clean­ing Homebrew git state and re­plac­ing the in­staller git di­rec­tory.

The ma­cOS in­staller pack­age trusted a user-con­trolled /var/tmp plist and could as­sign Homebrew own­er­ship to a lo­cal at­tacker (GHSA-59v8-x8q4-px5c), fixed by tweak­ing the ma­cOS .pkg pack­age-user plist han­dling.

🛡️ Other se­cu­rity im­prove­ments

Homebrew fil­ters sen­si­tive en­vi­ron­ment vari­ables dur­ing Ruby eval­u­a­tions and de­fers HOMEBREW_* en­vi­ron­ment se­crets to down­load time.

Homebrew runs for­bid­den checks for casks and for­mu­lae be­fore down­load and lets you re­quire check­sums for casks with HOMEBREW_CASK_OPTS_REQUIRE_SHA.

Homebrew links to a shared se­cu­rity pol­icy.

🗑️ Deprecations

Homebrew dep­re­cates de­fault opt-ins.

Homebrew dep­re­cates now-de­fault bun­dle and in­ter­nal API en­vi­ron­ment vari­ables such as HOMEBREW_BUNDLE_NO_SECRETS and HOMEBREW_USE_INTERNAL_API.

Homebrew marks un­used op­tions for dep­re­ca­tion.

Various other Homebrew 6.0.0 dep­re­ca­tions.

Homebrew’s SBOM sup­port is now opt-in with HOMEBREW_SBOM.

🎁 Features

🖥️ Casks

Homebrew can pin casks and sup­ports casks in brew miss­ing.

Homebrew adds AppImage sup­port for Linux and im­ple­ments a Linux freedesk­top trash for casks.

Homebrew im­proves cask up­grades by shar­ing up­grade down­load queues, mov­ing up­grade sum­maries be­fore fetch, adding a quit opt-out and re­open­ing closed apps dur­ing up­grade.

Homebrew im­proves au­to_up­dates casks: im­prov­ing how they up­date, re­fin­ing the be­hav­iour fur­ther, gat­ing auto-up­dates be­hind opt-in and up­grad­ing them when the bun­dle ver­sion is stale.

cask adds a gen­er­ate_­com­ple­tion­s_from_ex­e­cutable DSL ar­ti­fact and in­cludes re­solved ar­ti­fact tar­gets in JSON out­put.

Homebrew shows a cask ver­sion tran­si­tion in per-cask up­grade out­put, skips valid cached cask fetches, speeds up cask backup copies and has caskroom use the user’s pri­mary group on Linux.

brew doc­tor and brew cleanup han­dle cor­rupt Caskroom di­rec­to­ries.

💻 Operating sys­tem sup­port

Homebrew makes Linux cask re­quire­ments ex­plicit, aligns cask ma­cOS de­pen­den­cies, sup­ports bare de­pend­s_on :macos in casks, tracks ma­cOS sup­port ex­plic­itly and emits Linux vari­a­tions for casks with Linux check­sums.

Homebrew adds a max­i­mum ma­cOS for cask de­pen­den­cies. Homebrew/homebrew-cask adopts the new de­pend­s_on max­i­mum_­ma­cos: syn­tax and fixes its ma­cOS de­pen­den­cies in Homebrew/homebrew-cask and Homebrew/homebrew-core.

Homebrew adds M5 and M5 Pro/Max CPU recog­ni­tion and caps the OCLP tier when ma­cOS is out­dated.

Homebrew la­bels WSL an­a­lyt­ics, shows the Windows build on WSL in brew con­fig and moves the wsl? boolean from OS::Linux up to the OS mod­ule.

🚰 Taps

Homebrew recog­nises more equiv­a­lent tap re­mote forms, ig­nor­ing a .git suf­fix when match­ing GitHub re­motes and con­sol­i­dat­ing tap re­mote nor­mal­i­sa­tion. (and more)

Homebrew han­dles for­mu­lae and casks more uni­formly across com­mands, in­stalls ex­plic­itly re­quested taps and stops im­plicit tap in­stal­la­tion.

Homebrew uses work­trees for lo­cal core taps and blocks work­tree up­dates.

Homebrew shares full-name pars­ing helpers and uses full-name helpers for split names.

ℹ️ brew info and brew tap-info

brew info out­put is clearer: more con­sis­tent and help­ful, with a Binaries sec­tion list­ing ex­e­cuta­bles, a clearer re­cur­sive run­time de­pen­den­cies line, clearer same-named con­flicts and shad­owed for­mu­lae and a list ver­sions JSON out­put.

brew info shows in­stalled state bet­ter: the up­grade tar­get for out­dated @-versioned for­mu­lae, in­stalled de­pen­dents with –verbose, dep­re­cated and dis­abled pack­ages in in­stall sta­tus, in­stalled for­mu­lae re­solved from the re­ceip­t’s tap with a shad­ow­ing warn­ing, the in­stalled ver­sion and an up­grade hint on the head­line, other in­stalled ver­sions and an in­stalled info in­ven­tory.

brew info and brew tap-info skip the unin­stalled marker when not a prob­lem, show more tap info for pack­ages and brew tap-info lists for­mu­lae and casks.

brew which-for­mula shows in­stall sta­tus and Homebrew shows quar­an­tine script us­age.

🆕 New com­mands, flags and out­put

brew exec is a new com­mand, like npx, that sup­ports for­mu­lae en­vi­ron­ments.

brew as-con­sole-user is a new com­mand for run­ning Homebrew as the right user un­der MDM/root en­vi­ron­ments and brew up­date <formula> is aliased to up­grade.

Homebrew ti­dies help and com­ple­tions: omit­ting aliases from com­ple­tions, hid­ing HOMEBREW_CASK_OPTS_* from help, hid­ing main­tainer com­mands and hid­ing hide_from_­man_­page com­mands from brew com­mands.

Homebrew avoids in­stall warn­ing an­no­ta­tions and warns when for­mula ex­e­cuta­bles are shad­owed on PATH.

🧊 Cooldowns, livecheck and bump­ing

Homebrew adds down­load cooldowns for Bundler, RubyGems livecheck, npm and pip de­faults, PyPI re­source res­o­lu­tion and npm and PyPI in bump to avoid up­stream sup­ply-side se­cu­rity risks.

Homebrew prints bump skip sta­tus, mes­sages and er­rors and checks RubyGems li­cences.

Homebrew re­spects livecheck throt­tle days in au­dit, adds livecheck throt­tling by days and speeds up the for­mula throt­tle days check.

⬇️ Downloads and fetch­ing

brew fetch –all-platforms fetches every vari­ant, Homebrew prints down­load er­ror de­tails when us­ing con­cur­rency, pre­serves par­tial down­loads on net­work er­rors, avoids cached man­i­fest down­loads and hints when a down­load is HTML, not a bi­nary.

Homebrew avoids re­dun­dant Caskroom chgrp.

🛎️ Services

Homebrew starts sys­temd timers for ser­vices, cre­ates ser­vice path di­rec­to­ries au­to­mat­i­cally (with Homebrew/homebrew-core adopt­ing the new ser­vice path cre­ation logic) and au­dits re­dun­dant ser­vice path setup.

brew ser­vices no longer fails to load with –sudo-service-user.

🧪 Formulae and pack­ag­ing

Homebrew adds the VCS re­vi­sion as scm_re­vi­sion in the tab, sup­ports in-repos­i­tory patch files, sup­ports CPS meta­data di­rec­to­ries and in­cludes patches in for­mula to_hash.

Homebrew re­spects in­stalled de­pen­dents dur­ing au­tore­move and cross-checks au­tore­move can­di­dates against for­mula de­f­i­n­i­tions.

🪜 Install steps frame­work

The in­stall steps frame­work ex­presses com­mon postin­stall, pre­flight and post­flight be­hav­iour as or­dered, lit­eral-only DSL data that is ex­posed through the JSON APIs. Where a for­mula or cask only does sim­ple file prepa­ra­tion, it no longer needs to down­load and eval­u­ate a Ruby file at in­stall time. Homebrew adds for­mula in­stall steps, cask in­stall steps, an au­dit for for­mula in­stall steps, in­stall step re­build ac­tions, re­build step meth­ods, re­build step RuboCop checks and an au­dit of cask flight step con­ver­sions; home­brew/​core and home­brew/​cask adopt the new DSLs (post_install_steps, postin­stall and flight steps). In home­brew/​core and home­brew/​cask this cov­ers a large share of post_in­stall and *flight blocks (creating di­rec­to­ries, touch­ing mark­ers, mov­ing and sym­link­ing files), with more op­er­a­tion types planned.

🔀 Other changes

brew vulns is a new Homebrew tap and sub­com­mand that checks in­stalled pack­ages for known vul­ner­a­bil­i­ties 🔒.

Homebrew warns for Nix-managed Homebrew.

🧹 Internals, typ­ing and refac­tors

Homebrew re­places brew which-up­date, uses an AST for source rewrites and en­forces pub­lic API vis­i­bil­ity and docs.

Homebrew re­works com­mand pars­ing: parser sub­com­mand scaf­fold­ing, con­vert­ing the bun­dle, ser­vices and re­main­ing sub­com­mands, scop­ing sub­com­mand op­tion con­straints and us­age help, and no longer re­strict­ing global op­tions to sub­com­mands.

Homebrew lim­its Sorbet run­time de­faults and lim­its re­cur­sive Sorbet in test-bot.

🛠️ Continuous in­te­gra­tion and de­vel­oper tool­ing

The Ubuntu 24.04 CI mi­gra­tion flagged in 5.1.0 for 6.0.0 has now landed, rais­ing the Linux base­line.

container/docs/container-machine.md at main · apple/container

github.com

Container ma­chine pro­vides a highly in­te­grated Linux en­vi­ron­ment that works seam­lessly on your Mac. Container ma­chines are fast, light­weight and per­sis­tent. They are based on stan­dard OCI im­ages that can be built and shared. Host in­te­gra­tions such as au­to­matic user and home di­rec­tory shar­ing pro­vide quick and easy ac­cess to your Linux en­vi­ron­ment no mat­ter where you are in a ter­mi­nal.

Why con­tainer ma­chines

Containers are typ­i­cally mod­eled af­ter an ap­pli­ca­tion. A con­tainer ma­chine is mod­eled af­ter a Linux en­vi­ron­ment. It runs the im­age’s init sys­tem al­low­ing you to reg­is­ter long run­ning ser­vices or test your ap­pli­ca­tion un­der a process su­per­vi­sor. A con­tainer ma­chine au­to­mat­i­cally maps your user­name and home di­rec­tory into the Linux en­vi­ron­ment. Your repos­i­to­ries and dot­files are avail­able on both plat­forms. Use ed­i­tors and tools di­rectly on ma­cOS si­mul­ta­ne­ously build­ing and run­ning your ap­pli­ca­tion in­side of the Linux en­vi­ron­ment.

Edit on the Mac, build in­side. Your repo lives in $HOME on ma­cOS and is mounted at /Users/<username> in­side the con­tainer ma­chine. Use your ma­cOS ed­i­tor or IDE; com­pile and run in­side your con­tainer ma­chine.

Use ma­cOS-na­tive tool­ing against Linux ar­ti­facts. Profilers, screen­shot tools, browsers, and GUI de­bug­gers on your Mac all see the same files the con­tainer ma­chine sees — there is no copy step be­tween I built it” and I am in­spect­ing it”.

Real Linux ser­vices for test­ing. Run a data­base or what­ever your stack needs as a sys­tem ser­vice — sys­tem­ctl start post­gresql works on im­ages with sys­temd in­stalled.

One en­vi­ron­ment per tar­get dis­tro. Create as many con­tainer ma­chines as you have tar­get dis­tros — alpine, ubuntu, de­bian. Each has the same $HOME and the same dot­files from your Mac. Quickly test your ap­pli­ca­tion in var­i­ous dis­tri­b­u­tions.

Quickstart

con­tainer ma­chine cre­ate alpine:lat­est –name dev con­tainer ma­chine run -n dev whoami # your host user­name, not root con­tainer ma­chine run -n dev pwd # /home/<you> — your Mac home dir, mounted in con­tainer ma­chine run -n dev # in­ter­ac­tive shell; cd into your re­pos in $HOME

con­tainer ma­chine run is how you get a shell or run a sin­gle com­mand. If the con­tainer ma­chine is stopped, run boots it first.

Working in a con­tainer ma­chine

Open a shell, or run a sin­gle com­mand

With no com­mand, con­tainer ma­chine run opens an in­ter­ac­tive shell as a user that matches your host ac­count:

con­tainer ma­chine run -n dev

Pass a com­mand to run it once and exit:

con­tainer ma­chine run -n dev un­ame -a con­tainer ma­chine run -n dev — cat /proc/cpuinfo

Set a de­fault

Pick a de­fault con­tainer ma­chine so you can drop the -n flag:

con­tainer ma­chine set-de­fault dev con­tainer ma­chine run # op­er­ates on dev

List, in­spect, stop, delete

con­tainer ma­chine ls # list all con­tainer ma­chines con­tainer ma­chine in­spect dev # JSON de­tail for one con­tainer ma­chine stop dev # stop the con­tainer ma­chine con­tainer ma­chine rm dev # delete, in­clud­ing its per­sis­tent stor­age

con­tainer ma­chine has the alias m, so m ls, m run, etc. all work.

Resize CPUs, mem­ory, or change the home-mount

con­tainer ma­chine set up­dates con­fig­u­ra­tion on disk. Changes take ef­fect af­ter the next stop and start:

con­tainer ma­chine set -n dev cpus=4 mem­ory=8G con­tainer ma­chine stop dev con­tainer ma­chine run -n dev — nproc

Memory de­faults to half of host mem­ory. The home-mount can be rw (default), ro, or none.

Bring your own con­tainer ma­chine im­age

Any Linux im­age that in­cludes /sbin/init works as a con­tainer ma­chine. For ex­am­ple, this Dockerfile builds an Ubuntu 24.04 con­tainer ma­chine im­age with sys­temd and com­mon com­mand-line tools:

FROM ubuntu:24.04

ENV con­tainer con­tainer

RUN apt-get up­date && \ apt-get in­stall -y \ dbus sys­temd openssh-server net-tools iproute2 iputils-ping curl wget vim-tiny man sudo && \ apt-get clean && \ rm -rf /var/lib/apt/lists/* && \ yes | un­min­i­mize

RUN >/etc/machine-id RUN >/var/lib/dbus/machine-id

RUN sys­tem­ctl set-de­fault multi-user.tar­get RUN sys­tem­ctl mask \ dev-hugepages.mount \ sys-fs-fuse-con­nec­tions.mount \ sys­temd-up­date-utmp.ser­vice \ sys­temd-tmp­files-setup.ser­vice \ con­sole-getty.ser­vice RUN sys­tem­ctl dis­able \ net­workd-dis­patcher.ser­vice

RUN sed -i -e s/^AcceptEnv LANG LC_\*$/#AcceptEnv LANG LC_*/’ /etc/ssh/sshd_config

Build it and cre­ate a con­tainer ma­chine from it:

con­tainer build -t lo­cal/​ubuntu-ma­chine:lat­est . con­tainer ma­chine cre­ate lo­cal/​ubuntu-ma­chine:lat­est –name ubuntu

By de­fault, con­tainer runs a built-in setup script on first boot to pro­vi­sion the user de­scribed above. To use your own setup in­stead, add an ex­e­cutable script at /etc/machine/create-user.sh to the im­age. It runs once, as root, on first boot, with these vari­ables set:

CONTAINER_GID

CONTAINER_HOME

CONTAINER_MACHINE_ID

CONTAINER_UID

CONTAINER_USER

How building an HTML-first site doubled our users overnight

mohkohn.co.uk

Jun 10, 2026

This is a story of how build­ing HTML-first dou­bled a com­pa­ny’s users lit­er­ally overnight.

My client was a util­ity com­pany, and they had a big prob­lem. To ap­ply for their ser­vices, cus­tomers could ei­ther use an old ASP form on the web­site, or fol­low a man­ual process. The man­ual process was more ex­pen­sive for the com­pany, of course. Adding a lot of pres­sure, this was a reg­u­lated mo­nop­oly, and if their cus­tomer sat­is­fac­tion dropped be­low 96% (if I re­mem­ber cor­rectly) it could re­sult in mil­lions of pounds in fines.

There were two pre­vi­ous failed (and very ex­pen­sive) at­tempts to solve the prob­lem. In the most re­cent, con­trac­tors in an­other coun­try had built a React app. The React app was on­line for 3 days be­fore be­ing pulled be­cause of cus­tomer com­plaints. I took one look at it and told my boss we can’t take own­er­ship of this.” It was a mess of load­ing spin­ners and global javascript states. It was not ac­ces­si­ble. Image up­load was a vi­tal part of the form, and it at­tempted to store im­ages (along with all other form data) in lo­cal­stor­age which has a 5mb limit!

I took a very bold de­ci­sion and built a new ver­sion of the site us­ing Astro. It was HTML-first. Javascript ex­isted, in web com­po­nents, but only to pro­gres­sively-en­hance a web­site that worked per­fectly fine with­out it.

My logic was thus:

This is a pub­lic ser­vice

It should work on every ma­chine pos­si­ble

It should work when con­nec­tions are poor

The forms must never lose data once it is en­tered

I was very moved by this anec­dote from Terence Eden:

A few years ago I was do­ing pol­icy re­search in a hous­ing ben­e­fits of­fice in London. They are sin­gu­larly unlovely places. The walls are bright­ened up with posters of­fer­ing help­ful ser­vices for peo­ple flee­ing do­mes­tic vi­o­lence. The se­cu­rity guards on the door are cau­tiously in­dif­fer­ent to any­one walk­ing in. The air is filled with tense con­ver­sa­tions be­tween part­ners - drowned out by the noise of scream­ing kids. In the mid­dle, a young woman sits on a hard plas­tic chair. She is sur­rounded by can­vas-bags con­tain­ing her worldly pos­ses­sions. She does­n’t look like she is in a great emo­tional place right now. Clutched in her hands is a games con­sole - a PlayStation Portable. She stares at it in­tensely; block­ing out the world with Candy Crush. Or, at least, that’s what I thought. Walking be­hind her, I glance at her con­sole and recog­nise the screen she’s on. She’s con­nected to the com­ple­men­tary WiFi and is brows­ing the GOV.UK pages on Housing Benefit. She’s not slic­ing fruit; she’s arm­ing her­self with knowl­edge. The PSPs web browser is - char­i­ta­bly - pa­thetic. It is slow, fre­quently runs out of mem­ory, and can only open 3 tabs at a time. But the GOV.UK pages are writ­ten in sim­ple HTML. They are de­signed to be light­weight and will work even on rub­bish browsers. They have to. This is for every­one.

A few years ago I was do­ing pol­icy re­search in a hous­ing ben­e­fits of­fice in London. They are sin­gu­larly unlovely places. The walls are bright­ened up with posters of­fer­ing help­ful ser­vices for peo­ple flee­ing do­mes­tic vi­o­lence. The se­cu­rity guards on the door are cau­tiously in­dif­fer­ent to any­one walk­ing in. The air is filled with tense con­ver­sa­tions be­tween part­ners - drowned out by the noise of scream­ing kids.

In the mid­dle, a young woman sits on a hard plas­tic chair. She is sur­rounded by can­vas-bags con­tain­ing her worldly pos­ses­sions. She does­n’t look like she is in a great emo­tional place right now. Clutched in her hands is a games con­sole - a PlayStation Portable. She stares at it in­tensely; block­ing out the world with Candy Crush.

Or, at least, that’s what I thought.

Walking be­hind her, I glance at her con­sole and recog­nise the screen she’s on. She’s con­nected to the com­ple­men­tary WiFi and is brows­ing the GOV.UK pages on Housing Benefit. She’s not slic­ing fruit; she’s arm­ing her­self with knowl­edge.

The PSPs web browser is - char­i­ta­bly - pa­thetic. It is slow, fre­quently runs out of mem­ory, and can only open 3 tabs at a time.

But the GOV.UK pages are writ­ten in sim­ple HTML. They are de­signed to be light­weight and will work even on rub­bish browsers. They have to. This is for every­one.

Some re­quire­ments I de­rived:

Each ses­sion with the form should have a unique ID

At every step in the form wiz­ard, sub­mit­ted data should be stored on the back­end, in­clud­ing up­loads

It should be pos­si­ble to com­plete the form with­out javascript

It should be pos­si­ble to com­plete the form on out­dated and crap web browsers

We had to meet WCAG ac­ces­si­bil­ity (the team set­tled on AA rather than AAA)

Javascript and mod­ern CSS should be used to en­hance the ex­pe­ri­ence

The ba­sic setup ended up be­ing that each step in the form wiz­ard was its own page. When the user clicked next, the form would sub­mit. If the data was judged to be valid by the API, the browser would be redi­rected to the next step.

A ven­er­a­ble web ap­pli­ca­tion pat­tern that has had a small mod­ern re­nais­sance thanks to Remix, form sub­mis­sions and redi­rects took a while to ex­plain to my col­leagues, on ac­count of every­one be­ing used to heav­ily client-side web ap­pli­ca­tions. I have noth­ing against heav­ily client-side ap­pli­ca­tions, in their place. But this is just a big form - it’s not show­ing real-time data. Our user could be stand­ing in the mid­dle of a field on a new-build hous­ing es­tate, hold­ing a decade-old com­mod­ity an­droid phone they bought in Tesco. Shipping them 20MB of javascript be­fore we even ren­der a form would be a ridicu­lous thing to do.

Next, I tack­led one of my biggest bug­bears, form val­i­da­tion (and form and form er­ror ren­der­ing). I have seen teams waste per­son-months of ef­fort wran­gling React val­i­da­tion li­braries. If you are a React per­son, you might be scoff­ing at this - skill is­sue, I guess - but it is the re­al­ity for many teams. I would like to humbly sug­gest that you too may be spend­ing more time than you re­alise, and a lot more time than is nec­es­sary, in­ter­act­ing with and main­tain­ing poor im­i­ta­tions of the val­i­da­tion sys­tem that ships with every browser.

So I built an HTML web com­po­nent. These are sim­ple cus­tom el­e­ments that wrap around ex­ist­ing HTML and bring it to life. No shadow DOM, no (or lit­tle) ren­der­ing HTML in javascript. Mine wrapped around any HTML form, picked up the HTML val­i­da­tion, and made it look mod­ern. It would pre­vent those HTML val­i­da­tion popup tooltips, and in­stead place the er­ror in the aria-de­scribedby el­e­ment as­so­ci­ated with the field (today, aria-er­rormes­sage is ad­vised in­stead). It would clear val­i­da­tion while you typed, if you reached a valid state, and as­sess it again on blur and sub­mit.

Exactly the user ex­pe­ri­ence a form needs, de­liv­ered in un­der 1KB. If it failed, the form would fall back to built-in browser val­i­da­tion. If that failed, the back­end API would han­dle val­i­da­tion. We re­ported val­i­da­tion is­sues to the user as early as pos­si­ble given their browser, and al­ways fell back to an ac­cept­able ex­pe­ri­ence if it failed.

I have since writ­ten a new ver­sion of this web com­po­nent from scratch, aimed for gen­eral use. It’s called val­i­da­tion-en­hancer. I have been in this in­dus­try for over 20 years, and it is the best form val­i­da­tion li­brary I have ever used. I am very proud of it.

The code is so sim­ple to work with:

<validation-enhancer> <form>

<label for=“my-email”>Email</​la­bel> <input type=“email” name=“my-email” aria-er­rormes­sage=“my-email-er­ror” re­quired /> <div id=“my-email-er­ror”></​div>

<button type=“sub­mit”>Sub­mit</​but­ton> </form> </validation-enhancer>

The re­sults? When we launched, the num­ber of peo­ple com­plet­ing the form dou­bled. The an­a­lyt­ics peo­ple did­n’t even know where these users were com­ing from. Of course, your javascript-based an­a­lyt­ics pack­age does­n’t see the users you are bounc­ing be­cause of javascript fail­ures. It was a flood! We also saw my keep a back­end ses­sion, never lose user data” ap­proach pay off. In one case, some­one com­pleted a form a month af­ter start­ing it.

There was a sad coda; as is the way of con­tract work, I moved on. I ex­plained what I had built to my re­place­ment, that it al­ways worked even with­out javascript. He was ap­palled and said, but that’s a lot more work for us.”

It is not ac­cept­able to bounce users on old browsers, users with bad net­work con­nec­tions, users us­ing as­sis­tive tech­nolo­gies. Certainly not from a mo­nop­oly pub­lic ser­vice. A lot of hype and noise is press­ing us to ex­tend the cow­boy, wild-west phase of the soft­ware in­dus­try’s ex­pan­sion. We should set that aside, and take our­selves se­ri­ously as a ma­ture in­dus­try. Build a web ap­pli­ca­tion that works on a playsta­tion portable on a 3G con­nec­tion - if you do, it will work for all your users, and it will still work 30 years from now.

AI-native React Components

vorpus.github.io

LLMs are eroding my software engineering career and I don't know what to do

human-in-the-loop.bearblog.dev

06 Jun, 2026

I’m a soft­ware en­gi­neer, com­plet­ing 10 years of pro­fes­sional ex­pe­ri­ence this year. I started my ca­reer as a web fron­tend en­gi­neer (it was eas­ier for me to de­bug fron­tend code back then, so I chose that path), but shortly tran­si­tioned to (web) back­end and never looked back.

Through a se­ries of co­in­ci­dences, once I stepped into back­end de­vel­op­ment, I ended up work­ing in soft­ware de­vel­op­ment roles in the do­mains of fi­nance, book­keep­ing and pay­ment pro­cess­ing, where I had great au­ton­omy and a close and can­did re­la­tion­ship with Product Managers and stake­hold­ers.

I learnt a lot about the do­main and how to ef­fec­tively write pro­grams for it: PCI com­pli­ance, dou­ble-en­try ledgers, es­crows, rec­on­cil­i­a­tion, pay­ment life­cy­cles, bank trans­fer idem­po­tency, etc.

It was, then, ob­vi­ous that I should fo­cus my ca­reer on be­com­ing an ex­pert on that do­main to stand out as a pro­fes­sional and dif­fer­en­ti­ate my­self in a field that showed signs of an in­creas­ing need for do­main spe­cial­ists.

The first pil­lar to erode: do­main-spe­cific knowl­edge

Last year, I got hired by a com­pany in the fi­nance work­space. So far, I had worked on com­pa­nies that do have a strong pay­ment and fi­nance com­po­nent to their op­er­a­tions/​of­fer­ings, but that were not solely fi­nance-fo­cused com­pa­nies.

That com­pany also em­braced AI whole­heart­edly, so I got ChatGPT and Claude Enterprise ac­counts from day one and was en­cour­aged to use them for my re­search, ex­plo­ration, and even cod­ing, al­beit with a warn­ing that I should still re­view and own every sin­gle line that made it into pro­duc­tion.

One of my first pro­jects in­volved re­work­ing the legacy on­line pay­ment sys­tem, which was a mess. They hired me for (among other things) my pre­vi­ous ex­pe­ri­ence in build­ing that and trusted me with the task.

Different from the other com­pa­nies I had worked for so far, they wanted the Design Docs” I write be­fore cod­ing to be read­able by both en­gi­neers and prod­uct man­agers - so they should­n’t be a tech­ni­cal deep dive and more of an ar­chi­tec­tural view. I wrote my first one with min­i­mal AI as­sis­tance - I even called LLMs stochastic par­rots” at the time, a view I no longer hold - and de­liv­ered it.

I val­ued my knowl­edge and thought no LLMs could re­place it.

Then my man­ager reached out to me: even though you’re de­liv­er­ing code at a good pace, you’re tak­ing too long to de­liver those Design Docs. Are you us­ing AI? You should use more AI.

No way this will work”, I thought in my head, but agreed. The mod­els at that time were not as good as the ones we have now, but they did pro­vide a good speed-up on my writ­ing and even the de­ci­sion-mak­ing.

And then I started re­al­iz­ing: all the knowl­edge I have ac­cu­mu­lated over the years: the trade-offs be­tween im­ple­men­ta­tions, how ac­quir­ing works, how to struc­ture idem­po­tency to pre­vent dou­ble-charges, every­thing, was be­com­ing use­less. Even though the mod­els still needed some steer­ing, they could con­nect the dots on how to struc­ture such sys­tems, which was the hard­est part that only de­vel­ops in your brain af­ter years of hands-on ex­pe­ri­ence. That was my first shock.

But sure, I thought, they can do that be­cause there’s plenty of ar­ti­cles on the web on how that shit works along with all the tech­ni­cal doc­u­men­ta­tion, and we have blog posts ex­plain­ing how to ap­ply the tech­ni­cal tools to the do­main. For hu­mans, it may take a long time to learn all that, but that’s train­ing data so the mod­els can pick it up.

What the mod­els will never be good at, and that’s where hu­mans will shine, is de­bug­ging! I had ac­cu­mu­lated a good ex­pe­ri­ence de­bug­ging race con­di­tions and dis­trib­uted sys­tems in pro­duc­tion. That was my ticket to long-term em­ploy­a­bil­ity.

The sec­ond pil­lar to erode: de­bug­ging and dis­trib­uted sys­tems

So, af­ter LLMs started get­ting good at writ­ing docs and help­ing plan the ac­tual im­ple­men­ta­tions, they be­came good at cod­ing. It started in the sec­ond half of 2025 with the Claude Code hype, then Codex came and so on. Although I was us­ing LLMs for writ­ing unit tests every day be­fore that, I was­n’t trust­ing them to write the full im­ple­men­ta­tion yet.

The nat­ural next step was to in­tro­duce more AI into writ­ing code. And hon­estly, I liked it. I like ship­ping things to pro­duc­tion and see­ing users happy as much as I like cod­ing, so I was trad­ing one thing that I like for an­other one that I also like, it was fair.

LLMs were be­com­ing good at cod­ing, but it still could­n’t de­bug the mess left be­hind (by then or by the hu­mans), so I still had a role that was big­ger than steer­ing the ro­bot - a ticket to em­ploy­a­bil­ity.

Everything seemed fine.

Then came the MCPs, the agen­tic work­flows and Claude 4.5 and the sky started to fall.

Claude 4.5, to be hon­est, was­n’t that good. It solved like 60% of the bugs given a stack trace and some con­text (a Sentry link with Sentry MCP en­abled was all it took in most cases). Sometimes it gave a so­lu­tion that sounded plau­si­ble but was to­tally wrong.

This time, how­ever, I stopped doubt­ing the ma­chines. I saw bugs that in the past would eas­ily take 1 day of full-time de­bug­ging be­ing one-shot­ted by Claude Code. Of course, not all of them yet, but the pat­tern was clear.

Then came 4.6, 4.7, GPT 5.5, Opus 4.8 and the DataDog MCP… Now I have CLIs that one-shots bugs across dis­trib­uted sys­tems for me. Bugs that I could­n’t solve in the past. Bugs that would take 2 days of full-time de­bug­ging. Bugs across dis­trib­uted sys­tems that lack dis­trib­uted ob­serv­abil­ity. 90% of the bugs are one-shot­ted now, in­clud­ing bizarre race con­di­tions, un­ex­pected cor­ner-cases, third-party in­te­gra­tion is­sues, un­doc­u­mented API edge cases, every­thing. I hardly have to in­ter­vene.

Of course, I’m still em­ploy­able be­cause some­one has to re­view the code and steer the ro­bot. But I’m just an­other off-the-shelf en­gi­neer now. I have no do­main ex­per­tise that an­other Sr. en­gi­neer steer­ing an LLM can­not match. All my fi­nance and pay­ment do­main ex­per­tise, all the de­bug­ging in­tu­ition and dis­trib­uted sys­tem knowl­edge earned through hours of sweat and tears, is now prompt­able.

We were taught that gen­er­al­ists and spe­cial­ists will al­ways have their roles. But now the mar­ket is shap­ing every­one into be­com­ing a gen­er­al­ist. That’s not a bad thing per se, un­til you look un­der the eco­nom­ics of sup­ply and de­mand: if every­one is a gen­er­al­ist, the price of a gen­er­al­ist falls if there’s no de­mand to match. And we all know the de­mand is dry­ing up.

The third pil­lar, the one that has­n’t eroded yet: code qual­ity and ar­chi­tec­ture

I still have one pil­lar stand­ing, though: code qual­ity and soft­ware ar­chi­tec­ture - what’s now be­ing re­duced to be­ing called taste” 1.

Along the course of my ca­reer, I al­ways liked to refac­tor, al­ways prized good code, and ne­go­ti­ated time in the sprint for it. DDD, Hexagonal, Clean Architecture, you know all the buzz­words. I like this topic, I like to dis­cuss the trade-offs and dif­fer­ent ideas on how to shape code­bases. I re­ally like it.

This is the last pil­lar stand­ing. Except that no­body cares any­more.

Agents do a re­ally bad job at keep­ing code­bases or­ga­nized. If you don’t steer them, they’ll hit a cir­cu­lar de­pen­dency is­sue sooner than you think. Will du­pli­cate code. Add un­nec­es­sary com­ments. Mix up pure func­tions and side-ef­fects. Disregard the prin­ci­ples of SOLID.

That should keep hu­mans em­ployed, ex­cept that this skill is now be­ing re­duced to the word taste”. But it’s not just a re­nam­ing, the in­dus­try is mov­ing to a world where code or­ga­ni­za­tion is less im­por­tant.

Sure, hu­mans should steer the agent to pre­vent spaghetti code­bases with cir­cu­lar de­pen­dency graphs. We don’t want F-rated code­bases that are im­pos­si­ble to touch with­out break­ing some­thing. But a C or D? It’s now fine. Nobody needs A or B-grade code­bases any­more be­cause they’re be­ing made for LLMs, not for hu­mans to read.

I don’t want to ar­gue if this is in­her­ently good or bad. If the source code is now writ­ten for ma­chines to read and not hu­mans, it may be ac­tu­ally ok to tar­get them.

But that’s an­other pil­lar of my ex­per­tise that’s erod­ing. A good chunk of the knowl­edge I ac­cu­mu­lated on that topic is not that valu­able any­more. All the time I spent on it - read­ing books, do­ing real-world ex­er­cises, dis­cussing with other en­gi­neers, writ­ing ADRs - is be­com­ing use­less.

What now?

I’m still em­ployed and I see my­self em­ployed (at least in that com­pany) for a fore­see­able fu­ture. But I don’t know what to think about the long-term.

I spent 10 years (even more when you ac­count for non-pro­fes­sion ex­pe­ri­ence) get­ting good at things that are be­com­ing less and less valu­able. My last pil­lar of ex­per­tise is now re­duced to a taste” and will prob­a­bly won’t last long.

And I know that’s not just me. About 8 months ago there was a lay­off at my cur­rent com­pany (not re­lated to AI, ac­cord­ing to them). Some bril­liant ex-cowork­ers were laid off and are still look­ing for jobs. Most of them suf­fer from the same prob­lem I out­lined here: their do­main ex­per­tise is not enough to stand out any­more.

The com­pany is now hir­ing again for a few roles and do­main fa­mil­iar­ity is not a strong dif­fer­en­tia­tor any­more. We used to list Software Engineer - Area”. Now it’s just Software Engineer” and the team as­sign­ment comes af­ter the of­fer is ac­cepted.

Of course, this is good for bril­liant en­gi­neers that never had the chance to get deep into the do­main and now have bet­ter chances at get­ting a job, but it’s also sad to think that other bril­liant en­gi­neers that spent their lives col­lect­ing do­main knowl­edge are now com­pet­ing on the same lane.

The only way out for keep­ing my em­ploy­a­bil­ity in the long-term now seems to be shift­ing my do­main ex­per­tise to some­thing LLMs will not get good at so eas­ily. But what’s left?

I thought about go­ing back to col­lege, learn­ing Math, Statistics, ad­vanced Machine Learning and ap­ply­ing for re­search role at a fron­tier lab. Except that there are no fron­tier labs in my coun­try, the few ones that ex­ist are flood­ing with ap­pli­ca­tions and I have fam­ily mat­ters that makes mov­ing to an­other coun­try dif­fi­cult. By the time I can af­ford to make that jump, RSI may have made re­searchers ob­so­lete.

Maybe I should con­sider trans­form­ing my wood­work­ing hobby into a pro­fes­sion…

Update (Jun 7): this post went vi­ral. I wrote an­other post re­ply­ing to some com­ments from so­cial me­dia and ex­pand­ing some of my ar­gu­ments. You can read it here.

See this, this and this for ref­er­ence. Don’t take this as an en­dorse­ment of the con­tent in­side any of these posts.↩

See this, this and this for ref­er­ence. Don’t take this as an en­dorse­ment of the con­tent in­side any of these posts.↩

#ai

#llm

#software en­gi­neer­ing

If Claude Fable stops helping you, you'll never know — Jonathon Ready

jonready.com

Update: Anthropic has walked back this pol­icy af­ter out­rage from de­vel­op­ers. The com­pany now says Fable 5′s safe­guards for fron­tier LLM de­vel­op­ment will be vis­i­ble to users in­stead of silently de­grad­ing the model.

I did­n’t ex­pect to read this in a model card.

Fable 5 model card :

we’ve im­ple­mented new in­ter­ven­tions that limit Claude’s ef­fec­tive­ness for re­quests tar­get­ing fron­tier LLM de­vel­op­ment (for ex­am­ple, on build­ing pre­train­ing pipelines, dis­trib­uted train­ing in­fra­struc­ture, or ML ac­cel­er­a­tor de­sign). Using Claude to de­velop com­pet­ing mod­els al­ready vi­o­lates our Terms of Service, but en­forc­ing this re­stric­tion through our safe­guards avoids ac­cel­er­at­ing the ac­tors most will­ing to vi­o­late these terms. Unlike our in­ter­ven­tions for cy­ber­se­cu­rity, bi­ol­ogy and chem­istry, and dis­til­la­tion at­tempts, these safe­guards will not be vis­i­ble to the user. Fable 5 will not fall back to a dif­fer­ent model. Instead, the safe­guards will limit ef­fec­tive­ness through meth­ods such as prompt mod­i­fi­ca­tion, steer­ing vec­tors, or pa­ra­me­ter-ef­fi­cient fine-tun­ing (PEFT).

we’ve im­ple­mented new in­ter­ven­tions that limit Claude’s ef­fec­tive­ness for re­quests tar­get­ing fron­tier LLM de­vel­op­ment (for ex­am­ple, on build­ing pre­train­ing pipelines, dis­trib­uted train­ing in­fra­struc­ture, or ML ac­cel­er­a­tor de­sign). Using Claude to de­velop com­pet­ing mod­els al­ready vi­o­lates our Terms of Service, but en­forc­ing this re­stric­tion through our safe­guards avoids ac­cel­er­at­ing the ac­tors most will­ing to vi­o­late these terms. Unlike our in­ter­ven­tions for cy­ber­se­cu­rity, bi­ol­ogy and chem­istry, and dis­til­la­tion at­tempts, these safe­guards will not be vis­i­ble to the user. Fable 5 will not fall back to a dif­fer­ent model. Instead, the safe­guards will limit ef­fec­tive­ness through meth­ods such as prompt mod­i­fi­ca­tion, steer­ing vec­tors, or pa­ra­me­ter-ef­fi­cient fine-tun­ing (PEFT).

Claude can now be silently nerfed. Anthropic has de­cided it won’t tell users when this hap­pens.

Modern soft­ware com­pa­nies in­creas­ingly build their own em­bed­ding, rerank­ing, and rec­om­men­da­tion sys­tems. Even my small boot­strapped app, wan­derfugl.com, has a cus­tom reranker and em­bed­ding al­go­rithm that I trained my­self.

Anthropic gives a few ex­am­ples of what it con­sid­ers frontier AI de­vel­op­ment,” but does­n’t pro­vide a clear line. The prob­lem is that many tech­niques once re­served for AI labs are now be­ing used by or­di­nary soft­ware com­pa­nies. Startups train em­bed­ding mod­els. They build rerankers. They fine­tune and host small llms. The bound­ary be­tween frontier AI re­search” and nor­mal prod­uct de­vel­op­ment is be­com­ing harder to de­fine every year.

That cre­ates a real sup­ply chain risk for busi­nesses. If Claude gives me poor or in­cor­rect ad­vice while I’m work­ing on an AI com­po­nent, I have no way of know­ing whether the model was con­fused, whether my prob­lem is un­solv­able, or if some in­vis­i­ble pol­icy re­stric­tion qui­etly kicked in. Anthropic has ex­plic­itly cho­sen not to tell users when this is hap­pen­ing.

Once a de­vel­op­ment tool can stop op­ti­miz­ing for your suc­cess with­out telling you, it be­comes im­pos­si­ble to fully trust your in­fra­struc­ture.

The Anthropic sup­ply chain risk

Anthropic says these safe­guards only af­fect 0.03% of de­vel­op­ers. Maybe that’s true to­day.

The prob­lem is that the de­f­i­n­i­tion of an AI com­pany is chang­ing.

Maybe you’re not train­ing fron­tier mod­els to­day—most com­pa­nies aren’t. But mod­ern soft­ware in­creas­ingly con­tains AI mod­els. Five years ago, build­ing a startup meant writ­ing APIs and SQL queries. Today, it of­ten means train­ing, tun­ing, and de­ploy­ing mod­els.

Five years ago, mod­els like CLIP were fron­tier AI re­search pro­jects. Today I’m fine-tun­ing them for a boot­strapped travel startup.

If you’re de­bug­ging a model train­ing pipeline for your prod­uct and Claude gives a bad an­swer, was the model con­fused? Did you give it bad con­text? Or did a hid­den pol­icy nerf Claude’s abil­ity to as­sist you?

You won’t know.

Landmark German ruling declares Google's AI Overviews are Google's own words and makes it liable for false answers

the-decoder.com

Update, June 11, 2026:

Google has pro­vided us with a state­ment on the rul­ing. The com­pany says its AI overviews are de­signed to reflect” in­for­ma­tion that al­ready ex­ists on the web.

We in­vest deeply in the qual­ity of AI Overviews to en­sure that the over­whelm­ing ma­jor­ity of re­sponses pro­vide ac­cu­rate in­for­ma­tion, and they are de­signed to re­flect the in­for­ma­tion that ex­ists on the web. We’re care­fully re­view­ing this de­ci­sion, which is not yet fi­nal,” a Google spokesper­son said.

Google adds that AI overviews can oc­ca­sion­ally miss con­text or mis­in­ter­pret web con­tent, just like tra­di­tional search re­sults. But that’s ex­actly where the Munich rul­ing dis­agrees. The court draws a line be­tween AI overviews, which gen­er­ate new con­tent loosely based on sources, and tra­di­tional search re­sults, which list sources with di­rect quotes. That dis­tinc­tion is what makes Google di­rectly li­able, ac­cord­ing to the court. In its state­ment, Google also re­peats the very ar­gu­ment the court dis­missed, that people can dig deeper and ver­ify.”

Original ar­ti­cle, June 9, 2026:

A German court has ruled that Google is di­rectly li­able for what its AI search overviews say. Previous case law shield­ing search en­gine op­er­a­tors from li­a­bil­ity does­n’t ap­ply to AI overviews.

The Regional Court of Munich hit Google with a tem­po­rary in­junc­tion bar­ring the com­pany from spread­ing false claims about two Munich-based pub­lish­ers through its AI-generated search overviews (case no. 26 O 869/26). The court clas­si­fied Google as a di­rect in­fringer be­cause the AI overview” is its own con­tent, not just a list of search re­sults.

Google’s AI overviews had falsely tied two pub­lish­ing com­pa­nies to scams, sub­scrip­tion traps, and shady busi­ness prac­tices for cer­tain search queries. According to the court, the AI mixed up in­for­ma­tion about other, gen­uinely sketchy com­pa­nies with the plain­tiffs and drew con­nec­tions that did­n’t ap­pear in any of the linked sources. The pub­lish­ers sent Google a cease-and-de­sist let­ter, but Google did­n’t re­spond ap­pro­pri­ately.

AI overviews aren’t search re­sults

Google’s AI overviews work noth­ing like tra­di­tional search re­sults, the court ar­gues. The AI rewrites and judges re­sults in its own words and ac­cord­ing to its own struc­ture,” the rul­ing says. In the case at hand, for ex­am­ple, it opened with con­fi­dent claims like Yes, [company] is known for du­bi­ous busi­ness prac­tices,” then built its own struc­ture with a sum­mary, red flags for the al­leged scam, and tips for users.

The court also found that the AI overview made claims that are not even made in the search re­sults.” None of the linked sources drew any con­nec­tion be­tween the plain­tiffs and the shady com­pa­nies the AI men­tioned. The court called these the de­fen­dan­t’s own state­ments.”

Google built the AI, Google of­fered it to users, so Google owns what it pro­duces, because it alone has in­flu­ence over the AIs of­fer­ing and the al­go­rithms with which the AI op­er­ates.”

Search en­gine li­a­bil­ity rules don’t ap­ply to AI search”

The court also ex­am­ined ex­ist­ing rul­ings from Germany’s Federal Court of Justice (BGH), which gave tra­di­tional search en­gines and au­to­com­plete lim­ited li­a­bil­ity. The BGH had ar­gued that search en­gine op­er­a­tors were only li­able as in­di­rect in­fringers be­cause they merely made third-party con­tent find­able. A proac­tive duty to check re­sults would threaten how search en­gines work.

The Munich court found that this rea­son­ing does­n’t ap­ply to AI overviews. A reg­u­lar search en­gine just points to out­side web­sites. But AI overviews gen­er­ate independent, new, and sub­stan­tive state­ments” by eval­u­at­ing and com­bin­ing con­tent from var­i­ous third-party sites. And only Google can check those state­ments, the court said, at least by com­par­ing the un­der­ly­ing third-party web­sites with its own state­ments based on them.”

The court also noted that the AI overview is by no means ab­solutely nec­es­sary” for us­ing the in­ter­net. Traditional search re­sults al­ready help users sort through in­for­ma­tion, the AI overview is just an ex­tra fea­ture.

Google’s users can check for them­selves” de­fense falls flat

At the hear­ing, Google ar­gued that users could check the linked sources them­selves to ver­ify whether the AI sum­mary was cor­rect. Users gen­er­ally knew that in­for­ma­tion gen­er­ated with AI should not be blindly trusted,” the com­pany claimed. That’s a re­mark­able state­ment given the scale at which Google serves AI overviews. It’s also not en­tirely true, since the con­nec­tion be­tween sources and gen­er­ated con­tent is­n’t al­ways there.

The court re­jected this. The pos­si­bil­ity of dis­prov­ing a state­ment through fur­ther re­search does­n’t regularly ex­empt from li­a­bil­ity for this state­ment.” The AI overview was understandable on its own” and con­tained a self-con­tained state­ment with in­de­pen­dently un­der­stand­able con­tent and no ref­er­ence to other pos­si­ble in­ter­pre­ta­tions or even un­re­li­able con­tent.” Stud­ies show that users al­most never click on sources in AI overviews, which sup­ports the court’s rea­son­ing.

The court drew a par­al­lel to press law, where pub­lish­ers are li­able for teasers that are un­der­stand­able on their own, even if read­ers never read the full ar­ti­cle. Google’s own ar­gu­ment would also significantly di­min­ish” the ben­e­fit of the fea­ture, the court noted, if the overview were generally rec­og­nized as un­re­li­able.”

The court also pointed to a pro­tec­tion gap. If Google were only li­able for ob­vi­ous vi­o­la­tions, vic­tims would have no real le­gal re­course when the AI makes false claims. The third par­ties whose web­sites served as sources had­n’t even made the state­ments in ques­tion. So vic­tims could­n’t sue the sources, and un­der ex­ist­ing rules they could­n’t ef­fec­tively sue Google ei­ther.

As a re­sult, Google could­n’t in­voke host provider pro­tec­tions un­der the Digital Services Act or fall back on the stan­dard no­tice-and-take-down process for search en­gines.

AI-generated opin­ions get less free speech pro­tec­tion

As if the rest was­n’t bad enough for Google, the court also went af­ter free speech pro­tec­tion for AI-generated con­tent. An AIs opin­ion is not the ex­pres­sion of an ac­quired con­vic­tion of the per­sons ex­press­ing it, but the re­sult of an al­go­rithm,” the court wrote.

Offering AI-powered re­search is above all an ex­pres­sion of Google’s busi­ness ac­tiv­i­ties” and at most a sec­ondary ex­pres­sion of an in­ter­est in be­ing able to freely ex­press one’s opin­ion and be­liefs.”

When weigh­ing the plain­tiffs’ pri­vacy rights against Google’s in­ter­ests, Google had to take a back seat, es­pe­cially since the chal­lenged state­ments were based on un­true facts. The AI had linked the plain­tiffs to com­pa­nies that, ac­cord­ing to sworn af­fi­davits, had no con­nec­tion to them what­so­ever.

Google picks up 80 per­cent of the le­gal tab

The court ruled in fa­vor of the plain­tiffs on most counts. It banned claims about scams, con­nec­tions to du­bi­ous com­pa­nies, sub­scrip­tion traps, phone calls that never hap­pened, and lack of avail­abil­ity. Only two mi­nor re­quests got de­nied.

The risk of re­peated vi­o­la­tions re­mained, even though the spe­cific texts were no longer be­ing dis­played. Google had­n’t is­sued a cease-and-de­sist de­c­la­ra­tion with a penalty clause, and noth­ing stopped the al­go­rithms from gen­er­at­ing the same state­ments again. Google cov­ers 80 per­cent of the le­gal costs; the plain­tiffs pay 10 per­cent each.

The rul­ing may also have in­ter­na­tional reach, ac­cord­ing to the court.

Even a 91 per­cent ac­cu­racy rate means mil­lions of wrong an­swers

The Munich rul­ing goes far be­yond this one case. An analysis by AI startup Oumi for the New York Times found that Google’s AI Overviews with the cur­rent Gemini 3 model an­swered cor­rectly 91 per­cent of the time.

That’s solid enough for every­day use by most peo­ple. But at Google’s scale, it still means mil­lions of wrong an­swers every hour. If enough of that wrong con­tent de­fames com­pa­nies or in­di­vid­u­als, it could be­come a se­ri­ous le­gal prob­lem not just for Google but for other providers of sim­i­lar ser­vices like ChatGPT, Claude, or Perplexity.

The Oumi analy­sis also found that 56 per­cent of the cor­rect Gemini 3 an­swers could­n’t be backed up by the sources Google linked. The AI is giv­ing an­swers whose ori­gins users can’t trace.

The Munich court tack­led ex­actly this prob­lem: the AI makes its own claims that don’t ap­pear in any linked source, and the op­er­a­tor has to an­swer for them. Whether this rea­son­ing holds up on ap­peal re­mains to be seen, and Google has­n’t com­mented on the rul­ing. But if it gains trac­tion in­ter­na­tion­ally, the fall­out could hit not just Google but every AI provider whose sys­tems para­phrase con­tent from the web.

Catlantean 3D - Making Graphics Like It's 1993

staniks.github.io

Catlantean 3D is a side-pro­ject I’ve been slowly build­ing in my spare time for over a year, and I in­tend to re­lease it on Steam next year.

My goal was to build a com­plete, ship­pable first-per­son shooter us­ing tech­niques that were com­mon in the early 90s, while al­low­ing my­self the lux­ury of us­ing a mod­ern com­piler and a plat­form ab­strac­tion layer.

What this ac­tu­ally means is, the con­straints I have fool­ishly im­posed upon my­self are as fol­lows:

game must be made en­tirely from scratch, in­clud­ing the as­sets

all ren­der­ing must be done by hand

all sound mix­ing must be done by hand

320x240 tar­get res­o­lu­tion

256 col­ors only

float­ing point al­lowed, but be­hav­ior must be con­sis­tent across plat­forms de­cided on fixed point for game logic to guar­an­tee de­ter­min­is­tic be­hav­ior, float­ing point for ren­der­ing be­cause de­ter­min­ism is­n’t that im­por­tant there

de­cided on fixed point for game logic to guar­an­tee de­ter­min­is­tic be­hav­ior, float­ing point for ren­der­ing be­cause de­ter­min­ism is­n’t that im­por­tant there

must be a fin­ished, pol­ished game that is fun to play (not a tech-demo)

plat­form ab­strac­tion layer al­lowed, but I must pre­tend it’s very lim­ited (within rea­son): frame buffer to write pix­els into key­board/​mouse in­put au­dio buffer to write sam­ples into filesys­tem I/O

frame buffer to write pix­els into

key­board/​mouse in­put

au­dio buffer to write sam­ples into

filesys­tem I/O

no AI slop

If this sounds un­rea­son­able to you, that is be­cause it is.

But I’m do­ing it any­way, and to­day I’m gonna talk about some­thing that is typ­i­cally over­looked in de­vel­op­ment blogs, and that is as­set cre­ation.

Note: Everything dis­played here is work-in-progress, and heav­ily sub­ject to change.

Table of Contents

Palette Rendering

VGA Graphics The Palette The Colormap

VGA Graphics

The Palette

The Colormap

Creating Assets

Pre-rendered Sprites Hand-drawn Sprites and Textures Procedurally Generated Sprites and Textures

Pre-rendered Sprites

Hand-drawn Sprites and Textures

Procedurally Generated Sprites and Textures

Maps

Conclusion

Changelog

2026 – 06-09 - pub­lished.

Palette Rendering

VGA Graphics

Mode 13h on VGA hard­ware was the fa­mous 320x200 256-color graph­ics mode that de­fined a gen­er­a­tion of PC games. From a pro­gram­mer’s per­spec­tive it was won­der­fully sim­ple: you’d have a lin­ear frame buffer where each pixel was rep­re­sented by a sin­gle byte in­dex­ing into a palette of 256 col­ors.

If you wanted to draw a pixel, you wrote a byte at a spe­cific ad­dress, and that was it, there were no shaders or VRAM, or any­thing like that.

One byte per pixel, and that byte is an in­dex into a palette which con­tains ac­tual RGB val­ues that would be ren­dered to screen. This im­poses some in­ter­est­ing lim­i­ta­tions; when mak­ing as­sets for mod­ern games, you can throw mil­lions of col­ors at an im­age, but when your lim­i­ta­tion is that every pixel on screen can only be one of 256 col­ors, as­set cre­ation be­comes a very dif­fer­ent prob­lem be­cause every color choice has to be care­ful and de­lib­er­ate.

Games like Doom and Duke Nukem are good ex­am­ples of this done right. There is a cer­tain crispi­ness and clar­ity to these graph­ics that arises be­cause of these tech­ni­cal lim­i­ta­tions, not in spite of them. Restriction forces de­lib­er­ate choices, and de­lib­er­ate choices tend to look good.

Catlantean 3D is an at­tempt to re­pro­duce that feel­ing, but with one caveat - I’m ac­tu­ally go­ing for some­thing closer to VGA Mode-X, which is 320x240. The rea­son for this is, if you dis­play 320x200 on a 4:3 dis­play, you end up with non-square pix­els! While this would be most au­then­tic, I’ve cho­sen not to deal with this out of pref­er­ence rather than ob­jec­tive rea­son.

So how does one cre­ate graph­ics that work within these lim­its?

The Palette

Everything be­gins with 768 bytes, care­fully picked through many it­er­a­tions of trial and er­ror.

The main rea­son­ing for pick­ing these ex­act col­ors was the fol­low­ing:

one re­served for trans­parency (the vi­brant pink)

one re­served for pure white

one re­served for pure black

I was ob­vi­ously go­ing to need a lot of blood, thus reds

shades of green and blue be­cause I was go­ing to have red, green and blue keys and color-coded doors

game would be set in Catlantis, which is a par­ody land that re­sem­bles an­cient Egypt (because cat wor­ship), so ob­vi­ously, a lot of desert hues (yellows and browns)

lots of grays be­cause the set­ting in­volves many tech­ni­cal in­stal­la­tions (Catlantis is un­der oc­cu­pa­tion by cy­ber­netic dog-men)

some beige hues to break up mo­not­ony over grays, and to serve as warmer re­place­ments when dark­en­ing (more on this later)

the rest would be filled as nec­es­sary when cre­at­ing tex­tures - highly sub­jec­tive and im­pos­si­ble to ex­plain, other than it looked right”

The palette did not spring into life all at once; it in­volved a lot of back-and-forth dur­ing as­set cre­ation, test­ing, and re-it­er­at­ing in gen­eral.

Below are some ex­am­ples of sprites and tex­tures from the ac­tual game:

The Colormap

Catlantean 3D is a tra­di­tional ray­caster. The map con­sists of tiles which are all iden­ti­cal in size; some are walls, oth­ers are just voids with a floor and ceil­ing. In or­der to ren­der the map, the ren­derer uses the DDA al­go­rithm for each col­umn of screen, tra­vers­ing the tilemap and de­ter­min­ing where it hits the map geom­e­try, and based on this, a wall col­umn is ren­dered on screen with the ap­pro­pri­ate tex­ture, sam­pled from ap­pro­pri­ate co­or­di­nates. Floors and ceil­ings are ren­dered af­ter as hor­i­zon­tal scan­lines, fill­ing in the rest of the screen.

Raycasting has been done to death by other blogs and web­sites, so I’m not go­ing to cover all of it, but I do want to cover what I think is its most over­looked as­pect: light­ing.

If we were to ren­der the game world us­ing just the palette, with­out any spe­cial ef­fects, we would end up with some­thing that looked rather flat and unim­pres­sive:

But what we wanted was the fol­low­ing. Notice how the light di­min­ishes the fur­ther away geom­e­try is from player, and how one side of the map tiles is just slightly darker than the other. This gives an im­pres­sion of depth.

With a mod­ern hard­ware-ac­cel­er­ated ren­derer, this would be triv­ially done in a shader - based on how far the ver­tex is, we would mul­ti­ply its color vec­tor by a float­ing point fac­tor and get a di­min­ished color vec­tor as a re­sult.

But how do we achieve some­thing like this with a palette ren­derer? It has no con­cept of color, just in­dices into palette. So if we wanted to find a darker shade of a cer­tain color, we would need to loop through the palette and find the color that meets our cri­te­ria of darker”. This is just too much be­cause we can’t loop through the en­tire palette for every pixel we ren­der onto the screen, it would be too slow.

What we could do in­stead was some pre­pro­cess­ing, to al­low a fast color lookup based on dis­tance at run­time.

If we were to lay out our palette into a sin­gle row like this…

We then choose the num­ber of shade lev­els (32 in my case) mean­ing each color needs 31 darker vari­ants, all sourced from the palette. We know each col­or’s RGB val­ues, so from this, and the shade in­dex we can de­ter­mine the clos­est tar­get color of that shade:

// First shade in­dex (0) is orig­i­nal color. float dark­en­ing_­fac­tor = (32 - shade_in­dex) / 32.0f; tar­get_­dark­er_­color.r = cur­ren­t_­color.r * dark­en­ing_­fac­tor; tar­get_­dark­er_­color.g = cur­ren­t_­color.g * dark­en­ing_­fac­tor; tar­get_­dark­er_­color.b = cur­ren­t_­color.b * dark­en­ing_­fac­tor;

But that color might not ex­ist in the palette. So we need to loop through the palette and find the clos­est color to it.

Definition of close” ac­tu­ally changed for me dur­ing de­vel­op­ment - at first, I just took eu­clid­ean dis­tance as a mea­sure, but the prob­lem with that was that al­most every­thing had a ten­dency to grav­i­tate to­wards the greys, sim­ply due to the math­e­mat­ics. Some older games ac­tu­ally did use Euclidean dis­tance, but to me this did­n’t look very good. I can’t ex­plain why ex­actly, but a lot of darker shades ap­peared some­what cold and life­less. So in­stead, I con­verted my col­ors to Oklab color space, and lever­aged its per­cep­tual dis­tance for­mula, which is closer to how hu­mans per­ceive color dif­fer­ences. I also ap­ply a small shift to­wards warmer hues the darker the color is (a com­mon con­cept in pixel art called hue shift­ing”). This is typ­i­cally not nec­es­sary, but it does make the game look just a bit bet­ter.

How do I de­fine better” in this case? I have no idea, it just looks right. Frustrating, is­n’t it? It’s hard to ra­tio­nal­ize some­thing sub­jec­tive.

Back to our al­go­rithm…

Essentially, for each color, we cre­ate a col­umn that rep­re­sents the shades of that color. What we end up with is a 2D ma­trix of palette in­dices called the col­ormap. Note that the col­ormap gra­di­ents are im­per­fect, be­cause we’re still re­stricted to col­ors from the palette:

So now, de­ter­min­ing a darker shade of color N based on dis­tance be­comes triv­ial.

Given col­ormap row in­dex (i.e. shade level) based on dis­tance:

col­ormap_row = 32 * frag­men­t_dis­tance / view_dis­tance

We pick N-th en­try in row be­long­ing to that shade - that is the palette in­dex of the dark­ened color N.

And voila, O(1).

Also, in­stead of cal­cu­lat­ing the col­ormap row in­dex for every pixel, the cost is fur­ther re­duced by per­form­ing cal­cu­la­tion:

only once per screen col­umn when ren­der­ing walls, be­cause they’re per­fectly ver­ti­cal, so every pixel in col­umn has same dis­tance from cam­era

only once per screen row when ren­der­ing floors, be­cause they’re per­fectly hor­i­zon­tal, so every pixel in row has same dis­tance from cam­era

only once per sprite be­cause they are per­fectly flat bill­boards where every pixel has the same dis­tance from cam­era

So we’re do­ing col­ormap row in­dex cal­cu­la­tion 320 times for walls, at most 240 times for floors, and once per vis­i­ble sprite (raycasting gives free oc­clu­sion culling). That is cheap, and the pay­off is great.

Doom and many other ti­tles used sim­i­lar ap­proaches.

Creating Assets

Textures and sprites in Catlantean 3D fall into three cat­e­gories:

Pre-rendered sprites - 3D mod­els cre­ated in Blender and ren­dered to tex­tures

Hand-drawn sprites and tex­tures

Procedurally gen­er­ated tex­tures - gen­er­ated via spe­cial Python scripts by com­bin­ing hand-drawn art

Pre-rendered Sprites

I am work­ing a full-time job and have a de­cently ac­tive life, so my time to work on the game is lim­ited. Thus, I wanted to min­i­mize the time I spend re­it­er­at­ing when mak­ing com­plex sprites that in­volve an­i­ma­tions. I rarely get some­thing right on the first at­tempt, so nat­u­rally, re­it­er­a­tion is ex­pected, and it is hard to re­it­er­ate when you need to make changes to many frames of an an­i­ma­tion.

The more ef­fi­cient ap­proach was to cre­ate sprites in Blender as 3D mod­els, rig and an­i­mate them there, and then ren­der them to a se­ries of tex­tures with spe­cial Python scripts that lever­age Blender’s Python API. Reiteration then in­volved mak­ing changes in the model, and the ren­der­ing scripts did the rest, which was a lot of time saved.

The main hur­dle was that ren­dered sprites came out very blurry and washed out.

One might think that the ob­vi­ous an­swer to this was to ren­der the sprite in high res­o­lu­tion, and then down­scale with fil­ter­ing, but I’ve had mixed suc­cess with this; de­tails would of­ten be sup­pressed by fil­ter­ing, and edge clar­ity would be lost. What I found to be the most ef­fec­tive and reusable was to lever­age Blender’s com­posit­ing func­tion­al­ity to get the right amount of con­trast and clar­ity:

Once the im­age was ready, it would be sent through a spe­cial Python script which per­formed palette quan­ti­za­tion, cre­at­ing a 1-byte-per-pixel im­age used by the en­gine. For every pixel in the source im­age, the script finds the clos­est color in our palette (perceptually clos­est - Oklab), and uses the in­dex of that color for that pixel. The in­dex ar­ray, along with the di­men­sions, is then packed into the very sim­ple TEX for­mat that is used by the game.

A sim­i­lar work­flow was used for en­emy sprites. Note: some of these nodes are ei­ther re­dun­dant, or plain use­less, sim­ply be­cause I had used them at some point, and then changed my mind. I like leav­ing them in just in case I need them again.

Enemy sprites are ren­dered in a spe­cial way. Sprite can have mul­ti­ple an­i­ma­tions, and each an­i­ma­tion must have frames for each of the 8 di­rec­tions sprite can face. So, for every an­i­ma­tion (walk, fire, die, etc.), the Python script that uses Blender’s API ro­tates the sprite, ren­ders all frames of an an­i­ma­tion, ro­tates the sprite again, and so on. Sprites are saved with a spe­cial con­ven­tion that de­notes sprite name, ac­tion name, di­rec­tion and frame in­dex:

Nice thing about this ap­proach is that I don’t need to keep ren­dered sprites in the repos­i­tory - they’re ac­tu­ally .gitignored. Whenever I switch lo­ca­tions and use an­other com­puter, I sim­ply run the com­pi­la­tion script which ren­ders every model and pro­duces the sprites. It is rea­son­ably fast and runs in ~10 sec­onds for about 15 mod­els on RTX 3070.

Hand-drawn Sprites and Textures

Earlier in de­vel­op­ment, I cre­ated this vaguely cat-shaped head with the tex­ture of my cat Vilko, to use as a sta­tus bar face. After all, why would I draw some­thing like this by hand, if Blender could ren­der it in such a vivid like­ness of life?

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

Visit pancik.com for more.