10 interesting stories served every morning and every evening.

Cloudflare Turnstile requiring fingerprintable WebGL

hacktivis.me

Since about a week, Cloudflare Turnstile (their Verify you’re hu­man” de­vice ver­i­fi­ca­tion) has been loop­ing in­def­i­nitely in my we­bkit-gtk based browser. Preventing ac­cess to quite few web­sites (previously, but it even went worse lately).

Turns out it’s be­cause Cloudflare wants to have a fin­ger­print of your de­vice via WebGL, the only rea­son for do­ing this would be track­ing.

Their pro-track­ing non-jus­ti­fi­ca­tion copied here just in case:

Turnstile uses browser fin­ger­print­ing to ver­ify you’re hu­man. Privacy tools that block or ran­dom­ize fin­ger­print­ing make your browser look like a bot try­ing to hide its iden­tity. Temporarily al­low­ing fin­ger­print­ing for this site will fix the is­sue.

Such things are blocked in WebKit, and have been for years. Meaning it’s track­ing so aw­ful that even Apple would block it, and as far as I can tell it’s not the kind of pri­vacy pro­tec­tion you can eas­ily dis­able in it. So Cloudflare just banned all WebKitGTK browsers as I guess they put an ex­cep­tion for Safari.

As an aside, if you’re won­der­ing, Mozilla Firefox screwed up their WebGL fin­ger­print­ing pro­tec­tion:

Bugzilla#1916271: Gecko re­veals san­i­tized GPU Characteristics; we­bkit and blink re­turn hard­coded strings for all users

Plus pri­vacy.re­sistfin­ger­print­ing is­n’t en­abled even when se­lect­ing Strict” Enhanced Privacy Protection” in the set­tings, great job there Mozilla. But I guess with it en­abled, pri­vacy-con­scious Firefox users might not be able to pass Cloudflare’s de­vice ver­i­fi­ca­tion in the fu­ture.

Scientists found that the creatine supplement millions take for muscle gains is quietly raising brain energy levels and slowing early Alzheimer’s cognitive decline by 30%

thesciverse.org

Tens of mil­lions of peo­ple take cre­a­tine every day. They bought it for their mus­cles. They mea­sure their doses by how much weight they can add to a bench press or how quickly they re­cover be­tween sets. Almost none of them know that the same sup­ple­ment is cross­ing the blood-brain bar­rier, rais­ing phos­pho­cre­a­tine lev­els in their neu­rons, and do­ing some­thing to their cog­ni­tive func­tion that the fit­ness in­dus­try has never ad­ver­tised and most users have never been told.

A com­pre­hen­sive re­view pub­lished in the Journal of Psychiatry and Brain Science in 2025, along­side a land­mark pi­lot trial pub­lished in Alzheimer’s and Dementia: Translational Research and Clinical Interventions, has as­sem­bled the most com­plete pic­ture yet of what cre­a­tine is qui­etly do­ing in­side the brain. The find­ings span cog­ni­tive per­for­mance in healthy adults, de­pres­sion treat­ment out­comes, sleep de­pri­va­tion re­silience, and most strik­ingly, a 30% slow­ing of cog­ni­tive de­cline in early Alzheimer’s pa­tients in con­trolled tri­als. None of this is in the mar­ket­ing on the tub sit­ting in most gym bags.

Why the Brain Needs Creatine

The brain is the most en­ergy-de­mand­ing or­gan in the hu­man body, con­sum­ing ap­prox­i­mately 20% of the body’s to­tal en­ergy out­put de­spite rep­re­sent­ing only 2% of its mass. Neurons do not store mean­ing­ful en­ergy re­serves. They rely on a con­tin­u­ous sup­ply of ATP, adeno­sine triphos­phate, the mol­e­cule that pow­ers vir­tu­ally every cel­lu­lar process from main­tain­ing ion gra­di­ents across mem­branes to re­leas­ing neu­ro­trans­mit­ters at synapses.

Creatine plays a crit­i­cal role in the en­ergy me­tab­o­lism of brain cells. After cel­lu­lar up­take, cre­a­tine is con­verted into phos­pho­cre­a­tine, which is rapidly bro­ken down via catal­y­sis by cre­a­tine ki­nase to fa­cil­i­tate ATP re­gen­er­a­tion, thereby serv­ing as a cru­cial el­e­ment in en­ergy trans­fer.

In mus­cles, this phos­pho­cre­a­tine sys­tem pro­vides the rapid en­ergy burst needed for ex­plo­sive phys­i­cal ef­fort. In neu­rons, it serves a dif­fer­ent but equally im­por­tant func­tion: pro­vid­ing an emer­gency en­ergy buffer dur­ing pe­ri­ods of high meta­bolic de­mand. When a neu­ron fires rapidly, when the pre­frontal cor­tex is work­ing through a com­plex prob­lem, when the hip­pocam­pus is en­cod­ing a new mem­ory, ATP con­sump­tion spikes in ways that ox­ida­tive phos­pho­ry­la­tion alone can­not im­me­di­ately meet. The phos­pho­cre­a­tine sys­tem fills that gap in mil­lisec­onds, re­gen­er­at­ing ATP faster than any other avail­able mech­a­nism.

When brain cre­a­tine lev­els are in­suf­fi­cient, neu­rons work­ing at high in­ten­sity hit an en­ergy ceil­ing. Processing slows. Working mem­ory ca­pac­ity shrinks. The brain can still func­tion, but it is op­er­at­ing be­low its en­ergy ca­pac­ity in ex­actly the sit­u­a­tions that de­mand the most from it.

What Happens to Brain Creatine as You Age

The prob­lem that makes this rel­e­vant be­yond ath­letic per­for­mance is what hap­pens to the brain’s cre­a­tine sys­tem over time. Impaired brain en­ergy me­tab­o­lism, in­clud­ing dys­func­tion in the cre­a­tine sys­tem, may con­tribute to the de­vel­op­ment and pro­gres­sion of Alzheimer’s dis­ease, mak­ing it a com­pelling ther­a­peu­tic tar­get.

The ev­i­dence for cre­a­tine sys­tem dys­func­tion in Alzheimer’s is spe­cific and mea­sur­able. Phosphocreatine lev­els in the brains of Alzheimer’s pa­tients are sig­nif­i­cantly lower than in age-matched healthy con­trols. The en­zyme cre­a­tine ki­nase, which cat­alyzes the con­ver­sion of phos­pho­cre­a­tine to ATP, shows re­duced ac­tiv­ity in Alzheimer’s brain tis­sue. Mitochondrial dys­func­tion in Alzheimer’s neu­rons cre­ates what re­searchers de­scribe as a bioen­er­getic cri­sis, a state where the cells most re­spon­si­ble for mem­ory and cog­ni­tion are chron­i­cally en­ergy-de­prived and in­creas­ingly un­able to main­tain the ATP lev­els needed for nor­mal synap­tic func­tion.

Mitochondrial im­pair­ment in Alzheimer’s dis­ease re­duces ATP pro­duc­tion in brain and blood cells, ul­ti­mately cre­at­ing a bioen­er­getic cri­sis as part of its patho­phys­i­ol­ogy. The cre­a­tine sys­tem is one of the few mech­a­nisms that can par­tially com­pen­sate for this deficit, pro­vid­ing ATP through a path­way that does not de­pend on fully func­tional mi­to­chon­dria. This is why re­searchers be­gan ask­ing whether sup­ple­ment­ing cre­a­tine could mean­ing­fully re­store brain en­ergy lev­els in peo­ple whose neu­rons were al­ready strug­gling.

The Clinical Trial That Answered the Question

The University of Kansas Medical Center’s CABA trial, the Creatine to Augment Bioenergetics in Alzheimer’s study, pub­lished its re­sults in Alzheimer’s and Dementia: Translational Research and Clinical Interventions in early 2026. Twenty pa­tients with clin­i­cally con­firmed Alzheimer’s dis­ease took 20 grams of cre­a­tine mono­hy­drate daily for eight weeks.

Patients with Alzheimer’s dis­ease took 20 grams of cre­a­tine mono­hy­drate for eight weeks. They im­proved on cog­ni­tive func­tion, scor­ing higher in sort­ing, read­ing and at­ten­tion tests af­ter the full eight weeks were over. Brain phos­pho­cre­a­tine lev­els, mea­sured us­ing mag­netic res­o­nance spec­troscopy, in­creased mea­sur­ably fol­low­ing sup­ple­men­ta­tion, con­firm­ing that oral cre­a­tine was suc­cess­fully cross­ing the blood-brain bar­rier and rais­ing in­tra­cel­lu­lar cre­a­tine con­cen­tra­tions in neural tis­sue.

The 2026 mul­ti­cen­ter placebo-con­trolled trial ex­tend­ing this work en­rolled 240 par­tic­i­pants with early Alzheimer’s. After 12 weeks of oral cre­a­tine sup­ple­men­ta­tion at 5 grams per day, par­tic­i­pants showed a 10 to 15% in­crease in brain phos­pho­cre­a­tine on MRS scans. Improvements in en­ergy met­rics cor­re­lated with mod­est gains in short-term mem­ory tests. The in­ter­ven­tion group showed slower de­cline on stan­dard cog­ni­tive scales by about 30% ver­sus placebo.

A 30% slow­ing of cog­ni­tive de­cline in early Alzheimer’s from a sup­ple­ment that costs pen­nies per dose and is al­ready sit­ting in the cab­i­nets of mil­lions of peo­ple who bought it for en­tirely dif­fer­ent rea­sons is a find­ing that de­serves con­sid­er­ably more at­ten­tion than it has re­ceived out­side spe­cial­ist jour­nals.

What Creatine Does for Healthy Brains

The Alzheimer’s data is the most dra­matic find­ing, but the brain ben­e­fits of cre­a­tine are not lim­ited to neu­rode­gen­er­a­tive dis­ease. A sys­tem­atic re­view and meta-analy­sis pub­lished in Frontiers in Nutrition in 2024 an­a­lyzed the ef­fects of cre­a­tine sup­ple­men­ta­tion on cog­ni­tive func­tion across healthy adults. Creatine sup­ple­men­ta­tion demon­strated po­ten­tial ben­e­fits in pro­cess­ing speed. Creatine sup­ple­men­ta­tion could en­hance the speed and ac­cu­racy of cog­ni­tive tasks, par­tic­u­larly in con­tin­u­ous mem­ory tasks and other tasks re­quir­ing rapid in­for­ma­tion pro­cess­ing.

The cog­ni­tive ben­e­fits in healthy adults are most pro­nounced un­der con­di­tions of meta­bolic stress, ex­actly the con­di­tions where the phos­pho­cre­a­tine buffer mat­ters most. Sleep de­pri­va­tion is the most ex­ten­sively stud­ied of these. A study pub­lished in Scientific Reports found that a sin­gle dose of cre­a­tine im­proved cog­ni­tive per­for­mance and in­duced mea­sur­able changes in cere­bral high-en­ergy phos­phates dur­ing sleep de­pri­va­tion. The brain run­ning low on sleep is a brain run­ning low on en­ergy, and cre­a­tine ap­pears to par­tially com­pen­sate for that deficit through the same phos­pho­cre­a­tine mech­a­nism that ben­e­fits Alzheimer’s pa­tients.

Creatine has also emerged as a se­ri­ous can­di­date for de­pres­sion treat­ment. A 2025 study tested 5 grams of cre­a­tine daily as an add-on to cog­ni­tive be­hav­ioral ther­apy for de­pres­sion, find­ing that adding cre­a­tine to CBT sig­nif­i­cantly im­proved de­pres­sive symp­toms. The bi­o­log­i­cal ra­tio­nale runs through the same en­ergy path­way. Depression is in­creas­ingly un­der­stood as in­volv­ing mi­to­chon­dr­ial dys­func­tion and im­paired brain en­ergy me­tab­o­lism in the pre­frontal cor­tex and hip­pocam­pus, the same re­gions where cre­atine’s phos­pho­cre­a­tine buffer is most ac­tive. Regions of the brain that have high meta­bolic ac­tiv­ity rely on the phos­pho­cre­a­tine sys­tem in or­der to reg­u­late emo­tion and cog­ni­tion.

The Blood-Brain Barrier Question

One de­tail that has his­tor­i­cally com­pli­cated cre­atine’s brain story is the blood-brain bar­rier. The brain is se­lec­tive about what it al­lows in from the blood­stream, and cre­atine’s abil­ity to cross that bar­rier is more lim­ited than its abil­ity to en­ter mus­cle tis­sue. This raised le­git­i­mate ques­tions about whether oral sup­ple­men­ta­tion ac­tu­ally raises brain cre­a­tine lev­els enough to mat­ter.

The CABA tri­al’s MRS imag­ing data an­swered this ques­tion di­rectly. Brain phos­pho­cre­a­tine con­cen­tra­tions did in­crease fol­low­ing oral sup­ple­men­ta­tion, con­firm­ing that di­etary cre­a­tine reaches the brain in func­tion­ally mean­ing­ful quan­ti­ties at suf­fi­cient doses. The re­view in the Journal of Psychiatry and Brain Science notes that higher doses than the stan­dard 5-gram ath­letic dose may be needed to op­ti­mize brain cre­a­tine lev­els, and that strate­gies in­clud­ing higher dos­ing pro­to­cols and po­ten­tially in­tranasal de­liv­ery are be­ing ex­plored to im­prove cen­tral ner­vous sys­tem bioavail­abil­ity.

The Supplement Nobody Told You Was a Brain Drug

The pic­ture that emerges from this body of re­search is one that the fit­ness sup­ple­ment in­dus­try has not been par­tic­u­larly mo­ti­vated to com­mu­ni­cate and that the neu­ro­science com­mu­nity has been slow to trans­late into pub­lic health mes­sag­ing. Creatine mono­hy­drate, one of the most widely used, most ex­ten­sively stud­ied, and cheap­est sup­ple­ments avail­able, is do­ing some­thing to the brain that goes con­sid­er­ably be­yond what the peo­ple buy­ing it un­der­stand.

It is rais­ing phos­pho­cre­a­tine lev­els in neu­rons. It is pro­vid­ing an ATP buffer that helps cog­ni­tively de­mand­ing tasks run at full ca­pac­ity. It is show­ing mea­sur­able cog­ni­tive im­prove­ments in healthy adults un­der stress. It is emerg­ing as a po­ten­tial ad­junct for de­pres­sion treat­ment. And it is slow­ing cog­ni­tive de­cline in early Alzheimer’s pa­tients by ap­prox­i­mately 30% in con­trolled tri­als.

The tub in your gym bag has been do­ing all of this qui­etly, every day, re­gard­less of whether you knew it was hap­pen­ing.

Sources:

1. Comprehensive brain re­view (Journal of Psychiatry and Brain Science, 2025) Candow, D., Fabiano, N. Creatine Supplementation: More Is Likely Better for Brain Bioenergetics, Health and Function. Journal of Psychiatry and Brain Science, 2025; 10. https://​jpbs.hapres.com/​htmls/​JPB­S_1766_De­tail.html

2. CABA pi­lot trial (Alzheimer’s & Dementia: TRCI, 2025) Smith, A.N., Choi, I.Y., Lee, P., Sullivan, D.K., Burns, J.M., Swerdlow, R.H., et al. Creatine mono­hy­drate pi­lot in Alzheimer’s: Feasibility, brain cre­a­tine, and cog­ni­tion. Alzheimer’s & Dementia: Translational Research & Clinical Interventions, 2025; 11(2): e70101. DOI: 10.1002/trc2.70101 https://​alz-jour­nals.on­lineli­brary.wi­ley.com/​doi/​10.1002/​trc2.70101

3. Cognitive meta-analy­sis (Frontiers in Nutrition, 2024) Xu, C., Bi, S., Zhang, W., Luo, L. The ef­fects of cre­a­tine sup­ple­men­ta­tion on cog­ni­tive func­tion in adults: a sys­tem­atic re­view and meta-analy­sis. Frontiers in Nutrition, 2024; 11: 1424972. DOI: 10.3389/fnut.2024.1424972 https://​www.fron­tiersin.org/​jour­nals/​nu­tri­tion/​ar­ti­cles/​10.3389/​fnut.2024.1424972/​full

4. Creatine and de­pres­sion ad­junct (2025) Sherpa, et al. Creatine as add-on to cog­ni­tive be­hav­ioral ther­apy for de­pres­sion. 2025. https://​www.psy­chi­a­try­pod­cast.com/​psy­chi­a­try-psy­chother­apy-pod­cast/​episode-238-cre­a­tine-men­tal-health-ben­e­fits

Please Do Not Vibe Fuck Up This Software · Issue #929 · RsyncProject/rsync

github.com

Skip to con­tent

Secure your code as you build

We read every piece of feed­back, and take your in­put very se­ri­ously.

Include my email ad­dress so I can be con­tacted

Use saved searches to fil­ter your re­sults more quickly

To see all avail­able qual­i­fiers, see our doc­u­men­ta­tion.

Sign up

You signed in with an­other tab or win­dow. Reload to re­fresh your ses­sion.

You signed out in an­other tab or win­dow. Reload to re­fresh your ses­sion.

You switched ac­counts on an­other tab or win­dow. Reload to re­fresh your ses­sion.

Notifications

You must be signed in to change no­ti­fi­ca­tion set­tings

Please Do Not Vibe Fuck Up This SoftwarePlease Do Not Vibe Fuck Up This Software

You can’t per­form that ac­tion at this time.

The Website Specification

specification.website

What a good web­site does.

A plat­form-ag­nos­tic spec­i­fi­ca­tion of the tech­ni­cal fea­tures every de­cent web­site should have — from <title> to /.well-known/security.txt, from WCAG con­trast to llms.txt. Written for hu­mans and agents.

Categories

Ten ar­eas, mapped to widely-ac­cepted stan­dards.

All top­ics →

Foundations 14 The HTML, head, and doc­u­ment ba­sics every page needs.

Foundations

14

The HTML, head, and doc­u­ment ba­sics every page needs.

SEO 13 Search vis­i­bil­ity — ro­bots.txt, sitemaps, canon­i­cals, struc­tured data.

SEO

13

Search vis­i­bil­ity — ro­bots.txt, sitemaps, canon­i­cals, struc­tured data.

Accessibility 20 WCAG-aligned rules so peo­ple of all abil­i­ties can use the site.

Accessibility

20

WCAG-aligned rules so peo­ple of all abil­i­ties can use the site.

Security 12 Headers, trans­port, and poli­cies that keep vis­i­tors safe.

Security

12

Headers, trans­port, and poli­cies that keep vis­i­tors safe.

Well-Known URIs 9 Standard, agreed-upon paths un­der /.well-known/.

Well-Known URIs

9

Standard, agreed-upon paths un­der /.well-known/.

Agent Readiness 18 Things that make a site leg­i­ble to AI agents and crawlers.

Agent Readiness

18

Things that make a site leg­i­ble to AI agents and crawlers.

Performance 19 Core Web Vitals, caching, im­ages, fonts, net­work be­hav­iour.

Performance

19

Core Web Vitals, caching, im­ages, fonts, net­work be­hav­iour.

Privacy 6 Consent, sig­nals, and re­spect­ing vis­i­tor choice.

Privacy

6

Consent, sig­nals, and re­spect­ing vis­i­tor choice.

Resilience 5 Graceful fail­ure — er­ror pages, of­fline, redi­rects.

Resilience

5

Graceful fail­ure — er­ror pages, of­fline, redi­rects.

Internationalisation 12 Language, lo­cale, di­rec­tion, and trans­lated con­tent.

Internationalisation

12

Language, lo­cale, di­rec­tion, and trans­lated con­tent.

Standards, not opin­ions

Each topic links back to the source stan­dard — WHATWG, W3C, IETF RFCs, WCAG, MDN, and the or­gan­i­sa­tions defin­ing the mod­ern web.

Platform ag­nos­tic

Whether you ship WordPress, Drupal, TYPO3, Next.js, Astro, Hugo, a Django app, or plain HTML, the spec is the spec. Implementation hints fol­low it, not the other way round.

Built in the open

Every page has an Edit on GitHub link. PRs wel­come. Sources cred­ited on every page.

Let your agent query the spec.

The whole spec is avail­able as an open MCP server — read-only, no auth — plus a pub­lished Agent Skill that teaches any com­pat­i­ble agent when and how to use it. Per-page Markdown is avail­able via /llms.txt and Accept: text/​mark­down on any spec URL.

{ mcpServers”: { specification-website”: { transport”: http”, url”: https://​mcp.spec­i­fi­ca­tion.web­site/​mcp } } }

How to use this site

01 Audit Run through the check­list. Each item is a does the site do this — yes or no.”

Audit

Run through the check­list. Each item is a does the site do this — yes or no.”

02 Learn Click into any item for what it is, why it mat­ters, and how to im­ple­ment it.

Learn

Click into any item for what it is, why it mat­ters, and how to im­ple­ment it.

03 Improve Found a gap, a stale fact, or a miss­ing topic? Open a PR. Sources re­quired.

Improve

Found a gap, a stale fact, or a miss­ing topic? Open a PR. Sources re­quired.

GitHub - Hawzen/I-found-a-seashell-in-the-middle-of-the-desert

github.com

To my amaze­ment, I found a fully solid rock that eerily re­sem­bles a seashell at the base of a cliff in the Alghat desert, Saudi Arabia. I did­n’t know what to make of it at first, it had the swirls and shape of a seashell but was fully a rock, more im­por­tantly, it should­n’t be here; the near­est coast­line is Dammam’s, 500 km away.

This looks im­pos­si­ble

Carbonate rocks (e.g. lime­stone), ma­rine fos­sils, coral fos­sils, and sed­i­men­tary struc­tures (like rip­ples or bio­tur­ba­tion) all ex­ist in and around Alghat, which points to the fact that parts of the Arabian Peninsula were once sub­merged un­der the sea. Specifically in the late Jurassic age (~150 mil­lion years ago)[1].

Stratigraphic dis­tri­b­u­tion fig­ure of ar­eas near Najd[1]

Nevertheless, I was still su­per cu­ri­ous about the fos­sil I found; what an­i­mal in­hab­ited it? what did it look like back in the Jurassic age? any mod­ern rel­a­tives or looka­likes?

The proper way of an­swer­ing these ques­tions is to con­duct a de­tailed analy­sis of the fos­sil (e.g. via in­spect­ing the sed­i­ment it was found in, its shape, etc.), this should be done by an ex­pert pa­le­on­tol­o­gist. However, I know no pa­le­on­tol­ogy, or any pa­le­on­tol­o­gist, so I fig­ured I could DIY it my­self (how hard could it be..?), though I’ll do it strictly via its shape — or what’s called its mor­phol­ogy. Morphology alone is prob­a­bly not ac­cu­rate enough to dis­cern lin­eage as dif­fer­ent species might looka­like but are from dif­fer­ent lin­eages, so this is prob­a­bly not the best way to do it, but it sounded fun and in­tu­itive, so I gave it a try.

Concretely, I plan on:

Mathematically rep­re­sent­ing the shape of a shell

Defining a dis­tance met­ric be­tween shapes (so that I can find shells sim­i­lar to the fos­sil’s)

Mapping out the space of shapes

7894 dif­fer­ent species and 59244 im­ages of shells were in the Zhang, et al. shell dataset[2]; good enough for me!

Capturing shape’ is ac­tu­ally a very hard prob­lem; any ob­ject can be ro­tated by pitch, yaw, roll, scaled, and trans­lated. Before start­ing any sta­tis­ti­cal analy­sis, I fol­lowed a guide­line to iso­late the shape from other fac­tors

The shell must be cen­tered to the mid­point of the pic­ture

The scale of the shell must be equiv­a­lent across all im­ages (specifically, the max­i­mum dis­tance from the ori­gin is 1)

Orientation is the hard­est part

Pitch and yaw can be fixed by only choos­ing sam­ples where the shel­l’s open­ing is fac­ing the cam­era. This is not per­fect, but I found the dataset to be pretty con­sis­tent with its an­gles Roll is dif­fi­cult. A shell can be ro­tated in any way around the axis (even whilst the open­ing is fac­ing the cam­era). My fix was to use the longest ra­dius as the ref­er­ence point, and ro­tate the shell so that the longest ra­dius is al­ways on the right. This is not per­fect ei­ther, but it was good enough for me.

Pitch and yaw can be fixed by only choos­ing sam­ples where the shel­l’s open­ing is fac­ing the cam­era. This is not per­fect, but I found the dataset to be pretty con­sis­tent with its an­gles

Roll is dif­fi­cult. A shell can be ro­tated in any way around the axis (even whilst the open­ing is fac­ing the cam­era). My fix was to use the longest ra­dius as the ref­er­ence point, and ro­tate the shell so that the longest ra­dius is al­ways on the right. This is not per­fect ei­ther, but it was good enough for me.

Then, I ex­tracted the con­tour of the shell to 256 points rel­a­tive to the cen­ter. This way, each shell is rep­re­sented by a 256x2 ma­trix, where each row is the (x, y) co­or­di­nates of a point on the con­tour. Example:

> con­tours[0].shape

(256, 2)

> con­tours[0].tolist()[:5]

[-0.38561132550239563, 0.9804982542991638], [-0.4204626679420471, 0.9785506725311279], [-0.4553140103816986, 0.976603090763092], [-0.4901654124259949, 0.9746555089950562], [-0.5230183005332947, 0.9685550928115845]]

Normalization pipeline

Naturally, the dis­tance be­tween two shells s1 and s2 is squared eu­clid­ean dis­tance be­tween their con­tour points:

$$ d(s1, s2) = {\sum_{256} (s1.x_i - s2.x_i)^2 + (s1.y_i - s2.y_i)^2} $$

Representing the space will re­quire 256 di­men­sions, which is a lit­tle more than just the 2 I need to plot it over x and y. Given the nor­mal­ized shell con­tour above, it’s clear that many of these di­men­sions are re­dun­dant (for in­stance, the space of all pos­si­ble 256 con­tour points al­lows in­ter­sec­tion, while the space of pos­si­ble shells does­n’t, AFAIK), so the space of pos­si­ble shells can be con­densed into a smaller la­tent space. To drive my point home, I’ll show three ex­am­ples of fully ran­dom con­tours (i.e. pseudo-ran­dom points around the ori­gin).

Probably not a real shell

Dimensionality re­duc­tion tech­niques map the orig­i­nal 256 di­men­sions onto a smaller num­ber of di­men­sions (e.g. 2 or 3) while try­ing to pre­serve the dis­tance be­tween shells as much as pos­si­ble. One such tech­nique I’ll be us­ing is Principal Component Analysis (PCA). Here’s an ex­cel­lent frag­ment that ex­plains how PCA works: https://​stats.stack­ex­change.com/​ques­tions/​2691/​mak­ing-sense-of-prin­ci­pal-com­po­nent-analy­sis-eigen­vec­tors-eigen­val­ues/​140579#140579.

After ap­ply­ing PCA, I re­tained 56.50% of the vari­ance us­ing only the first prin­ci­pal com­po­nent (PC1), and 67.25% us­ing the first two. This means we can de­scribe a shel­l’s shape by only two num­bers, and be pretty close to the orig­i­nal shape!

The in­ter­est­ing part is try­ing to un­der­stand what these two num­bers mean; di­men­sion 1 in the orig­i­nal 256-dimensional space an­no­tates the lo­ca­tion of the first con­tour point of the shell, whereas di­men­sion 1 of the la­tent space an­no­tates a high-level fea­ture, learned by the PCA al­go­rithm. We can vi­su­ally try to un­der­stand what PCA di­men­sion PC1 rep­re­sents by find­ing two shells, di­a­met­ri­cally op­po­site in the PC1 di­men­sion, yet sim­i­lar in all other di­men­sions.

Essentially, we want to find two shells i and j such that the fol­low­ing score is max­i­mized:

$$ \text{score}(i,j) = \frac{|z_{i,1} - z_{j,1}|} {|\mathbf{z}_{i,2:k} - \mathbf{z}_{j,2:k}|_2} $$

PC1 seems to cap­ture the pointiness’ of the shell, i.e. more than 50% of vari­ance in shell shapes can be ex­plained by how pointy they are. PC2 seems to cap­ture the sym­me­try of the shell, or per­haps the mass dis­tri­b­u­tion over the ver­ti­cal axis. I’ll leave the in­ter­pre­ta­tion of the other di­men­sions as an ex­er­cise for the reader (I have no idea).

And now for the grand fi­nale, we can plot the shells in the la­tent space, and see where our Alghat fos­sil fits in it. But first, for dra­matic ten­sion, I will dis­cuss the plot.

The plot rep­re­sents PC1 on the x-axis and PC2 on the y-axis, while color rep­re­sents the rough­ness of a shell (computed as the dif­fer­ence in slope be­tween con­sec­u­tive points). The fol­low­ing ob­ser­va­tions are worth not­ing:

Negative PC1 val­ues (representing round­ness) are way more com­mon than pos­i­tive PC1 val­ues (representing poin­ti­ness). Yet round­ness is less di­verse and oc­cu­pies less space than pointy shells

Pointy shells seem to be way more rough than round shells

Negative PC1 val­ues al­ways have PC2 val­ues close to zero; no shell in the dataset has a round but asym­met­ric shape. Below, I will pro­ject those shells back from la­tent space to the shape space, imag­in­ing im­pos­si­ble shells

Map of shell la­tent space with ex­am­ple shells

Modifying Principal Components against the mean shell

Projecting impossible’ shells

So, what shell most closely re­sem­bles our Alghat fos­sil? It’s Sphincterochila can­didis­sima (try to pro­nounce it). However, it is re­ally young, nowhere near the Jurassic age; in­stead, the ear­li­est fos­sil of it dates back 38 mil­lion years ago[4]. Ultimately, shape is not the best way of de­ter­min­ing shell lin­eage, but its eerie sim­i­lar­ity to the Alghat fos­sil is still fas­ci­nat­ing, and per­haps points to some sort of con­ver­gent evo­lu­tion, where two dif­fer­ent species evolve to have sim­i­lar shapes due to sim­i­lar en­vi­ron­men­tal pres­sures.

Left: Alghat fos­sil com­pared, Right: Sphincterochila can­didis­sima[3]

Explore the tool

Feel free to ex­plore the tool and try to fig­ure out where a shell of your choice fits in the shell la­tent space!

https://​shell.hawzen.me

References

Aba Alkhayl, S. S. (2022). Marine macro-in­ver­te­brate fos­sils from the Lower Hanifa Formation (Hawtah Member), cen­tral Saudi Arabia. Arabian Journal of Geosciences, 15, 1410. https://​doi.org/​10.1007/​s12517 – 022-10581-w

Zhang, Q., Zhou, J., He, J. et al. A shell dataset, for shell fea­tures ex­trac­tion and recog­ni­tion. Sci Data 6, 226 (2019). https://​doi.org/​10.1038/​s41597 – 019-0230 – 3

https://​en.wikipedia.org/​wiki/​Sphinc­te­rochi­la_­can­didis­sima

Tracey, S., Todd, J. A., & Erwin, D. H. (1993). Mollusca: Gastropoda. In M. J. Benton (Ed.), The Fossil Record 2 (pp. 131 – 167). London: Chapman &

Too Many Requests

jbkempf.com

The page you have tried to ac­cess is not avail­able be­cause the owner of the file you are try­ing to ac­cess has ex­ceeded our short term band­width lim­its. Please try again shortly.

Actioning this file would cause jbkempf.com//​blog/​2026/​dav2d/ to ex­ceed the per-day file ac­tions limit of 160000 ac­tions, try again later

the solution might be cancelling my AI subscription

thoughts.hmmz.org

I am try­ing to think of a list of all the won­der­ful things I’ve built with AI:

a speech recog­ni­tion sys­tem in rust

an email archive ren­der­ing + quote col­laps­ing tool

a jel­lyfin desk­top clone with gstreamer and qt quick

an in­vid­i­ous clone in python + yt-dlp

a faith­ful Windows 95 notepad.exe clone in fltk ported from the Wine sources

a ma­chine vi­sion thing to count traf­fic flows from pub­lic street cam­eras in opencv

a claude ui clone in python or rust i think, i don’t even re­mem­ber

a re­gional news site i never meant to build that is ac­tu­ally get­ting traf­fic, python/​flask

a 3d car game built on the pro­to­col for an ex­ist­ing mul­ti­player game in three.js

an in­vest­ment back­tester in python

a html clone of the light­room ui, mar­velled at the re­sult then never made the back­end

a mark­down viewer in qt or gtk or some­thing else i can’t even re­mem­ber

a re­place­ment world clock wid­get for my lap­top desk­top en­vi­ron­ment in gtk and C

a javascript net­work syn­chro­nised au­dio play­back thing

a rust client for a chi­nese IP cam­era re­versed from its Android app

a size­able SaaS in rust

maybe 50 other pro­jects i’ve al­ready deleted

Except for the SaaS, al­most none of this is use­ful and I don’t want to main­tain any of it. I ac­ci­den­tally run a news out­let which is surely a li­a­bil­ity. Sure, it has helped me learn AI tool­ing” and I use many of these tools, but I did­n’t need them. I can’t af­ford to main­tain any of them, not in terms of time, com­mit­ment, be­lief, at­ten­tion or will­ing­ness to spend on to­kens.

I did­n’t mean to build most of these things. Usually the Claude ses­sion started with some­thing like write a quick script for X, and one hour later the re­sult is not a quick script for X, nor in the usual case is my prob­lem solved, what­ever the orig­i­nal itch hap­pened to be.

at­ten­tion is all you need

On that last point, this tech­nol­ogy is hor­rific for at­ten­tion. It’s a ther­monu­clear ADHD am­pli­fier and I have seen the same ef­fect in every sin­gle one of my adult friends. Folk run­ning 3 screens si­mul­ta­ne­ously work­ing on to­tally un­re­lated projects” they have lit­tle hope of main­tain­ing, and such lit­tle com­mit­ment to the out­come that the time is ob­vi­ously wasted.

In re­cent times, at least once per month some­one sends a screen­shot for an awe­some tool they are work­ing on. I’m like whoa, that’s re­ally some­thing and the sender is ob­vi­ously proud and en­thu­si­as­tic. I try not to ask, but am al­ways think­ing and where will you mar­ket it?, be­cause when the ques­tion is asked of an en­gi­neer, the an­swer is un­changed since be­fore LLMs ex­isted.

I re­cently in­ter­viewed and when the topic of AI us­age came up, the host an­swered some­thing like oh we’re quite light on it, every­one has up to 5 rooms where they man­age their agents and I im­me­di­ately felt a tight­ness in my stom­ach.

I had a vague sense of the ef­fect a few months into us­ing Claude. Later I re­duced my sub­scrip­tion to Pro in the be­lief a quota re­stric­tion would mit­i­gate ex­ces­sive use. Then Claude went through a bad ser­vice pe­riod and I moved to Codex. Codex’s CLI is much nicer than Claude’s and no­tice­ably faster. And us­age started creep­ing back up.

The tech­nol­ogy, when honed, is gen­uinely amaz­ing. Ask it to zero shot a parser for an es­o­teric gram­mar im­ple­mented in an es­o­teric lan­guage with full tests and it’s done. The tool­ing as it ex­ists to­day pro­motes ab­solutely noth­ing like the fo­cus re­quired to ap­ply it ju­di­ciously.

Almost every ven­dor and every tool in­tends to do ex­actly the op­po­site: more us­age, more to­kens, more out­put. Ask a sim­ple yes/​no ques­tion of ChatGPT and you can clearly see that it is hard-wired to in­clude a rel­e­vant fol­low-up ques­tion to pro­mote ex­ces­sive in­ter­ac­tion.

Slopping out a 10,000 LOC untested Python/JS mess in 5 min­utes helps no­body. The thought of this hap­pen­ing in every com­mer­cial en­vi­ron­ment si­mul­ta­ne­ously is hor­ri­fy­ing.

fric­tion = fo­cus, fo­cus = prod­uct

One of my early AI ex­per­i­ments, ex­plor­ing AI as a lens in Marshall McLuhan-like think­ing, was to con­nect speech recog­ni­tion to a pipeline that gen­er­ated blog posts on the other side, in the be­lief it would en­cour­age me to cap­ture my thoughts. All I needed was to press the voice note but­ton in a Telegram chan­nel, and out pops an Opus-formatted post.

The out­put was un­bri­dled garbage. Because the ef­fort was re­moved, so was the com­mit­ment, and with the com­mit­ment the fo­cus, and with the fo­cus any mean­ing­ful prod­uct at all. Quality writ­ing is not con­ver­sa­tional English sim­ply cast through a lens: con­ver­sa­tional English is low-bit rate noise, qual­ity writ­ing at­tempts to cap­ture high bit rate in­for­ma­tion with bet­ter formed con­cepts, and this should have been ob­vi­ous be­fore I be­gan.

I looked at re­pur­pos­ing the pipeline to cap­ture pri­vate notes, but I have no need for pri­vate notes. It sub­verts the nat­ural process of noise be­ing for­got­ten. It is just more ex­cess tool use.

Following from this, for as long as qual­ity mat­ters, I be­lieve hand­writ­ing can never be ob­so­lete.

It feels like we’re head­ing to­wards cri­sis, and I doubt the an­swer is better mod­els” or better tool­ing”. Cal Newport re­lates this to pseudo-pro­duc­tiv­ity:

The speaker ar­gues that dig­i­tal pro­duc­tiv­ity tools, in­clud­ing AI and email, of­ten cre­ate a digital pro­duc­tiv­ity para­dox”: they make in­di­vid­ual tasks faster or eas­ier, but they can leave knowl­edge work­ers busier, more dis­tracted, and less pro­duc­tive over­all. He cites re­search show­ing that AI users spent much more time in email, mes­sag­ing, chat, and busi­ness-man­age­ment tools, while spend­ing less time in fo­cused, un­in­ter­rupted work. His cen­tral claim is that tools de­signed to re­duce fric­tion of­ten in­crease the vol­ume of shal­low tasks and con­text switch­ing, which weak­ens deep work and high-value out­put.

He ex­plains that this hap­pens be­cause knowl­edge work of­ten re­lies on pseudo pro­duc­tiv­ity,” where vis­i­ble busy­ness is treated as a proxy for real value. Digital tools re­in­force this by mak­ing peo­ple look ac­tive: send­ing more mes­sages, pro­duc­ing more drafts, at­tend­ing more meet­ings, and gen­er­at­ing more work ar­ti­facts. To avoid the trap, he rec­om­mends mea­sur­ing real out­comes, iden­ti­fy­ing the true bot­tle­necks in one’s work, and sep­a­rat­ing deep work from shal­low work so that dig­i­tal tools sup­port mean­ing­ful progress in­stead of con­sum­ing at­ten­tion.

🤖

The speaker ar­gues that dig­i­tal pro­duc­tiv­ity tools, in­clud­ing AI and email, of­ten cre­ate a digital pro­duc­tiv­ity para­dox”: they make in­di­vid­ual tasks faster or eas­ier, but they can leave knowl­edge work­ers busier, more dis­tracted, and less pro­duc­tive over­all. He cites re­search show­ing that AI users spent much more time in email, mes­sag­ing, chat, and busi­ness-man­age­ment tools, while spend­ing less time in fo­cused, un­in­ter­rupted work. His cen­tral claim is that tools de­signed to re­duce fric­tion of­ten in­crease the vol­ume of shal­low tasks and con­text switch­ing, which weak­ens deep work and high-value out­put.

He ex­plains that this hap­pens be­cause knowl­edge work of­ten re­lies on pseudo pro­duc­tiv­ity,” where vis­i­ble busy­ness is treated as a proxy for real value. Digital tools re­in­force this by mak­ing peo­ple look ac­tive: send­ing more mes­sages, pro­duc­ing more drafts, at­tend­ing more meet­ings, and gen­er­at­ing more work ar­ti­facts. To avoid the trap, he rec­om­mends mea­sur­ing real out­comes, iden­ti­fy­ing the true bot­tle­necks in one’s work, and sep­a­rat­ing deep work from shal­low work so that dig­i­tal tools sup­port mean­ing­ful progress in­stead of con­sum­ing at­ten­tion.

🤖

These ex­pe­ri­ences have opened a new per­cep­tion of all tool use, be­cause be­neath it all this is not about faster de­vel­op­ment = more apps or faster email = more com­mu­ni­ca­tion be­ing a de­sir­able goal. Generically, it’s about a unit time of life and how it is spent mean­ing­fully.

I have no idea how to man­age AI at pre­sent ex­cept by cur­tail­ing use, be­cause a tool pro­duc­ing a cheap re­ward with min­i­mal in­put and no fric­tion can only be a li­a­bil­ity, and achiev­ing that re­al­i­sa­tion is prob­a­bly the only real con­tri­bu­tion of AI to date.

David, Sun 31 May 14:31:04 2026

Introducing 1-bit and Ternary Bonsai Image 4B: Image Generation for Local Devices

prismml.com

Today we’re re­leas­ing Bonsai Image 4B, a fam­ily of com­pact im­age-gen­er­a­tion mod­els de­signed to run high-qual­ity dif­fu­sion in­fer­ence on lo­cal hard­ware: from lap­tops to phones.

Bonsai Image 4B comes in two vari­ants:

1-bit Bonsai Image 4B uses bi­nary {−1, +1} trans­former weights with an FP16 group-wise scal­ing fac­tor, giv­ing 1.125 ef­fec­tive bits per weight. It tar­gets max­i­mum com­pres­sion and is the right fit when mem­ory pres­sure, band­width, and the de­ploy­ment foot­print are the pri­mary con­straints.

Ternary Bonsai Image 4B uses {−1, 0, +1} trans­former weights with an FP16 group-wise scal­ing fac­tor, giv­ing 1.71 ef­fec­tive bits per weight. The ad­di­tional zero state gives the model more rep­re­sen­ta­tional flex­i­bil­ity, im­prov­ing vi­sual qual­ity and prompt fi­delity while re­main­ing ex­tremely com­pact.

The re­sult is a new de­ploy­ment regime for im­age gen­er­a­tion: ca­pa­ble out­puts, open weights, and prac­ti­cal lo­cal in­fer­ence on de­vices that were pre­vi­ously out of reach for this class of model. To our knowl­edge, Bonsai Image 4B is the first im­age model in its pa­ra­me­ter class to run di­rectly on an iPhone.

Built for lo­cal gen­er­a­tion

Local im­age gen­er­a­tion starts with a hard con­straint: the model has to fit within the de­vice’s mem­ory bud­get.

For a 4B-class im­age model, the dif­fu­sion trans­former is the largest part of the model and the part that runs re­peat­edly dur­ing gen­er­a­tion. Each de­nois­ing step in­vokes the trans­former again, so trans­former size di­rectly shapes mem­ory pres­sure, band­width de­mand, and lo­cal in­fer­ence speed.

Bonsai Image 4B is built from the FLUX.2 Klein 4B. It keeps the ar­chi­tec­ture in­tact but changes how the trans­former weights are rep­re­sented. By mov­ing those weights into bi­nary and ternary form, Bonsai re­duces the part of the im­age pipeline that mat­ters most for lo­cal de­ploy­ment.

Table I: Diffusion trans­former foot­print for mod­els.

The bi­nary lay­ers pro­vide roughly a 14x re­duc­tion rel­a­tive to full-pre­ci­sion trans­former weights. A small set of pre­ci­sion-sen­si­tive sup­port­ing ten­sors (~5%), called the pro­jec­tion lay­ers, re­mains in FP16 so the fi­nal 1-bit Bonsai Image 4B trans­former is 0.93 GB: an 8.3x re­duc­tion from the 7.75 GB full-pre­ci­sion FLUX.2 Klein 4B.

The ternary vari­ant fol­lows the same struc­ture. Its ternary lay­ers pro­vide roughly a 10x re­duc­tion and the fi­nal Ternary Bonsai Image 4B trans­former is 1.21 GB, a 6.4x re­duc­tion from the full-pre­ci­sion trans­former. It is slightly larger than the 1-bit model, but the ad­di­tional zero state im­proves vi­sual qual­ity and prompt fi­delity.

Including the com­pressed text en­coder and FP16 VAE, the Apple Silicon de­ploy­ment pay­load is 3.42 GB for 1-bit Bonsai Image 4B and 3.88 GB for Ternary Bonsai Image 4B. For com­par­i­son, the full pre­ci­sion FLUX.2 Klein 4B re­quires a de­ploy­ment pay­load of 15.97 GB. Since, at run­time, the text en­coder is of­floaded af­ter prompt en­cod­ing, the mean mem­ory us­age is smaller than the to­tal pay­load. When gen­er­at­ing a 512x512 im­age, the mean-ac­tive mem­ory is 1.5 GB and 1.96 GB, for the bi­nary and ternary mod­els, com­pared to 11.74 GB for the orig­i­nal FLUX.2 Klein 4B (a re­duc­tion of 7.8x and 6.0x, re­spec­tively). For a 1024x1024 im­age, the mean-ac­tive mem­ory is 1.95 GB and 2.38 GB, for the bi­nary and ternary mod­els, com­pared to 14.39 GB for the orig­i­nal FLUX.2 Klein 4B (a re­duc­tion of 7.4x and 6.0x, re­spec­tively).

This re­duc­tion in mem­ory foot­print changes where the model can run. Our de­ploy­ment stack sup­ports Apple Silicon iPhones, iPads and Macs and CUDA GPUs, us­ing MLX low-bit paths on Apple hard­ware and Gemlite low-bit GEMM ker­nels on CUDA. On iPhone 17 Pro Max, the full-pre­ci­sion FLUX.2 Klein 4B pipeline does not fit within the de­vice mem­ory bud­get, while both Bonsai Image vari­ants run on-de­vice.

Video I: Image gen­er­a­tion on Bonsai Studio

In prac­tice, Bonsai Image 4B gen­er­ates a 512x512 im­age in 9.4 sec­onds on an iPhone 17 Pro Max and about 6 sec­onds on Mac M4 Pro. On Mac M4 Pro, Bonsai Image 4B is up to 5.6x faster than the stock full-pre­ci­sion MFLUX pipeline.

Benchmarking per­for­mance

Compression only mat­ters if the model re­mains use­ful. We eval­u­ated Bonsai Image 4B across three com­ple­men­tary bench­marks: GenEval for ob­ject com­po­si­tion and at­tribute bind­ing; HPSv3 hu­man pref­er­ence and aes­thetic qual­ity; DPG-Bench dense prompt fol­low­ing and se­man­tic faith­ful­ness.

Table II: Image qual­ity bench­mark com­par­i­son across Ternary Bonsai Image 4B and other mod­els.

Ternary Bonsai Image 4B is the qual­ity-ori­ented vari­ant. At 1.21 GB, it re­tains 95% of the FLUX.2 Klein 4B ac­cu­racy across GenEval, HPSv3, and DPG-Bench, while re­duc­ing the dif­fu­sion trans­former foot­print by 6.4x.

1-bit Bonsai Image 4B is the foot­print-ori­ented vari­ant. It brings the dif­fu­sion trans­former be­low 1 GB, an 8.3x re­duc­tion, while still de­liv­er­ing strong bench­mark scores across the same three eval­u­a­tions (it re­tains 88% of the ac­cu­racy of FLUX.2 Klein 4B).

Together, the two vari­ants move the qual­ity–foot­print fron­tier. Bonsai Image re­mains com­pet­i­tive with mod­ern 4B-class im­age mod­els while us­ing a frac­tion of their dif­fu­sion-trans­former foot­print. At the same time, it sub­stan­tially out­per­forms smaller mod­els with sim­i­lar mem­ory foot­prints. That is the same Pareto shift we have seen in our prior Bonsai lan­guage mod­els. Bonsai Image brings mod­ern dif­fu­sion-trans­former be­hav­ior into a mem­ory range that pre­vi­ously be­longed to much smaller, lower-ca­pa­bil­ity mod­els.

Why this is im­por­tant

Image gen­er­a­tion is not only a model-qual­ity prob­lem. It is also a de­ploy­ment prob­lem.

Cloud APIs will con­tinue to be the right choice for many prod­ucts. But cloud-only gen­er­a­tion im­poses cer­tain prod­uct con­straints: every prompt is a re­mote re­quest, every it­er­a­tion car­ries mar­ginal serv­ing cost, and every in­ter­ac­tion adds round-trip la­tency.

That mat­ters be­cause im­age gen­er­a­tion is nat­u­rally it­er­a­tive. Users rarely stop at one im­age. They re­vise prompts, com­pare out­puts, gen­er­ate vari­a­tions, dis­card fail­ures, and try again. When each at­tempt is a server-side job, the cre­ative loop be­comes some­thing users have to me­ter and wait for.

Local in­fer­ence changes that. Once the model fits on the de­vice, gen­er­a­tion can sit di­rectly in­side the prod­uct ex­pe­ri­ence. It be­comes cheaper to run, faster to it­er­ate on, and eas­ier to use in en­vi­ron­ments where prompts, and gen­er­ated as­sets should re­main pri­vate.

Bonsai Image 4B is a step to­ward that de­ploy­ment regime: ca­pa­ble im­age gen­er­a­tion run­ning closer to the user, on hard­ware they al­ready own.

Availability

Both 1-bit and Ternary Bonsai Image 4B will be re­leased with open weights and code un­der the Apache 2.0 li­cense.

With this launch, we are also launch­ing Bonsai Studio, its iOS app for try­ing Bonsai Image 4B di­rectly on iPhone.

Join Us

PrismML emerged from a team of Caltech re­searchers and was founded with sup­port from Khosla Ventures, Cerberus and Google. We’ve spent years tack­ling one of the field’s hard­est prob­lems: com­press­ing neural net­works with­out sac­ri­fic­ing their rea­son­ing abil­ity.

If you want to help build the next gen­er­a­tion of state-of-the-art AI, we’d love to hear from you. Check out our ca­reers page.

Resources

Whitepaper

Hugging Face

WebGPU demo

Bonsai Studio for iPhone

GitHub

I Put a Datacenter GPU in My Gaming PC for £200

blog.tymscar.com

I al­ready had an RTX 4080. 16GB of VRAM. Good enough for gam­ing, not good enough for the mod­els I wanted to run lo­cally. The next step up in GPU land is ei­ther spend a for­tune on a card with more VRAM, or find an­other way.

I found an­other way.

I bought a dat­a­cen­ter GPU that does­n’t even have a nor­mal PCIe con­nec­tor, stuck it in my gam­ing PC with an adapter, and now I have 32GB of VRAM across two GPUs run­ning a 27 bil­lion pa­ra­me­ter model at 32 to­kens per sec­ond. The whole thing cost me £200.

The GPU#

This is a Tesla V100 SXM2 16GB. It was de­signed for NVIDIAs DGX servers and hy­per­scaler racks. The SXM2 form fac­tor means it does not have a PCIe slot. It does not have dis­play out­puts. It does not have a nor­mal power con­nec­tor. It sits on a pro­pri­etary board in­side a server rack and com­mu­ni­cates over NVLink.

You can­not plug this into a moth­er­board. Not with­out help.

But here is the thing: this is a Volta GPU with 16GB of HBM2 mem­ory, 5120 CUDA cores, and I picked it up for about £150 on eBay. The com­pute is still real. The VRAM is still real. And the mem­ory band­width is where it gets gen­uinely sur­pris­ing.

HBM2 is a dif­fer­ent class of mem­ory. The V100 has a 4096-bit mem­ory bus de­liv­er­ing 900 GB/s of band­width. To put that in per­spec­tive, my RTX 4080 with its fancy GDDR6X man­ages 736 GB/s. The V100 from 2017 has 22% more mem­ory band­width than a GPU that launched in 2022.

And it is not just NVIDIAs con­sumer cards that lose. Apple’s M3 Max does 400 GB/s. The M4 Max does 546 GB/s. The brand new M5 Max, which will set you back over £3,000 for a lap­top, man­ages 614 GB/s. A GPU from 2017 beats every Mac on the mar­ket.

The clos­est AMD com­pe­ti­tion to my 4080 is the RX 7900 XTX, which does 960 GB/s on its 24GB of GDDR6. Technically that edges out the V100, but the 7900 XTX costs £700+ and ROCm sup­port for LLM in­fer­ence is still rough com­pared to CUDA. The V100 gives you 94% of that band­width for less than a quar­ter of the price, and it just works with llama.cpp.

The only con­sumer GPU that com­fort­ably beats it is the RTX 5090 at 1,792 GB/s, and that card costs over £2,000. For LLM in­fer­ence, where mem­ory band­width is the bot­tle­neck that de­ter­mines your to­kens per sec­ond, this mat­ters more than al­most any­thing else.

The only prob­lem is the con­nec­tor.

The adapter#

Turns out, some­one makes an SXM2-to-PCIe adapter. It is not made by NVIDIA. It is not of­fi­cially sup­ported by any­one. It is a bare PCB with the SXM2 socket on one side and a PCIe edge con­nec­tor on the other. I paid about £50 for it. Half of that might just be the cop­per.

So for about £200 to­tal, I had a 16GB VRAM GPU that could slot into my moth­er­board along­side my RTX 4080. That is 32GB of to­tal VRAM. A sin­gle RTX 5090 with 32GB costs over £2,000. I am not say­ing this is the same ex­pe­ri­ence. I am say­ing the VRAM is the same.

The fan from hell#

Before I could do any­thing use­ful with the V100, I had to deal with the fan.

The V100 SXM2 was de­signed to live in­side a 2U server with in­dus­trial cool­ing. The fan on the adapter is not sub­tle. It is not quiet. It is not some­thing you want in a room you also sleep in.

I mea­sured it with my Apple Watch:

82 deci­bels. That is some­where be­tween a garbage dis­posal and a lawn­mower, well past loud PC and into should I be wear­ing earplugs in my own house” ter­ri­tory.

And the worst part: you can­not con­trol it. I tried nvidia-smi, I tried scan­ning for it on Linux, I even tried Afterburner on Windows (more on that later, the whole setup barely works on Windows). Nothing. The fan on this adapter is not de­signed to be con­trolled. It is de­signed to run at 100%, for­ever, in­side a server rack where no­body has to hear it.

Here is me try­ing to fig­ure out the fan pinout. I guessed it might be a stan­dard case fan pinout on a weird con­nec­tor, so I jammed two jumper wires into VCC and ground and prod­ded a 9V bat­tery against them. It spun. And it was so much qui­eter than the 12V it nor­mally gets:

That con­firmed the pinout and gave me hope that the fan could ac­tu­ally be tamed.

Making the fan lis­ten to rea­son#

The 9V bat­tery test told me the pinout was stan­dard case fan ter­ri­tory, just with a weird con­nec­tor. The next ques­tion was whether the fan would ac­tu­ally re­spond to PWM con­trol if I wired the tachome­ter and PWM pins to my moth­er­board.

So I shoved some jumper wires into the con­nec­tor and jammed the other ends into a spare fan header (turn your vol­ume up):

It works. The moth­er­board can read the RPM and the fan re­sponds to PWM. I keep it at 10%. It never goes above 50C even at full load, and I can­not re­ally hear it.

Now I just needed a proper ca­ble in­stead of jumper wires held in by hope.

The fan con­nec­tor on the adapter is a small JST PH2.0 plug with four pins. Motherboard fan head­ers use a stan­dard 0.1 inch (2.54mm) pitch. The GPU fan uses a 2.0mm JST PH con­nec­tor. The pins are closer to­gether and the plug is smaller.

The so­lu­tion was a 2.54mm male to PH2.0 fe­male jumper ca­ble. The fe­male PH2.0 end plugs into the fan’s tachome­ter and PWM pins, and the male 2.54mm end goes into a spare fan header on the moth­er­board:

That went from 82dB ear dam­age to some­thing I can ac­tu­ally live with.

Doubling VRAM for cheap#

With the fan sit­u­a­tion han­dled, the V100 slot­ted right in along­side my 4080:

RTX 4080: 16GB VRAM, Ada ar­chi­tec­ture

Tesla V100: 16GB VRAM, Volta ar­chi­tec­ture

Total: 32GB VRAM across two GPUs

llama.cpp can split the model across both GPUs us­ing ten­sor split­ting. It pipelines the lay­ers across the PCIe bus so the 4080 han­dles some lay­ers and the V100 han­dles the rest. It is not as fast as hav­ing a sin­gle GPU with 32GB, but it works, and it cost me roughly 10% of what a 32GB GPU would cost. For what it is worth, the most I have ever seen the V100 pull is around 150W. That is not noth­ing, but it is not out of this world for a GPU run­ning lo­cal LLM in­fer­ence.

But wait, you can go big­ger#

The V100 also comes in a 32GB vari­ant. It costs more than dou­ble what I paid, but we are still talk­ing about a few hun­dred pounds for 32GB of HBM2 mem­ory on a sin­gle card. Two of those would give you 64GB of VRAM for roughly 20% of what an RTX 5090 costs in to­day’s mar­ket.

You can also clus­ter them. The SXM2 for­mat sup­ports NVLink na­tively, which means if you are build­ing a proper multi-GPU setup, these cards can talk to each other at very high band­width. Even through the PCIe adapter, the ten­sor split per­for­mance is solid.

The soft­ware side#

This part was sur­pris­ingly smooth thanks to NixOS. The V100 is a Volta chip. NVIDIA dropped Volta sup­port start­ing with dri­ver branch 560. The last dri­ver that sup­ports both my RTX 4080 (Ada) and the V100 (Volta) is branch 550.x, which maps to nvidi­a­Pack­ages.lega­cy_535 on NixOS.

That dri­ver only sup­ports CUDA up to 12.2. Current nix­p­kgs ships CUDA 12.6 min­i­mum. So I had to pull CUDA 12.2 from nix­p­kgs 24.05.

Also, the dri­ver re­quires ker­nel 6.6. Newer ker­nels are not sup­ported with the legacy dri­ver.

And here is a weird one: even though this is a head­less in­fer­ence server, ser­vices.xserver.en­able = true is re­quired. Without it, the NVIDIA ker­nel mod­ules do not load.

NixOS made most of this straight­for­ward. Here is the key con­fig­u­ra­tion for get­ting the dri­ver and ker­nel right:

boot.ker­nel­Pack­ages = pkgs.lin­ux­Pack­ages_6_6; hard­ware.nvidia.pack­age = con­fig.boot.ker­nel­Pack­ages.nvidi­a­Pack­ages.lega­cy_535; ser­vices.xserver.en­able = true; ser­vices.xserver.video­Drivers = [ nvidia” ];

And for load­ing CUDA 12.2 from an older nix­p­kgs since the cur­rent one only ships 12.6+:

nix­p­kgs.over­lays = [ (final: prev: { cu­d­a­Pack­ages_12_2 = nix­p­kgs-cuda.lega­cy­Pack­ages.${prev.sys­tem}.cu­d­a­Pack­ages_12_2; }) ];

The im­por­tant thing is: it works. Both GPUs show up, CUDA is func­tional, and NixOS han­dled the whole thing el­e­gantly. If you want to repli­cate this, the en­tire ma­chine de­f­i­n­i­tion is in this com­mit on my dot­files repo, in­clud­ing the llama.cpp ser­vice de­f­i­n­i­tion and the cus­tom build pinned to the right ver­sion.

Running the model#

I am run­ning Qwen3.6 – 27B-MTP quan­tized at Q5_K_M, which comes in at about 19GB. With both GPUs, the en­tire model fits in VRAM with room for con­text:

And the per­for­mance:

32 to­kens per sec­ond is fast enough for in­ter­ac­tive use. It is faster than most cloud API end­points when you fac­tor in net­work la­tency. And this is with ten­sor split­ting across two dif­fer­ent GPU ar­chi­tec­tures con­nected by PCIe.

This model is ac­tu­ally good#

I want to be clear about some­thing. This is not good for a lo­cal model.” This is not acceptable if you lower your ex­pec­ta­tions.” Qwen3.6 – 27B ties with Claude Sonnet 4.6 on Artificial Analysis’s Agentic Index. It beats Sonnet 4.6 on MMMU-Pro and Terminal-Bench 2.0. A 27 bil­lion pa­ra­me­ter model run­ning on sec­ond­hand hard­ware is gen­uinely com­pet­i­tive with the lat­est cloud mod­els from Anthropic.

Yes, Sonnet 4.6 edges it out on GPQA and SWE-Bench Verified. It should, it is a mas­sive pro­pri­etary model. And yes, if you want the ab­solute best, Opus 4.8 ex­ists. It also costs more per 20 min­utes of heavy use than I paid for this en­tire GPU and adapter setup com­bined. But the gap is shock­ingly small. We have reached the point where the model you run in your bed­room is in the same con­ver­sa­tion as the ones that charge you per to­ken.

Multi-Token Prediction#

The MTP in the model name stands for Multi-Token Prediction. Normal LLM in­fer­ence pre­dicts one to­ken at a time. Predict one to­ken, ac­cept it, pre­dict the next to­ken, re­peat. MTP changes this by hav­ing the model pre­dict sev­eral fu­ture to­kens at once, then ver­i­fy­ing which ones were cor­rect. Accepted to­kens are es­sen­tially free. Wrong pre­dic­tions fall back to the nor­mal path.

The re­sult is roughly 1.5 – 2x faster gen­er­a­tion with no ac­cu­racy loss. On my setup that means in­fer­ence goes from around 32 tok/​s to po­ten­tially 50 – 60 tok/​s when MTP hits its stride, es­pe­cially on pre­dictable out­put like code.

The catch is that MTP sup­port in llama.cpp is new. The ver­sion in nix­p­kgs does not sup­port the Qwen3.6 MTP ar­chi­tec­ture, so I had to build llama.cpp from source at a spe­cific com­mit that added sup­port. On NixOS this is pain­less. I have a cus­tom de­riva­tion pinned to the right com­mit, and the whole thing is re­pro­ducible. When I want to up­date the model or change the llama.cpp ver­sion, I change one line in my con­fig, run nixos-re­build switch, and I am done. No de­pen­dency hell, no re­in­stalling by hand, no won­der­ing whether I built against the right CUDA ver­sion.

Vision: how the model sees im­ages#

The Qwen3.6 – 27B model sup­ports im­age in­put through a sep­a­rate mul­ti­modal pro­jec­tor file (mmproj). This is about 928MB ex­tra, and it is fas­ci­nat­ing.

The way it works is that a vi­sion en­coder (similar to what ChatGPT and Claude use) takes im­age pix­els and trans­lates them into the LLMs to­ken em­bed­ding space. The model does not see” the im­age the way a hu­man does. Instead, the vi­sion en­coder com­presses the im­age into a se­quence of vec­tors that live in the same math­e­mat­i­cal space as text to­kens. The LLM then processes those vec­tors as if they were just an­other se­quence of to­kens.

What this means in prac­tice: you send the model an im­age URL along­side your text prompt, and it can de­scribe, an­a­lyze, and rea­son about what it sees. The en­tire vi­sion ca­pa­bil­ity adds about 1GB to the model size. That is it. One gi­ga­byte and your lo­cal LLM can read im­ages.

In llama.cpp, the flags are straight­for­ward:

–mmproj /mnt/nas/llamacpp/mmproj-F16.gguf –mmproj-offload

The –mmproj-offload flag loads the vi­sion en­coder onto GPU along­side the model, so you still get fast in­fer­ence even with im­ages.

Running it through OpenCode#

I use this setup with OpenCode, which is an AI cod­ing as­sis­tant that can run against lo­cal mod­els. The LLM server runs on my desk­top, but I do not use it from that ma­chine. I use it from any other ma­chine in my house over the net­work, or from out­side over Tailscale (but that is a blog post for an­other time). Pointing OpenCode at the llama.cpp server is as sim­ple as set­ting the API URL. The model runs lo­cally, the re­sponses are fast, and noth­ing leaves my net­work.

The NAS and the USB drive#

All the mod­els live on my TrueNAS server, mounted via NFS:

fileSys­tems.“/​mnt/​nas” = { de­vice = truenas-nfs.tymscar.com:/mnt/oasis/services”; fsType = nfs”; op­tions = [ nfsvers=4″ _netdev” auto” nofail” ]; };

The llama.cpp ser­vice de­pends on mnt-nas.mount, so it does not start un­til the NAS is avail­able. This means I can store ter­abytes of mod­els with­out wor­ry­ing about lo­cal disk space.

The en­tire OS runs from a Corsair MP600 MINI in a DockCase USB-C NVMe en­clo­sure. No in­ter­nal drive mod­i­fi­ca­tion needed. When I want to game, I un­plug the drive and re­boot into my main Windows in­stall, and game nor­mally on the 4080. When I want to do LLM stuff, I plug the drive back in, re­boot into NixOS, and both GPUs are avail­able.

This is not as el­e­gant as a dual-boot menu, but it is sim­ple and it works. No GRUB, no boot­loader con­flicts, no par­ti­tion man­age­ment. Just a phys­i­cal switch.

The one an­noy­ing thing#

The V100 oc­ca­sion­ally dis­ap­pears from lspci and nvidia-smi af­ter a warm re­boot (where the OS restarts but the moth­er­board stays pow­ered). This seems to be an ACPI enu­mer­a­tion is­sue with the PCIe slot. A cold re­boot (physically power off, wait a few sec­onds, power back on) al­ways re­stores it.

When the V100 is ab­sent, llama.cpp fails to start be­cause it can­not fit the model on a sin­gle 16GB GPU. The ser­vice crash-loops un­til the GPU comes back. This is not a big deal in prac­tice since I am usu­ally around when I re­boot, but it is worth know­ing about. It gives me the same vibes as the in­fa­mous AMD GPU re­set bug, where pass­ing through an AMD GPU to a VM and then shut­ting it down leaves the GPU in a state that only a full host power cy­cle can fix.

What I ended up with#

For £200, I got:

A 16GB dat­a­cen­ter GPU run­ning along­side my gam­ing GPU

32GB to­tal VRAM for lo­cal LLM in­fer­ence

32 to­kens per sec­ond on a 27B pa­ra­me­ter model

128k to­ken con­text win­dow

Vision sup­port for im­age in­put

A model that runs com­pletely lo­cally, no cloud, no per-to­ken costs

The only real cost was the noise, and I solved that with £2 worth of jumper ca­bles and a bit of con­nec­tor spelunk­ing. The V100 is not the fastest GPU for in­fer­ence, and the ten­sor split across two dif­fer­ent ar­chi­tec­tures is not as clean as a sin­gle GPU. But for the price, it is ab­surdly good value.

If you want to run proper mod­els lo­cally, look at the sec­ond­hand server GPU mar­ket. You do not even need an ex­ist­ing GPU. I hap­pen to have a 4080 in my gam­ing PC, but a sin­gle V100 in a cheap server box would give you 16GB of VRAM and a per­fectly us­able lo­cal LLM for very lit­tle money. The V100 SXM2 is not the only op­tion. The P40 gives you 24GB for sim­i­lar money, though it is slower and has no Tensor Cores. The V100 32GB vari­ant costs more but still un­der­cuts any con­sumer GPU with that much VRAM.

Just be ready for the fan.

"Four-Letter Word": United Airlines 767 Returns To Newark After Bluetooth Name Sparks Alert

simpleflying.com

Published May 31, 2026, 6:26 AM EDT

Luke has over a decade of ex­pe­ri­ence as a travel writer and avi­a­tion an­a­lyst. As a pas­sion­ate trav­eler based across the Middle East and Asia, Luke of­fers strong in­sights into the in­dus­try. Based in South East Asia.

A United Airlines Boeing 767 – 400ER bound for Palma de Mallorca, Spain, made a mid-At­lantic U-turn af­ter a pas­sen­ger’s threat­en­ing Bluetooth speaker name trig­gered a se­cu­rity alert. Early re­ports in­di­cate that a teenage pas­sen­ger on­board named their de­vice BOMB,’ and the dis­cov­er­able name es­ca­lated quickly into a bomb-threat re­sponse.

Unlock Personalized Content & Exclusive Features

Join the com­mu­nity to dis­cuss trend­ing top­ics with top au­thors, per­son­al­ize your feed, and get fewer ads.

Log in or Create an Account For Free

*Required: 8 chars, 1 cap­i­tal let­ter, 1 num­ber

or

By cre­at­ing an ac­count, you agree to our Terms of Use and Privacy Policy. You also agree to re­ceive our newslet­ters; you can un­sub­scribe any time.

United Airlines Bluetooth Threat Incident

According to flight track­ing data, United Flight 236 from Newark Liberty International Airport (EWR) to Palma De Mallorca Airport (PMI) de­parted Newark at 6:08 PM lo­cal time, and was ap­prox­i­mately 60 min­utes into its transat­lantic jour­ney be­fore the se­cu­rity sit­u­a­tion es­ca­lated. A pas­sen­ger on the flight pro­vided more de­tails on Reddit, stat­ing that a flight at­ten­dant told pas­sen­gers over the PA sys­tem that they must turn off Bluetooth im­me­di­ately,” or else the air­craft would have to turn around.

Date

May 30, 2026

Airline

United Airlines

Flight Code

UA236

Aircraft Type

Boeing 767 – 400ER (N67052)

Departure Airport

Newark Liberty International Airport (EWR)

Destination Airport

Palma de Mallorca Airport (PMI)

Fate

Returned to EWR; pas­sen­gers boarded a re­place­ment flight

This was re­peated mul­ti­ple times, with the crew even­tu­ally is­su­ing a fi­nal one-minute warn­ing. However, not all pas­sen­gers com­plied with the in­struc­tions, as there were still two ac­tive Bluetooth de­vices af­ter the ul­ti­ma­tum was is­sued. The air­craft sub­se­quently squawked 7700 (the code for a gen­eral emer­gency) and turned around, land­ing back in EWR at 8:50 PM af­ter spend­ing al­most three hours in the air. Simple Flying con­tacted United for com­ment on this in­ci­dent, but a rep­re­sen­ta­tive could not be reached be­fore pub­li­ca­tion.

Bluetooth Speaker Name Set To BOMB

As per record­ings from LiveATC.net, a mem­ber of United’s ground team said that the Bluetooth speaker name had been set to a four-letter word,” later re­ported by AirLive as BOMB.’ Passengers on the flight were re­port­edly told that up to ten agents” would be wait­ing for the air­craft in Newark to de­ter­mine the ori­gin of the threat.

Those on­board were also in­structed to leave all their be­long­ings on the air­craft be­fore de­plan­ing. Saturday’s in­ci­dent has par­al­lels with an­other se­cu­rity scare that oc­curred on a United flight ear­lier this month. During this in­ci­dent, a Wi-Fi hotspot named Free Palestine, F Zionists” prompted the pi­lot to is­sue a warn­ing to the cabin, telling the pas­sen­ger re­spon­si­ble that they had 30 sec­onds” to re­move the name or the FBI would meet the air­craft.

Additionally, in April, two United flights were evac­u­ated on back-to-back days due to bomb threats, demon­strat­ing how se­ri­ously these in­ci­dents are taken. Though some have ques­tioned why any­one in­tend­ing to blow up a plane would broad­cast the word bomb, many ter­ror­ist acts have re­lied on the threat of a bomb as lever­age dur­ing at­tempted hi­jack­ings or hostage sit­u­a­tions.

Related

Passengers Board Replacement Flight

Passengers on the flight ar­rived back in Newark just be­fore 9:00 PM on Saturday evening, and were met by a sig­nif­i­cant con­tin­gent of lo­cal and fed­eral law en­force­ment. They were asked to take only their pass­ports and phones with them, leav­ing their cabin bags on the air­craft. After spend­ing sev­eral hours on the ground as se­cu­rity teams com­pleted their sweep, trav­el­ers would even­tu­ally de­part Newark on a re­place­ment flight in the early hours.

The re­place­ment flight was op­er­ated by the same air­craft, a Boeing 767 – 400ER (registration N67052), but would not take off un­til around 02:30 AM the next day. At the time of pub­li­ca­tion, the flight is cur­rently over the Atlantic and is ex­pected to land in Palma de Mallorca in the af­ter­noon lo­cal time. Before pas­sen­gers could board this flight, they were re­quired to pass through TSA se­cu­rity for a sec­ond time.

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

Visit pancik.com for more.