10 interesting stories served every morning and every evening.

Valve releases Steam Controller CAD files under Creative Commons license

www.digitalfoundry.net

Modders, start your en­gines.

by William Judd

Yesterday, 10:29am

With the rather ex­cel­lent Steam Controller now on its way to the lucky few that man­aged to or­der one, Valve has re­leased a full set of CAD files for their new hard­ware. The idea is to let en­ter­pris­ing mod­ders cre­ate their own Steam Controller add-ons, like skins, charg­ing stands, grip ex­ten­ders or smart­phone mounts.

The Valve re­lease in­cludes files for the ex­ter­nal shell (“surface topol­ogy”) of the Controller and Puck, with a .STP, .STL and en­gi­neer­ing di­a­gram of each de­vice, with the lat­ter show­ing ar­eas that must re­main un­cov­ered to let the de­vice main­tain its sig­nal strength and oth­er­wise func­tion as de­signed.

Valve has pre­vi­ously re­leased CAD files for its Steam Deck hand­held, Valve Index VR suite and even the orig­i­nal Steam Controller a decade ago, so this re­lease is wel­comed but not un­ex­pected.

The re­lease is un­der a fairly re­stric­tive Creative Commons li­cense which al­lows for non-com­mer­cial use and re­quires at­tri­bu­tion and shar­ing of de­signs back to the com­mu­nity. However, the li­cense also sug­gests that com­mer­cial en­ti­ties in­ter­ested in mak­ing ac­ces­sories for the Steam Controller or its Puck can con­tact Valve di­rectly to dis­cuss terms.

What is your ul­ti­mate Steam Controller or Steam Controller Puck ac­ces­sory? Let us know in the com­ments be­low. For me, it would def­i­nitely be a smart­phone clip - play­ing through some­thing rel­a­tively low-stakes like Forza Horizon 6 via Moonlight game stream­ing on a phone would be slick.

[source steam­com­mu­nity.com]

Will is web­site ed­i­tor for Digital Foundry, spe­cial­is­ing in PC hard­ware, sim rac­ing and dis­play tech­nol­ogy.

Author Profile

Bluesky

Reply

Appearing Productive in The Workplace — No One's Happy

nooneshappy.com

Parkinson’s Law states that work ex­pands to fill the time avail­able. In the era of AI, work­ers now have a tool that ex­pands to fill what­ever a large lan­guage model can be per­suaded to gen­er­ate, which is to say, with­out limit.

Parkinson’s Law states that work ex­pands to fill the time avail­able. In the era of AI, work­ers now have a tool that ex­pands to fill what­ever a large lan­guage model can be per­suaded to gen­er­ate, which is to say, with­out limit.

What I have watched hap­pen in my pro­fes­sion in the last two years, I am still strug­gling to de­scribe. The first time I knew some­thing was wrong, roughly a year and a quar­ter ago, I no­ticed a col­league re­ply­ing to me us­ing AI. His re­sponse was ob­vi­ously gen­er­ated by Claude. The punc­tu­a­tion gave it away — em dashes where no one types em dashes, the rhyth­mic struc­ture, the con­fi­dent grasp of tech­nolo­gies I knew for a fact he did not un­der­stand. I sat with it for a while, weigh­ing whether to de­bate some­one who was vis­i­bly copy-past­ing ver­ba­tim from a model. The chan­nel was pub­lic, and I spent more time than I should have cor­rect­ing fun­da­men­tals. Eventually I stopped. He was not, in any mean­ing­ful sense, on the other side of the con­ver­sa­tion.

Generative AI can pro­duce work that looks ex­pert with­out be­ing ex­pert, and the fail­ure ar­rives in two shapes. The first is when novices in a field are able to pro­duce work that re­sem­bles what their se­niors pro­duce, faster or more ad­vanced than their judg­ment. The sec­ond is when peo­ple gen­er­ate ar­ti­facts in dis­ci­plines they were never trained in. The two fail­ures look sim­i­lar from a dis­tance and are not the same. Research has mostly mea­sured the first. The sec­ond is what it is miss­ing, and in my ex­pe­ri­ence it is the riskier of the two.

Cross do­main gen­er­a­tion

People who can­not write code are build­ing soft­ware. People who have never de­signed a data sys­tem are de­sign­ing data sys­tems. Most of it is not shipped; it is built, of­ten for many hours, pos­si­bly shown in­ter­nally with great vigor, used qui­etly, and oc­ca­sion­ally sur­faced to a client with­out much fan­fare. Workers can ob­sess over an idea, work­ing many hours over­time. There are a few prac­ti­tion­ers who use the cur­rent agen­tic tools to do com­plex things prop­erly, but they are scarce and as I find, typ­i­cally in code gen­er­a­tion. AI, for all its ca­pa­bil­i­ties at the level of the in­di­vid­ual, has not scaled prop­erly in my work­place.

I have a col­league, a care­ful and in­tel­li­gent per­son in a role that is not en­gi­neer­ing, who spent two months ear­lier this year build­ing a sys­tem that should have been de­signed by some­one with for­mal train­ing in data ar­chi­tec­ture. He used the tools well, by the stan­dards by which use of the tools is cur­rently mea­sured. He pro­duced a great deal of code, a great deal of doc­u­men­ta­tion, a great deal of what looked, to any­one who did not know what to look for, like progress. He could not, when asked, ex­plain how any of it ac­tu­ally worked. The work was wrong from the first day. The schemas, and more im­por­tantly the ob­jec­tives, were wrong in a way that would have been ob­vi­ous to any­one with two years in the field. Several of us did know. When opin­ions were voiced even as high as a V.P., he fought back. The room had been arranged in such a way that say­ing so was not a con­tri­bu­tion; his man­agers were too in­vested in the ap­pear­ance of mo­men­tum to want the ap­pear­ance dis­turbed. The work will con­tinue, in all prob­a­bil­ity, un­til it is shown to a stake­holder, and they de­cide not to in­vest.

This is the part of the phe­nom­e­non I find hard­est to write about. The tool did not make him a worse col­league. It made him able to im­per­son­ate, for months, a dis­ci­pline he had never trained in, and the im­per­son­ation was good enough that the in­sti­tu­tional in­cen­tives all bent to­ward let­ting him con­tinue. Perhaps it’s a fail­ure of man­age­ment, but I have been find­ing man­age­ment to be so ea­ger to em­brace AI that they’re will­ing to ac­cept the risk.

It would be tol­er­a­ble, per­haps, if the tool of­fered an hon­est as­sess­ment of what it had pro­duced. The Cheng et al. Stanford study pub­lished in Science this spring [1] con­firmed what every reg­u­lar user al­ready knew: lead­ing mod­els are roughly fifty per­cent more agree­able than hu­man re­spon­dents, af­firm­ing the user even where the af­fir­ma­tion is un­war­ranted. Berkeley CMR meta-analy­ses [4] found AI-literate users of­ten over­es­ti­mate their per­for­mance. Particularly in­ter­est­ing when work­ers stray out­side of their train­ing. An NBER study of sup­port agents [2] found gen­er­a­tive AI boosted novice pro­duc­tiv­ity by about a third while barely help­ing ex­perts. Harvard Business School re­searchers found the same pat­tern in con­sult­ing work [3]. So you have over­con­fi­dent, novices able to im­prove their in­di­vid­ual pro­duc­tiv­ity in an area of ex­per­tise they are un­able to re­view for cor­rect­ness. What could go wrong?

The con­duit prob­lem

A grow­ing body of work calls this out­put-com­pe­tence de­cou­pling [5]. In any pre­vi­ous era, the qual­ity of a piece of work was a more or less re­li­able sig­nal of the com­pe­tence of the per­son who pro­duced it. A novice es­say read like a novice es­say; novice code crashed in novice ways. AI has sev­ered that re­la­tion­ship. A novice now pro­duces work that does not be­tray the novice, be­cause the com­pe­tence the work re­flects is not the novice’s com­pe­tence at all. It is the sys­tem’s. The per­son, in the trans­ac­tion, be­comes a kind of con­duit, ca­pa­ble of rout­ing the out­put to a re­cip­i­ent and in­ca­pable of eval­u­at­ing it on the way through.

The skills of pro­duc­ing work and judg­ing it were de­lib­er­ately dis­tinct, but ac­com­plish­ing the work it­self used to teach the judg­ment. The first skill now be­longs, in large part, to the ma­chines. The sec­ond still be­longs to us, though fewer are both­er­ing to ac­quire or uti­lize it.

The ar­chi­tec­tural cri­tique that used to come from some­one who was taught, or who had built and bro­ken three of these be­fore now comes from a model with no em­bod­ied mem­ory of build­ing or break­ing any­thing. The slow­ness was not a tax on the real work; the slow­ness was the real work. It was how the work got good, and how the peo­ple pro­duc­ing the work got good, and how the firm whose name was on the work could promise the client that what they were buy­ing was a par­tic­u­lar kind of thing rather than a generic one.

The cur­rent gen­er­a­tion of agen­tic sys­tems is built around the premise that the hu­man is the bot­tle­neck — that the loop runs faster and cleaner with­out the awk­ward de­lay of some­one read­ing what is about to hap­pen and de­cid­ing whether it should. This is, in a great many cases, ex­actly back­wards. The hu­man in the loop is not a ves­tige of an ear­lier era; the hu­man is the only part of the loop with skin in the game. Removing the H from HITL is not an ef­fi­ciency. It is the aban­don­ment of the only mech­a­nism the sys­tem has for catch­ing it­self.

Slop on the in­side

Requirements doc­u­ments that were once a page are now twelve. Status up­dates that were once three sen­tences are now bul­leted sum­maries of bul­leted sum­maries. Retrospective notes, post-in­ci­dent re­ports, de­sign memos, kick­off decks: every ar­ti­fact that can be elon­gated is, by peo­ple who do not read what they pro­duce, for read­ers who do not read what they re­ceive. The cost of pro­duc­ing a doc­u­ment has fallen to nearly zero; the cost of read­ing one has not, and is in fact ris­ing, be­cause the reader must now sift the syn­thetic con­text for what­ever the doc­u­ment was orig­i­nally about. Each in­di­vid­ual de­ci­sion to elon­gate seems ra­tio­nal, and each is in­de­pen­dently re­warded — read­ers are more con­fi­dent in longer AI-generated ex­pla­na­tions whether or not the ex­pla­na­tions are cor­rect [5]. The col­lec­tive ef­fect is that the sig­nal in any given work­place is harder to find than it was be­fore any of this be­gan. The check­points have been hid­den, drowned in their own pa­per­work, even when the peo­ple drown­ing them were gen­uinely try­ing to be brief”.

This is a new form of slop, and it is more ex­pen­sive than the pub­lic kind, be­cause the peo­ple pro­duc­ing it are be­ing paid a salary to do so. The pipeline of fu­ture ex­perts is thin­ning from both ends. The work that used to teach judg­ment is now done by the tool, and the en­try-level roles where the teach­ing hap­pened are be­ing cut on the the­ory that the tool can do the work. What this is caus­ing, in many of­fices in­clud­ing mine, is a great deal of mo­tion and very lit­tle of what mo­tion used to cre­ate.

The down­stream costs are ac­cu­mu­lat­ing quickly. Most of the pub­lic dis­cus­sion of AI slop has fo­cused on the flood into pub­lic mar­kets — a University of Florida mar­ket­ing study [6] be­ing among the more di­rect treat­ments. What is less re­marked upon is the same dy­namic play­ing out in­side or­ga­ni­za­tions: time wasted us­ing AI on tasks that did not need it, on ar­ti­facts no one will read, on processes that ex­ist only be­cause the tool made it cheap to con­struct them. On decks that spell out things that pre­vi­ously did­n’t even need to be said or were as­sumed.

What to do about it

What dis­ci­pline looks like, in this en­vi­ron­ment, is al­most em­bar­rass­ingly old-fash­ioned and may seem ob­vi­ous to most of you un­til you try to avoid it. Use the tool where you can ver­ify pre­cisely what it pro­duces. Never ask a model for con­fir­ma­tion; the tool agrees with every­one, and an agree­ment that costs the agreer noth­ing is worth noth­ing.

Generative AI does well on tasks where feed­back is fast, where be­ing ap­prox­i­mately right is good enough, where the hu­man re­mains the fi­nal ar­biter. Drafting a memo, gen­er­at­ing ex­am­ples, sum­ma­riz­ing ma­te­r­ial the reader could ver­ify if they cared to. The University of Illinois Generative AI guid­ance [7] and the PLOS Computational Biology Ten Simple Rules” pa­per on AI in re­search [8], among the more care­ful doc­u­ments now cir­cu­lat­ing, list much of this ex­plic­itly: brain­storm­ing, copy­edit­ing, re­for­mu­lat­ing one’s own ideas, pat­tern de­tec­tion in data one al­ready un­der­stands.

In every rec­om­mended use, the hu­man sup­plies the judg­ment and the tool sup­plies the through­put. This is a stronger po­si­tion than hu­man-in-the-loop. The tool sits out­side the work, con­tribut­ing where in­vited and silent oth­er­wise, which is the op­po­site of what most agen­tic sys­tems are now be­ing built to do.

For firms, the com­pet­i­tive ad­van­tage of a firm whose work can be trusted has not dis­ap­peared; it has, if any­thing, ap­pre­ci­ated, be­cause so many of the fir­m’s com­peti­tors are qui­etly con­vert­ing them­selves into con­tent-gen­er­a­tion pipelines and count­ing on the client not to no­tice.

This is al­ready com­ing to a head. Deloitte has al­ready re­funded part of a $440,000 fee over an AI-hallucinated gov­ern­ment re­port. It could be a pro­duc­tion sys­tem built on a hal­lu­ci­nated spec­i­fi­ca­tion, or a se­nior en­gi­neer who re­al­izes they have spent the last year nom­i­nally re­view­ing work they could no longer com­pe­tently re­view. The reck­on­ing will not be sub­tle. The firms still do­ing the work prop­erly will be in a po­si­tion to charge for it. The firms that have hol­lowed them­selves out will dis­cover that what they hol­lowed out was the thing the client was pay­ing for.

Misunderstanding and mis­use of AI in the work­place is ram­pant. In many of the rooms I now find my­self in, ex­per­tise has been asked to look the other way: to de­liver faster, pro­duce more, in­te­grate the tools more deeply, get out of the way of the col­leagues who are getting things done”. The ar­ti­facts are ac­cu­mu­lat­ing; the work is not. And some­where on the other side of all this out­put, a client is open­ing a de­liv­er­able, read­ing a sum­ma­rized list, and they may just choose to re­view it man­u­ally.

References

1. Sycophantic AI de­creases proso­cial in­ten­tions and pro­motes de­pen­dence (Cheng, Lee, Khadpe, Yu, Han, & Jurafsky, 2026). Science.

2. Generative AI at Work (Brynjolfsson, Li, & Raymond, 2025). The Quarterly Journal of Economics, 140(2), 889 – 942. Also: NBER Working Paper No. 31161, April 2023.

3. Navigating the Jagged Technological Frontier (Dell’Acqua, McFowland, Mollick, et al., 2026). Organization Science. Originally HBS Working Paper No. 24 – 013, 2023.

4. Seven Myths About AI and Productivity: What the Evidence Really Says (Berkeley CMR, 2025). Meta-analysis con­firm­ing asym­met­ric AI pro­duc­tiv­ity gains and user over­con­fi­dence.

5. Beyond the Steeper Curve: AI-Mediated Metacognitive Decoupling (Koch, 2025). Longer AI ex­pla­na­tions make users more con­fi­dent re­gard­less of cor­rect­ness.

6. Generative AI and the mar­ket for cre­ative con­tent (Zou, Shi, & Wu, 2026). Forthcoming, Journal of Marketing Research.

7. Generative AI Guidance (University of Illinois). Recommended uses and lim­i­ta­tions of gen­er­a­tive AI in aca­d­e­mic and pro­fes­sional work.

8. Ten sim­ple rules for op­ti­mal and care­ful use of gen­er­a­tive AI in sci­ence (Helmy, Jin, et al., 2025). PLOS Computational Biology, 21(10), e1013588.

Red Squares — the GitHub outage graph

red-squares.cian.lol

DENIC Status

status.denic.de

Components

DNS

Services

DNS Nameservice

May 6, 2026 01:34 CESTMay 5, 2026 23:34 UTC

RESOLVED

All Services are up and run­ning.

May 5, 2026 23:28 CESTMay 5, 2026 21:28 UTC

INVESTIGATING

Frankfurt am Main, 5 May 2026 — DENIC eG is cur­rently ex­pe­ri­enc­ing a dis­rup­tion in its DNS ser­vice for .de do­mains. As a re­sult, all DNSSEC-signed .de do­mains are cur­rently af­fected in their reach­a­bil­ity. The root cause of the dis­rup­tion has not yet been fully iden­ti­fied. DENICs tech­ni­cal teams are work­ing in­ten­sively on analy­sis and on restor­ing sta­ble op­er­a­tions as quickly as pos­si­ble. Based on cur­rent in­for­ma­tion, users and op­er­a­tors of .de do­mains may ex­pe­ri­ence im­pair­ments in do­main res­o­lu­tion. Further up­dates will be pro­vided as soon as re­li­able find­ings on the cause and re­cov­ery are avail­able. DENIC asks all af­fected par­ties for their un­der­stand­ing. For fur­ther en­quiries, DENIC can be con­tacted via the usual chan­nels.

Agents can now create Cloudflare accounts, buy domains, and deploy

blog.cloudflare.com

2026 – 04-30

6 min read

This post is also avail­able in 한국어.

Coding agents are great at build­ing soft­ware. But to de­ploy to pro­duc­tion they need three things from the cloud they want to host their app — an ac­count, a way to pay, and an API to­ken. Until now these have been tasks that hu­mans han­dle di­rectly. Increasingly, agents han­dle them on the user’s be­half. The agent needs to per­form all the tasks a hu­man cus­tomer can. They’re given higher-or­der prob­lems to solve and choose to use Cloudflare and call Cloudflare APIs.

Starting to­day, agents can pro­vi­sion Cloudflare on be­half of their users. They can cre­ate a Cloudflare ac­count, start a paid sub­scrip­tion, reg­is­ter a do­main, and get back an API to­ken to de­ploy code right away. Humans can be in the loop to grant per­mis­sion and must ac­cept Cloudflare’s terms of ser­vice, but no hu­man steps are oth­er­wise re­quired from start to fin­ish. There’s no need to go to the dash­board, copy and paste API to­kens, or en­ter credit card de­tails. Without any ex­tra setup, agents have every­thing they need to de­ploy a new pro­duc­tion ap­pli­ca­tion in one shot. And with Cloudflare’s Code Mode MCP server and Agent Skills, they’re even bet­ter at it.

This all works via a new pro­to­col that we’ve co-de­signed with Stripe as part of the launch of Stripe Projects.

We’re ex­cited to launch this new part­ner­ship with Stripe, and also to of­fer $100,000 in Cloudflare cred­its to all new star­tups who in­cor­po­rate us­ing Stripe Atlas. But this new pro­to­col also makes it pos­si­ble for any plat­form with signed-in users to in­te­grate with Cloudflare in the same way Stripe does, with zero fric­tion for the end user.

How it works: zero to pro­duc­tion with­out any setup or man­ual steps

Install the Stripe CLI with the Stripe Projects plu­gin, lo­gin to Stripe, and then start a new pro­ject:

stripe pro­jects init

Then prompt your agent to build some­thing new and de­ploy it to a new do­main. You can watch a con­densed two-minute video of this en­tire flow be­low:

If the email you’re logged into Stripe with al­ready has a Cloudflare ac­count, you’ll be prompted with a typ­i­cal OAuth flow to grant the agent ac­cess. If there is no ex­ist­ing Cloudflare ac­count for the email you’re logged in with, Cloudflare will pro­vi­sion an ac­count au­to­mat­i­cally for you and your agent:

You will see the agent build and de­ploy a site to a new Cloudflare ac­count, and then use the Stripe Projects CLI to reg­is­ter the do­main:

The agent will prompt for in­put and ap­proval when nec­es­sary. For ex­am­ple, if your Stripe ac­count does­n’t yet have a linked pay­ment method, the agent will prompt you to add one:

At the end, the agent has de­ployed to pro­duc­tion, and the app runs on the newly reg­is­tered do­main:

The agent has gone from lit­eral zero, no Cloudflare ac­count at all, with­out any pre­con­fig­ured Agent Skills or MCP server, to hav­ing:

Provisioned a new Cloudflare ac­count

Provisioned a new Cloudflare ac­count

Obtained an API to­ken

Obtained an API to­ken

Purchased a do­main

Purchased a do­main

Deployed an app to pro­duc­tion

Deployed an app to pro­duc­tion

But wait — how did the agent dis­cover that it could do all of this? How did it know what ser­vices it could pro­vi­sion, and how to pur­chase a do­main? How did it gain the con­text it needed to un­der­stand how to de­ploy to Cloudflare? Let’s dig in.

How the pro­to­col and in­te­gra­tion works

There are three com­po­nents to the in­ter­ac­tion be­tween the agent, Stripe, and Cloudflare shown above:

Discovery — the agent can call a com­mand to query the cat­a­log of avail­able ser­vices.

Discovery — the agent can call a com­mand to query the cat­a­log of avail­able ser­vices.

Authorization — the plat­form at­tests to the iden­tity of the user, al­low­ing providers to pro­vi­sion ac­counts or link ex­ist­ing ones, and se­curely is­sue cre­den­tials back to the agent.

Authorization — the plat­form at­tests to the iden­tity of the user, al­low­ing providers to pro­vi­sion ac­counts or link ex­ist­ing ones, and se­curely is­sue cre­den­tials back to the agent.

Payment — the plat­form pro­vides a pay­ment to­ken that providers can use to bill the cus­tomer, al­low­ing the agent to start sub­scrip­tions, make pur­chases and be billed on a us­age ba­sis.

Payment — the plat­form pro­vides a pay­ment to­ken that providers can use to bill the cus­tomer, al­low­ing the agent to start sub­scrip­tions, make pur­chases and be billed on a us­age ba­sis.

These build on prior art and ex­ist­ing stan­dards like OAuth, OIDC and pay­ment to­k­eniza­tion — but are used to­gether to re­move many steps that might oth­er­wise re­quire a hu­man in the loop.

Discovery: how agents find ser­vices they can pro­vi­sion them­selves

In the agent ses­sion above, be­fore the agent ran the CLI com­mand stripe pro­jects add cloud­flare/​reg­is­trar:do­main, it first had to dis­cover the Cloudflare Registrar ser­vice. It did this by call­ing the stripe pro­jects cat­a­log com­mand, which re­turns avail­able ser­vices:

The full set of Cloudflare prod­ucts and ser­vices from other providers is long and grow­ing — ar­guably over­whelm­ing to hu­mans. But for agents, this cat­a­log of ser­vices is ex­actly the con­text they need. The agent chooses ser­vices to use from this cat­a­log based on what the user has asked them to do and the user’s pref­er­ences — but the user needs no prior knowl­edge of what ser­vices are of­fered by which providers, and does not need to pro­vide any in­put. Providers like Cloudflare make this cat­a­log avail­able via a sim­ple REST API that re­turns JSON, and that gives agents every­thing they need.

Authorization: in­stant ac­count cre­ation for new users

When the agent chooses a ser­vice and pro­vi­sions it (ex: stripe pro­jects add cloud­flare/​reg­is­trar:do­main), it pro­vi­sions the re­source within a Cloudflare ac­count. But how is it able to cre­ate one on de­mand, with­out send­ing a hu­man to a signup page?

Remember how at the start, the user signed in to their Stripe ac­count? Stripe acts as the iden­tity provider, at­test­ing to the user’s iden­tity. Cloudflare au­to­mat­i­cally pro­vi­sions a new ac­count for the user if no ac­count al­ready ex­ists, and re­turns cre­den­tials back to the Stripe Projects CLI, which are se­curely stored, but avail­able to the agent to use to make au­then­ti­cated re­quests to Cloudflare. This means if some­one is brand new to Cloudflare or other ser­vices, they can start build­ing right away with their agent, with­out ex­tra steps.

If the user al­ready has a Cloudflare ac­count, they’re sent through a stan­dard OAuth flow to grant ac­cess to the Stripe Projects CLI, al­low­ing them to pro­vi­sion re­sources on their ex­ist­ing Cloudflare ac­count.

Payment: give your agent a bud­get it can spend, with­out giv­ing it your credit card info

You might rightly worry, What if my agent goes a bit over­board and starts buy­ing dozens of do­mains? Will I end up on the hook for a mas­sive bill? Can I re­ally trust my agent with my credit card?”

The pro­to­col ac­counts for this in two ways. When an agent pro­vi­sions a paid ser­vice, Stripe in­cludes a pay­ment to­ken in the re­quest to the Provider (Cloudflare). Raw pay­ment de­tails like credit card num­bers aren’t ever shared with the agent. Stripe then sets a de­fault limit of $100.00 USD/month as the max­i­mum the agent can spend on any one provider. When you’re ready to raise this limit, you can then set Budget Alerts on your Cloudflare ac­count.

Any plat­form with signed-in users can in­te­grate with Cloudflare in the same way Stripe does

Any plat­form with signed-in users can act as the Orchestrator”, play­ing the same role Stripe does with Stripe Projects, and in­te­grate with Cloudflare.

Let’s say your prod­uct is a cod­ing agent. You’d love for peo­ple to be able to take what they’ve built and get it de­ployed to pro­duc­tion, us­ing Cloudflare and other ser­vices. But the last thing you want is to send peo­ple down a maze of au­tho­riza­tion flows and de­ci­sion trees of where and how to de­ploy it. You just want to let peo­ple ship.

Your plat­form acts as the Orchestrator, with the al­ready signed-in user. When your user needs a do­main, a stor­age bucket, a sand­box to give their agent, or any­thing else, you make one API call to Cloudflare to pro­vi­sion a new Cloudflare ac­count to them, and get back a to­ken to make au­then­ti­cated re­quests on their be­half.

Or let’s say you want Cloudflare cus­tomers to be able to eas­ily pro­vi­sion your ser­vice, sim­i­lar to how Cloudflare is part­ner­ing with Planetscale to make it pos­si­ble to cre­ate Planetscale Postgres data­bases di­rectly from Cloudflare. We started work­ing with Planetscale on this well be­fore this new pro­to­col got off the ground, but the flow here is quite sim­i­lar. Cloudflare acts as the Orchestrator, let­ting you con­nect to your PlanetScale ac­count, cre­ate data­bases, and use the user’s ex­ist­ing pay­ment method for billing.

This new pro­to­col starts to stan­dard­ize the types of cross-prod­uct in­te­gra­tions that many plat­forms have been do­ing for years, of­ten in ways that were one off or be­spoke to a par­tic­u­lar plat­form. Without a stan­dard, each in­te­gra­tion re­quired en­gi­neer­ing work that of­ten could­n’t be lever­aged for fu­ture in­te­gra­tions. Similar to how the OAuth stan­dard made it pos­si­ble to del­e­gate ac­cess to your ac­count to other plat­forms, the pro­to­col uses OAuth and ex­tends fur­ther into pay­ments and ac­count cre­ation, do­ing so in a way that treats agents as a first-class con­cern.

We’re ex­cited to con­tinue evolv­ing the stan­dard, and to work with Stripe on shar­ing a more of­fi­cial spec­i­fi­ca­tion soon. We’re also ex­cited to in­te­grate with more plat­forms —  email us at [email protected], and tell us how you want your plat­form to in­te­grate with Cloudflare.

Give your agent the power to pro­vi­sion and pay

Stripe Projects is in open beta, and you can get started even if you don’t yet have a Cloudflare ac­count. Just in­stall the Stripe CLI, log in to Stripe, and then start a new pro­ject:

stripe pro­jects init

Prompt your agent to build some­thing new on Cloudflare, and show us what you’ve built!

Cloudflare’s con­nec­tiv­ity cloud pro­tects en­tire cor­po­rate net­works, helps cus­tomers build Internet-scale ap­pli­ca­tions ef­fi­ciently, ac­cel­er­ates any web­site or Internet ap­pli­ca­tion, wards off DDoS at­tacks, keeps hack­ers at bay, and can help you on your jour­ney to Zero Trust.

Visit 1.1.1.1 from any de­vice to get started with our free app that makes your Internet faster and safer.

To learn more about our mis­sion to help build a bet­ter Internet, start here. If you’re look­ing for a new ca­reer di­rec­tion, check out our open po­si­tions.

The bottleneck was never the code

www.thetypicalset.com

The other month I fi­nally ran an ex­per­i­ment we had been post­pon­ing for over a year at .txt.

The goal was to test our struc­tured-gen­er­a­tion al­go­rithms and their open-source coun­ter­parts, re­plac­ing the naive does it ac­cept this string?” with some­thing closer to the real prob­lem: does it pro­duce the right to­ken dis­tri­b­u­tion?”

The ex­per­i­ment kept com­ing up in con­ver­sa­tion, then re­turn­ing to the roadmap. Last month, I spent half an hour ex­plain­ing the method to Codex. A few hours later, it had pro­duced a work­ing first ver­sion. That’s all it took.

Coding agents are trans­form­ing the way in­di­vid­u­als write code. In a way, that has al­ready hap­pened. And yet I re­main skep­ti­cal of the story peo­ple usu­ally tell about what this means for soft­ware as an in­dus­try: that in­di­vid­ual pro­duc­tiv­ity gains will trans­late into the in­dus­try mov­ing sub­stan­tially faster. I have been stuck on that ten­sion for months.

Time to re-read the clas­sics.

Impactful soft­ware tends to be writ­ten by many hu­mans that need to col­lab­o­rate.

Impactful soft­ware tends to be writ­ten by many hu­mans that need to col­lab­o­rate.

Discussions on cod­ing agents al­most ex­clu­sively fo­cus on the in­di­vid­ual pro­duc­tiv­ity gains. But col­lab­o­ra­tion is the in­ter­est­ing unit of analy­sis.

This is def­i­nitely not a new idea. Fred Brooks wrote about it in 1975 in The Mythical Man Month, and it was not new then; Gerald Weinberg in­tro­duced the idea in 1971 in The Psychology of Computer Programming. Software is what’s left over af­ter a group of hu­mans fin­ishes ne­go­ti­at­ing with each other about what the sys­tem should do. The code mat­ters, but it is the residue of the harder work, not the work it­self.

For fifty years the residue was ex­pen­sive enough to keep our at­ten­tion on it. Typing speed, lan­guage de­sign, frame­work choice, IDE plu­g­ins, code re­view tool­ing were all about re­duc­ing the cost of the residue. With cod­ing agents the cost has fallen far enough that we can see what’s un­der­neath: peo­ple try­ing to agree.

Negotiating, agree­ing, com­mu­ni­cat­ing the shared pic­ture of what we are build­ing has be­come the work. And it’s just as hard as it was.

The Roadmap is the Limit

The Roadmap is the Limit

What slows down a team where agents do the im­ple­men­ta­tion is the pro­duc­tion of spec­i­fi­ca­tions pre­cise enough for an agent to pick up and run. Roadmap, writ­ten down. Acceptance cri­te­ria, writ­ten down. The what we ac­tu­ally want” forced into pre­ci­sion, be it via a test suite, a ticket, or a writ­ten de­sign.

Your mileage may vary, but many man­agers I know are over­whelmed by this. Features get im­ple­mented at break­neck speed, and en­gi­neers are not wait­ing on other en­gi­neers any­more. They are wait­ing on the next well-formed spec. The bot­tle­neck moves from the peo­ple writ­ing the code to the peo­ple de­cid­ing what code should ex­ist. Which is to say: man­age­ment.

Focus is say­ing no

Focus is say­ing no

And the sur­face that has to be agreed on is grow­ing. Jevons Paradox: when some­thing gets cheaper, you tend to use more of it, not less. When code gets 10x cheaper to write, we don’t spend 10% of the ef­fort on the same out­comes. We spend the same ef­fort on out­comes that weren’t worth pur­su­ing be­fore. Prototypes that would have been not worth the time” three months ago get spun up in an af­ter­noon. Internal tools that solve prob­lems no­body quite had get built and for­got­ten.

Every vibe-coded prod­uct with 12 fea­tures is 11 fea­tures away from be­ing great.

People can only ab­sorb fea­tures at a cer­tain rate, and that rate stays roughly the same whether the team ships ten fea­tures or fifty. Steve Jobs in 1997: fo­cus is about say­ing no. Apple killed roughly sev­enty per­cent of its prod­uct line that year, and the com­pany that came back was the one that had learned to sub­tract. The dis­ci­pline got harder with agents; don’t we all like the feel­ing of ship­ping a new fea­ture?

Context is gold.

Context is gold.

All of that ne­go­ti­a­tion runs on some­thing hu­mans rarely think about: shared con­text.

Context is the com­mod­ity an or­ga­ni­za­tion runs on. It is the shared un­der­stand­ing of what we are build­ing, why it mat­ters, what has been tried, who de­cided what, what is load-bear­ing and what is ves­ti­gial. Humans on a team ac­crete it by os­mo­sis. By be­ing in the room, by read­ing the same Slack chan­nel, by de­bug­ging the same out­age at two in the morn­ing. Most of it is never writ­ten down. When a se­nior en­gi­neer re­views a PR and says this’ll break the mi­gra­tion,” they are draw­ing on con­text that has no doc­u­ment.

Agents can­not do os­mo­sis. They do not get con­text by be­ing in the room, by half-hear­ing the plan­ning con­ver­sa­tion, or by car­ry­ing the mem­ory of the last in­ci­dent. Whatever you do not man­age to pack into the prompt, the file tree, the tools, or the ex­plicit in­struc­tions, they do not re­li­ably have. Without con­text, an agent will pro­duce a plau­si­ble an­swer to a slightly wrong ver­sion of the ques­tion.

So when I find my­self ex­cited about an agent that did some­thing use­ful at .txt, the hon­est ac­count­ing is that we did the con­text work. The next ten en­gi­neers will not have that pic­ture by de­fault. They will have it only to the ex­tent we get se­ri­ous about writ­ing it down.

Context, the un­writ­ten sub­strate or­ga­ni­za­tions have al­ways run on, is now the rate-lim­it­ing in­put. And the nat­ural thing for hu­mans to do is leave it im­plicit, be­cause there was never any­one to read the ex­plicit ver­sion.

Real pro­gram­mers don’t doc­u­ment their pro­grams.

Real pro­gram­mers don’t doc­u­ment their pro­grams.

Producing eas­ily con­sum­able con­text is pre­cisely the thing hu­mans don’t like to do.

What may save us it that agents are un­rea­son­ably good at read­ing ex­haus­tively. An agent will read every PR com­ment, every closed is­sue, every com­mit mes­sage, every stale de­sign doc, every Slack archive, and ex­tract pat­terns no­body both­ered to write down be­cause no­body was go­ing to read them.

We have started build­ing this at .txt. Agents that crawl the code­base, the is­sues, the PRs, the threads, and pro­duce a knowl­edge base of the im­plicit de­ci­sions, the con­ven­tions, the why-we-did-it-this-way that no­body had time to doc­u­ment. Not just this mod­ule ex­ists,” but this mod­ule is weird be­cause the mi­gra­tion had to pre­serve old be­hav­ior,” or this bench­mark mat­ters be­cause a pre­vi­ous op­ti­miza­tion silently changed the dis­tri­b­u­tion.” Other agents use that knowl­edge base when they need to act on the code­base. The os­mo­sis hu­mans do in­for­mally is be­ing ex­ter­nal­ized into some­thing agents can read, and hu­mans can too.

Agents that con­sume con­text need agents that pro­duce it. Once that loop is run­ning, the or­ga­ni­za­tion has a writ­ten sub­strate it would never have pro­duced on its own. The con­straint that was rate-lim­it­ing in the pre­vi­ous sec­tion stops be­ing con­stant. Context be­comes a thing you can pro­duce more of.

Of course, this loop will only ever pro­duce a par­tial pic­ture. To quote Michael Polanyi: we know more than we can tell. Some load-bear­ing con­text ex­ists pre­cisely be­cause it was never put into words, and writ­ing it down would change what it is. The os­mo­sis layer hu­mans ac­crete in per­son is not fully re­cov­er­able from writ­ten ex­haust. What comes out the other side is closer to a use­ful start­ing point than to a full re­cov­ery, and whether that is enough to com­pound is an em­pir­i­cal ques­tion I’m still test­ing. I think it is. I’m not sure.

The new moat is or­ga­ni­za­tional, not tech­ni­cal.

The new moat is or­ga­ni­za­tional, not tech­ni­cal.

The com­pa­nies that win the next decade will not nec­es­sar­ily have the best mod­els or the best agent in­fra­struc­ture. It will be the com­pa­nies whose fifty peo­ple, then two hun­dred, then two thou­sand, can stay aligned on a shrink­ing set of de­ci­sions while ship­ping more out­put per head. They will be the ones that al­ready knew, be­fore agents ar­rived, that their hard­est prob­lem was co­her­ence.

That is a cul­ture and man­age­ment prob­lem. Always has been.

Every pre­vi­ous gen­er­a­tion of tool­ing, whether IDEs, ver­sion con­trol, CI, mi­croser­vices, or de­vops, promised to solve co­or­di­na­tion through bet­ter tools. Every one of them turned out to be a mul­ti­plier on what­ever or­ga­ni­za­tional co­her­ence was al­ready there. Small teams have co­her­ence for free; the mul­ti­plier is uni­formly pos­i­tive there, which is why the loud­est agent boost­ers tend to be small teams, and why they are mostly right about their own con­text. Past a cer­tain size, co­her­ence has to be pro­duced and main­tained, and the mul­ti­plier sharp­ens in both di­rec­tions. Good orgs got bet­ter. Bad orgs got faster at ru­in­ing things.

Agents are a much big­ger mul­ti­plier than any of those. The sig­nal is go­ing to be louder in both di­rec­tions. They are over­es­ti­mated as a way to make in­di­vid­u­als write code faster, and un­der­es­ti­mated as a way to make or­ga­ni­za­tions ex­ter­nal­ize what they know.

Agents can feel like the ex­ten­sion of your own mind, and that feel­ing is ex­hil­a­rat­ing. The harder chal­lenge is mak­ing them an ex­ten­sion of your com­pany cul­ture. That is a dif­fer­ent prob­lem and a dif­fer­ent shape of work. It needs writ­ing cul­ture. It needs man­age­ment thought­ful enough to iden­tify where they re­main a con­text bot­tle­neck. It needs peo­ple who treat co­her­ence as a real ar­ti­fact to main­tain. What’s new is that some of those things are build­able now. The read-and-ex­tract loop is one shape, and there will be oth­ers.

I’ll re­port back on the ex­per­i­ment.

CARA 2.0 — Aaed Musa

www.aaedmusa.com

CARA 2.0 (05/01/2026)

Prologue

On May 31st, 2024, I up­loaded a video ti­tled High Precision Speed Reducer Using Rope, where I built a niche speed re­ducer called a cap­stan drive. Little did I know that this video would go vi­ral and brand me as the capstan guy”. Roughly 2 years later, this is still my high­est-viewed video, and I still get lots of emails and DMs about cap­stan dri­ves. About a year af­ter mak­ing that video, I cre­ated CARA, a quadrupedal ro­bot that used cap­stan dri­ves. Now, a year later af­ter that, I’ve built CARA 2.0, an up­graded ver­sion of CARA. This pro­ject is par­tic­u­larly spe­cial be­cause CARA 2.0 was my se­nior de­sign pro­ject. Considering the fact that I’ve been ob­sessed with mak­ing quads since high school, it only seemed fit­ting that I end my col­lege ex­pe­ri­ence by build­ing my best quad yet. My team and I set out to make a low-cost (<$1000), low-weight (<20lbs), and durable quad suited for hob­by­ists and re­searchers.

If you’re in­ter­ested in build­ing your own CARA 2.0, you can pur­chase the full build guide on my Patreon Shop or ac­cess it through my Patreon Builder Tier mem­ber­ship. Also, the BOM for CARA 2.0 is free to ac­cess here.

A Low-Cost Dynamic Actuator

Actuators are the most ba­sic and es­sen­tial electro­mechan­i­cal sub­assem­bly of a ro­bot. They are also the main dri­ver of a ro­bot’s cost and per­for­mance. With the lofty goal of build­ing a quad un­der $1000, it only made sense that we would start our de­sign process by mak­ing a low-cost dy­namic ac­tu­a­tor. Luckily, the blue­print for build­ing one has al­ready been well doc­u­mented on­line. Ben Katz specif­i­cally set the prece­dent for mak­ing low-cost dy­namic ac­tu­a­tors dur­ing his de­vel­op­ment of the MIT Mini Cheetah. I would def­i­nitely rec­om­mend read­ing his his­toric pa­per, A Low Cost Modular Actuator for Dynamic Robots. Essentially, Katz pop­u­lar­ized the idea of a Quasi Direct Drive (QDD) ac­tu­a­tor. This is an ac­tu­a­tor that com­bines a high-torque brush­less mo­tor (generally with a large gap ra­dius) with a low gear ra­tio gear­box (generally un­der 10:1), and an FOC Controller to achieve po­si­tion, ve­loc­ity, and torque con­trol. The name Quasi Direct Drive comes from the fact that the low gear ra­tio is able to re­tain a lot of the ben­e­fits of a di­rect drive ac­tu­a­tor. Namely, ef­fi­ciency, trans­parency, and back­drive­abil­ity.

From left to right: high torque BLDC, 9:1 plan­e­tary gear­box, and an FOC con­troller

Actuation Hardware

Given our $1000 goal, we needed to build a QDD ac­tu­a­tor for around $50 to $60. This is a hard goal to achieve. For ref­er­ence, each ac­tu­a­tor on CARA 1.0 costs ap­prox­i­mately $250! The bulk of this cost came from the BLDC mo­tor and FOC con­troller, mak­ing up about 32% and 60% of the to­tal cost, re­spec­tively. After do­ing a ton of re­search, we even­tu­ally found a mo­tor and con­troller within our price range. Here are the specs of the mo­tor and con­troller used in CARA 1.0 and the ones we found for CARA 2.0.

CARA 1.0 Actuation Hardware

Eagle Power 8308 BLDC Motor

Cost: $80

Cost: $80

Size (D x H): 92 × 29 mm

Size (D x H): 92 × 29 mm

Weight: 340 g

Weight: 340 g

Rated KV: 90

Rated KV: 90

Rated volt­age: 6 - 12S

Rated volt­age: 6 - 12S

Configuration: 36N40P

Configuration: 36N40P

Measured stall torque: 1.67 Nm

Measured stall torque: 1.67 Nm

ODrive S1 FOC Controller

Cost: $150

Cost: $150

Size (L x W x H): 66 × 54 × 25 mm

Size (L x W x H): 66 × 54 × 25 mm

Weight: 55 g

Weight: 55 g

Input volt­age: 12 – 48 V

Input volt­age: 12 – 48 V

Continuous cur­rent: 40 A

Continuous cur­rent: 40 A

Max cur­rent: 80 A

Max cur­rent: 80 A

Onboard en­coder: yes

Onboard en­coder: yes

CARA 2.0 Actuation Hardware

TYI 5008 BLDC Motor

Cost: $18

Cost: $18

Size (D x H): 58.3 × 38.2 mm

Size (D x H): 58.3 × 38.2 mm

Weight: 160 g

Weight: 160 g

Rated KV: 335

Rated KV: 335

Rated volt­age: 2 - 6S

Rated volt­age: 2 - 6S

Configuration: 12N14P

Configuration: 12N14P

Measured stall torque: 0.421 Nm

Measured stall torque: 0.421 Nm

MKS XDrive Mini FOC Controller

Cost: $41

Cost: $41

Size (L x W x H): 63 × 58 × 27.8 mm

Size (L x W x H): 63 × 58 × 27.8 mm

Weight: 66 g

Weight: 66 g

Input volt­age: 12 – 56 V

Input volt­age: 12 – 56 V

Continuous cur­rent: 60 A

Continuous cur­rent: 60 A

Peak cur­rent: 120 A

Peak cur­rent: 120 A

Onboard en­coder: yes

Onboard en­coder: yes

The TYI 5008 is a dirt-cheap Chinese BLDC mo­tor, which is only about ¼ of the cost of the Eagle Power mo­tor used in CARA 1.0. I don’t think it’s pos­si­ble to find mo­tors that are this cheap yet pow­er­ful enough to use for ro­bot­ics. They’re also sur­pris­ingly high qual­ity as they use arced mag­nets and bal­anc­ing glue. The XDrive con­trollers are per­haps an even bet­ter bang for your buck. They’re also about ¼ of the price of the ODrive S1 con­trollers used in CARA 1.0, but they’re ac­tu­ally rated for higher volt­age and cur­rent! This al­most seemed too good to be true, and as I found out, it was. While this pair of ac­tu­a­tion hard­ware was ex­tremely cheap, it came with some draw­backs.

Rewinding Motors, an Arthtitis Speedrun

The one flaw of the TYI mo­tors is that it has a re­ally high KV i.e., a re­ally low Kt or torque per amp rat­ing. The sim­plest way to fix this is­sue would be to use a high gear re­duc­tion gear­box in the ac­tu­a­tor, but as men­tioned be­fore, a QDD ac­tu­a­tor needs a low re­duc­tion. So, we de­cided to mod­ify the mo­tors them­selves by rewind­ing them to re­duce their KV rat­ing and thus in­crease their torque per amp rat­ing. The idea of rewind­ing the mo­tors came to me as I in­spected the mo­tors’ wind­ings. They seem to have so much space for more mag­net wire. I’m guess­ing that since these mo­tors were de­signed for drones, a low wind­ing den­sity was fa­vor­able as it leads to high KV and thus high speed. Before rewind­ing the mo­tors, I took one apart to un­der­stand how the man­u­fac­turer wound them. What I found was that the mo­tors were wired in a delta con­fig­u­ra­tion and wound with 22 turns/​slot of a sin­gle strand of 22 AWG mag­net wire. My goal was to wind the mo­tors down from 335 KV to 100 KV, which is a good KV rat­ing for a high torque ac­tu­a­tor. Firstly, I knew that I would have to wire the mo­tor us­ing star con­fig­u­ra­tion, since it pro­vides more torque at low speeds than delta. Specifically, delta wiring has a √3x higher KV than an iden­ti­cally wound mo­tor wired in star. Secondly, I knew I was go­ing to have to use a sin­gle strand of 24 AWG mag­net wire for a cou­ple of rea­sons. Any lower gauge would be too thick to pack a sub­stan­tial amount of wire on the sta­tor. Any higher gauge would re­quire mul­ti­ple strands in or­der to have a high cur­rent-car­ry­ing ca­pac­ity, which would also limit the num­ber of turns that could be wound on each slot. I came to this con­clu­sion af­ter try­ing a bunch of dif­fer­ent wind­ing gauges and strand num­bers on a mock sta­tor. In the end, a sin­gle strand of 24 AWG wire seemed to be the ideal thick­ness to pack as much cop­per on the sta­tor with as many turns per slot. The last and most im­por­tant ques­tion was how many turns per slot are needed to achieve 90 KV?” Well, a KV rat­ing is di­rectly pro­por­tional to the num­ber of turns/​slot on a mo­tor. So, us­ing the man­u­fac­tur­er’s KV rat­ing, the man­u­fac­tur­er’s turns/​slot value, the tar­get KV, and the KV re­duc­tion fac­tor from delta to star, you can cal­cu­late the tar­get turns per slot as shown be­low.

Rewinding Calculations

From the cal­cu­la­tions, it’s shown that it takes ap­prox­i­mately 39 turns per slot to achieve 100 KV on the TYI mo­tors. I de­cided to round this up to 40 turns/​slot to work with an even num­ber. After rewind­ing a mo­tor, I con­ducted KV and torque tests and found the be­low pa­ra­me­ters.

Manufacturer Wound TYI Motor

Rated KV: 335

Rated KV: 335

Weight: 160 g

Weight: 160 g

Measured stall torque: 0.421 Nm

Measured stall torque: 0.421 Nm

Rewinded TYI Motor

Rated KV: 90

Rated KV: 90

Weight: 160 g

Weight: 160 g

Measured stall torque: 1.274 Nm

Measured stall torque: 1.274 Nm

I was able to get the KV down to 90, which is even bet­ter than what I tar­geted. This pro­duced a much higher Kt or torque per amp rat­ing. One in­ter­est­ing thing that I no­ticed was that the weight of the rewinded mo­tor re­mained the same as the man­u­fac­turer-wound mo­tor. This just goes to show that you can rad­i­cally change the char­ac­ter­is­tics of a mo­tor with­out adding or re­mov­ing the amount of cop­per.

When Cheap Motor Controllers Don’t Work

The XDrive Mini con­trollers are one of the cheap­est and most pow­er­ful FOC con­trollers that can be pur­chased off the shelf, but that means they come with a ton of is­sues. I knew this go­ing in, but it did­n’t make the trou­bleshoot­ing any less tor­tur­ous. When you pay $150 for an ODrive S1, you aren’t just pay­ing for the phys­i­cal hard­ware; you’re pay­ing for the UI, the doc­u­men­ta­tion, and the con­tin­ued sup­port that it comes with. With cheaper al­ter­na­tives, you don’t get any of that!

The XDrives are sin­gle-axis ODrive 3.6 clones that look like the ODrive S1. The main is­sue with the XDrives is com­mu­ni­ca­tion. They work per­fectly fine with UART but not with CAN bus. UART sucks for mak­ing high DoF ro­bots be­cause MCUs only have so many UART ports. The Teensy 4.1 MCU that I used for this pro­ject, and most of my other ro­bot­ics pro­jects, only has 8 UART ports, so I would only be able to con­trol 8 mo­tors in­stead of 12. CAN bus is a much more pre­ferred comms pro­to­col for ro­bot­ics. The boards come with a cus­tom firmware made by the man­u­fac­turer called 0.5.1. After run­ning some test CAN bus code on the boards, it seemed to be able to send com­mands to the mo­tor, but it could­n’t pro­vide en­coder or cur­rent feed­back, which are crit­i­cal for a highly ro­bust and dy­namic ro­bot. I then turned to us­ing ODrive’s Open Source Firmware. This firmware did pro­vide en­coder and cur­rent feed­back, but it was­n’t able to main­tain sta­ble com­mu­ni­ca­tion with the mo­tors. This is­sue was par­tic­u­larly an­noy­ing as it var­ied every time. Sometimes the mo­tor would run for a minute and then dis­con­nect, some­times for an hour, and some­times for 30 sec­onds. I tried a bunch of dif­fer­ent things to iso­late any vari­ables that may be at play here. I tried chang­ing the loop fre­quency, de­con­gest­ing the bus by slow­ing down heart­beats and mes­sage rates, and us­ing pretty much every open-source firmware ver­sion that ODrive pro­vided. Nothing worked, and it be­came clear that I was miss­ing some­thing. So, I de­cided to do some dig­ging on­line to see if any­one else had ex­pe­ri­enced this is­sue, and I came across Mohammad Marshid, who cre­ated a cus­tom ver­sion of the XDrive’s 0.5.1 firmware that could send en­coder feed­back (Mohammad’s firmware). I reached out to Mohammad to see if he could also add cur­rent feed­back to the firmware, and luck­ily, he was able to do so! Big shout-out to Mohammad! With the new firmware, I was able to main­tain sta­ble com­mu­ni­ca­tion with the mo­tor and also get mo­tor feed­back. So, it seems like boards will only work with their na­tive firmware.

Alas, a Capstan Drive Joint Test Stand!

With the mo­tor rewinded for more torque and the mo­tor con­troller comms is­sues sorted out, it only made sense to pro­to­type a sin­gle cap­stan drive joint test stand. The de­sign of the drive is pretty sim­i­lar to my pre­vi­ous cap­stan drive de­signs. A small drum ro­tates a big drum through a ten­sioned rope that’s wrapped around both drums. The drive weighs 470 g (1 lb), fea­tures a 9.6:1 re­duc­tion, pro­duces 12 Nm of peak torque, and has a range of mo­tion (ROM) of 120°. One thing that I’ve come to re­al­ize over the course of mak­ing cap­stan dri­ves (especially from YouTube com­ments) is that try­ing to achieve an ex­act re­duc­tion is a fruit­less ef­fort. I’ve spent an in­cred­i­ble amount of time try­ing to de­rive equa­tions to prop­erly de­ter­mine the ex­act drum di­am­e­ters needed to achieve cer­tain ra­tios, but they never work across the board. The best way to make a cap­stan drive is to have a tar­get ra­tio in mind, es­ti­mate the tar­get ef­fec­tive drum di­am­e­ters us­ing a bit of math, and then mea­sure and cal­cu­late the gear re­duc­tion (Δoutput shaft po­si­tion/​Δmo­tor shaft po­si­tion). As al­ways, the drive used Dyneema DM20 rope, which is the low­est creep rope that money can buy. Specifically, I used 1mm Mastrant-M Rope, which is dou­ble braided, abra­sion re­sis­tant, and has a strength of 100 kg (225 lbs) with a work­ing load of 30 kg (66 lb). One thing I’ve added to the cap­stan drive as­sem­bly process with this pro­ject is pre­stretch­ing the rope. I found that this helps get rid of slack, thus lim­it­ing the amount of ten­sion­ing needed when the rope is in­stalled on the drive. In to­tal, the sin­gle joint costs around $80 to make, which is a bit higher than the tar­get range, but still quite af­ford­able.

Single Leg Design - The Easy Part

Naturally, af­ter build­ing a sin­gle joint, it only makes sense to de­sign a sin­gle leg. From CARA 1.0, I knew that a coax­ial 5-bar link­age de­sign would be the most ideal de­sign since it al­lows for less load­ing on each link com­pared to a stan­dard quadruped leg. I also love the fact that most quads don’t use this de­sign, mak­ing CARA su­per unique. Also, a coax­ial 5-bar link­age pairs su­per well with a cap­stan drive. In fact, I be­lieve it’s the most com­pact way to cre­ate a 3-DOF leg with a cap­stan drive. Since we al­ready knew what the leg would look like, the de­sign process be­came an op­ti­miza­tion prob­lem. How can we im­prove cost, weight, and as­sem­bly?

The first thing we de­cided to im­prove was the num­ber of mir­rored parts in the leg. CARA 1.0 used a ton of mir­rored parts, which made as­sem­bly con­fus­ing at times. This time, we de­cided to make each leg on the bot iden­ti­cal. Ultimately, this bit us in the back, but more on that later. We also de­cided to re­duce the num­ber of screws used in the leg. Why use 4 screws to fas­ten some­thing that just needs 2? I’ve started ask­ing my­self more ques­tions like this since screws add cost, weight, and as­sem­bly time to builds. In line with screws, we also got rid of re­dun­dant brac­ing on some parts. Overconstraining a part does­n’t im­me­di­ately seem like an is­sue, but it def­i­nitely can be­come one since things in the real world are never per­fect. One good ex­am­ple is 4 legged table. You only need 3 legs to fully con­strain a table. That 4th leg over­con­strains the table. So, if one leg hap­pens to be a lit­tle shorter or taller than the oth­ers, the whole table be­comes un­bal­anced. To help fur­ther re­duce weight, we also used much thin­ner bear­ings. There are a lot of ra­di­ally con­strained parts in the leg, so lighter bear­ings go a long way in mak­ing the ro­bot it­self sig­nif­i­cantly lighter. The to­tal weight of the sin­gle leg was 1.47 kg (3.24 lbs).

One thing I wanted to in­ves­ti­gate with CARA 2.0’s leg de­sign was the up­per-to-lower link ra­tio of the 5-bar link­age. For CARA 1.0, I used a 1:2 ra­tio just on a whim with­out do­ing any analy­sis. This time, I had one of my team­mates in­ves­ti­gate the ideal ra­tio us­ing a MATLAB sim­u­la­tion. What we found was that a 1:1 ra­tio gives you the high­est ROM. Unfortunately, a 1:1 ra­tio looks kinda goofy, so we ended up us­ing a 2:3 ra­tio to get a bit more ROM while pre­serv­ing aes­thet­ics. Lastly, the biggest per­for­mance change we made with the leg de­sign was switch­ing from a TPU foot to a squash ball foot. TPU feet sucked with CARA 1.0 as it’s sim­ply a flex­i­ble plas­tic with no com­pli­ance or trac­tion. Squash balls, on the other hand, ex­cel in com­pli­ance and trac­tion. Squash balls are also su­per cheap and were used in the MIT Mini Cheetah as well as Stanley, both high-per­form­ing quads. There are ac­tu­ally 4 main types of squash balls: blue dot (beginner), red dot (progressing), sin­gle yel­low dot (competition), and dou­ble yel­low dot (pro). Double yel­low dot is the firmest and least bouncy, so we chose those.

Unraveling AI's 'Knitting Bullshit'

katedaviesdesigns.com

My theme to­day is Knitting Bullshit and be­fore I be­gin, I had bet­ter ex­plain to you what I un­der­stand bull­shit to be. In what fol­lows, bullshit” is used very much in the sense that Princeton philoso­pher Harry Frankfurt de­scribes in his sem­i­nal es­say, On Bullshit (1986; 2005). For Frankfurt, bull­shit is an ut­ter­ance with a lack of con­nec­tion to con­cern with truth” and an indifference to how things re­ally are.” From the off, Frankfurt tells us, it is im­por­tant to un­der­stand that bull­shit is, in its pe­cu­liarly ex­e­crable na­ture, ma­te­ri­ally dif­fer­ent to a lie. While a liar dis­plays an un­der­ly­ing re­spect for the truth in the very act of in­ten­tion­ally dis­tort­ing it, the essence of bull­shit”, Frankfurt writes is not that it is false but that it is phony.” For Frankfurt, then, bull­shit, is dis­course from which in­ci­den­tal mat­ters like truth and re­al­ity have been com­pletely hol­lowed out and re­placed by per­for­mance and sim­u­la­tion. Unfortunately, as none of us can fail to be aware, we live in an age of bull­shit; a mo­ment when the bull­shit­ter-in-chief sits in the White House daily pur­vey­ing what Frankfurt, be­fore his death in 2023, mem­o­rably re­ferred to as farcically un­al­loyed bull­shit”. You’ll no doubt be pleased to hear, though, that the bull­shit I am go­ing to talk about to­day is of a very spe­cific rather than a gen­eral kind: yes, what con­cerns me here is knit­ting bull­shit.

I have been think­ing about knit­ting bull­shit now for quite some time, but I was alerted to a par­tic­u­lar type of it while lis­ten­ing to Jamie Bartlett’s ex­cel­lent se­ries Everything is Fake and Nobody Cares (available wher­ever you get your pod­casts). The first episode in­cludes an in­ter­view with Anne McHealy, head of prod­uct at Inception Point AI, a pod­cast­ing com­pany founded by Jeanine Wright, for­merly COO at Wondery. Until its dis­so­lu­tion (by Amazon in 2025 at the cost of 110 jobs), Wondery was known for pro­duc­ing high qual­ity, hu­man-au­thored, nar­ra­tive con­tent. Inception Point AI, on the other hand, is a slop fac­tory em­ploy­ing just 8 peo­ple which, ac­cord­ing to Anne, pub­lishes about 3000 pod­cast episodes per week, hosted by AI per­son­al­i­ties.” Anne tells Jamie, that, to date, Inception Point AI’s  pod­casts have ac­cu­mu­lated 12 mil­lion life­time down­loads. And we’re av­er­ag­ing about 750,000 down­loads a month.” Stunned by these ex­tra­or­di­nary fig­ures, Jamie asks Anne about the ed­i­to­r­ial over­sight of the con­tent which she pro­duces. Does she, or any of her col­leagues, ac­tu­ally lis­ten to any of these 3000 weekly episodes? With only 8 em­ploy­ees, who on earth has time to check the ac­cu­racy or qual­ity of these pod­casts? The an­swer, is, of course, that no one checks or ed­its the pod­cast con­tent– but, Anne tells Jamie blithely, this re­ally does­n’t mat­ter be­cause the top­ics un­der dis­cus­sion are so low stakes:

most of our con­tent sits squarely in top­ics that aren’t life or death nec­es­sar­ily. So gar­den­ing, for ex­am­ple, knit­ting, cook­ing, these things we can af­ford to be wrong. And it’s not nec­es­sar­ily the end of the world.”

Listening to this apol­o­gist for au­to­mated ar­bi­trage with a kind of fas­ci­nated hor­ror, I found my­self pulled up short. Knitting, you say? Not life or death, you say? Who are you kid­ding, Anne?

So, of course I went to lis­ten to Inception Point AIs knitting” pod­cast. I heartily en­cour­age you not to do the same, not least be­cause this joy­less ex­pe­ri­ence would be con­tribut­ing to the slop fac­to­ry’s jaw-drop­ping (and de­press­ing) num­ber of down­loads while si­mul­ta­ne­ously serv­ing you ads for ac­count­ing soft­ware and small busi­ness in­sur­ance (your tai­lored mar­ket­ing will, of course, be per­sonal to you). No, I have now done that work for you; those few sad hours are for­ever lost to me, and I am here to tell you that this ai gen­er­ated knit­ting content” is just as bad as you imag­ine. Worse than you imag­ine. Much, much worse.

Let’s take the first episode on Knitting Through the Ages, for ex­am­ple. The pod­cast opens by promis­ing to examine the cul­tural sig­nif­i­cance of knit­ting. . . the way this sim­ple act of loop­ing yarn has brought peo­ple to­gether across gen­er­a­tions and con­ti­nents. We’ll be delv­ing into the juicy de­tails and quirky anec­dotes that make the story of knit­ting truly cap­ti­vat­ing,” your husky-voiced AI host promises, . . . from an­cient Egyptian socks to the rise of knit­ting as a global phe­nom­e­non, we’ll un­cover the hid­den sto­ries and colour­ful char­ac­ters that have shaped this beloved craft.” Indeed, the host does go on to talk about a pair of an­cient Egyptian socks, be­fore leap­ing for­ward to a dis­cus­sion of the con­tem­po­rary global knit­ting com­mu­nity . . . but there is noth­ing in-be­tween. Nothing. Nada. Zilch. Yes, that’s right: the en­tire his­tory of knit­ting is en­com­passed by a pair of Egyptian socks and Ravelry. But if these two huge his­tor­i­cal mile­stones are ap­par­ently the only avail­able top­ics then of what, pray, is the rest of the episode com­posed? I sat through 15 min­utes which sounded as if the AI had been trained on a decade’s worth of poorly-com­posed yarn mar­ket­ing ma­te­r­ial, and was spew­ing it back out at me as a syrupy word salad. As I lis­tened, I could feel my grey mat­ter dis­solv­ing into a kind of marsh­mal­low soup as each sen­tence made its own kind of inane, sweet sense, while say­ing pre­cisely noth­ing.

So far, so slop. Thanks so much, Inception AI, for such an in­sight­ful episode cov­er­ing, as promised, the whole of knit­ting’s long, dif­fi­cult, con­tested his­tory: a story in­volv­ing  the in­vis­i­ble labour and cre­ativ­ity of women, the ex­ploita­tion of that cre­ativ­ity and labour, in­dus­tri­al­i­sa­tion, in­ge­nu­ity, re­sis­tance, sol­i­dar­ity . . . oh, you’re not telling that story, I’m so sorry. Let’s swiftly move on to the episode about knit­ting de­sign. . . .

The Art of Knitting Pattern Design be­gins with an­other hol­low marsh­mal­low preçis that  seems to promise so very much:

Join us as we un­ravel the cre­ative process from the ini­tial spark of an idea to the fi­nal stitches of a beau­ti­fully de­signed gar­ment. We’ll ex­plore the di­verse realm of knit­ting pat­tern types, in­clud­ing the del­i­cate in­tri­ca­cies of lace, the mes­mer­iz­ing tex­tures of ca­bles, the play­ful in­ter­play of col­or­work, and more. But that’s not all.”

Oh no?

We’ve gath­ered wis­dom from renowned knit­ting ex­perts and de­sign­ers who will share their unique per­spec­tives, de­sign philoso­phies, and fa­vorite tech­niques. Their in­sights will pro­vide you with a deeper un­der­stand­ing of the art and sci­ence be­hind cre­at­ing pat­terns that not only look stun­ning, but also feel en­joy­able to knit.”

Tell me more! I’m so ready to learn from these renowned knit­ting ex­perts who are, the AI host in­forms me, so receptive to the beauty and in­spi­ra­tion that sur­rounds us every day.” So imag­ine my dis­ap­point­ment when I dis­cover that, al­though ex­plic­itly named and ex­ten­sively quoted, none of these ex­pert de­sign­ers ac­tu­ally ex­ists!  That’s right: rather than the real knit­ting ex­perts who, through their pat­terns, we­bi­nars, mag­a­zine ar­ti­cles, books, dig­i­tal fo­rums, sub­stacks, pod­casts and in­struc­tional videos, gen­er­ously share their ac­cu­mu­lated wis­dom with the global craft­ing com­mu­nity every sin­gle day, Michael Lee, Elizabeth Brown, Daniel Nakamura,  Olivia Patel and Emily Davis are mere AI con­fec­tions, whose bland ut­ter­ances re­mind you to embrace the process” and feel confident and em­pow­ered” even as you leave the episode hav­ing learnt pre­cisely noth­ing about knit­ting in gen­eral or de­sign in par­tic­u­lar. The cre­ative labour of knitwear de­sign—which to­day em­ploys thou­sands of tal­ented peo­ple around the world—is here sub­sti­tuted with the sac­cha­rine sim­u­lacrum of joy” and possibility”, a hol­low promise held out, in each episode, to keep you lis­ten­ing, engaged,” en­thralled.

I don’t think we need any fur­ther ex­am­ples of this con­tent to un­der­stand just how badly and how baldly it has ad­dressed it­self to the ex­tra­or­di­nary cre­ative prac­tice and the vi­brant global com­mu­nity of which I am proud to be a part, hol­lowed it out, and trans­formed it into Bullshit of the purest, most un­al­loyed kind. But, hon­estly, the thing that I found most weird (in the way that AI bull­shit can so of­ten feel weird or un­canny) is the sleek man­ner in which these pod­cast episodes sub­sti­tuted what one might re­fer to as the truth” or reality” of knit­ting with a reg­is­ter of emo­tional val­i­da­tion fa­mil­iar to any­one who has ever asked a ques­tion of Claude or ChatGPT.

In the same way that Chat GPT ap­plauds your sim­ply be­ing there and ask­ing it such a gen­uinely in­sight­ful ques­tion, the pod­cast con­tin­u­ally con­grat­u­lates you for your ex­cel­lent craft­ing choices. That is, hav­ing lis­tened to sev­eral episodes of this pod­cast you will come away hav­ing learned ab­solutely noth­ing about knit­ting it­self, but you might well feel good about knit­ting, and in­deed about be­ing a knit­ter, be­cause the pod­cast is re­peat­edly telling you just how how good it feels to be one.

There is a one episode which pur­port­edly cov­ers ad­vanced knit­ting tech­niques, but which, hav­ing pre­cisely noth­ing to say about such mat­ters, in­stead con­tin­u­ally asks you to imag­ine the joy you are go­ing to feel as the stitches emerge from your nee­dles, or to pic­ture the sat­is­fac­tion of fi­nally wrap­ping your­self up in the cosy” or mesmerising” (words to which the AI re­turns re­peat­edly) work of your own hands.

Ye gods! The emo­tively per­sua­sive syn­thetic hor­ror! What a time to be alive.

Just as I was mulling over these post-post-mod­ern con­tra­dic­tions of an AI sub­sti­tut­ing its lack of con­nec­tion to real-world hu­man-em­bod­ied, ma­te­r­ial prac­tices with imag­i­nary en­comi­ums about what such prac­tices feel like to the prac­ti­tioner, I was as­sailed by yet an­other ex­am­ple of knit­ting bull­shit. Now, I’d like to point out that this is a dif­fer­ent kind of bull­shit—one which in­volves more hu­man in­ter­ven­tion than the un­medi­ated dig­i­tal ar­bi­trage we have so far been dis­cussing—but it is bull­shit nonethe­less,

This AI gen­er­ated an­i­mated film, which os­ten­si­bly takes knitting” as its sub­ject, has had more than 100,000 views and elicited more than 500 en­thu­si­as­tic com­ments, the ma­jor­ity from knit­ters re­mark­ing on how good it makes them feel. Now, if you were among the com­menters, or in­deed, have watched and en­joyed this film, in what fol­lows I mean no crit­i­cism of you at all. This an­i­ma­tion is specif­i­cally in­tended to make you feel good in gen­eral, and to feel good about knit­ting in par­tic­u­lar—so of course you are left with a warm, fuzzy, happy feel­ing hav­ing sat through it. But while the feel­ing of the an­i­ma­tion might be per­sua­sive and fa­mil­iar, its ac­tual nar­ra­tive con­tent seems not just of sec­ondary, but of neg­li­gi­ble con­cern, both to the AI and who­ever has prompted it (we could spend a long time dis­cussing how creative” AI prompts can be, and I’m def­i­nitely not here to mull over that).

But what I am here to talk about is the fact that this an­i­ma­tion con­tin­u­ally tells you that it is con­cerned with the long his­tory of knit­ting, while hav­ing noth­ing to say about its sub­ject at all. And I’d like, at this point, to bring back Harry Frankfurt, whose es­say draws a use­ful dis­tinc­tion be­tween dif­fer­ent kinds of bull­shit. On the one hand there is the type of bull­shit which is merely emit­ted or dumped,” with which we might as­so­ci­ate the au­to­mat­i­cally-gen­er­ated pod­cast slop we dis­cussed ear­lier. But on the other hand, Frankfurt says, there is carefully wrought bull­shit”: that is, bull­shit which ap­pears to re­ally have some­thing to say, and which dis­guises the empty void at its black heart with a per­sua­sive façade of emo­tional sin­cer­ity. Even if we set to one side the ex­plicit in­ten­tion of an AI gen­er­ated an­i­ma­tion, which has been posted on YouTube for mon­e­tised likes, clicks and views, this short film would still squarely in Frankfurt’s lat­ter cat­e­gory: it is care­fully wrought knit­ting bull­shit par ex­cel­lence.

You can get a rea­son­able taste of its par­tic­u­lar flavour of bull­shit even with­out watch­ing the AI gen­er­ated video, but by sim­ply read­ing its de­scrip­tion, which de­ploys ex­actly the same syrupy, quasi-mytho­log­i­cal, mean­ing­less emo­tional reg­is­ter as the ac­com­pa­ny­ing im­agery and au­dio. Before writ­ing. Before any­one thought to write any­thing down at all — there were hands, and thread, and the slow click of nee­dles in the dark . . .”

Setting aside the ob­vi­ous fact that none of our knit­ting an­ces­tors, how­ever prim­i­tive, were ever likely to have been knit­ting in the dark this is def­i­nitely pure bull­shit. The de­scrip­tion con­tin­ues:  . . .the old­est thing peo­ple still do. Not a craft. Not a hobby. A lan­guage passed from hand to hand.” The old­est thing peo­ple still do? I and Sigmund Freud call Bullshit.

But Kate, you say, why are you be­ing such a ter­ri­ble killjoy? Why should it mat­ter that this AI an­i­ma­tion is­n’t grounded in ac­tual knit­ting his­tory when it cel­e­brates knit­ting, and makes every­one feel so good about knit­ting? Isn’t that enough?

Well, sorry, no it is­n’t, and in this in­stance I’m per­fectly happy to play the straw-woman role of po-faced fac­toid-ob­sessed tex­tile his­to­rian (if you’d like to re­gard me in that way) sim­ply in or­der to point out that one of the most per­ni­cious things about this par­tic­u­lar kind of bull­shit is the way it casts any form of crit­i­cal scrutiny as a ter­ri­ble fail­ure of sen­si­bil­ity. On these grounds you might ar­gue that my prob­lem with this lovely video sim­ply comes down to the fact that I’m so clearly un­sen­ti­men­tal, so un­feel­ing, so ter­ri­bly bound up with te­dious points of de­tail, such as the film’s weird his­tor­i­cal in­ac­cu­ra­cies and false claims, its bizarre lack of con­cern with ac­tual knit­ting prac­tices (or even em­bod­ied ges­tures), its com­plete fail­ure to en­gage with the con­tested and com­pli­cated nar­ra­tives that have made the craft what it is to­day; its man­i­fest lack of con­nec­tion to knit­ting’s ba­sic re­al­ity  . . and other count­less other sim­i­lar mat­ters of small con­se­quence .

But all of those in­ac­cu­ra­cies, all of that weird, syn­thetic emo­tional grasp­ing is not why I ob­ject so much to this kind of knit­ting bull­shit. No — knit­ting bull­shit both­ers me most of all be­cause of the way it par­a­sitises and de­grades our in­dus­try and our com­mu­nity.

Remember Anne McHealy’s blithe lack of con­cern for the po­ten­tial in­ac­cu­ra­cies of AI gen­er­ated con­tent, be­cause things like knit­ting, were not the end of the world?” But for us, they re­ally are our world, and the in­creas­ing preva­lence of Knitting Bullshit re­ally does make, on oc­ca­sion, the apoc­a­lyp­tic end of that world seem nigh.

Our com­mu­nity has spent so many years build­ing some­thing of gen­uine hu­man value: a shared body of knowl­edge, cul­tural mean­ing and care­ful cri­tique all of which lend con­sid­er­able dis­cur­sive depth and rich­ness to what we do. But in the brave new world of Knitting Bullshit, all of that ac­cu­mu­lated wis­dom, all of the real his­tory of knit­ting as labour, as re­sis­tance, as sol­i­dar­ity, as de­sign in­tel­li­gence, as craft, is now there sim­ply to pro­vide the pow­er­ful emo­tional cur­rency that AI-generated pod­casts and videos cyn­i­cally mine for profit.

Again, I’d like to re­it­er­ate that, if you en­joyed the AI gen­er­ated video (or, in a less likely sce­nario, the AI gen­er­ated pod­cast), I’m not crit­i­cis­ing you for feel­ing good about it, nor for en­joy­ing any­thing which truly cel­e­brates our craft. But as you wipe away a tear or two, and the warm, fuzzy marsh­mal­low sen­sa­tion starts to sub­side, I might gen­tly point out that what you are feel­ing is per­haps less about the con­tent you are con­sum­ing in it­self than it is about all of those knotty, messy, real-world, ma­te­ri­ally-based lega­cies of knit­ting that have been cre­ated by hu­man com­mu­ni­ties and prac­ti­tion­ers over decades and cen­turies. . .legacies which AI Knitting Bullshit now slurps up and spews out.

And per­haps, rather than con­sum­ing this AI gen­er­ated Knitting Bullshit, we might like to sup­port some ac­tual hu­man knit­ting con­tent: the crofters and the crafters, the in­die yarnies and de­sign­ers, the pod­cast­ers, the show or­gan­is­ers, the spin­ners, the mak­ers of ce­ramic but­tons, the colour-lover work­ing with his­toric plant dyes, the carver of wooden hap frames, swifts and yarn bowls, all of the cre­ative crafts­peo­ple that make our global com­mu­nity such a beau­ti­ful, vi­brant, thriv­ing thing of which to be a part. That hu­man legacy, those hu­man cre­ative prac­tices, that long con­tested his­tory, that joy­ful, di­verse, con­tem­po­rary hu­man com­mu­nity: all of those things will re­main wor­thy of our cel­e­bra­tion, our love, and our sup­port, what­ever the AI-bullshit fu­ture brings.

All of the im­ages in this post were gen­er­ated by an ai in re­sponse to the sim­ple two-word prompt lovely knit­ting”

Vibe coding and agentic engineering are getting closer than I’d like

simonwillison.net

6th May 2026

I re­cently talked with Joseph Ruscio about AI cod­ing tools for Heavybit’s High Leverage pod­cast: Ep. #9, The AI Coding Paradigm Shift with Simon Willison. Here are some of my high­lights, in­clud­ing my dis­turb­ing re­al­iza­tion that vibe cod­ing and agen­tic en­gi­neer­ing have started to con­verge in my own work.

One thing I re­ally en­joy about pod­casts is that they some­times push me to think out loud in a way that ex­poses an idea I’ve not pre­vi­ously been able to put into words.

Vibe cod­ing and agen­tic en­gi­neer­ing are start­ing to over­lap

A few weeks af­ter vibe cod­ing was first coined I pub­lished Not all AI-assisted pro­gram­ming is vibe cod­ing (but vibe cod­ing rocks), where I firmly staked out my be­lief that vibe cod­ing” is a very dif­fer­ent beast from re­spon­si­ble use of AI to write code, which I’ve since started to call agen­tic en­gi­neer­ing.

When Joseph brought up the dis­tinc­tion be­tween the two I had a sud­den re­al­iza­tion that they’re not nearly as dis­tinct for me as they used to be:

Weirdly though, those things have started to blur for me al­ready, which is quite up­set­ting. I thought we had a very clear de­lin­eation where vibe cod­ing is the thing where you’re not look­ing at the code at all. You might not even know how to pro­gram. You might be a non-pro­gram­mer who asks for a thing, and gets a thing, and if the thing works, then great! And if it does­n’t, you tell it that it does­n’t work and cross your fin­gers. But at no point are you re­ally car­ing about the code qual­ity or any of those ad­di­tional con­straints. And my take on vibe cod­ing was that it’s fan­tas­tic, pro­vided you un­der­stand when it can be used and when it can’t. A per­sonal tool for you, where if there’s a bug it hurts only you, go ahead! If you’re build­ing soft­ware for other peo­ple, vibe cod­ing is grossly ir­re­spon­si­ble be­cause it’s other peo­ple’s in­for­ma­tion. Other peo­ple get hurt by your stu­pid bugs. You need to have a higher level than that. This con­trasts with agen­tic en­gi­neer­ing where you are a pro­fes­sional soft­ware en­gi­neer. You un­der­stand se­cu­rity and main­tain­abil­ity and op­er­a­tions and per­for­mance and so forth. You’re us­ing these tools to the high­est of your own abil­ity. I’m find­ing the scope of chal­lenges I can take on has gone up by a sig­nif­i­cant amount be­cause I’ve got the sup­port of these tools. But I’m still lean­ing on my 25 years of ex­pe­ri­ence as a soft­ware en­gi­neer. The goal is to build high qual­ity pro­duc­tion sys­tems: if you’re build­ing lower qual­ity stuff faster, I think that’s bad. I want to build higher qual­ity stuff faster. I want every­thing I’m build­ing to be bet­ter in every way than it was be­fore. The prob­lem is that as the cod­ing agents get more re­li­able, I’m not re­view­ing every line of code that they write any­more, even for my pro­duc­tion level stuff. I know full well that if you ask Claude Code to build a JSON API end­point that runs a SQL query and out­puts the re­sults as JSON, it’s just go­ing to do it right. It’s not go­ing to mess that up. You have it add au­to­mated tests, you have it add doc­u­men­ta­tion, you know it’s go­ing to be good. But I’m not re­view­ing that code. And now I’ve got that feel­ing of guilt: if I haven’t re­viewed the code, is it re­ally re­spon­si­ble for me to use this in pro­duc­tion? The thing that re­ally helps me is think­ing back to when I’ve worked at larger or­ga­ni­za­tions where I’ve been an en­gi­neer­ing man­ager. Other teams are build­ing soft­ware that my team de­pends on. If an­other team hands over some­thing and says, hey, this is the im­age re­size ser­vice, here’s how to use it to re­size your im­ages”… I’m not go­ing to go and read every line of code that they wrote. I’m go­ing to look at their doc­u­men­ta­tion and I’m go­ing to use it to re­size some im­ages. And then I’m go­ing to start ship­ping my own fea­tures. And if I start run­ning into prob­lems where the im­age re­sizer thing ap­pears to have bugs or the per­for­mance is­n’t good, that’s when I might dig into their Git repos­i­to­ries and see what’s go­ing on. But for the most part I treat that as a semi-black box that I don’t look at un­til I need to. I’m start­ing to treat the agents in the same way. And it still feels un­com­fort­able, be­cause hu­man be­ings are ac­count­able for what they do. A team can build a rep­u­ta­tion. I can say I trust that team over there. They built good soft­ware in the past. They’re not go­ing to build some­thing rub­bish be­cause that af­fects their pro­fes­sional rep­u­ta­tions.” Claude Code does not have a pro­fes­sional rep­u­ta­tion! It can’t take ac­count­abil­ity for what it’s done. But it’s been prov­ing it­self any­way—time and time again it’s churn­ing out straight­for­ward things and do­ing them right in the style that I like.

Weirdly though, those things have started to blur for me al­ready, which is quite up­set­ting.

I thought we had a very clear de­lin­eation where vibe cod­ing is the thing where you’re not look­ing at the code at all. You might not even know how to pro­gram. You might be a non-pro­gram­mer who asks for a thing, and gets a thing, and if the thing works, then great! And if it does­n’t, you tell it that it does­n’t work and cross your fin­gers.

But at no point are you re­ally car­ing about the code qual­ity or any of those ad­di­tional con­straints. And my take on vibe cod­ing was that it’s fan­tas­tic, pro­vided you un­der­stand when it can be used and when it can’t.

A per­sonal tool for you, where if there’s a bug it hurts only you, go ahead!

If you’re build­ing soft­ware for other peo­ple, vibe cod­ing is grossly ir­re­spon­si­ble be­cause it’s other peo­ple’s in­for­ma­tion. Other peo­ple get hurt by your stu­pid bugs. You need to have a higher level than that.

This con­trasts with agen­tic en­gi­neer­ing where you are a pro­fes­sional soft­ware en­gi­neer. You un­der­stand se­cu­rity and main­tain­abil­ity and op­er­a­tions and per­for­mance and so forth. You’re us­ing these tools to the high­est of your own abil­ity. I’m find­ing the scope of chal­lenges I can take on has gone up by a sig­nif­i­cant amount be­cause I’ve got the sup­port of these tools.

But I’m still lean­ing on my 25 years of ex­pe­ri­ence as a soft­ware en­gi­neer.

The goal is to build high qual­ity pro­duc­tion sys­tems: if you’re build­ing lower qual­ity stuff faster, I think that’s bad. I want to build higher qual­ity stuff faster. I want every­thing I’m build­ing to be bet­ter in every way than it was be­fore.

The prob­lem is that as the cod­ing agents get more re­li­able, I’m not re­view­ing every line of code that they write any­more, even for my pro­duc­tion level stuff.

I know full well that if you ask Claude Code to build a JSON API end­point that runs a SQL query and out­puts the re­sults as JSON, it’s just go­ing to do it right. It’s not go­ing to mess that up. You have it add au­to­mated tests, you have it add doc­u­men­ta­tion, you know it’s go­ing to be good.

But I’m not re­view­ing that code. And now I’ve got that feel­ing of guilt: if I haven’t re­viewed the code, is it re­ally re­spon­si­ble for me to use this in pro­duc­tion?

The thing that re­ally helps me is think­ing back to when I’ve worked at larger or­ga­ni­za­tions where I’ve been an en­gi­neer­ing man­ager. Other teams are build­ing soft­ware that my team de­pends on.

If an­other team hands over some­thing and says, hey, this is the im­age re­size ser­vice, here’s how to use it to re­size your im­ages”… I’m not go­ing to go and read every line of code that they wrote.

I’m go­ing to look at their doc­u­men­ta­tion and I’m go­ing to use it to re­size some im­ages. And then I’m go­ing to start ship­ping my own fea­tures. And if I start run­ning into prob­lems where the im­age re­sizer thing ap­pears to have bugs or the per­for­mance is­n’t good, that’s when I might dig into their Git repos­i­to­ries and see what’s go­ing on. But for the most part I treat that as a semi-black box that I don’t look at un­til I need to.

I’m start­ing to treat the agents in the same way. And it still feels un­com­fort­able, be­cause hu­man be­ings are ac­count­able for what they do. A team can build a rep­u­ta­tion. I can say I trust that team over there. They built good soft­ware in the past. They’re not go­ing to build some­thing rub­bish be­cause that af­fects their pro­fes­sional rep­u­ta­tions.”

Claude Code does not have a pro­fes­sional rep­u­ta­tion! It can’t take ac­count­abil­ity for what it’s done. But it’s been prov­ing it­self any­way—time and time again it’s churn­ing out straight­for­ward things and do­ing them right in the style that I like.

There’s an el­e­ment of the nor­mal­iza­tion of de­viance here—every time a model turns out to have writ­ten the right code with­out me mon­i­tor­ing it closely there’s a risk that I’ll trust it at the wrong mo­ment in the fu­ture and get burned.

The new chal­lenge of eval­u­at­ing soft­ware

It used to be if you found a GitHub repos­i­tory with a hun­dred com­mits and a good readme and au­to­mated tests and stuff, you could be pretty sure that the per­son writ­ing that had put a lot of care and at­ten­tion into that pro­ject. And now I can knock out a git repos­i­tory with a hun­dred com­mits and a beau­ti­ful readme and com­pre­hen­sive tests of every line of code in half an hour! It looks iden­ti­cal to those pro­jects that have had a great deal of care and at­ten­tion. Maybe it is as good as them. I don’t know. I can’t tell from look­ing at it. Even for my own pro­jects, I can’t tell. So I re­al­ized what I value more than the qual­ity of the tests and doc­u­men­ta­tion is that I want some­body to have used the thing. If you’ve got a vibe coded thing which you have used every day for the past two weeks, that’s much more valu­able to me than some­thing that you’ve just spat out and hardly even ex­er­cised.

It used to be if you found a GitHub repos­i­tory with a hun­dred com­mits and a good readme and au­to­mated tests and stuff, you could be pretty sure that the per­son writ­ing that had put a lot of care and at­ten­tion into that pro­ject.

And now I can knock out a git repos­i­tory with a hun­dred com­mits and a beau­ti­ful readme and com­pre­hen­sive tests of every line of code in half an hour! It looks iden­ti­cal to those pro­jects that have had a great deal of care and at­ten­tion. Maybe it is as good as them. I don’t know. I can’t tell from look­ing at it. Even for my own pro­jects, I can’t tell.

So I re­al­ized what I value more than the qual­ity of the tests and doc­u­men­ta­tion is that I want some­body to have used the thing. If you’ve got a vibe coded thing which you have used every day for the past two weeks, that’s much more valu­able to me than some­thing that you’ve just spat out and hardly even ex­er­cised.

The bot­tle­necks have shifted

If you can go from pro­duc­ing 200 lines of code a day to 2,000 lines of code a day, what else breaks? The en­tire soft­ware de­vel­op­ment life­cy­cle was, it turns out, de­signed around the idea that it takes a day to pro­duce a few hun­dred lines of code. And now it does­n’t. It’s not just the down­stream stuff, it’s the up­stream stuff as well. I saw a great talk by Jenny Wen, who’s the de­sign leader at Anthropic, where she said we have all of these de­sign processes that are based around the idea that you need to get the de­sign right—be­cause if you hand it off to the en­gi­neers and they spend three months build­ing the wrong thing, that’s cat­a­strophic. There’s this whole very ex­ten­sive de­sign process that you put in place be­cause that de­sign re­sults in ex­pen­sive work. But if it does­n’t take three months to build, maybe the de­sign process can be a whole lot riskier be­cause cost, if you get some­thing wrong, has been re­duced so much.

If you can go from pro­duc­ing 200 lines of code a day to 2,000 lines of code a day, what else breaks? The en­tire soft­ware de­vel­op­ment life­cy­cle was, it turns out, de­signed around the idea that it takes a day to pro­duce a few hun­dred lines of code. And now it does­n’t.

It’s not just the down­stream stuff, it’s the up­stream stuff as well. I saw a great talk by Jenny Wen, who’s the de­sign leader at Anthropic, where she said we have all of these de­sign processes that are based around the idea that you need to get the de­sign right—be­cause if you hand it off to the en­gi­neers and they spend three months build­ing the wrong thing, that’s cat­a­strophic.

There’s this whole very ex­ten­sive de­sign process that you put in place be­cause that de­sign re­sults in ex­pen­sive work. But if it does­n’t take three months to build, maybe the de­sign process can be a whole lot riskier be­cause cost, if you get some­thing wrong, has been re­duced so much.

Why I’m still not afraid for my ca­reer

When I look at my con­ver­sa­tions with the agents, it’s very clear to me that this is moon lan­guage for the vast ma­jor­ity of hu­man be­ings. There are a whole bunch of rea­sons I’m not scared that my ca­reer as a soft­ware en­gi­neer is over now that com­put­ers can write their own code, partly be­cause these things are am­pli­fiers of ex­ist­ing ex­pe­ri­ence. If you know what you’re do­ing, you can run so much faster with them. […] I’m con­stantly re­minded as I work with these tools how hard the thing that we do is. Producing soft­ware is a fe­ro­ciously dif­fi­cult thing to do. And you could give me all of the AI tools in the world and what we’re try­ing to achieve here is still re­ally dif­fi­cult. […] Matthew Yglesias, who’s a po­lit­i­cal com­men­ta­tor, yes­ter­day tweeted, Five months in, I think I’ve de­cided that I don’t want to vibecode — I want pro­fes­sion­ally man­aged soft­ware com­pa­nies to use AI cod­ing as­sis­tance to make more/​bet­ter/​cheaper soft­ware prod­ucts that they sell to me for money.” And that feels about right to me. I can plumb my house if I watch enough YouTube videos on plumb­ing. I would rather hire a plumber.

When I look at my con­ver­sa­tions with the agents, it’s very clear to me that this is moon lan­guage for the vast ma­jor­ity of hu­man be­ings.

There are a whole bunch of rea­sons I’m not scared that my ca­reer as a soft­ware en­gi­neer is over now that com­put­ers can write their own code, partly be­cause these things are am­pli­fiers of ex­ist­ing ex­pe­ri­ence. If you know what you’re do­ing, you can run so much faster with them. […]

I’m con­stantly re­minded as I work with these tools how hard the thing that we do is. Producing soft­ware is a fe­ro­ciously dif­fi­cult thing to do. And you could give me all of the AI tools in the world and what we’re try­ing to achieve here is still re­ally dif­fi­cult. […]

Matthew Yglesias, who’s a po­lit­i­cal com­men­ta­tor, yes­ter­day tweeted, Five months in, I think I’ve de­cided that I don’t want to vibecode — I want pro­fes­sion­ally man­aged soft­ware com­pa­nies to use AI cod­ing as­sis­tance to make more/​bet­ter/​cheaper soft­ware prod­ucts that they sell to me for money.” And that feels about right to me. I can plumb my house if I watch enough YouTube videos on plumb­ing. I would rather hire a plumber.

On the threat to SaaS providers of com­pa­nies rolling their own so­lu­tions in­stead:

I just re­al­ized it’s the thing I said ear­lier about how I only want to use your side pro­ject if you’ve used it for a few weeks. The en­ter­prise ver­sion of that is I don’t want a CRM un­less at least two other gi­ant en­ter­prises have suc­cess­fully used that CRM for six months. […] You want so­lu­tions that are proven to work be­fore you take a risk on them.

I just re­al­ized it’s the thing I said ear­lier about how I only want to use your side pro­ject if you’ve used it for a few weeks. The en­ter­prise ver­sion of that is I don’t want a CRM un­less at least two other gi­ant en­ter­prises have suc­cess­fully used that CRM for six months. […] You want so­lu­tions that are proven to work be­fore you take a risk on them.

Higher usage limits for Claude and a compute deal with SpaceX

www.anthropic.com

We’ve agreed to a part­ner­ship with SpaceX that will sub­stan­tially in­crease our com­pute ca­pac­ity. This, along with our other re­cent com­pute deals, means that we’ve been able to in­crease our us­age lim­its for Claude Code and the Claude API.

Below, we de­scribe these changes and the progress we’re mak­ing on com­pute.

The fol­low­ing three changes—all ef­fec­tive to­day—are aimed at im­prov­ing the ex­pe­ri­ence of us­ing Claude for our most ded­i­cated cus­tomers.

First, we’re dou­bling Claude Code’s five-hour rate lim­its for Pro, Max, Team, and seat-based Enterprise plans.

Second, we’re re­mov­ing the peak hours limit re­duc­tion on Claude Code for Pro and Max ac­counts.

Third, we’re rais­ing our API rate lim­its con­sid­er­ably for Claude Opus mod­els, as shown in the table be­low:

New com­pute part­ner­ship with SpaceX

We’ve signed an agree­ment with SpaceX to use all of the com­pute ca­pac­ity at their Colossus 1 data cen­ter. This gives us ac­cess to more than 300 megawatts of new ca­pac­ity (over 220,000 NVIDIA GPUs) within the month. This ad­di­tional ca­pac­ity will di­rectly im­prove ca­pac­ity for Claude Pro and Claude Max sub­scribers.

This joins our other sig­nif­i­cant com­pute an­nounce­ments:

An up to 5 gi­gawatt (GW) agree­ment with Amazon, which in­cludes nearly 1 GW of new ca­pac­ity by the end of 2026;

A 5 GW agree­ment with Google and Broadcom, which will be­gin com­ing on­line in 2027;

A strate­gic part­ner­ship with Microsoft and NVIDIA that in­cludes $30 bil­lion of Azure ca­pac­ity;

Our $50 bil­lion in­vest­ment in American AI in­fra­struc­ture with Fluidstack.

We train and run Claude on a range of AI hard­ware—AWS Trainium, Google TPUs, and NVIDIA GPUs—and con­tinue to ex­plore op­por­tu­ni­ties to bring ad­di­tional ca­pac­ity on­line.

As part of this agree­ment, we have also ex­pressed in­ter­est in part­ner­ing with SpaceX to de­velop mul­ti­ple gi­gawatts of or­bital AI com­pute ca­pac­ity.

Expanding in­ter­na­tion­ally

Our en­ter­prise cus­tomers—par­tic­u­larly those in reg­u­lated in­dus­tries like fi­nan­cial ser­vices, health­care, and gov­ern­ment—in­creas­ingly need in-re­gion in­fra­struc­ture to meet com­pli­ance and data res­i­dency re­quire­ments. Accordingly, some of our ca­pac­ity ex­pan­sion will be in­ter­na­tional: our re­cently an­nounced col­lab­o­ra­tion with Amazon in­cludes ad­di­tional in­fer­ence in Asia and Europe.

We’re very in­ten­tional about where we’ll add ca­pac­ity—part­ner­ing with de­mo­c­ra­tic coun­tries whose le­gal and reg­u­la­tory frame­works sup­port in­vest­ments of this scale, and where the sup­ply chain on which our com­pute de­pends—hard­ware, net­work­ing, and fa­cil­i­ties—will be se­cure.

Finally, we re­cently made a com­mit­ment to cover any con­sumer elec­tric­ity price in­creases caused by our data cen­ters in the US. As part of our in­ter­na­tional ex­pan­sion, we’re ex­plor­ing ways to ex­tend that com­mit­ment to new ju­ris­dic­tions, as well as part­ner­ing with lo­cal lead­ers to in­vest back into the com­mu­ni­ties that host our fa­cil­i­ties.

Related con­tent

Agents for fi­nan­cial ser­vices

We’re re­leas­ing ten new Cowork and Claude Code plu­g­ins, in­te­gra­tions with the Microsoft 365 suite, new con­nec­tors, and an MCP app for fi­nan­cial ser­vices and in­sur­ance or­ga­ni­za­tions.

Read more

Building a new en­ter­prise AI ser­vices com­pany with Blackstone, Hellman & Friedman, and Goldman Sachs

Read more

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

Visit pancik.com for more.