10 interesting stories served every morning and every evening.

Safeguarding Your Website — BigScoots

dronexl.co

We’re check­ing if you’re a real per­son and not an au­to­mated bad bot. Usually, the captcha be­low will com­plete it­self. If it does­n’t, sim­ply click the check­box in the captcha to ver­ify. Once ver­i­fied, you’ll be taken to the page you wanted to visit.

If for some rea­son af­ter ver­i­fy­ing the captcha above, you are con­stantly be­ing redi­rected to this ex­act same page to re-ver­ify the captcha again, then please click on the but­ton be­low to get in touch with the sup­port team.

6.0.0

brew.sh

Today, I’m proud to an­nounce Homebrew 6.0.0. The most sig­nif­i­cant changes since 5.1.0 are a new tap trust se­cu­rity mech­a­nism, the new faster, smaller, de­fault in­ter­nal Homebrew JSON API, sand­box­ing on Linux, bet­ter de­faults in­formed by our user sur­vey, many brew bun­dle im­prove­ments, im­proved per­for­mance and ini­tial sup­port for ma­cOS 27 (Golden Gate).

✨ Highlights since 5.1.0

🔐 Tap trust

Homebrew 6.0.0 in­tro­duces tap trust. A third-party tap can con­tain ar­bi­trary, un­sand­boxed Ruby that runs on your ma­chine, so Homebrew now re­quires taps (and tap-qual­i­fied for­mu­lae and casks) to be ex­plic­itly trusted be­fore their code is eval­u­ated or run. This re­duces the risk from ma­li­cious or com­pro­mised taps while leav­ing the of­fi­cial Homebrew taps trusted by de­fault. See the new Tap-Trust doc­u­men­ta­tion for de­tails.

Homebrew en­forces ini­tial tap trust so un­trusted taps are flagged be­fore their code runs, trusts qual­i­fied tap items be­fore in­stall, stops auto-tap­ping un­trusted taps, pins tap al­low, for­bid and trust lists to re­motes and uses tap trust when eval­u­at­ing all for­mu­lae and casks.

brew tap gains com­mands for man­ag­ing tap trust, can trust a tap by its re­mote URL, brew trust adds a –json=v1 flag and brew tap-info adds a trusted field.

brew bun­dle ho­n­ours the trusted: op­tion and brew bun­dle dump records trusted bun­dle en­tries, mark­ing cus­tom-re­mote taps as trusted.

docs.brew.sh has new pages, in­clud­ing Tap-Trust, ex­plain­ing Homebrew’s new tap trust model, and Homebrew trusts taps in test-bot.

⚡ Default in­ter­nal JSON API

The in­ter­nal JSON API is now the de­fault, ad­vanc­ing the smaller API that Homebrew re-en­abled and turned on for de­vel­op­ers re­cently. It com­bines all Homebrew’s meta­data into a sin­gle down­load, so brew up­dates faster and talks to the net­work less. It was opt-in via HOMEBREW_USE_INTERNAL_API since 5.0.0; that vari­able is now dep­re­cated (see be­low).

🐧 Linux sand­box

The Linux Bubblewrap sand­box aligns Linux with ma­cOS, where build, test and postin­stall phases al­ready run sand­boxed. It is on by de­fault for de­vel­op­ers, Homebrew moved its ma­cOS sand­box logic to share code, im­proved Linux sand­box be­hav­iour (with Homebrew/homebrew-core set­ting the sand­box env in CI), hard­ened sand­boxed in­stall phases, sand­boxed cask ex­e­cutable hooks, al­lowed logs in the build sand­box, in­stalled Bubblewrap on hosted Ubuntu and skips sand­box setup for syn­tax-only jobs.

⚙️ Better de­faults

Following our Homebrew user sur­vey, we have made many changes based on the re­sults. The most no­table is mak­ing ask mode the de­fault for de­vel­op­ers, so brew in­stall and brew up­grade show a de­pen­dency sum­mary and con­fir­ma­tion prompt be­fore mak­ing changes.

Homebrew adds ask de­pen­dency plans and cask sup­port, ac­cepts one-key ask con­fir­ma­tions and aligns ask dry-run prompts.

Homebrew fetches ask up­grades to­gether, prints the ask up­grade sum­mary sooner, skips the up­grade ask prompt when empty, adds a fi­nal brew up­grade sum­mary and ex­plains the up­grade meta­data fetch.

📦 brew bun­dle

brew bun­dle gains many im­prove­ments, most no­tably par­al­lel for­mula in­stal­la­tion that now runs jobs au­to­mat­i­cally by de­fault, plus npm and krew ex­ten­sions, wider cleanup sup­port and, on Windows, winget sup­port.

Homebrew adds cleanup sup­port to npm, cargo, go and uv ex­ten­sions and asks be­fore re­mov­ing dur­ing cleanup.

Homebrew runs brew bun­dle krew via kubectl-krew di­rectly, re­spects CARGO_HOME and friends for cargo, adds a –describe flag to brew bun­dle add and tries mas in­stall be­fore falling back to mas get.

Homebrew adds bun­dle type dis­able flags, im­proves check guid­ance and checks for­mula link sta­tus.

Homebrew se­ri­alises for­mula locks, makes non-core DSLs a sin­gle file, re­moves de­scrip­tion com­ments from brew bun­dle/​re­mover and avoids pars­ing the out­put of brew ser­vices list.

brew bun­dle per­forms npm in­stalls more se­curely.

🏎️ Performance

Homebrew is faster across the board, with startup per­for­mance tweaks, a ~30% faster brew leaves, par­al­lelised bot­tle tab fetch­ing on up­grade and less work load­ing Ruby li­braries at startup.

🍎 ma­cOS 27 (Golden Gate)

Homebrew adds ini­tial sup­port for ma­cOS 27 (Golden Gate).

🔮 Upcoming changes

ma­cOS 27 (Golden Gate) drops Intel sup­port, so per our Support Tiers: in September 2026, ma­cOS Intel x86_64 moves to Tier 3 with no CI sup­port and no new bot­tles (binary pack­ages) built for ma­cOS Intel; in September 2027, ma­cOS Intel x86_64 will be un­sup­ported en­tirely and all re­lated code deleted.

The mas­ter to main mi­gra­tion be­gun in 4.6.0 con­tin­ues: more repos­i­to­ries no longer up­date mas­ter, GitHub Actions warn @master users to mi­grate to @main and the sync-de­fault-branches work­flows are re­moved from Homebrew/homebrew-cask and Homebrew/homebrew-core.

Casks that fail ma­cOS Gatekeeper checks, dep­re­cated in 5.0.0, re­main on track to be dis­abled in September 2026.

🔒 Security

🚨 Security ad­vi­sories

Homebrew pub­lished three se­cu­rity ad­vi­sories:

The POST down­load strat­egy by­passed the doc­u­mented HTTPS-to-HTTP redi­rect pro­tec­tion by dis­card­ing the re­solved URL (GHSA-7699-qf8c-q47m), fixed by en­forc­ing se­cure redi­rects.

Root code ex­e­cu­tion was pos­si­ble via Git hooks in the ma­cOS .pkg postin­stall (GHSA-6689-q779-c33m), fixed by clean­ing Homebrew git state and re­plac­ing the in­staller git di­rec­tory.

The ma­cOS in­staller pack­age trusted a user-con­trolled /var/tmp plist and could as­sign Homebrew own­er­ship to a lo­cal at­tacker (GHSA-59v8-x8q4-px5c), fixed by tweak­ing the ma­cOS .pkg pack­age-user plist han­dling.

🛡️ Other se­cu­rity im­prove­ments

Homebrew fil­ters sen­si­tive en­vi­ron­ment vari­ables dur­ing Ruby eval­u­a­tions and de­fers HOMEBREW_* en­vi­ron­ment se­crets to down­load time.

Homebrew runs for­bid­den checks for casks and for­mu­lae be­fore down­load and lets you re­quire check­sums for casks with HOMEBREW_CASK_OPTS_REQUIRE_SHA.

Homebrew links to a shared se­cu­rity pol­icy.

🗑️ Deprecations

Homebrew dep­re­cates de­fault opt-ins.

Homebrew dep­re­cates now-de­fault bun­dle and in­ter­nal API en­vi­ron­ment vari­ables such as HOMEBREW_BUNDLE_NO_SECRETS and HOMEBREW_USE_INTERNAL_API.

Homebrew marks un­used op­tions for dep­re­ca­tion.

Various other Homebrew 6.0.0 dep­re­ca­tions.

Homebrew’s SBOM sup­port is now opt-in with HOMEBREW_SBOM.

🎁 Features

🖥️ Casks

Homebrew can pin casks and sup­ports casks in brew miss­ing.

Homebrew adds AppImage sup­port for Linux and im­ple­ments a Linux freedesk­top trash for casks.

Homebrew im­proves cask up­grades by shar­ing up­grade down­load queues, mov­ing up­grade sum­maries be­fore fetch, adding a quit opt-out and re­open­ing closed apps dur­ing up­grade.

Homebrew im­proves au­to_up­dates casks: im­prov­ing how they up­date, re­fin­ing the be­hav­iour fur­ther, gat­ing auto-up­dates be­hind opt-in and up­grad­ing them when the bun­dle ver­sion is stale.

cask adds a gen­er­ate_­com­ple­tion­s_from_ex­e­cutable DSL ar­ti­fact and in­cludes re­solved ar­ti­fact tar­gets in JSON out­put.

Homebrew shows a cask ver­sion tran­si­tion in per-cask up­grade out­put, skips valid cached cask fetches, speeds up cask backup copies and has caskroom use the user’s pri­mary group on Linux.

brew doc­tor and brew cleanup han­dle cor­rupt Caskroom di­rec­to­ries.

💻 Operating sys­tem sup­port

Homebrew makes Linux cask re­quire­ments ex­plicit, aligns cask ma­cOS de­pen­den­cies, sup­ports bare de­pend­s_on :macos in casks, tracks ma­cOS sup­port ex­plic­itly and emits Linux vari­a­tions for casks with Linux check­sums.

Homebrew adds a max­i­mum ma­cOS for cask de­pen­den­cies. Homebrew/homebrew-cask adopts the new de­pend­s_on max­i­mum_­ma­cos: syn­tax and fixes its ma­cOS de­pen­den­cies in Homebrew/homebrew-cask and Homebrew/homebrew-core.

Homebrew adds M5 and M5 Pro/Max CPU recog­ni­tion and caps the OCLP tier when ma­cOS is out­dated.

Homebrew la­bels WSL an­a­lyt­ics, shows the Windows build on WSL in brew con­fig and moves the wsl? boolean from OS::Linux up to the OS mod­ule.

🚰 Taps

Homebrew recog­nises more equiv­a­lent tap re­mote forms, ig­nor­ing a .git suf­fix when match­ing GitHub re­motes and con­sol­i­dat­ing tap re­mote nor­mal­i­sa­tion. (and more)

Homebrew han­dles for­mu­lae and casks more uni­formly across com­mands, in­stalls ex­plic­itly re­quested taps and stops im­plicit tap in­stal­la­tion.

Homebrew uses work­trees for lo­cal core taps and blocks work­tree up­dates.

Homebrew shares full-name pars­ing helpers and uses full-name helpers for split names.

ℹ️ brew info and brew tap-info

brew info out­put is clearer: more con­sis­tent and help­ful, with a Binaries sec­tion list­ing ex­e­cuta­bles, a clearer re­cur­sive run­time de­pen­den­cies line, clearer same-named con­flicts and shad­owed for­mu­lae and a list ver­sions JSON out­put.

brew info shows in­stalled state bet­ter: the up­grade tar­get for out­dated @-versioned for­mu­lae, in­stalled de­pen­dents with –verbose, dep­re­cated and dis­abled pack­ages in in­stall sta­tus, in­stalled for­mu­lae re­solved from the re­ceip­t’s tap with a shad­ow­ing warn­ing, the in­stalled ver­sion and an up­grade hint on the head­line, other in­stalled ver­sions and an in­stalled info in­ven­tory.

brew info and brew tap-info skip the unin­stalled marker when not a prob­lem, show more tap info for pack­ages and brew tap-info lists for­mu­lae and casks.

brew which-for­mula shows in­stall sta­tus and Homebrew shows quar­an­tine script us­age.

🆕 New com­mands, flags and out­put

brew exec is a new com­mand, like npx, that sup­ports for­mu­lae en­vi­ron­ments.

brew as-con­sole-user is a new com­mand for run­ning Homebrew as the right user un­der MDM/root en­vi­ron­ments and brew up­date <formula> is aliased to up­grade.

Homebrew ti­dies help and com­ple­tions: omit­ting aliases from com­ple­tions, hid­ing HOMEBREW_CASK_OPTS_* from help, hid­ing main­tainer com­mands and hid­ing hide_from_­man_­page com­mands from brew com­mands.

Homebrew avoids in­stall warn­ing an­no­ta­tions and warns when for­mula ex­e­cuta­bles are shad­owed on PATH.

🧊 Cooldowns, livecheck and bump­ing

Homebrew adds down­load cooldowns for Bundler, RubyGems livecheck, npm and pip de­faults, PyPI re­source res­o­lu­tion and npm and PyPI in bump to avoid up­stream sup­ply-side se­cu­rity risks.

Homebrew prints bump skip sta­tus, mes­sages and er­rors and checks RubyGems li­cences.

Homebrew re­spects livecheck throt­tle days in au­dit, adds livecheck throt­tling by days and speeds up the for­mula throt­tle days check.

⬇️ Downloads and fetch­ing

brew fetch –all-platforms fetches every vari­ant, Homebrew prints down­load er­ror de­tails when us­ing con­cur­rency, pre­serves par­tial down­loads on net­work er­rors, avoids cached man­i­fest down­loads and hints when a down­load is HTML, not a bi­nary.

Homebrew avoids re­dun­dant Caskroom chgrp.

🛎️ Services

Homebrew starts sys­temd timers for ser­vices, cre­ates ser­vice path di­rec­to­ries au­to­mat­i­cally (with Homebrew/homebrew-core adopt­ing the new ser­vice path cre­ation logic) and au­dits re­dun­dant ser­vice path setup.

brew ser­vices no longer fails to load with –sudo-service-user.

🧪 Formulae and pack­ag­ing

Homebrew adds the VCS re­vi­sion as scm_re­vi­sion in the tab, sup­ports in-repos­i­tory patch files, sup­ports CPS meta­data di­rec­to­ries and in­cludes patches in for­mula to_hash.

Homebrew re­spects in­stalled de­pen­dents dur­ing au­tore­move and cross-checks au­tore­move can­di­dates against for­mula de­f­i­n­i­tions.

🪜 Install steps frame­work

The in­stall steps frame­work ex­presses com­mon postin­stall, pre­flight and post­flight be­hav­iour as or­dered, lit­eral-only DSL data that is ex­posed through the JSON APIs. Where a for­mula or cask only does sim­ple file prepa­ra­tion, it no longer needs to down­load and eval­u­ate a Ruby file at in­stall time. Homebrew adds for­mula in­stall steps, cask in­stall steps, an au­dit for for­mula in­stall steps, in­stall step re­build ac­tions, re­build step meth­ods, re­build step RuboCop checks and an au­dit of cask flight step con­ver­sions; home­brew/​core and home­brew/​cask adopt the new DSLs (post_install_steps, postin­stall and flight steps). In home­brew/​core and home­brew/​cask this cov­ers a large share of post_in­stall and *flight blocks (creating di­rec­to­ries, touch­ing mark­ers, mov­ing and sym­link­ing files), with more op­er­a­tion types planned.

🔀 Other changes

brew vulns is a new Homebrew tap and sub­com­mand that checks in­stalled pack­ages for known vul­ner­a­bil­i­ties 🔒.

Homebrew warns for Nix-managed Homebrew.

🧹 Internals, typ­ing and refac­tors

Homebrew re­places brew which-up­date, uses an AST for source rewrites and en­forces pub­lic API vis­i­bil­ity and docs.

Homebrew re­works com­mand pars­ing: parser sub­com­mand scaf­fold­ing, con­vert­ing the bun­dle, ser­vices and re­main­ing sub­com­mands, scop­ing sub­com­mand op­tion con­straints and us­age help, and no longer re­strict­ing global op­tions to sub­com­mands.

Homebrew lim­its Sorbet run­time de­faults and lim­its re­cur­sive Sorbet in test-bot.

🛠️ Continuous in­te­gra­tion and de­vel­oper tool­ing

The Ubuntu 24.04 CI mi­gra­tion flagged in 5.1.0 for 6.0.0 has now landed, rais­ing the Linux base­line.

Data retention practices for Mythos-class models

support.claude.com

All Collections

Team and Enterprise plans

Security and com­pli­ance

Data re­ten­tion prac­tices for Mythos-class mod­els

Updated to­day

Table of con­tents

To en­sure we’re re­spon­si­bly de­ploy­ing Mythos-class mod­els, we are re­quir­ing lim­ited data re­ten­tion and re­view as part of our safety work. Prompts sub­mit­ted to, and out­puts gen­er­ated by, Mythos-class mod­els are re­tained for 30 days for trust and safety pur­poses, on every plat­form where these mod­els are of­fered.

This ap­plies to Mythos-class mod­els and fu­ture mod­els with sim­i­lar ca­pa­bil­i­ties that we des­ig­nate as cov­ered mod­els. For all other mod­els, every­thing you use is un­af­fected and stays un­der the cur­rent terms.

This pol­icy, de­scribed be­low, goes into ef­fect on June 9, 2026. For more in­for­ma­tion on the threat model for re­tained data and as­so­ci­ated pri­vacy con­trols, please see the cor­re­spond­ing tech­ni­cal white pa­per on our Trust Center.

Who this ap­plies to

Consumer plans (Claude Free, Pro, and Max) across our web, desk­top, and mo­bile apps—in­clud­ing Claude.ai and Claude Code—are un­af­fected by this up­date, since we al­ready re­tain in­puts and out­puts for safety pur­poses on these sur­faces. Learn more about how we re­tain data for con­sumer plans.

This change only ap­plies to or­ga­ni­za­tions that have set up work­spaces with zero data re­ten­tion (ZDR) in Claude Console, use Claude Code with ZDR in Claude Enterprise, or ac­cess Claude through AWS Bedrock, Google Cloud Agent Platform, or Microsoft Foundry with ZDR. The rest of this ar­ti­cle ap­plies only to these or­ga­ni­za­tions.

Why we’re do­ing this

Claude Mythos 5 rep­re­sents a sub­stan­tial in­crease in model ca­pa­bil­i­ties, some of which can be used for both be­nign and ma­li­cious pur­poses. Claude Fable 5 shares the same un­der­ly­ing model as Claude Mythos 5, but with ad­di­tional safe­guards, par­tic­u­larly in the cy­ber and bio do­mains. While these safe­guards al­low us to share this in­tel­li­gence more broadly, we are tak­ing a con­ser­v­a­tive ap­proach that al­lows us to look for pat­terns of mis­use with this class of model. Some at­tacks only be­come vis­i­ble across mul­ti­ple re­quests. Best-of-N jail­break­ing, for ex­am­ple, sends hun­dreds of slight vari­a­tions of a prompt in the hope that one will work. Larger pat­terns of mis­use, such as state-spon­sored es­pi­onage or data ex­tor­tion cam­paigns, only sur­face when our safe­guards clas­si­fiers can zoom out across many re­quests. Detecting these threats re­quires tem­porar­ily re­tain­ing prompts and out­puts so they can be an­a­lyzed to­gether, rather than one at a time.

How we pro­tect your data

Anthropic em­ploy­ees can­not ac­cess your con­ver­sa­tions un­less they are flagged for po­ten­tial se­ri­ous harm or upon a cus­tomer’s writ­ten re­quest. These re­views can only be per­formed by a small set of ap­proved re­view­ers through tool­ing that pre­vents ex­port, copy­ing, or down­load­ing. Every in­stance of ac­cess is recorded in a tam­per-proof log that re­view­ers can­not sup­press or mod­ify. After 30 days, the data is deleted au­to­mat­i­cally, ex­cept in the rare cases where it’s part of a safety in­ves­ti­ga­tion or we’re legally re­quired to keep it. Eligible or­ga­ni­za­tions also have the op­tion to add cus­tomer-man­aged en­cryp­tion keys and ac­cess trans­parency au­dit logs.

Anthropic main­tains a doc­u­mented in­for­ma­tion se­cu­rity pro­gram with tech­ni­cal and or­ga­ni­za­tional mea­sures that are de­signed to pro­tect the se­cu­rity, con­fi­den­tial­ity, and in­tegrity of cus­tomer data. Our risk-based pro­gram is built for and evolves to de­fend against known and an­tic­i­pated threat mod­els and is tested reg­u­larly. For more in­for­ma­tion, see the tech­ni­cal white pa­per in our Trust Center.

What, if any­thing, do I need to con­fig­ure?

This change only ap­plies to or­ga­ni­za­tions that have set up work­spaces with zero data re­ten­tion (ZDR) in Claude Console, use Claude Code with ZDR in Claude Enterprise, or ac­cess Claude through AWS Bedrock, Google Cloud Agent Platform, or Microsoft Foundry with ZDR. For all other or­ga­ni­za­tions, there is no change and there’s noth­ing to con­fig­ure. The rest of this sec­tion is for or­ga­ni­za­tions that ac­cess Claude with­out data re­ten­tion to­day and need to set up data re­ten­tion in or­der to use des­ig­nated mod­els when they be­come avail­able.

If your de­vel­op­ers use the Claude API

If your team uses Claude Code

If your team uses Claude chat or Cowork through Claude for Enterprise

Related Articles

Public Sector FAQs

Use Claude for Microsoft 365 with third-party plat­forms

Real-time cy­ber safe­guards on Claude

Covered Models

Covered Models un­der a Business Associate Agreement (BAA)

Cybersecurity researchers aren’t happy about the guardrails on Anthropic’s Fable

techcrunch.com

Anthropic re­leased its lat­est model Fable on Tuesday, billing it as a pub­lic and lim­ited ver­sion of its pow­er­ful and much-hyped cy­ber­se­cu­rity model Mythos.

But not every­one is happy with the re­stric­tions, and a num­ber of cy­ber­se­cu­rity re­searchers and pro­fes­sion­als have aired com­plaints on­line.

[Fable] re­jects any re­quest that could be tan­gen­tially cy­ber re­lated. Even in­nocu­ous tasks like read­ing a blog post,” said Valentina Chompie” Palmiotti, a well-known se­cu­rity re­searcher who works at IBM X-Force.

When a prompt trig­gers its guardrails, Fable pauses the chat and says that its safety mea­sures flagged this mes­sage for cy­ber­se­cu­rity or bi­ol­ogy top­ics.”

The guardrails were put in place to limit the risk that Fable could be used to de­velop mal­ware or com­pro­mise soft­ware — a long-stand­ing con­cern within Anthropic. The re­stric­tions on bi­ol­ogy come from a sim­i­lar con­cern around de­vel­op­ing bi­o­log­i­cal weapons.

When the AI gi­ant re­leased Mythos in April, it re­stricted the model to a lim­ited num­ber of com­pa­nies and or­ga­ni­za­tions in what it called Project Glasswing, an ef­fort to de­ploy the model to se­cure crit­i­cal soft­ware and in­fra­struc­ture. Last week, Anthropic ex­panded ac­cess to Mythos to hun­dreds of or­ga­ni­za­tions in 15 coun­tries.

But de­spite the good in­ten­tions, many cy­ber­se­cu­rity ex­perts are still put off by the hap­haz­ard na­ture of the re­stric­tions. Matt Suiche, a cy­ber­se­cu­rity vet­eran, told TechCrunch that if you ask it to write se­cure code, it as­sumes it is cy­ber­se­cu­rity re­lated work in­stead of soft­ware en­gi­neer­ing best prac­tices, and you get down­graded.” Fable is pro­grammed to fall back to Claude Opus 4.8 if it hits a guardrail. It seems to be key­word based, so any­thing in the lex­i­cal field of cybersecurity’ trig­gers the guardrails.”

Contact Us

Do you have more in­for­ma­tion about how hack­ers are us­ing AI? Or how cy­ber­se­cu­ity com­pa­nies are us­ing AI? We’d love to hear from you. From a non-work de­vice and net­work, you can con­tact Lorenzo Franceschi-Bicchierai se­curely on Signal at +1 917 257 1382, or via Telegram and Keybase @lorenzofb, or email.

But it is un­der­stand­able as we are still in the early days and they are still adapt­ing their guardrails. I am sure they are go­ing to evolve over time as Anthropic and other fron­tier model com­pa­nies will col­lab­o­rate more with the cur­rent new gen­er­a­tion of cy­ber­se­cu­rity com­pa­nies,” said Suiche, who is a mem­ber of the tech­ni­cal staff at Tolmo, an AI cy­ber­se­cu­rity startup. It’s bet­ter to catch more peo­ple than not enough when you do such a re­lease and to re­lax the guardrails over time.”

Another re­searcher griped on X that even ask­ing for a code re­view” trig­gers Fable’s guardrails.

Anthropic did not im­me­di­ately re­spond to a re­quest for com­ment.

Apart from guardrails in­side its mod­els, Anthropic re­quires cy­ber­se­cu­rity pro­fes­sion­als to ap­ply to the Cyber Verification Program. If they get ap­proved, the ap­pli­cants have fewer lim­i­ta­tions on us­ing Claude for cy­ber­se­cu­rity work. OpenAI has a sim­i­lar pro­gram called Trusted Access for Cyber.

When you pur­chase through links in our ar­ti­cles, we may earn a small com­mis­sion. This does­n’t af­fect our ed­i­to­r­ial in­de­pen­dence.

Lorenzo Franceschi-Bicchierai is a Senior Writer at TechCrunch, where he cov­ers hack­ing, cy­ber­se­cu­rity, sur­veil­lance, and pri­vacy.

You can con­tact or ver­ify out­reach from Lorenzo by email­ing lorenzo@techcrunch.com, via en­crypted mes­sage at +1 917 257 1382 on Signal, and @lorenzofb on Keybase/Telegram.

View Bio

AI agent runs amok in Fedora and elsewhere

lwn.net

[LWN sub­scriber-only con­tent]

Welcome to LWN.net

The fol­low­ing sub­scrip­tion-only con­tent has been made avail­able to you by an LWN sub­scriber. Thousands of sub­scribers de­pend on LWN for the best news from the Linux and free soft­ware com­mu­ni­ties. If you en­joy this ar­ti­cle, please con­sider sub­scrib­ing to LWN. Thank you for vis­it­ing LWN.net!

Welcome to LWN.net

The fol­low­ing sub­scrip­tion-only con­tent has been made avail­able to you by an LWN sub­scriber. Thousands of sub­scribers de­pend on LWN for the best news from the Linux and free soft­ware com­mu­ni­ties. If you en­joy this ar­ti­cle, please con­sider sub­scrib­ing to LWN. Thank you for vis­it­ing LWN.net!

Agentic AI sys­tems can be used to do a va­ri­ety of things au­tonomously on be­half of a hu­man user: open or man­age bugs, gen­er­ate code, sub­mit pull-re­quests, and (apparently) even com­plain about re­jec­tion. In May, a Fedora de­vel­oper dis­cov­ered that an al­legedly rogue agent had been pes­ter­ing the pro­ject in a num­ber of ways: re­as­sign­ing bugs, fab­ri­cat­ing un­help­ful replies to bugs, and even per­suad­ing main­tain­ers to merge ques­tion­able code into the Anaconda in­staller. It also sub­mit­ted a num­ber of pull re­quests (PRs), some ac­cepted, to sev­eral up­stream pro­jects. The Fedora ac­count as­so­ci­ated with the agent has had its group priv­i­leges re­voked and the messes have been mopped up, but the mo­tive be­hind the agen­t’s ac­tions is still a mys­tery.

Kind of er­ratic”

On May 27, Adam Williamson copied Fedora’s de­vel­oper and test­ing mail­ing lists on a mes­sage to Nathan Giovannini about what ap­peared to be an un­su­per­vised agen­tic AI sys­tem un­der Giovannini’s con­trol. It’s great that you’re try­ing to fix things, but the re­sults seem to be kind of er­ratic.”

Williamson said that he was still look­ing through the his­tory of Giovannini’s ac­tions in Bugzilla, but had al­ready spot­ted a num­ber of prob­lems. For ex­am­ple, Williamson had found dozens of in­stances of Giovannini’s agent as­sign­ing Bugzilla en­tries to his ac­count af­ter sub­mit­ting al­legedly re­lated pull re­quests to up­stream pro­jects, or clos­ing a bug af­ter a PR was merged into an up­stream pro­ject. In some cases, the agent sim­ply closed bugs with com­ments that ei­ther re­stated the orig­i­nal bug or were, as Williamson said of this com­ment, superficially plau­si­ble, but prob­lem­atic in other ways”.

In ad­di­tion, Williamson said that Giovannini (or his agent) had sub­mit­ted patches that were in­cor­rect and then replied to ob­jec­tions with LLM-generated jus­ti­fi­ca­tions that even­tu­ally over­whelmed the main­tainer into merg­ing the fix”. The agent, as GitHub user nathan9513-aps”, had sub­mit­ted a pull re­quest for the Anaconda in­staller used by Fedora and other Linux dis­tri­b­u­tions. The PRs de­scrip­tion claimed it was a fix for an Anaconda bug that would cause in­stal­la­tion to fail, but the patch ac­tu­ally pre­served a ker­nel op­tion passed on the com­mand line that seemed to have noth­ing to do with the ac­tual bug.

The agen­t’s GitHub ac­count has since been dis­abled. It now shows up in con­ver­sa­tions on GitHub as ghost”, which is the plat­for­m’s de­fault place­holder for user ac­counts that have been deleted. Thus, it is dif­fi­cult, if not im­pos­si­ble, to piece to­gether a full trail of all the agen­t’s ac­tions on GitHub.

Williamson said, rather diplo­mat­i­cally, that the agen­t’s ac­tions were not having a pos­i­tive im­pact on Fedora or the up­stream pro­jects”, and sug­gested that Giovannini ad­just the agent to be substantially less au­tonomous”. He specif­i­cally asked that the agent not as­sign bugs to Giovannini, change their state, or post con­fi­dent as­ser­tions or spe­cific ac­tion rec­om­men­da­tions” with­out hu­man re­view.

Hacked?

Later on May 27, Williamson said that Giovannini had replied to him pri­vately to say that his cre­den­tials had been com­pro­mised and that he was not the one be­hind the AI sys­tem. Obviously we should there­fore treat any ac­tions it has taken with sus­pi­cion”, Williamson said. He planned to re­view the bugs touched by Giovannini’s ac­count even more ag­gres­sively”, and asked for help from oth­ers to re­view them as well.

A re­ply later that day, os­ten­si­bly from Giovannini, said that he was able to re­gain ac­cess to his GitHub and Fedora ac­counts and I am cur­rently se­cur­ing and re­view­ing all in­volved sys­tems and cre­den­tials”. The re­ply said his GitHub ac­count was nathangiovannini99″. Williamson replied that the GitHub ac­count was only an hour old, and that the re­cent emails to the list and sent to Williamson pri­vately did not seem like mes­sages Giovannini had sent in ear­lier in­ter­ac­tions with the pro­ject.

Giovannini has par­tic­i­pated in dis­cus­sions at least as far back as 2018, and his ac­tiv­ity in Bugzilla goes back to at least 2016. He does not ap­pear to have been a par­tic­u­larly ac­tive con­trib­u­tor to the pro­ject, but his in­volve­ment clearly pre­dates the agen­tic AI era. Whether his ac­count is now be­ing op­er­ated by a hu­man at­tacker, an agen­tic AI, or a mix of both, it has a le­git­i­mate his­tory prior to its re­cent ac­tiv­ity.

Williamson said that he had re­viewed ac­count ac­tiv­ity in Bugzilla by nathan95” from this year, and found sus­pi­cious ac­tiv­ity, such as sever­ity and pri­or­ity changes to a bug with no jus­ti­fi­ca­tion, be­gin­ning on April 7, in bug 2416721. Activity be­fore that ap­peared le­git­i­mate, he said, and none of the ac­tiv­ity that he had seen so far looked out­right ma­li­cious.

He also iden­ti­fied an­other GitHub ac­count, leurus27-boop”, as likely be­ing as­so­ci­ated with the same agen­tic AI. That ac­count is still ac­tive, and has sub­mit­ted a PR to the open­SUSE Commander (osc) com­mand-line in­ter­face for the Open Build Service as well as a PR to the lxqt-pol­i­cykit repos­i­tory. That pro­ject is used to ex­tend the priv­i­leges of the LXQt desk­top’s lxqt-ad­min GUI tools for ad­min­is­ter­ing op­er­at­ing-sys­tem set­tings such as user and group con­fig­u­ra­tions.

Williamson said that it would be good to look through any other ac­tions by the re­lated ac­counts and warn other pro­jects that they should re­view any­thing that had been sub­mit­ted by them. Williamson seems to have fol­lowed up on each PR to warn other main­tain­ers the whole sit­u­a­tion is ex­tremely fishy”. Kevin Fenzi said that he had re­moved the nathan95 user from any groups it had been in, so it should no longer have the per­mis­sion to re­as­sign or close bugs.

Pre-attack?

Martin Kolman, a mem­ber of the Anaconda team, said the events were really prob­lem­atic” even if not ma­li­cious. The team had spent a lot of time re­view­ing PRs from what seemed to be an ea­ger con­trib­u­tor: while it started to look off af­ter a while, all the replies were still like this - a bit weird, but still *plausible*”. He also the­o­rized that it could be an at­tacker work­ing their way up to ma­li­cious ac­tiv­ity, much like the XZ back­door:

Unfortunately, for an ac­tual at­tack the prepara­tory phase could (and for the Xz at­tack did) look very sim­i­lar - a new con­trib­u­tor slowly gain­ing trust in the com­mu­nity, get­ting in harm­less changes and build­ing up to the point when the at­tack pay­load can be in­jected (or the changes not ac­tu­ally be­ing harm­less if com­bined the right way).

So not say­ing this was it, but an AI agent au­to­mated at­tempt at a Xz like com­pro­mise might re­ally look very sim­i­lar what we have just seen here.

Unfortunately, for an ac­tual at­tack the prepara­tory phase could (and for the Xz at­tack did) look very sim­i­lar - a new con­trib­u­tor slowly gain­ing trust in the com­mu­nity, get­ting in harm­less changes and build­ing up to the point when the at­tack pay­load can be in­jected (or the changes not ac­tu­ally be­ing harm­less if com­bined the right way).

So not say­ing this was it, but an AI agent au­to­mated at­tempt at a Xz like com­pro­mise might re­ally look very sim­i­lar what we have just seen here.

Chris Adams said that the com­mit to Anaconda should be in­spected and prob­a­bly re­verted im­me­di­ately. Kolman replied that it had been re­verted. He also con­firmed that the LLM-generated PRs had made it into the Anaconda 45.5 re­lease on May 26. They were re­verted in the Anaconda 45.6 re­lease on June 2.

The tar­gets cer­tainly sug­gest that it may have been a pre­lude to an at­tack of some sort; an op­er­at­ing-sys­tem in­staller, a util­ity for es­ca­lat­ing user priv­i­leges, and a tool for in­ter­act­ing with a build sys­tem all seem like promis­ing av­enues for in­sert­ing mal­ware or hi­jack­ing sys­tems.

It’s dis­con­cert­ing that what ap­pears to be an AI agent has had so much suc­cess af­ter gain­ing ac­cess to a hu­man con­trib­u­tor’s ac­counts. It seems that an AI agent with ac­cess to an ac­count with a le­git­i­mate his­tory of in­ter­act­ing with pro­jects stands a good chance of per­suad­ing busy main­tain­ers to ac­cept ques­tion­able con­tri­bu­tions. Happily, Williamson caught this be­fore it be­came a big­ger prob­lem. Let’s hope that other hu­man main­tain­ers are as ob­ser­vant.

mimo.xiaomi.com

Lines of Code Got a Better Publicist

curlewis.co.nz

It’s fif­teen years ago (bear with me, I’ve been in this in­dus­try since the late 90s, most of my good sto­ries start this way), and you’ve got two se­nior de­vel­op­ers at a SaaS com­pany. One of them writes 40% more lines of code than the other. Is that de­vel­oper bet­ter? More im­pact­ful for the busi­ness? Should the other one be pol­ish­ing their CV?

Of course not. You’d want to know what ac­tu­ally shipped. What it did for cus­tomers, for rev­enue, for re­li­a­bil­ity. Lines of code, PR counts… we spent a cou­ple of decades learn­ing these are stereo­typ­i­cally bad ways to mea­sure a de­vel­oper, to the point where sug­gest­ing them to­day is laugh­able.

Sooooo… Here’s what the in­dus­try put on the bill­board this year:

Google: 75% of new code is AI-generated .

Anthropic: ~80% of merged pro­duc­tion code is writ­ten by Claude , and en­gi­neers ship 8x more code per quar­ter”.

OpenAI: also ~80% , ap­par­ently.

Cursor: 100M+ lines of en­ter­prise code writ­ten per day” .

Every sin­gle one is a vol­ume claim. Percent of code writ­ten by AI is just lines of code with a bet­ter pub­li­cist. (The scep­tic in me edit­ing this draft would like to point out that it’s no co­in­ci­dence that all of these are AI ven­dors of some kind, so pump­ing adop­tion is pretty im­por­tant to them.)

We used to claim out­comes

Rewind a few years and the head­line num­ber was dif­fer­ent in kind, not just size. GitHub’s flag­ship claim was that de­vel­op­ers com­pleted tasks 55% faster with Copilot. Say what you like about that study (plenty did), but it was an out­come claim. Bold, fal­si­fi­able, about value. If it was wrong, you could show it was wrong.

The 2026 claims can’t fail. That’s the ge­nius of them; 75% of our code is AI-written” could be true, and will keep go­ing up, re­gard­less of whether any­thing got bet­ter (faster de­liv­ery, fewer in­ci­dents, hap­pier cus­tomers, etc). A vol­ume num­ber can only ever dis­ap­point you if adop­tion stalls, and adop­tion is the one thing most of us agree is real. 📈

So the claims got big­ger and started say­ing less. What hap­pened in be­tween?

The bit no­body puts on a bill­board

The out­come ev­i­dence got com­pli­cated, that’s what hap­pened.

The strongest pro-adop­tion re­sult is still Cui et al. ; nearly 5,000 de­vel­op­ers, +26% com­pleted tasks, with the biggest gains for ju­nior devs. Not re­ally in dis­pute. But then GitClear showed code churn ris­ing and refac­tor­ing col­laps­ing as Copilot adop­tion deep­ened. Then METR ran the study many have quoted: ex­pe­ri­enced open-source devs were 19% slower with AI in their own code­bases, while be­liev­ing they were 20% faster.

But! Hold my beer… in February 2026 METR ef­fec­tively walked it back : their fol­low-up es­ti­mates flipped to a speedup (with er­ror bars wide enough to ride a Moto Guzzi, with pan­niers, through!), and they aban­doned the study de­sign en­tirely - be­cause de­vel­op­ers now refuse to work with­out AI, and can’t re­li­ably self-re­port time on agen­tic work. Their lat­est po­si­tion: AI prob­a­bly speeds de­vel­op­ers up in 2026, and we can no longer cleanly mea­sure by how much.

Meanwhile at the com­pany level, an NBER sur­vey of ~6,000 ex­ec­u­tives found 69% of firms ac­tively us­ing AI and roughly nine in ten re­port­ing no mea­sur­able pro­duc­tiv­ity im­pact. The cross-study con­sen­sus sits some­where around 10% or­gan­i­sa­tional gains. Not noth­ing! Still bloody use­ful! Buuuut, also not you don’t need de­vel­op­ers any­more” ter­ri­tory.

And if you’re a scep­tic still quot­ing 19% slower”, you’re cherry-pick­ing too. The re­search keeps up­dat­ing; the in­dus­try just changed what it counts.

Vanity met­rics, now in AI flavour

It’s not just AI ven­dor claims, to be fair. Carnegie Mellon’s SEI and Accenture launched an AI Adoption Maturity Model just a few days ago: five lev­els, eight di­men­sions, mar­keted off a stat about 95% of or­gan­i­sa­tions see­ing no re­turns. Steve Yegge’s 8 lev­els of AI-assisted de­vel­op­ment” ranks you by which tools you run and how much su­per­vi­sion you give them. And every tools ven­dor now ships a ma­tu­rity lad­der whose top rung is, usu­ally, use more of our prod­uct”. These lad­ders mea­sure adop­tion in­ten­sity and call it ma­tu­rity. Same sub­sti­tu­tion, nicer pack­ag­ing.

My favourite data point in this whole genre: Augment sur­veyed 219 en­gi­neer­ing lead­ers and asked them to de­fine AI-native en­gi­neer­ing” . They got 219 dif­fer­ent an­swers. 🫠

And the prize for hold­ing both ends of the rope goes to Anthropic, who gave us the 8x more code shipped” claim and one of the more rig­or­ous stud­ies of the year: an RCT find­ing that AI-assisted de­vel­op­ers scored 17% lower on com­pre­hen­sion of the code they’d just shipped, with no sta­tis­ti­cally sig­nif­i­cant pro­duc­tiv­ity gain. I use Claude every sin­gle day (it rec­om­mended half the links I read for this post, so the irony is not lost on me), the prod­ucts are gen­uinely ex­cel­lent, and their re­search arm up­dates while their mar­ket­ing arm counts vol­ume. Both things are true at once, which is kinda the point.

Why I ac­tu­ally care

Because these num­bers aren’t dec­o­ra­tive. They move bud­gets, per­for­mance ex­pec­ta­tions, and head­count plans. In February, Jack Dorsey cut over 40% of Block’s work­force (4,000+ peo­ple) with AI as the ex­plicit core the­sis: A sig­nif­i­cantly smaller team, us­ing the tools we’re build­ing, can do more and do it bet­ter.” A cou­ple weeks later, Atlassian cut 10% (~1,600 peo­ple) , while con­ced­ing it would be disingenuous to pre­tend AI does­n’t change the mix of skills we need or the num­ber of roles re­quired”. And there’s a key de­tail that gets me: Dorsey said, in the same an­nounce­ment, that the busi­ness was strong and gross profit was grow­ing.

When a com­pany says AI made every­one more pro­duc­tive, so we need fewer peo­ple”, I want to see the ev­i­dence - and I don’t be­lieve it ex­ists to­day. Show me that x% of your work­force is gen­uinely idle (or even just un­der­utilised) be­cause the work can now be done by fewer peo­ple. Even then: I’ve never seen a prod­uct/​SaaS com­pany that did­n’t have an end­less roadmap. If you got a free head­count in­crease es­sen­tially overnight, why would­n’t you use it to de­liver more value to your cus­tomers, faster? That should show up as MAU, con­ver­sion, rev­enue. Choosing the lay­off in­stead tells me the pro­duc­tiv­ity claim is do­ing PR work for a de­ci­sion that was al­ready made for other rea­sons (over-hiring, in­vestor pres­sure, take your pick).

Look, every busi­ness car­ries some fat, and I can ac­cept ef­fi­ciency-dri­ven trim­ming as a thing that some­times le­git­i­mately hap­pens - it has at every step change in this in­dus­try. But when it hap­pens, try to do so us­ing the in­di­vid­ual per­for­mance sys­tems you al­ready run, the ones that sur­face who’s cruis­ing and who’s dis­en­gaged. Not to­ken counts. Not % of code AI-written” or some­body’s level on a ma­tu­rity lad­der. If your se­lec­tion ev­i­dence is a van­ity met­ric, your se­lec­tion is a lot­tery wear­ing lip­stick.

Where I land

As I’ve said in pre­vi­ous posts , don’t read any of this as anti-AI. I think every en­gi­neer should be us­ing AI daily. Call it AI-first, AI-proficient, what­ever you like. Be cu­ri­ous, try the new tools, test the lat­est mod­els. To not do so is silly. I’ve watched this in­dus­try ab­sorb higher-level lan­guages, IDEs, au­to­com­plete, ag­ile and de­vops, and there were al­ways crusty hold-outs rem­i­nisc­ing about the good old days be­fore X came along and ru­ined every­thing. The hold-outs even­tu­ally got on board (usually). The dif­fer­ence this time is pace: you could de­lay adopt­ing the cloud” for a cou­ple of years and sur­vive. With AI you might get a few months. The way we work has al­ready changed, and it’s not chang­ing back as far as I can tell.

But adop­tion is the start­ing line, not the score­board. We al­ready know how to mea­sure whether en­gi­neer­ing is de­liv­er­ing: DORA met­rics, re­li­a­bil­ity, rate of mean­ing­ful change, and ul­ti­mately rev­enue and cus­tomer value. Battle-tested, crusty stuff. Why are we throw­ing all of that out for bull­shit AI van­ity scores? (I could be wrong about plenty in this post, but I don’t think I’m wrong about that one.)

So here’s the ques­tion to smug­gle into your next ven­dor pitch, exec re­view, or LinkedIn doom-scroll: is that an out­come, or a vol­ume? It’s amaz­ing how quickly a po­si­tion or state­ment de­flates when you ask that.

The change is here to stay and the tools are good. The hope­ful part is that we al­ready know how to mea­sure what mat­ters (and none of it is counted in to­kens).

Be AI-first in how you work, but bat­tle-tested in how you mea­sure it.

Cheers, Dave

Solar generates more energy in US than coal for first time

www.theguardian.com

Even as Donald Trump boosts coal over clean en­ergy, so­lar power is hit­ting new mile­stones in the US and re­mains the lead­ing source of new power.

Data re­leased on Wednesday by the global en­ergy think­tank Ember, along with a re­port by the Solar Energy Industries Association (Seia) and an­a­lyt­ics firm Wood Mackenzie, show the con­tin­ued growth of so­lar and de­cline of coal in the United States de­spite fed­eral pol­icy. In May, for the first time, so­lar sup­plied more of the na­tion’s elec­tric­ity than coal, or 12.8%, Ember said. Coal sup­plied 12.2%, its fourth-low­est monthly share ever.

For years so­lar power has risen in the US elec­tric­ity mix,” said Nicolas Fulghum, se­nior en­ergy and data an­a­lyst at Ember. At the same time, coal power has lost its sta­tus, first as the largest source in the US mix, and then grad­u­ally over the years has fallen even fur­ther.”

Solar also be­came the third-largest source of elec­tric­ity in the US in May, be­hind nat­ural gas and nu­clear, Fulghum said. Coal gen­er­a­tion hit an all-time monthly low in April and re­bounded only mod­estly in May, al­low­ing in­creas­ing so­lar gen­er­a­tion to over­take coal, he added.

Electricity is pro­duced by con­vert­ing sources of en­ergy — fos­sil fu­els, re­new­able re­sources and nu­clear — into elec­tri­cal power. Burning coal, oil and nat­ural gas for elec­tric­ity emits car­bon diox­ide, trap­ping heat in the at­mos­phere and warm­ing the planet. By con­trast, so­lar, wind, ge­ot­her­mal, hy­dropower and nu­clear are car­bon-free.

After about two decades of es­sen­tially flat elec­tric­ity con­sump­tion in the US, elec­tric­ity de­mand is in­creas­ing to power ar­ti­fi­cial in­tel­li­gence, grow do­mes­tic man­u­fac­tur­ing and elec­trify trans­porta­tion and heat­ing. Fulghum said he ex­pected to see more months when so­lar ex­ceeds coal gen­er­a­tion, be­fore over­tak­ing it on an an­nual ba­sis in a few years.

These mile­stones sig­nify that so­lar has stay­ing power” at a time when there is less sup­port for re­new­able en­ergy at the fed­eral level, he added.

Wind and so­lar com­bined have over­taken coal in the past, and wind power alone has out­paced coal dur­ing spring months when wind speeds pick up. Ember gets its hourly and monthly data from the US Energy Information Administration.

Globally, elec­tric­ity gen­er­a­tion from re­new­ables is grow­ing rapidly. Renewables will be­come the largest global en­ergy source, used for al­most 45% of elec­tric­ity gen­er­a­tion by 2030, ac­cord­ing to the International Energy Agency.

Last week, Trump, a Republican, an­nounced a plan to boost the strug­gling US coal in­dus­try by spend­ing nearly $700m to sup­port coal-fired power plants and coal ex­ports. Trump said at a White House event that coal’s a great busi­ness” and that in terms of power, there’s re­ally noth­ing like it”.

Martin Pochtaruk, CEO and founder of Canadian-based so­lar panel man­u­fac­turer Heliene, said Trump can say that coal is com­ing back but in­vestors will in­vest their money in what­ever brings the best re­turn. And for power gen­er­a­tion that is so­lar, mak­ing it the fastest-grow­ing fuel, he added.

A White House spokes­woman de­fended the Trump ad­min­is­tra­tion’s over­all en­ergy poli­cies, say­ing they were geared to­ward strength­en­ing the coun­try’s se­cu­rity.

The President has re­versed the Left’s dev­as­tat­ing poli­cies, saved the American coal in­dus­try, pre­vented the re­tire­ment of more than 17 gi­gawatts of power, and saved lives dur­ing height­ened de­mand pe­ri­ods,” Taylor Rogers said in a state­ment.

While Trump is try­ing to re­verse the coal in­dus­try’s de­cline, so­lar has been the top source for new power for five years, Seia said. Seia and Wood Mackenzie said so­lar and bat­tery stor­age were prac­ti­cally the only en­ergy re­sources be­ing built in the first quar­ter, mak­ing up 91% of all new gen­er­at­ing ca­pac­ity.

The Trump ad­min­is­tra­tion has can­celed so­lar and wind pro­jects, im­ple­mented poli­cies that slowed clean en­ergy per­mit­ting and de­vel­op­ment and ter­mi­nated $7bn in fund­ing in­tended for af­ford­able so­lar en­ergy pro­jects across the US.

Sweet Jeebus, MacOS 27 Golden Gate Removes the Dumb Icons From Menu Items

daringfireball.net

Perhaps the worst UI crime in MacOS 26 Tahoe was the in­ex­plic­a­ble de­ci­sion to add in­scrutable, dis­tract­ing icons next to every item in the menu bar. You will re­call Jim Nielsen writ­ing about it, rightly de­scrib­ing it as ex­actly the sort of thing that Mac users look down upon in plat­forms like Google Docs and Windows. You will also re­call Nikita Tonsky” Prokopov writ­ing about it, il­lus­trat­ing that the bad idea was­n’t even im­ple­mented well, with dif­fer­ent Apple apps us­ing en­tirely dif­fer­ent icons for the same menu items. You will also re­call my link­ing to Nielsen (“I can tol­er­ate be­ing an­gry about UI changes Apple makes to the Mac. But I can’t tol­er­ate be­ing heart­bro­ken.”) and to Prokopov (“The fact that Tahoe’s menu item icons are glar­ingly in­con­sis­tent and of­ten ut­terly in­scrutable is the fudge ic­ing on a shit cake, but the real em­bar­rass­ment is that the idea ever got past the pro­posal stage. No real UI or icon de­sign­ers think this is a good idea. None.”)

Top third-party de­vel­op­ers rightly re­jected the de­sign, adopt­ing open source code from Brent Simmons to dis­able the de­fault icons in all stan­dard menu items” be­hav­ior.

Wonderful news in MacOS 27 Golden Gate: the icons are gone. It’s like Tahoe’s menu item icons never hap­pened. Prokopov noted it on Mastodon with be­fore and af­ter screen­shots, and men­tions that Apple has up­dated the Human Interface Guidelines ac­cord­ingly:

Use menu item icons spar­ingly and with pur­pose. Icons al­low peo­ple to find menu items more quickly, and help clar­ify what se­lect­ing an item does. Use an icon to high­light the most com­mon ac­tions and key fea­tures of your app, file sys­tem lo­ca­tions, con­nected de­vices, vi­sual con­cepts like ro­tat­ing or flip­ping an im­age, and user-gen­er­ated con­tent like fold­ers and doc­u­ments. Don’t dis­play an icon if you can’t find one that clearly rep­re­sents the menu item.

Use menu item icons spar­ingly and with pur­pose. Icons al­low peo­ple to find menu items more quickly, and help clar­ify what se­lect­ing an item does. Use an icon to high­light the most com­mon ac­tions and key fea­tures of your app, file sys­tem lo­ca­tions, con­nected de­vices, vi­sual con­cepts like ro­tat­ing or flip­ping an im­age, and user-gen­er­ated con­tent like fold­ers and doc­u­ments. Don’t dis­play an icon if you can’t find one that clearly rep­re­sents the menu item.

This up­dated ad­vice in the HIG is per­fect. Screenshot:

MacOS 26 Tahoe — across every Apple app on the sys­tem — is a liv­ing ex­am­ple of the up­dated HIGs what not to do” ex­am­ple il­lus­tra­tions (including the sec­ond sec­tion about groups within a menu). If you’re stuck us­ing Tahoe un­til Golden Gate ar­rives, re­call this tip to al­le­vi­ate the prob­lem to some ex­tent.

This is my fa­vorite news from all of WWDC this week. I mean that. In a small way I mean it be­cause I so loathe this as­pect of MacOS Tahoe. But in a large way I mean it be­cause it’s proof that the rot has been rooted out of Apple’s soft­ware de­sign team. I don’t know if all the un­tal­ented hacks are gone, but the un­tal­ented mag­a­zine-de­signer hacks with clout and in­flu­ence all left with Alan Dye. I’ve chat­ted with a few peo­ple from Apple’s de­sign team this week and they’re all lov­ing the work they’re do­ing and the di­rec­tion they’re tak­ing Apple’s plat­forms. Backtracking on these id­i­otic menu item icons was a nec­es­sary first step.

Why AI hasn’t replaced software engineers, and won’t

www.normaltech.ai

There is great anx­i­ety and un­cer­tainty about AI re­plac­ing jobs. How can we move past vague warn­ings and bom­bas­tic pre­dic­tions and bring data to bear on this ques­tion? One good way is to look at the pro­fes­sion where AI ca­pa­bil­i­ties are fur­thest along and adop­tion has been ex­cep­tion­ally rapid: soft­ware en­gi­neer­ing.

In this es­say, we ar­gue that there is enough ev­i­dence to re­ject the nar­ra­tive that once AI ca­pa­bil­i­ties reach a cer­tain thresh­old, it will cause mass lay­offs. Given that this is true even in a sec­tor with very few reg­u­la­tory bar­ri­ers, most other pro­fes­sions are likely to be even more cush­ioned.

We also have a good un­der­stand­ing of why this is the case. We can think of many kinds of knowl­edge work, in­clud­ing soft­ware de­vel­op­ment, as a decide-execute-deliver sand­wich”. AI com­presses the execute” layer — the mid­dle of the sand­wich — but the other two lay­ers re­sist au­toma­tion in a way that will not be over­come by ca­pa­bil­ity im­prove­ments alone.

We con­clude on a note of cau­tious op­ti­mism about the fu­ture tra­jec­tory of de­mand for soft­ware en­gi­neer­ing. This es­say is the first in a se­ries, and the next one will look at rea­sons why in­di­vid­ual soft­ware en­gi­neers’ ca­reers might be rocky even if over­all de­mand is healthy. The se­ries is based on the pub­lished lit­er­a­ture in eco­nom­ics and soft­ware en­gi­neer­ing, our own eval­u­a­tions and ob­ser­va­tions of AI agents, and many soft­ware en­gi­neers’ re­flec­tion on the pre­sent and fu­ture of AI im­pacts on their pro­fes­sion, gleaned both from pub­lished writ­ings and our in­ter­ac­tions with the com­mu­nity.

Consider three sto­ries that made the head­lines and how they con­trasted with re­al­ity:

In February, fin­tech com­pany Block (maker of Cash App, Square, Afterpay, and other such apps) an­nounced lay­offs of 4,000 em­ploy­ees be­cause, ac­cord­ing to founder Jack Dorsey, AI is enabling a new way of work­ing” with smaller and flat­ter teams”, specif­i­cally cit­ing late-2025 im­prove­ments in model ca­pa­bil­i­ties. But sub­se­quent re­port­ing re­vealed a rad­i­cally dif­fer­ent pic­ture. After grow­ing head­count more than three­fold dur­ing the pan­demic, the com­pany was un­der mas­sive fi­nan­cial pres­sure. A data sci­en­tist on the Cash App team, Naoko Takeda posted that Block shoved AI down every­one’s throats” yet she saw very lim­ited gains in pro­duc­tiv­ity.” She re­fused a 75% re­ten­tion raise and quit. Other em­ploy­ees in­ter­viewed had a sharply dif­fer­ent un­der­stand­ing of what AI was ca­pa­ble of at Block and whether Dorsey had a com­pe­tent un­der­stand­ing of the is­sues. As Aaron Levie has pointed out, CEOs are uniquely prone to delu­sions about AIs use­ful­ness be­cause they can build quick pro­to­types but can’t see the 90% of work it takes to turn it into a fin­ished prod­uct. Dorsey’s pub­lic state­ments about AI seem to fit ex­actly this pat­tern.

In February, fin­tech com­pany Block (maker of Cash App, Square, Afterpay, and other such apps) an­nounced lay­offs of 4,000 em­ploy­ees be­cause, ac­cord­ing to founder Jack Dorsey, AI is enabling a new way of work­ing” with smaller and flat­ter teams”, specif­i­cally cit­ing late-2025 im­prove­ments in model ca­pa­bil­i­ties.

But sub­se­quent re­port­ing re­vealed a rad­i­cally dif­fer­ent pic­ture. After grow­ing head­count more than three­fold dur­ing the pan­demic, the com­pany was un­der mas­sive fi­nan­cial pres­sure. A data sci­en­tist on the Cash App team, Naoko Takeda posted that Block shoved AI down every­one’s throats” yet she saw very lim­ited gains in pro­duc­tiv­ity.” She re­fused a 75% re­ten­tion raise and quit. Other em­ploy­ees in­ter­viewed had a sharply dif­fer­ent un­der­stand­ing of what AI was ca­pa­ble of at Block and whether Dorsey had a com­pe­tent un­der­stand­ing of the is­sues.

As Aaron Levie has pointed out, CEOs are uniquely prone to delu­sions about AIs use­ful­ness be­cause they can build quick pro­to­types but can’t see the 90% of work it takes to turn it into a fin­ished prod­uct. Dorsey’s pub­lic state­ments about AI seem to fit ex­actly this pat­tern.

In April, Snap laid off about 1,000 peo­ple, with CEO Evan Spiegel pri­mar­ily cit­ing AI as the rea­son in his lay­off memo. He also said that AI gen­er­ated 65% of new code. In re­al­ity, the lay­offs fol­lowed a cam­paign by an ac­tivist in­vestor de­mand­ing cost cuts. (Snap has posted a net loss every full year since its 2017 IPO and shares were down over 30% in 2026). Tellingly, the na­ture of the cuts, such as 150 jobs span­ning var­i­ous roles in the aug­mented re­al­ity di­vi­sion, don’t cor­re­late with the cuts we would ex­pect to see if they were dri­ven by AI (i.e. pro­gram­ming and other AI-exposed” jobs across the board, not con­cen­trated in any unit).

In April, Snap laid off about 1,000 peo­ple, with CEO Evan Spiegel pri­mar­ily cit­ing AI as the rea­son in his lay­off memo. He also said that AI gen­er­ated 65% of new code. In re­al­ity, the lay­offs fol­lowed a cam­paign by an ac­tivist in­vestor de­mand­ing cost cuts. (Snap has posted a net loss every full year since its 2017 IPO and shares were down over 30% in 2026). Tellingly, the na­ture of the cuts, such as 150 jobs span­ning var­i­ous roles in the aug­mented re­al­ity di­vi­sion, don’t cor­re­late with the cuts we would ex­pect to see if they were dri­ven by AI (i.e. pro­gram­ming and other AI-exposed” jobs across the board, not con­cen­trated in any unit).

In May, Intuit an­nounced 3,000 cuts, along­side deals with Anthropic and OpenAI. The press con­nected the two, fram­ing the lay­offs as AI-driven re­struc­tur­ing. For once, the CEO ac­tu­ally pushed back on this easy nar­ra­tive, say­ing that none of it had to do with AI and that the cuts tar­geted coordination-heavy roles” and too many man­age­ment lay­ers.

In May, Intuit an­nounced 3,000 cuts, along­side deals with Anthropic and OpenAI. The press con­nected the two, fram­ing the lay­offs as AI-driven re­struc­tur­ing. For once, the CEO ac­tu­ally pushed back on this easy nar­ra­tive, say­ing that none of it had to do with AI and that the cuts tar­geted coordination-heavy roles” and too many man­age­ment lay­ers.

We did not cherry-pick these ex­am­ples. In every story about AI-driven soft­ware en­gi­neer­ing lay­offs that we ex­am­ined, the same nar­ra­tive vi­o­la­tion emerged. It turns out that AI wash­ing” of job cuts is an econ­omy-wide phe­nom­e­non, ev­i­denced by many sur­veys:

59% of U.S. hir­ing man­agers ad­mit­ted they em­pha­size AI when ex­plain­ing hir­ing freezes or lay­offs be­cause it plays bet­ter with stake­hold­ers than cit­ing fi­nan­cial con­straints.

59% of U.S. hir­ing man­agers ad­mit­ted they em­pha­size AI when ex­plain­ing hir­ing freezes or lay­offs be­cause it plays bet­ter with stake­hold­ers than cit­ing fi­nan­cial con­straints.

Forrester prin­ci­pal an­a­lyst J. P. Gownder says of com­pa­nies prepar­ing sup­pos­edly AI-driven lay­offs: When we ask if they have a ma­ture, vet­ted AI app ready to fill in those jobs, nine out of 10 times, the an­swer is no—and they haven’t even started.”

Forrester prin­ci­pal an­a­lyst J. P. Gownder says of com­pa­nies prepar­ing sup­pos­edly AI-driven lay­offs: When we ask if they have a ma­ture, vet­ted AI app ready to fill in those jobs, nine out of 10 times, the an­swer is no—and they haven’t even started.”

In a HBR sur­vey of over 1,000 global ex­ec­u­tives, 21% had made large head­count re­duc­tions in an­tic­i­pa­tion of” AI, with an­other 39% hav­ing made low or mod­er­ate an­tic­i­pa­tory head­count re­duc­tions. In con­trast, only 2% had al­ready made large re­duc­tions in head­count re­lated to ac­tual AI im­ple­men­ta­tion. The 10x gap sug­gests that ex­ec­u­tives, like every­one else, are highly prone to suc­cumb­ing to the mis­lead­ing nar­ra­tives about AI re­plac­ing jobs.

In a HBR sur­vey of over 1,000 global ex­ec­u­tives, 21% had made large head­count re­duc­tions in an­tic­i­pa­tion of” AI, with an­other 39% hav­ing made low or mod­er­ate an­tic­i­pa­tory head­count re­duc­tions. In con­trast, only 2% had al­ready made large re­duc­tions in head­count re­lated to ac­tual AI im­ple­men­ta­tion. The 10x gap sug­gests that ex­ec­u­tives, like every­one else, are highly prone to suc­cumb­ing to the mis­lead­ing nar­ra­tives about AI re­plac­ing jobs.

Another in­ter­est­ing data point comes from the WARN Act, which re­quires cer­tain dis­clo­sures of plant clos­ings and mass lay­offs af­fect­ing over 100 work­ers. In March 2025, New York be­came the first U.S. state to add an AI dis­clo­sure check­box to WARN Act fil­ings. In the full first year, more than 160 com­pa­nies filed WARN no­tices. Not a sin­gle one checked the AI box.1 We reached out to the NY Department of Labor who con­firmed that as of late May, only one com­pany, Nespresso, checked the box.2 If these fil­ings are ac­cu­rate, only 46 out of about 25,000 laid off work­ers in New York State in the rel­e­vant pe­riod, or about two-tenths of a per­cent, were af­fected by AI.

Even more damn­ing for the AI-driven-mass-layoffs nar­ra­tive: lay­offs are the wrong sig­nal of AIs po­ten­tial pro­duc­tiv­ity ben­e­fits in the first place! The re­search is clear that the ef­fect op­er­ates through slower hir­ing rather than in­creased sep­a­ra­tions”. Firing ex­ist­ing work­ers re­sults in the loss of pre­cisely the tacit knowl­edge and or­ga­ni­za­tional cap­i­tal that al­lows work­ers to op­er­ate AI ef­fec­tively. Besides, it is ex­pen­sive in terms of sev­er­ance, dam­age to morale, and re­hir­ing risk. Given these costs, it is largely un­nec­es­sary given that nat­ural turnover achieves the same re­sult in a few years.

So what does the data tell us when we look be­yond lay­offs to over­all em­ploy­ment trends? An im­por­tant pa­per from Federal Reserve econ­o­mists com­piles the ev­i­dence in the U.S. con­text. Employment is still grow­ing, but they find that it is grow­ing slower post-Chat­GPT com­pared to a no-AI coun­ter­fac­tual, by about 3 per­cent­age points per year. One im­por­tant lim­i­ta­tion of this study is that the method­ol­ogy can’t cap­ture self-em­ploy­ment, so it is pos­si­ble that some of the slow­down in growth is be­ing ab­sorbed by en­tre­pre­neur­ship in­stead. We do have ev­i­dence from other stud­ies that AI makes en­tre­pre­neur­ship eas­ier. So the real pic­ture is prob­a­bly even health­ier than the Federal Reserve study sug­gests.3

Finally, it is worth ac­knowl­edg­ing two kinds of in­di­rectly-AI-dri­ven job losses in soft­ware en­gi­neer­ing that are real, but dif­fer­ent from AI re­plac­ing soft­ware en­gi­neers. First, AI some­times dec­i­mates de­mand for the prod­uct, in cases like Chegg (homework help) or Stack Overflow (technical help), both of which have laid off work­ers. AI does­n’t di­rectly do the job that these work­ers did, but rather ob­vi­ates the need for it. The his­tor­i­cal par­al­lel is strong: Among the 270 jobs in the 1950 U.S. cen­sus, only one job was au­to­mated away — el­e­va­tor op­er­a­tor. But many oth­ers were ren­dered ob­so­lete by new tech­nol­ogy, like the job of tele­graph op­er­a­tor.

Another cred­i­ble AI-driven lay­offs story is among com­pa­nies that sell AI, rather than buy it. So when com­pa­nies like IBM or SAP an­nounce lay­offs be­cause of AI, a more ac­cu­rate fram­ing is we re­al­lo­cated head­count from legacy func­tions to our fastest-grow­ing prod­uct line.” That’s or­di­nary cor­po­rate re­struc­tur­ing around a rev­enue op­por­tu­nity, not tech­nol­ogy dis­plac­ing work­ers.

Many tech lead­ers, like the Snap CEO above, re­port the per­cent­age of code writ­ten by AI along­side re­ports of lay­offs or pre­dic­tions of fu­ture job losses. This feeds into the sim­plis­tic men­tal model that once AI writes all the code, there is no need for coders. Fortunately, this men­tal model is wrong. This AI-written-code met­ric is al­most com­pletely dis­con­nected from what mat­ters for la­bor dis­place­ment. Here’s why.

First, writ­ing code is­n’t, and never was, the bot­tle­neck. For ex­am­ple, a 2019 pa­per sum­ma­rized ex­ist­ing stud­ies with the con­clu­sion that developers spend sur­pris­ingly lit­tle time with cod­ing, 9% to 61% de­pend­ing on the study”. This find­ing was con­sis­tent with the pa­per’s own data from 6,000 de­vel­op­ers at Microsoft. As cod­ing agents be­gan to be taken up, there was an ex­plo­sion of blog posts in late 2025 point­ing out that writ­ing code is­n’t the bot­tle­neck, as de­vel­op­ers re­al­ized that us­ing agents to write most of the code led to lit­tle im­pact on over­all pro­duc­tiv­ity [1, 2, 3, 4, 5, 6, 7, 8].

If writ­ing code is­n’t the bot­tle­neck, what is? The task-break­down sur­veys point at things like meet­ings or de­bug­ging. This just leads to more ques­tions: what are de­vel­op­ers do­ing in those meet­ings and why can’t it be done by AI? Won’t de­bug­ging get au­to­mated as ca­pa­bil­i­ties im­prove? To un­der­stand the real bot­tle­necks, we have to get qual­i­ta­tive, and dig into soft­ware en­gi­neers’ own un­der­stand­ing of what it is they do that re­sists au­toma­tion.

When we did this analy­sis, it re­vealed three things as the real bot­tle­necks (1) de­cid­ing and spec­i­fy­ing what to build, (2) ver­i­fy­ing and be­ing ac­count­able for what is de­liv­ered, and (3) the deep hu­man un­der­stand­ing — of the code­base, the busi­ness, and the en­vi­ron­ment — re­quired to carry out both of these.

In other words, soft­ware en­gi­neers’ work con­sists of a decide-execute-deliver” sand­wich (with un­der­stand­ing be­ing a pre­req­ui­site for all three). AI has com­pressed the mid­dle of the sand­wich, but has left the two ends largely un­changed. As long as soft­ware de­vel­op­ment teams are in charge of de­ci­sion mak­ing and ac­count­able for what they de­liver, en­gi­neers still need to spend time build­ing up a deep un­der­stand­ing of the sys­tem. These are the three bot­tle­necks.

Figure: Software de­vel­op­ment con­sists of three lay­ers: (1) Decision mak­ing — prob­lem fram­ing, spec­i­fi­ca­tion, plan­ning (2) ex­e­cu­tion — de­sign and im­ple­men­ta­tion (3) de­liv­ery — test­ing, ver­i­fi­ca­tion, in­te­gra­tion, main­te­nance, etc. Note that these are con­cep­tual lay­ers, not tem­po­ral phases. It is com­mon to switch back and forth in the course of a pro­ject.

Evidence for the sand­wich model of AIs pro­duc­tiv­ity ef­fects comes from a re­cent pa­per on Writing Code vs. Shipping Code”. Across 100,000 de­vel­op­ers on GitHub, the re­searchers found that AI agents led to an eight-fold in­crease in the num­ber of lines of code writ­ten, con­sis­tent with the idea that AI al­most com­pletely com­presses the Execute layer of the sand­wich. But this led to only 30% more re­leases, strongly sug­gest­ing that hu­man bot­tle­necks (the Decide and Deliver lay­ers) re­main in place.4

Can the sand­wich be fur­ther com­pressed? We don’t think so. At one end of the pipeline, de­vel­op­ment teams need to de­cide what to build. One of the most im­por­tant lessons ju­nior soft­ware en­gi­neers learn is that re­quire­ments spec­i­fi­ca­tion (the pro­fes­sion’s lingo for this layer) takes sur­pris­ingly long, and if it is com­pressed, it leads to much more pain down the line. This layer is hard to au­to­mate be­cause it re­quires think­ing about user needs, mar­ket sig­nals, or­ga­ni­za­tional pri­or­i­ties, and in some cases reg­u­la­tory con­straints.

As AI ca­pa­bil­i­ties im­prove, the kinds of de­ci­sions that can be del­e­gated to AI in­crease over time. But this does not make the decide” layer thin­ner — once a de­ci­sion can be del­e­gated to AI, it is no longer a source of com­pet­i­tive ad­van­tage, and the value of hu­man de­ci­sion-mak­ing mi­grates up­ward. Software in­creases in com­plex­ity over time, so there is no ceil­ing to this process.

At the other end of the sand­wich, hu­man teams need to be ac­count­able for what they de­liver. It is pos­si­ble that some day in the fu­ture teams will ship mis­sion-crit­i­cal code with­out fully test­ing and un­der­stand­ing it, but to­day’s AI is so un­re­li­able that such hap­haz­ard prac­tices would rep­re­sent an ex­is­ten­tial threat to soft­ware teams and their cus­tomers.

Even if the tech­ni­cal bar­ri­ers go away in the fu­ture, we don’t have to cede con­trol to AI. A cen­tral in­sight of AI as Normal Technology is that we can col­lec­tively choose to keep hu­mans ac­count­able through shared norms, law, and pol­icy. This is a much more re­silient way to con­trol the speed of AI im­pacts and im­prove safety than try­ing to slow the de­vel­op­ment of tech­ni­cal ca­pa­bil­i­ties. These speed bar­ri­ers are al­ready largely in place due to li­a­bil­ity laws and sec­tor-spe­cific reg­u­la­tion, but can be fur­ther strength­ened. (For a longer ver­sion of this ar­gu­ment, see the orig­i­nal es­say.)

In this vi­sion, as more and more of the ex­e­cu­tion layer gets del­e­gated to AI, the soft­ware en­gi­neer’s role in the fu­ture be­comes anal­o­gous to that of a crane op­er­a­tor. AI agents will do most of the cog­ni­tive heavy lift­ing; su­per­vis­ing the agent and keep­ing it in con­trol be­comes most of the hu­man’s job.

Some com­men­ta­tors ar­gue that a fu­ture with hu­mans stay­ing in con­trol is un­likely be­cause it is too costly to pay peo­ple to do so. There have al­ready been a few vi­ral sto­ries of poorly-su­per­vised cod­ing agents delet­ing pro­duc­tion data­bases or caus­ing other types of dam­age. But we view these as man bites dog” sto­ries rather than an emerg­ing norm. They go vi­ral pre­cisely be­cause they rep­re­sent such ir­re­spon­si­ble and un­usual be­hav­ior that they have shock value, and serve as reg­u­lar re­minders and learn­ing mo­ments help­ing the com­mu­nity guard it­self against over-re­liance on AI. As the apho­rism goes, if it’s in the news, don’t worry about it”. Still, be­ing able to de­tect whether there is an uptick in poorly-su­per­vised use of AI for high-stakes tasks — across the econ­omy, not just in soft­ware en­gi­neer­ing — re­mains one of the most crit­i­cal data gaps we have to­day.

By the way, the sand­wich get­ting squished is a new trend and it is not uniquely due to AI. Over two decades ago, the Bureau of Labor Statistics started track­ing pro­gram­ming sep­a­rately from soft­ware en­gi­neer­ing. Roughly speak­ing, pro­gram­mers are re­spon­si­ble only for ex­e­cu­tion while soft­ware en­gi­neers man­age a big­ger part of the sand­wich. Not only has pro­gram­ming been shrink­ing, it is also pays much less be­cause it is seen as grunt work. AI merely ac­cel­er­ates this long-ex­ist­ing trend, fur­ther de­valu­ing purely tech­ni­cal skills.

This pat­tern — where hu­mans re­main heav­ily in­volved at both ends of the de­cide-ex­e­cute-de­liver sand­wich, even as AI in­creas­ingly au­to­mates the mid­dle layer, seems to be broadly ap­plic­a­ble to most knowl­edge work, though it is far­thest along in soft­ware. After all, com­plex de­ci­sion mak­ing and ac­count­abil­ity are com­mon to most fields. A lack of recog­ni­tion of this phe­nom­e­non has led to many over­con­fi­dent pre­dic­tions about im­mi­nent job losses, such as among ra­di­ol­o­gists.

One rea­son for con­fu­sion about the ex­tent to which soft­ware en­gi­neer­ing is chang­ing is the sloppy use of the term vibe cod­ing” to re­fer to a wide spec­trum of prac­tices, the ends of which are con­cep­tu­ally dis­tinct and more dis­sim­i­lar than sim­i­lar.

In true vibe cod­ing the user sim­ply tells the agent what to do, does­n’t su­per­vise it when it’s run­ning, does­n’t re­view the code — might not even have the skills to do so — and does­n’t eval­u­ate the out­put, be­yond per­haps notic­ing when things are vis­i­bly bro­ken.

This is in con­trast to how most soft­ware en­gi­neers are ac­tu­ally us­ing agents — as a tool, with the hu­man re­main­ing in con­trol and ac­count­able for the out­put. Fortunately, the term agen­tic en­gi­neer­ing is gain­ing cur­rency as a de­scrip­tor of this prac­tice.

As agen­tic en­gi­neer­ing has be­come the norm, en­gi­neers are dis­cov­er­ing that su­per­vis­ing cod­ing agents is sur­pris­ingly time con­sum­ing. For ex­am­ple, Simon Willison, a promi­nent de­vel­oper and chron­i­cler of the AI tran­si­tion, has noted how he is men­tally ex­hausted by 11am from su­per­vis­ing agents. This is con­sis­tent with our ex­pe­ri­ence as well.

More quan­ti­ta­tive ev­i­dence comes from SWE-chat, a dataset of cod­ing agent in­ter­ac­tions from open-source de­vel­op­ers who opted into a log­ging tool. The study found that only 44% of agent-pro­duced code sur­vives into user com­mits, that vibe-coded com­mits in­tro­duce vul­ner­a­bil­i­ties at nine times the hu­man-only rate, and that the most com­mon user in­tent is un­der­stand­ing ex­ist­ing code, not gen­er­at­ing new code (19% vs 13%). The self-se­lected na­ture of the dataset means that we can’t draw strong con­clu­sions based on this study alone, but it does re­in­force many other lines of ev­i­dence that vibe-cod­ing and agen­tic en­gi­neer­ing pat­terns are quite dif­fer­ent.

To re-it­er­ate, these are not two dis­tinct cat­e­gories. They are two ends of a spec­trum, and there is a blurry mid­dle. Not every pro­ject is ei­ther a throw­away or mis­sion-crit­i­cal. Not every work­flow fits pre­cisely in the left col­umn or the right col­umn of the table. But the key im­pli­ca­tion for the jobs ques­tion re­mains solid — com­pa­nies can’t ship pro­duc­tion soft­ware by hir­ing un­qual­i­fied vibe coders in­stead of soft­ware en­gi­neers.

AI boost­ers might claim that mass lay­offs are com­ing; they just haven’t hap­pened yet be­cause hu­man-level soft­ware en­gi­neer­ing abil­i­ties are very re­cent (or haven’t been achieved yet). But if the sand­wich model is cor­rect, these pre­dic­tions won’t come true. AI has al­ready largely com­pressed the mid­dle of the sand­wich (and the com­pres­sion ac­tu­ally started decades ago). So even mak­ing the ex­e­cu­tion layer in­stant and per­fect will only be a small change from the sta­tus quo. The rea­sons why the other two lay­ers have re­sisted AI is not be­cause of ca­pa­bil­ity lim­i­ta­tions.

In fact, not only are soft­ware en­gi­neer­ing jobs not go­ing away due to AI, there might even be an in­crease in de­mand for soft­ware en­gi­neers. When soft­ware (or any­thing else) gets cheaper to cre­ate due to tech­no­log­i­cal pro­duc­tiv­ity im­prove­ments, peo­ple will buy a lot more soft­ware (in econ jar­gon, soft­ware is highly price elas­tic”). And as we have ar­gued, AI does­n’t re­place soft­ware en­gi­neers (the elasticity of sub­sti­tu­tion” is low), so the de­mand for more soft­ware re­sults in a de­rived de­mand for more soft­ware en­gi­neers. A loosely re­lated but flashier eco­nom­ics term, Jevons’ para­dox”, is of­ten thrown around in the AI dis­course to de­scribe this con­cept.

Historically, this has been the pat­tern — pro­gram­mer em­ploy­ment in the U.S. has grown from near-zero around 1950 to mil­lions to­day. This is sharply dif­fer­ent from oc­cu­pa­tions such as agri­cul­ture in which la­bor de­mand was fa­mously dec­i­mated due to mech­a­niza­tion and au­toma­tion. The dif­fer­ence is that the amount of calo­ries peo­ple con­sume is rel­a­tively fixed — even a 25% in­crease led to the obe­sity epi­demic — whereas the amount of soft­ware pro­duced has grown a mil­lion­fold. Modern cars have some­thing like a hun­dred mil­lion lines of code run­ning on their var­i­ous on-board com­put­ers.

If there is a ceil­ing to the de­mand for code, we are nowhere near it. Virtually all cog­ni­tive work ben­e­fits from soft­ware. As AI makes cod­ing cheaper, peo­ple are cre­at­ing all kinds of one-off util­i­ties — whether for work or per­sonal use — that it never made sense to cre­ate un­til now.

To be clear, while we think there will be a lot more soft­ware in the fu­ture, and likely more soft­ware en­gi­neers, this does­n’t mean big tech com­pa­nies will get even big­ger. The ma­jor­ity of soft­ware en­gi­neers to­day al­ready work in-house in non-soft­ware firms, and that share might grow in the fu­ture. Then there’s the idea of AI rollups”, which refers to ven­ture cap­i­tal or pri­vate eq­uity firms buy­ing Main street” busi­nesses — den­tistry prac­tices, ac­count­ing firms, and what­not — and re­build them from the ground up to be AI-native” by em­bed­ding soft­ware en­gi­neers or AI en­gi­neers into those busi­nesses. Of course, it might end up be­ing noth­ing more than hype. It’s too early to tell.

Some peo­ple pre­dict that de­mand for soft­ware en­gi­neer­ing skills will fall be­cause of de­moc­ra­ti­za­tion. They ac­knowl­edge that there will be more soft­ware pro­duced than ever be­fore, and also that more hu­man time will be spent pro­duc­ing soft­ware than ever be­fore, but that this work will be done by peo­ple who are not soft­ware en­gi­neers. The idea is that AI will de­moc­ra­tize soft­ware en­gi­neer­ing to the ex­tent that le­gal soft­ware, for in­stance, can be more eas­ily cre­ated by those with train­ing in law than in soft­ware en­gi­neer­ing.

Maybe. But we’ll bet against it. In our view, this falls into the same trap of con­flat­ing vibe cod­ing with agen­tic en­gi­neer­ing, and the ex­e­cu­tion layer with the the whole de­cide-ex­e­cute-de­liver sand­wich. In fact, when we look at the his­tory of pro­gram­ming, there have al­ways been claims that we are at the thresh­old of de­moc­ra­ti­za­tion — old lan­guages such as FORTRAN, COBOL, and SQL were all ac­com­pa­nied by such promi­nent hopes at the time of their in­tro­duc­tion. It never hap­pened. The bar­rier is­n’t ac­tu­ally learn­ing the syn­tax. It’s hav­ing enough skilled judg­ment to make good de­ci­sions while main­tain­ing ac­count­abil­ity.

Ultimately the dis­tinc­tion may be se­man­tic. It seems clear that the amount of time peo­ple spend on get­ting com­put­ers to do new things will in­crease over time. This might take the form of build­ing soft­ware, or man­ag­ing com­plex work­flows us­ing agents, or some­thing else. It will re­quire a mix of soft­ware skills, AI skills, and do­main ex­per­tise. Whether it is to­day’s soft­ware en­gi­neers who will best adapt to fill these new roles re­mains to be seen.

That last point about the need for adap­ta­tion sets up the next es­say in this se­ries. The fact that ag­gre­gate la­bor de­mand in soft­ware is likely to re­main strong does­n’t mean that most in­di­vid­ual work­ers won’t be af­fected. We will ar­gue that AI will cre­ate mas­sive struc­tural shifts in how soft­ware is pro­duced, which will have big im­pacts on which soft­ware en­gi­neers stand to gain or lose — based on the types of firms they work in, their ge­og­ra­phy, their se­nior­ity, the pace at which they can adapt.

Further read­ing

Deena Mousa points out the su­per­fi­cial­ity of broad, econ­omy-wide analy­ses of AI im­pacts based on met­rics like AI ex­po­sure”, and in­stead calls for careful, oc­cu­pa­tion-spe­cific work”. We hope that this se­ries of es­says will play a role in es­tab­lish­ing a nu­anced un­der­stand­ing of AIs trans­for­ma­tion of soft­ware en­gi­neer­ing. We’ve pre­vi­ous coau­thored, with Justin Curl, a pa­per an­a­lyz­ing AI in le­gal ser­vices that se­ri­ously en­gages with reg­u­la­tory and other bot­tle­necks that make that oc­cu­pa­tion unique. We plan to do more oc­cu­pa­tion-spe­cific deep dives in the fu­ture.

In a re­mark­able es­say called No Silver Bullet 40 years ago, Fred Brooks dis­tin­guished be­tween the essential com­plex­ity” and accidental com­plex­ity” of soft­ware. He ar­gued that some of the com­plex­ity of soft­ware is ac­ci­den­tal, aris­ing from lim­i­ta­tions of pre­sent tech­nol­ogy such as the clunk­i­ness of pro­gram­ming lan­guages, and can be al­le­vi­ated over time as tool­ing im­proves. But some of it is es­sen­tial, be­cause spec­i­fy­ing the cor­rect be­hav­ior of soft­ware is it­self hard. He pre­sents a force­ful ar­tic­u­la­tion of why the decide” layer of the sand­wich is thick and re­sists au­toma­tion. Interestingly, hopes of boost­ing pro­gram­mer pro­duc­tiv­ity through AI were al­ready promi­nent back then! Brooks ar­gues that be­cause AI or any other tech­nol­ogy only re­duces ac­ci­den­tal com­plex­ity, it won’t re­sult in an or­der-of-mag­ni­tude pro­duc­tiv­ity im­prove­ment. (Brooks is the au­thor of The Mythical Man Month, an es­say col­lec­tion that is al­most cer­tainly the best known and most in­flu­en­tial writ­ing on soft­ware en­gi­neer­ing of all time. No Silver Bullet later be­came part of the col­lec­tion.)

We are grate­ful to Felix Chen for feed­back on a draft.

1

The check­box is ac­tu­ally la­beled technological in­no­va­tion or au­toma­tion”. If checked, there is a sec­ond menu that to dis­close the spe­cific tech­nol­ogy such as AI or ro­bot­ics.

The cur­rent WARN Act data have var­i­ous lim­i­ta­tions — it is New York only, and it is pos­si­ble that com­pa­nies are un­der-re­port­ing AI as a rea­son for lay­offs be­cause of am­bi­gu­ity or asym­met­ric risks from check­ing ver­sus not check­ing the box (though we have no spe­cific rea­son to think this). Stronger trans­parency re­quire­ments are in the works at both the fed­eral and state lev­els; clos­ing this data gap is ur­gent.

2

We are grate­ful to our col­league Mihir Kshirsagar for con­nect­ing us to the New York State Department of Labor and Elena Grovenger from the de­part­ment for a prompt re­sponse.

3

The pa­per uses the term coder, but it de­fines the term based on skills rather than roles, re­sult­ing in a broad sweep of jobs that is much broader than coding”. Measurements based on in­dus­try, ti­tle, and skills can­not be eas­ily com­pared to one an­other.

4

Interestingly, in a sub-study look­ing at mo­bile apps, the pa­per found that the us­age of the re­sult­ing apps did not go up at all. This gets at one im­por­tant dif­fer­ence be­tween con­sumer and en­ter­prise soft­ware. The for­mer com­petes for a rel­a­tively fixed pool of at­ten­tion; more apps pub­lished does­n’t mean more hours of app us­age. But in en­ter­prise soft­ware there is a lot of room for growth, as pre­vi­ously hu­man processes can be soft­ware-me­di­ated or au­to­mated.

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

Visit pancik.com for more.