10 interesting stories served every morning and every evening.




1 764 shares, 41 trendiness

All elementary functions from a single binary operator

...

Read the original on arxiv.org »

2 452 shares, 118 trendiness

Someone Bought 30 WordPress Plugins and Planted a Backdoor in All of Them.

Last week, I wrote about catch­ing a sup­ply chain at­tack on a WordPress plu­gin called Widget Logic. A trusted name, ac­quired by a new owner, turned into some­thing ma­li­cious. It hap­pened again. This time at a much larger scale.

Ricky from Improve & Grow emailed us about an alert he saw in the WordPress dash­board for a client site. The no­tice was from the WordPress.org Plugins Team, warn­ing that a plu­gin called Countdown Timer Ultimate con­tained code that could al­low unau­tho­rized third-party ac­cess.

I ran a full se­cu­rity au­dit on the site. The plu­gin it­self had al­ready been force-up­dated by WordPress.org to ver­sion 2.6.9.1, which was sup­posed to clean things up. But the dam­age was al­ready done.

The plug­in’s wpos-an­a­lyt­ics mod­ule had phoned home to an­a­lyt­ics.es­sen­tialplu­gin.com, down­loaded a back­door file called wp-com­ments-posts.php (designed to look like the core file wp-com­ments-post.php), and used it to in­ject a mas­sive block of PHP into wp-con­fig.php.

The in­jected code was so­phis­ti­cated. It fetched spam links, redi­rects, and fake pages from a com­mand-and-con­trol server. It only showed the spam to Googlebot, mak­ing it in­vis­i­ble to site own­ers. And here is the wildest part. It re­solved its C2 do­main through an Ethereum smart con­tract, query­ing pub­lic blockchain RPC end­points. Traditional do­main take­downs would not work be­cause the at­tacker could up­date the smart con­tract to point to a new do­main at any time.

CaptainCore keeps daily restic back­ups. I ex­tracted wp-con­fig.php from 8 dif­fer­ent backup dates and com­pared file sizes. Binary search style.

The in­jec­tion hap­pened on April 6, 2026, be­tween 04:22 and 11:06 UTC. A 6-hour 44-minute win­dow.

I traced the plug­in’s his­tory through 939 quick­save snap­shots. The plu­gin had been on the site since January 2019. The wpos-an­a­lyt­ics mod­ule was al­ways there, func­tion­ing as a le­git­i­mate an­a­lyt­ics opt-in sys­tem for years.

Then came ver­sion 2.6.7, re­leased August 8, 2025. The changelog said, Check com­pat­i­bil­ity with WordPress ver­sion 6.8.2.” What it ac­tu­ally did was add 191 lines of code, in­clud­ing a PHP de­se­ri­al­iza­tion back­door. The class-anylc-ad­min.php file grew from 473 to 664 lines.

The new code in­tro­duced three things:

A fetch_ver_info() method that calls file_get_­con­tents() on the at­tack­er’s server and passes the re­sponse to @unserialize()

A ver­sion_in­fo_­clean() method that ex­e­cutes @$clean($this->version_cache, $this->changelog) where all three val­ues come from the un­se­ri­al­ized re­mote data

That is a text­book ar­bi­trary func­tion call. The re­mote server con­trols the func­tion name, the ar­gu­ments, every­thing. It sat dor­mant for 8 months be­fore be­ing ac­ti­vated on April 5-6, 2026.

This is where it gets in­ter­est­ing. The orig­i­nal plu­gin was built by Minesh Shah, Anoop Ranawat, and Pratik Jain. An India-based team that op­er­ated un­der WP Online Support” start­ing around 2015. They later re­branded to Essential Plugin” and grew the port­fo­lio to 30+ free plu­g­ins with pre­mium ver­sions.

By late 2024, rev­enue had de­clined 35-45%. Minesh listed the en­tire busi­ness on Flippa. A buyer iden­ti­fied only as Kris,” with a back­ground in SEO, crypto, and on­line gam­bling mar­ket­ing, pur­chased every­thing for six fig­ures. Flippa even pub­lished a case study about the sale in July 2025.

The buy­er’s very first SVN com­mit was the back­door.

On April 7, 2026, the WordPress.org Plugins Team per­ma­nently closed every plu­gin from the Essential Plugin au­thor. At least 30 plu­g­ins, all on the same day. Here are the ones I con­firmed:

* SlidersPack — All in One Image Sliders — slid­er­spack-all-in-one-im­age-slid­ers

All per­ma­nently closed. The au­thor search on WordPress.org re­turns zero re­sults. The an­a­lyt­ics.es­sen­tialplu­gin.com end­point now re­turns {“message”:“closed”}.

In 2017, a buyer us­ing the alias Daley Tias” pur­chased the Display Widgets plu­gin (200,000 in­stalls) for $15,000 and in­jected pay­day loan spam. That buyer went on to com­pro­mise at least 9 plu­g­ins the same way.

The Essential Plugin case is the same play­book at a larger scale. 30+ plu­g­ins. Hundreds of thou­sands of ac­tive in­stal­la­tions. A le­git­i­mate 8-year-old busi­ness ac­quired through a pub­lic mar­ket­place and weaponized within months.

WordPress.org’s forced up­date added re­turn; state­ments to dis­able the phone-home func­tions. That is a band-aid. The wpos-an­a­lyt­ics mod­ule is still there with all its code. I built patched ver­sions with the en­tire back­door mod­ule stripped out.

I scanned my en­tire fleet and found 12 of the 26 Essential Plugin plu­g­ins in­stalled across 22 cus­tomer sites. I patched 10 of them (one had no back­door mod­ule, one was a dif­fer­ent pro” fork by the orig­i­nal au­thors). Here are the patched ver­sions, hosted per­ma­nently on B2:

# Countdown Timer Ultimate

wp plu­gin in­stall https://​plu­g­ins.cap­tain­core.io/​count­down-timer-ul­ti­mate-2.6.9.1-patched.zip –force

# Popup Anything on Click

wp plu­gin in­stall https://​plu­g­ins.cap­tain­core.io/​popup-any­thing-on-click-2.9.1.1-patched.zip –force

# WP Testimonial with Widget

wp plu­gin in­stall https://​plu­g­ins.cap­tain­core.io/​wp-tes­ti­mo­nial-with-wid­get-3.5.1-patched.zip –force

# WP Team Showcase and Slider

wp plu­gin in­stall https://​plu­g­ins.cap­tain­core.io/​wp-team-show­case-and-slider-2.8.6.1-patched.zip –force

# WP FAQ (sp-faq)

wp plu­gin in­stall https://​plu­g­ins.cap­tain­core.io/​sp-faq-3.9.5.1-patched.zip –force

# Timeline and History Slider

wp plu­gin in­stall https://​plu­g­ins.cap­tain­core.io/​time­line-and-his­tory-slider-2.4.5.1-patched.zip –force

# Album and Image Gallery plus Lightbox

wp plu­gin in­stall https://​plu­g­ins.cap­tain­core.io/​al­bum-and-im­age-gallery-plus-light­box-2.1.8.1-patched.zip –force

# SP News and Widget

wp plu­gin in­stall https://​plu­g­ins.cap­tain­core.io/​sp-news-and-wid­get-5.0.6-patched.zip –force

# WP Blog and Widgets

wp plu­gin in­stall https://​plu­g­ins.cap­tain­core.io/​wp-blog-and-wid­gets-2.6.6.1-patched.zip –force

# Featured Post Creative

wp plu­gin in­stall https://​plu­g­ins.cap­tain­core.io/​fea­tured-post-cre­ative-1.5.7-patched.zip –force

# Post Grid and Filter Ultimate

wp plu­gin in­stall https://​plu­g­ins.cap­tain­core.io/​post-grid-and-fil­ter-ul­ti­mate-1.7.4-patched.zip –force

Each patched ver­sion re­moves the en­tire wpos-an­a­lyt­ics di­rec­tory, deletes the loader func­tion from the main plu­gin file, and bumps the ver­sion to -patched. The plu­gin it­self con­tin­ues to work nor­mally.

The process is straight­for­ward with Claude Code. Point it at this ar­ti­cle for con­text, tell it which plu­gin you need patched, and it can strip the wpos-an­a­lyt­ics mod­ule the same way I did. The pat­tern is iden­ti­cal across all of the Essential Plugin plu­g­ins:

Delete the wpos-an­a­lyt­ics/ di­rec­tory from the plu­gin

Remove the loader func­tion block in the main plu­gin PHP file (search for Plugin Wpos Analytics Data Starts” or wpos_­an­a­lyt­ic­s_anl)

Two sup­ply chain at­tacks in two weeks. Both fol­lowed the same pat­tern. Buy a trusted plu­gin with an es­tab­lished in­stall base, in­herit the WordPress.org com­mit ac­cess, and in­ject ma­li­cious code. The Flippa list­ing for Essential Plugin was pub­lic. The buy­er’s back­ground in SEO and gam­bling mar­ket­ing was pub­lic. And yet the ac­qui­si­tion sailed through with­out any re­view from WordPress.org.

WordPress.org has no mech­a­nism to flag or re­view plu­gin own­er­ship trans­fers. There is no change of con­trol” no­ti­fi­ca­tion to users. No ad­di­tional code re­view trig­gered by a new com­mit­ter. The Plugins Team re­sponded quickly once the at­tack was dis­cov­ered. But 8 months passed be­tween the back­door be­ing planted and be­ing caught.

If you man­age WordPress sites, search your fleet for any of the 26 plu­gin slugs listed above. If you find one, patch it or re­move it. And check wp-con­fig.php.

...

Read the original on anchor.host »

3 387 shares, 20 trendiness

How the "AI Loser" may end up winning

A few weeks ago I wrote about how I thought in­tel­li­gence is be­com­ing a com­mod­ity. The idea is quite straight­for­ward, and wide­spread now: when every­one races to build the best model, the mod­els get bet­ter, but so does every other model even­tu­ally. Every dol­lar spent on a big­ger train­ing run makes the pre­vi­ous one cheaper. The dis­tance be­tween fron­tier, sec­ond-best, and open-source al­ter­na­tives is col­laps­ing fast (actually Gemma4, Kimi K2.5 and GLM 5.1 are be­com­ing my bed­side mod­els these days). Even more, as mod­els be­come bet­ter, the unit of in­tel­li­gence that can be de­ployed in lo­cal hard­ware with lower hard­ware ca­pa­bil­i­ties in­creases sig­nif­i­cantly.

The irony of this sit­u­a­tion is that this com­modi­ti­sa­tion of in­tel­li­gence is ben­e­fit­ing the com­pany that every­one was fram­ing as the AI loser”: Apple

There’s a ver­sion of the last three years where Apple gen­uinely failed at AI. They had Siri be­fore any­one had a se­ri­ous voice as­sis­tant, and then watched how ChatGPT ate their lunch al­ready since their first re­lease (even be­fore they had in­tro­duced their na­tive voice in­ter­ac­tion). Apple did­n’t have a flag­ship fron­tier (or even a van­ity open-source) model, no $500B com­pute com­mit­ment with the usual sus­pects. Meanwhile, the rest of the AI labs and big tech com­pa­nies were rac­ing to win the next state-of-the-art bench­mark by burn­ing bags of cash.

What this also meant is that while these com­pa­nies were burn­ing money at a rate that would make a sov­er­eign wealth fund un­com­fort­able, Apple was (and still is) sit­ting in a pile of un­de­ployed cash (to the point of even in­creas­ing their stock buy­backs) giv­ing them op­tion­al­ity.

To me, OpenAI is the most par­a­dig­matic ex­am­ple of this infinite money burn­ing ma­chine”. OpenAI raised at a $300B val­u­a­tion and then shut down Sora, the video prod­uct they’d been po­si­tion­ing as a cre­ative in­dus­try flag­ship, be­cause it was run­ning at roughly $15M a day in costs against $2.1M in daily rev­enue. Disney had al­ready signed a three-year li­cens­ing deal for Sora to gen­er­ate con­tent from Marvel, Pixar, and Star Wars char­ac­ters. They were fi­nal­is­ing a $1B eq­uity stake in OpenAI. When Sora died, so did the bil­lion. A $1B in­vest­ment evap­o­rated, be­cause the prod­uct it was staked on could­n’t pay for it­self (reducing their buffer that ac­com­mo­dates their daily burn).

On the in­fra­struc­ture side: OpenAI signed non-bind­ing let­ters of in­tent with Samsung and SK Hynix for up to 900,000 DRAM wafers per month, roughly 40% of global out­put. These were of course non-bind­ing. Micron, read­ing the de­mand sig­nal, shut down its 29-year-old Crucial con­sumer mem­ory brand to redi­rect all ca­pac­ity to­ward AI cus­tomers. Then Stargate Texas was can­celled, OpenAI and Oracle could­n’t agree terms, and the de­mand that had jus­ti­fied Micron’s en­tire strate­gic pivot sim­ply van­ished. Micron’s stock crashed.

I don’t know about you, but I don’t see these be­hav­iours as those of some­one that is win­ning the AI race, in­de­pen­dently of how good their mod­els do in bench­marks, and how much they are burn­ing in in­fra­struc­ture. A small mis­cal­cu­la­tion in the ex­pected rev­enue, and you are out of the game (I am ac­tu­ally of the opin­ion that with­out some kind of bailout, OpenAI could be bank­rupt in the next 18-24 months, but I am hor­ri­ble at pre­dic­tions).

My sense is that the labs’ bet was al­ways that raw model ca­pa­bil­ity, i.e. in­tel­li­gence, along with the in­fra­struc­ture re­quired to run them would stay scarce. Those who man­age to se­cure the best model and the in­fra­struc­ture to run it at scale would get the best moat. But I am afraid that hav­ing the best model in it­self may not be enough mov­ing for­ward. Less ca­pa­ble mod­els are be­com­ing as ca­pa­ble as pre­vi­ous ver­sions of the fron­tier mod­els.

The best re­cent ex­am­ple I can think of is Gemma 4, Google’s open-weight model. It was built to run on a phone, scores 85.2% on MMLU Pro and matches Claude Sonnet 4.5 Thinking on the Arena leader­board. 2 mil­lion down­loads in its first week. Models that would have been state-of-the-art eigh­teen months ago now run on a lap­top, and they get bet­ter every quar­ter.

If you haven’t tried Gemma4 your­self I highly rec­om­mend it. I am run­ning it on my AMD Ryzen AI Max+, and its per­for­mance in terms of to­kens per sec­ond and in­tel­li­gence are so good that I have al­ready mi­grated some of my per­sonal tools to use this model as the back­end with­out vis­i­bly im­pact­ing their out­put. This trend can re­ally change in the next few months way we ac­cess in­tel­li­gence.

I feel that some of the labs see this com­ing. Anthropic has been par­tic­u­larly ag­gres­sive about it and they are re­leas­ing new (actually use­ful) tools every day that work like a charm with their mod­els in or­der to lock users into their ecosys­tem. Claude Code for de­vel­op­ers, Claude Cowork for teams, the re­cent Claude Managed Sessions to or­ches­trate agents, all de­signed to put Claude in­side work­flows peo­ple are al­ready in.

The logic be­hind it: if the model it­self won’t hold the moat, cap­ture the us­age layer and make switch­ing painful. I think this is bril­liant, and see­ing how much Anthropic is grow­ing in num­ber of users and rev­enue, it seems to be pay­ing off. The eco­nom­ics of their plans are still rough, though. One analy­sis found a max-plan sub­scriber con­sum­ing $27,000 worth of com­pute with their 200$ Max sub­scrip­tion. The labs are sub­si­dis­ing the de­mand they’re chas­ing, which jus­ti­fies their level of burn (let’s see for how long they can af­ford these sub­si­dies).

Apple, by con­trast, has spent al­most noth­ing on AI in­fra­struc­ture and sub­si­dis­ing users’ to­ken burn. And this may be giv­ing them more op­tion­al­ity and lever­age than any of the other com­pa­nies that jumped heads first into the AI race.

In that ear­lier post, I ar­gued that if in­tel­li­gence be­comes abun­dant, con­text be­comes the scarce re­source. A model that can rea­son about any­thing but knows noth­ing about you or the en­vi­ron­ment it op­er­ates in is a generic tool. What makes AI gen­uinely use­ful day-to-day is rea­son­ing plus per­sonal con­text: your mes­sages, your cal­en­dar, your code, your tools, your health data, your pho­tos, your habits. I think here is where Anthropic is mak­ing an amaz­ing job with their Claude suite”.

But Apple al­ready has all this con­text and ac­cess to your en­vi­ron­ment through their 2.5 bil­lion ac­tive de­vices. Each one is a con­text mine that users have been fill­ing for years. Health data from Apple Watch. Every photo taken on an iPhone. Notes, mes­sages, lo­ca­tion his­tory, app be­hav­iour, emails, and aware­ness of your en­vi­ron­ment through the pool of sen­sors of your de­vice. Why build a com­mod­ity when they al­ready have the con­text that can be­come their moat?

And they even have the abil­ity to keep all this data on-de­vice, which is where the Privacy. That’s iPhone” po­si­tion­ing be­comes some­thing more than a PR strat­egy, and which could ac­tu­ally make a come­back to be­come one of their core value propo­si­tions. Apple spent years us­ing pri­vacy as a dif­fer­en­tia­tor against the ad-dri­ven mod­els of Google and Meta. It worked, but it al­ways felt a bit ab­stract and, hon­estly, fake. Now it could be­come re­ally con­crete. Would you hand OpenAI your med­ical records and fif­teen years of pho­tos to get bet­ter AI an­swers? Probably not. Some are, but I per­son­ally would­n’t like Sam to have that per­sonal data from me. Would you let a model run­ning en­tirely on your de­vice (no net­work re­quest, no data leav­ing your phone) ac­cess all of that? That’s a dif­fer­ent ques­tion. The on-de­vice model gets full con­text be­cause it never leaves the hard­ware. Apple built the rep­u­ta­tion and the ar­chi­tec­ture for this when no one else thought it mat­tered.

Of course, there are still tech­no­log­i­cal bar­ri­ers to make this pos­si­ble, but I feel like we may be get­ting there.

In this con­text, the Gemini deal, where Apple signed a $1B to li­cense Google’s fron­tier model for the queries that need cloud-scale rea­son­ing, makes to­tal sense. Apple did­n’t build a fron­tier model. They bought ac­cess to one, at a price that’s round­ing er­ror against OpenAI’s weekly com­pute bill. What they kept in-house: the con­text layer, the on-de­vice stack, and the op­er­at­ing sys­tem that me­di­ates every­thing.

Turns out Apple had an­other un­ex­pected lever for AI as shown with the Mac Mini craze af­ter OpenClaw’s re­lease. Apple Silicon was­n’t built specif­i­cally for AI, it was built for ef­fi­ciency, for bat­tery life, for ther­mal per­for­mance, for the hard­ware/​soft­ware co-de­sign that Apple had been run­ning for fif­teen years. But it turned out to be the per­fect ar­chi­tec­ture to run lo­cal mod­els ef­fi­ciently.

The key de­ci­sion is uni­fied mem­ory. On a con­ven­tional ar­chi­tec­ture (that of most lap­tops, and even tra­di­tional data cen­ter-grade GPUs) the CPU and GPU are sep­a­rate chips with sep­a­rate mem­ory pools. Moving data be­tween them is slow and power-hun­gry. Nvidia’s GPUs are ex­tremely fast at ma­trix op­er­a­tions, but they sit on the other side of a PCIe bus from the CPU, and feed­ing them is a con­stant bot­tle­neck (as dis­cussed when pre­sent­ing the dif­fer­ence be­tween DRAM and HBM in this post from a few weeks ago).

Apple’s M-series and A-series chips put the CPU, GPU, and Neural Engine (their pro­pri­etary ac­cel­er­a­tor) on the same die, shar­ing one high-band­width mem­ory pool. No bus cross­ing, no trans­fer over­head, no la­tency switch­ing be­tween CPU and GPU work. For video edit­ing or com­pil­ing Xcode, this is a nice ef­fi­ciency win. For LLM in­fer­ence, this has been key.

As de­scribed also in my post about RAM mem­ory and TurboQuant, LLM in­fer­ence is cur­rently mem­ory-band­width bound, not com­pute bound. The bot­tle­neck is­n’t so much how fast you can mul­ti­ply ma­tri­ces, it’s how fast you can stream model weights from mem­ory into the com­pute units, and how big of a KV cache you can store to avoid hav­ing to re-com­pute it. Apple’s uni­fied pool gives every com­pute unit di­rect, high-band­width ac­cess to the same mem­ory si­mul­ta­ne­ously. That’s ex­actly the op­er­a­tion in­fer­ence needs.

This is what makes the LLM in a Flash tech­nique work so well on Apple hard­ware. Someone re­cently ran Qwen 397B, a 209GB model, on an M3 Max Mac at ~5.7 to­kens per sec­ond, us­ing only 5.5GB of ac­tive RAM. The weights live on the SSD and stream in at ~17.5 GB/s as needed. This works be­cause Qwen is a mix­ture-of-ex­perts ar­chi­tec­ture: each to­ken only ac­ti­vates a small sub­set of ex­pert lay­ers, so you only ever need a frac­tion of the 209GB res­i­dent in mem­ory. The SSD through­put Apple achieves (faster than their own fig­ures from the orig­i­nal LLM in a Flash pa­per) comes from stor­age ar­chi­tec­ture they built for iPhone re­spon­sive­ness, not AI. Claude wrote the ~5,000 lines of Objective-C and Metal shaders to make it all work. A 400-billion-parameter model, on a con­sumer lap­top, from 5.5GB of RAM (another win of the au­tore­search flow dis­cussed in this newslet­ter).

What I find more in­ter­est­ing about all of this is the plat­form dy­namic that this can re­sult in. Think about the App Store. Apple did­n’t build the apps, they built the plat­form where apps ran best, and the ecosys­tem fol­lowed. Developers did­n’t tar­get iOS be­cause Apple asked, they tar­geted it be­cause the users were there, the tool­ing was good, the hard­ware was con­sis­tent. My feel­ing is that the same thing could hap­pen now with lo­cal in­fer­ence. MLX is al­ready a de facto frame­work for on-de­vice AI. Gemma, Qwen, Mistral, the most rel­e­vant model ar­chi­tec­tures have MLX sup­port. Apple does­n’t need to win the model race if they man­age to be­come the de-facto plat­form where the mod­els (or the agents that use them) run. Again, a great ex­am­ple of this is the Mac Mini craze af­ter OpenClaw went vi­ral.

I keep go­ing back and forth on this, hon­estly, and I still don’t know if this was Apple’s strat­egy all along, or they did­n’t feel in the po­si­tion to make a bet and are just flow­ing as the events un­fold max­imis­ing their op­tion­al­ity.

The hard­ware/​soft­ware co-de­sign strat­egy has been a key fo­cus for years, and one that I’ve al­ways agreed on my­self (as an elec­tri­cal en­gi­neer­ing by train­ing, I’ve al­ways been into hard­ware/​soft­ware co-de­sign). If you can af­ford it, I think that’s the right ap­proach. The pri­vacy po­si­tion­ing, the on-de­vice pro­cess­ing fo­cus, the de­ci­sion to build their own sil­i­con when the rest of the in­dus­try was happy buy­ing Nvidia and Intel, all of those were choices Apple made when they were com­mer­cially risky and the di­rec­tion was­n’t ob­vi­ous. Is it true that they were made with cost and gov­er­nance in mind, not AI, but it turned out well for them.

What Apple could­n’t have planned (or could they?) is that their uni­fied mem­ory ar­chi­tec­ture would be a per­fect fit for LLMs, and that open-weight mod­els would get this ca­pa­ble, this fast, re­mov­ing the need for huge hard­ware in­vest­ment for AI in­fra­struc­ture from their side. That the model race would com­modi­tise in­tel­li­gence as quickly as it did. Or that some­one would stream a 400B pa­ra­me­ter model from an SSD and it would ac­tu­ally work.

So some of this is luck. But it’s the kind of luck that finds you when you built the right foun­da­tion, even if you built it for com­pletely dif­fer­ent rea­sons. They were def­i­nitely well-po­si­tioned.

The rest of the in­dus­try spent three years rac­ing to see who could build the best model with Apple look­ing from the side­lines, wait­ing to un­der­stand how their de­vices and own ecosys­tem could fit in this fu­ture. I don’t know if this is ex­actly the case, but I feel this was smart. Risky but smart.

I gen­uinely don’t know how this plays out over the next few years. The labs are not stand­ing still, and Apple’s AI track record (looking at you, Siri, you al­ready suck a bit) is not ex­actly flaw­less. But it’s hard to imag­ine a world where 2.5 bil­lion de­vices, car­ry­ing your en­tire per­sonal con­text, run­ning ca­pa­ble mod­els lo­cally on pur­pose-built sil­i­con, with Gemini on-call for the hard stuff, in­cur­ring in vari­able cost for in­fer­ence in­stead of ex­pen­sive CAPEX in­vest­ment could be a bad po­si­tion to be in a fu­ture where AI is every­where.

Whether that was strat­egy or for­tune, I’ll leave for you to de­cide. And if you do, please let me know what you think about it. My TL;DR is that, to my sur­prise, I am still bull­ish about Apple and their rel­e­vance in an AI-centric fu­ture.

Disclaimer: To frame the opin­ion of this post, I just want to be clear about the fact that I am not one of those Apple fan boys. Proof of this is that this post was writ­ten from a Linux ma­chine and that I don’t even own a Mac :)

...

Read the original on adlrocha.substack.com »

4 376 shares, 23 trendiness

Why Most Engineering Organizations Are Flying Blind

This post works through the fi­nan­cial logic of soft­ware teams, from what a team of eight en­gi­neers ac­tu­ally costs per month to what it needs to gen­er­ate to be eco­nom­i­cally vi­able. It also ex­am­ines why most teams have no vis­i­bil­ity into ei­ther num­ber, how that con­di­tion was built over two decades, and what the ar­rival of LLMs now means for or­ga­ni­za­tions that have been treat­ing large en­gi­neer­ing head­count as an as­set.

Software de­vel­op­ment is one of the most cap­i­tal-in­ten­sive ac­tiv­i­ties a mod­ern com­pany un­der­takes, and it is also one of the least un­der­stood from a fi­nan­cial per­spec­tive. The peo­ple mak­ing daily de­ci­sions about what to build, what to de­lay, and what to aban­don are rarely given the fi­nan­cial con­text to un­der­stand what those de­ci­sions ac­tu­ally cost. This is not a co­in­ci­dence. It is a struc­tural con­di­tion that most or­ga­ni­za­tions have main­tained, qui­etly and con­sis­tently, for roughly two decades.

A soft­ware en­gi­neer in Western Europe costs some­where be­tween €120,000 and €150,000 per year when you ac­count for salary, so­cial fees, pen­sion con­tri­bu­tions, equip­ment, so­cial ac­tiv­i­ties, man­age­ment over­head, and of­fice space. Call it €130,000 as a rea­son­able mid­dle es­ti­mate. A team of eight en­gi­neers there­fore costs ap­prox­i­mately €1,040,000 per year, or €87,000 per month, or roughly €4,000 for every work­ing day.

Most en­gi­neers do not know this num­ber. Many of their man­agers do not ei­ther. And in the or­ga­ni­za­tions where some­one does know it, the num­ber rarely makes its way into the con­ver­sa­tions where pri­or­i­ti­za­tion de­ci­sions are ac­tu­ally made.

This mat­ters be­cause every de­ci­sion a team makes car­ries an im­plicit cost that com­pounds over time. Choosing to spend three weeks on a fea­ture that serves 2% of users is a €60,000 de­ci­sion. Delaying an op­er­a­tional im­prove­ment for a quar­ter is a de­ci­sion with a cal­cu­la­ble daily price tag. Rebuilding a plat­form be­cause the cur­rent one feels em­bar­rass­ing, rather than be­cause cus­tomers are leav­ing, is a cap­i­tal al­lo­ca­tion choice that would look very dif­fer­ent if the peo­ple mak­ing it were spend­ing their own money.

Consider a team of eight en­gi­neers whose mis­sion is to build and main­tain an in­ter­nal de­vel­oper plat­form serv­ing one hun­dred other en­gi­neers. This is a com­mon or­ga­ni­za­tional struc­ture, and it is one where the fi­nan­cial logic is rarely ex­am­ined care­fully.

The team costs €87,000 per month. To jus­tify that cost, the plat­form they build needs to gen­er­ate at least €87,000 per month in value for the en­gi­neers who use it. The most di­rect way to mea­sure that value is through time saved, since the plat­for­m’s pur­pose is to make other en­gi­neers more pro­duc­tive.

At a cost of €130,000 per year, one en­gi­neer costs ap­prox­i­mately €10,800 per month, or around €65 per work­ing hour. For the plat­form team to break even, their plat­form needs to save the hun­dred en­gi­neers they serve a com­bined to­tal of 1,340 hours per month. That is 13.4 hours per en­gi­neer per month, or roughly three hours per week per per­son.

Three hours per week is achiev­able. A well-built plat­form that elim­i­nates man­ual de­ploy­ment steps, re­duces en­vi­ron­ment setup time, or re­moves the need for repet­i­tive con­fig­u­ra­tion work can eas­ily clear that bar. Time saved is the most di­rect mea­sure for a plat­form team, though value can also come from re­duc­ing out­ages, which car­ries a di­rect rev­enue im­pact of its own. But the ques­tion worth ask­ing is whether any­one on that team knows this num­ber, tracks it, or uses it to de­cide what to build next. In most or­ga­ni­za­tions, the an­swer is no. The team has a roadmap dri­ven by en­gi­neer­ing pref­er­ences, stake­holder re­quests, and quar­terly plan­ning cy­cles, and the fi­nan­cial logic un­der­ly­ing their ex­is­tence is left un­ex­am­ined.

And break-even is not ac­tu­ally the right bar. Leah Tharin has writ­ten a sharp break­down of the math­e­mat­ics of this: a team with a 50% ini­tia­tive suc­cess rate, which is al­ready op­ti­mistic, needs its wins to cover its losses too. Leah’s cal­cu­la­tion is growth-ori­ented, but even for non-growth or­ga­ni­za­tions, the same in­vest­ment the­sis holds. Even a two-times re­turn is not suf­fi­cient. Capital sit­ting in a bank car­ries no op­er­a­tional risk, no co­or­di­na­tion costs, and no on­go­ing main­te­nance oblig­a­tions. The sys­tems a team builds will out­live the team it­self, and the cost of own­ing, main­tain­ing, and even­tu­ally re­plac­ing those sys­tems is al­most al­ways larger than an­tic­i­pated. The re­turn has to cover not just the team’s cur­rent cost, but the long tail of what they leave be­hind.

That pushes the re­al­is­tic thresh­old for fi­nan­cial vi­a­bil­ity to some­where be­tween three and five times an­nual cost. For an €87,000 per month team, that means gen­er­at­ing be­tween €260,000 and €435,000 in monthly value. The three hours per week cal­cu­la­tion gets you to break-even. To clear the re­al­is­tic fi­nan­cial bar, the plat­form needs to be gen­uinely trans­for­ma­tive for the en­gi­neers us­ing it, and the team needs to be ruth­less about work­ing on the high­est-value prob­lems rather than the most in­ter­est­ing ones.

A cus­tomer-fac­ing prod­uct team of eight car­ries the same €87,000 monthly cost. The levers avail­able to jus­tify that cost are dif­fer­ent, but the un­der­ly­ing logic is iden­ti­cal.

If the prod­uct has an av­er­age rev­enue per user of €50 per month, the team needs to gen­er­ate or pro­tect the equiv­a­lent of 1,740 users worth of value every month just to break even, and roughly 5,000 to 8,700 users worth of value to clear the three-to-five times thresh­old.

Churn is of­ten the most di­rect lever. Consider a prod­uct with 50,000 ac­tive users los­ing 2% monthly to churn. That is 1,000 users per month, rep­re­sent­ing €50,000 in monthly re­cur­ring rev­enue walk­ing out the door. A team that iden­ti­fies the pri­mary dri­ver of that churn and elim­i­nates it is gen­er­at­ing nearly €50,000 per month in pro­tected rev­enue, cov­er­ing most of its break-even cost from a sin­gle ini­tia­tive. But that cal­cu­la­tion re­quires know­ing the churn rate, un­der­stand­ing its causes, and con­nect­ing those causes to the team’s work, and most teams are not op­er­at­ing with that level of fi­nan­cial clar­ity.

Activation is an­other lever that is fre­quently un­der­es­ti­mated. If 10,000 users sign up each month but only 30% com­plete the ac­ti­va­tion steps that lead to long-term re­ten­tion, there are 7,000 users each month who paid ac­qui­si­tion costs but never con­verted to re­tained rev­enue. Improving the ac­ti­va­tion rate by five per­cent­age points, from 30% to 35%, con­verts an ad­di­tional 500 users per month. At €50 av­er­age rev­enue per user, that is €25,000 in ad­di­tional monthly re­cur­ring rev­enue, rep­re­sent­ing roughly 29% of the team’s break-even thresh­old from one met­ric mov­ing in the right di­rec­tion.

Sales con­ver­sion fol­lows the same logic. If the prod­uct has a free-to-paid con­ver­sion fun­nel pro­cess­ing 20,000 tri­als per month at a 4% con­ver­sion rate, that pro­duces 800 pay­ing cus­tomers monthly. Moving con­ver­sion from 4% to 4.5% pro­duces 900 cus­tomers, an ad­di­tional 100 pay­ing users, and €5,000 in ad­di­tional monthly rev­enue. Small im­prove­ments across mul­ti­ple levers com­pound quickly, but only if the team un­der­stands which levers con­nect to which fi­nan­cial out­comes and by how much.

Given that soft­ware teams are ex­pen­sive and that their value is, at least in prin­ci­ple, cal­cu­la­ble, it is worth ex­am­in­ing why most teams do not mea­sure any­thing fi­nan­cially mean­ing­ful. Some mea­sure ac­tiv­ity prox­ies such as ve­loc­ity, tick­ets closed, or fea­tures shipped. Others mea­sure sen­ti­ment prox­ies such as NPS, CSAT, or en­gage­ment scores. These are not de­graded ver­sions of fi­nan­cial mea­sure­ment. They are a dif­fer­ent cat­e­gory en­tirely, one that was de­signed around the goal of un­der­stand­ing user be­hav­ior and team through­put rather than around the goal of un­der­stand­ing eco­nomic re­turn.

The prob­lem is that ac­tiv­ity and sen­ti­ment met­rics can trend up­ward while fi­nan­cial per­for­mance de­te­ri­o­rates. A team can ship more fea­tures while build­ing the wrong things. Engagement scores can rise while churn ac­cel­er­ates among the users who ac­tu­ally gen­er­ate rev­enue. Velocity can in­crease while the work be­ing com­pleted has no mea­sur­able con­nec­tion to busi­ness out­comes. These met­rics feel mean­ing­ful be­cause they cor­re­late with out­comes in many cir­cum­stances, but cor­re­la­tion is not a re­li­able guide to pri­or­i­ti­za­tion when the un­der­ly­ing fi­nan­cial logic is never ex­am­ined.

This is a struc­tural con­di­tion rather than a fail­ure of in­di­vid­ual judg­ment. Organizations chose these met­rics be­cause they are eas­ier to in­stru­ment, eas­ier to com­mu­ni­cate, and eas­ier to look good on than fi­nan­cial met­rics. A team that mea­sures its suc­cess by fea­tures shipped will al­ways have some­thing to show. A team that mea­sures its suc­cess by re­turn gen­er­ated will some­times have to re­port that it does not know, or that the re­turn was dis­ap­point­ing, and that kind of trans­parency re­quires an or­ga­ni­za­tional cul­ture that most com­pa­nies have not de­lib­er­ately built.

The ma­trix above is drawn from a prod­uct man­age­ment train­ing pro­gram I run called Booster, where prod­uct lead­ers map their ac­tual met­rics against their in­vest­ment the­sis to sur­face gaps. The ex­er­cise is un­com­fort­able pre­cisely be­cause most lead­ers dis­cover mid-map­ping that their team’s daily mea­sure­ments have no di­rect con­nec­tion to the fi­nan­cial ob­jec­tive they were given.

Understanding why this con­di­tion ex­ists re­quires look­ing at roughly two decades of macro­eco­nomic con­text, be­cause the fi­nan­cial dys­func­tion in mod­ern soft­ware or­ga­ni­za­tions did not emerge from bad in­ten­tions or in­tel­lec­tual fail­ure. It emerged from a spe­cific en­vi­ron­ment that made fi­nan­cial dis­ci­pline in prod­uct teams eco­nom­i­cally un­nec­es­sary.

The pic­ture is not a sin­gle clean era but two dis­tinct phases. From roughly 2002 through 2011, cap­i­tal was pe­ri­od­i­cally cheap but con­di­tions were mixed. Rates fell sharply af­ter the dot-com crash and again af­ter the global fi­nan­cial cri­sis, but in both cases risk ap­petite was sup­pressed. The money was tech­ni­cally in­ex­pen­sive but in­vestors were cau­tious, mul­ti­ples were rea­son­able, and the growth-at-all-costs logic had not yet taken hold. Product or­ga­ni­za­tions dur­ing this pe­riod still op­er­ated with some resid­ual fi­nan­cial dis­ci­pline in­her­ited from the dot-com reck­on­ing.

From ap­prox­i­mately 2011 through 2022, some­thing dif­fer­ent hap­pened. Zero-rate pol­icy be­came fully nor­mal­ized, risk ap­petite re­cov­ered and then over­cor­rected, and the SaaS men­tal model crys­tal­lized into a broadly shared in­vest­ment the­sis. All three con­di­tions ar­rived si­mul­ta­ne­ously, and the re­sult was about eleven years dur­ing which soft­ware com­pa­nies could grow head­count ag­gres­sively, miss on the ma­jor­ity of their roadmap, and still look healthy on pa­per. Revenue growth for­gave an enor­mous range of pri­or­i­ti­za­tion mis­takes, and the cost of build­ing the wrong thing was largely in­vis­i­ble.

Eleven years is not a long time, but it is long enough to form the pro­fes­sional in­stincts of an en­tire gen­er­a­tion of prod­uct and en­gi­neer­ing lead­ers. The frame­works they learned, the met­rics they adopted, the plan­ning rit­u­als they prac­tice, and the de­f­i­n­i­tions of suc­cess they in­ter­nal­ized were all formed dur­ing a win­dow that was un­usu­ally short and un­usu­ally dis­torted. There is no co­hort of se­nior prod­uct lead­ers who de­vel­oped their judg­ment in con­di­tions where their teams were ex­pected to demon­strate fi­nan­cial re­turn, be­cause those con­di­tions did not ex­ist dur­ing the years when that co­hort was learn­ing the craft.

When cap­i­tal be­came ex­pen­sive again in 2022, the be­hav­ior did not au­to­mat­i­cally ad­just, be­cause the be­hav­ior was never con­nected to the fi­nan­cial logic in the first place.

There is a deeper con­se­quence of this twenty-year pe­riod that is now be­com­ing painfully vis­i­ble, and it con­cerns how the in­dus­try has thought about large en­gi­neer­ing or­ga­ni­za­tions and code­bases.

The con­ven­tional un­der­stand­ing is that a code­base rep­re­sent­ing years of en­gi­neer­ing in­vest­ment is a valu­able as­set. It en­codes busi­ness logic, cap­tures ac­cu­mu­lated de­ci­sions, and rep­re­sents the tech­ni­cal foun­da­tion on which fu­ture prod­ucts are built. A large en­gi­neer­ing or­ga­ni­za­tion is sim­i­larly un­der­stood as a source of ca­pa­bil­ity, with more en­gi­neers mean­ing more ca­pac­ity to build, main­tain, and im­prove that foun­da­tion.

While some ar­gued that large code­bases ac­tu­ally shoulg be con­sid­ered a li­a­bil­ity, the in­dus­try as a whole has mostly ig­nored that. But this un­der­stand­ing is now be­ing more closely ex­am­ined. A large code­base also car­ries main­te­nance costs that grow over time as the sys­tem be­comes more com­plex, more in­ter­con­nected, and more dif­fi­cult to change safely. Every en­gi­neer added to main­tain it in­creases co­or­di­na­tion costs, in­tro­duces new de­pen­den­cies, and adds to the or­ga­ni­za­tional weight that slows de­ci­sion-mak­ing. The as­set and the li­a­bil­ity ex­ist si­mul­ta­ne­ously, and for most of the past twenty years, the fi­nan­cial en­vi­ron­ment masked the li­a­bil­ity side of that equa­tion.

The ar­rival of large lan­guage mod­els has made the li­a­bil­ity vis­i­ble in a way that is dif­fi­cult to ig­nore. Recently, Nathan Cavaglione, a de­vel­oper, built a func­tional replica of ap­prox­i­mately 95% of Slack’s core prod­uct in four­teen days us­ing LLM agents. Slack was built by thou­sands of en­gi­neers over the course of more than a decade, at a cost that rep­re­sents bil­lions of dol­lars in cu­mu­la­tive en­gi­neer­ing in­vest­ment. Nathan started with­out any of that ac­cu­mu­lated com­plex­ity, with­out the or­ga­ni­za­tional weight, with­out the legacy ar­chi­tec­tural de­ci­sions, and with­out the co­or­di­na­tion costs, and ar­rived at a com­pa­ra­ble prod­uct in a pe­riod that would not con­sti­tute a sin­gle sprint in most en­ter­prise en­gi­neer­ing or­ga­ni­za­tions.

Day 14: A func­tional replica of Slack’s core prod­uct, built by a Nathan us­ing LLM agents.

This does not mean that Slack’s en­gi­neer­ing in­vest­ment was wasted, be­cause Slack also built en­ter­prise sales in­fra­struc­ture, com­pli­ance ca­pa­bil­i­ties, data se­cu­rity prac­tices, and or­ga­ni­za­tional re­silience that a four­teen-day pro­to­type does not in­clude. But it does mean that the as­sump­tion un­der­ly­ing large en­gi­neer­ing or­ga­ni­za­tions, which is that scale and ac­cu­mu­lated com­plex­ity rep­re­sent com­pet­i­tive moats, is no longer re­li­able in the way it once was. When the cost of build­ing a func­tional ap­prox­i­ma­tion of a so­phis­ti­cated soft­ware prod­uct can col­lapse to days of in­di­vid­ual ef­fort, the ques­tion of what a large en­gi­neer­ing team jus­ti­fies be­comes both more ur­gent and more dif­fi­cult to an­swer with the met­rics most or­ga­ni­za­tions cur­rently track.

The ob­vi­ous ob­jec­tion is that code pro­duced at that speed be­comes un­man­age­able, a li­a­bil­ity in it­self. That is a rea­son­able con­cern, but it largely ap­plies when agents pro­duce code that hu­mans then main­tain. Agentic plat­forms are be­ing it­er­ated upon quickly, and for es­tab­lished pat­terns and non-busi­ness-crit­i­cal code, which is the ma­jor­ity of what most en­gi­neer­ing or­ga­ni­za­tions ac­tu­ally main­tain, de­tailed hu­man fa­mil­iar­ity with the code­base mat­ters less than it once did. A messy code­base is still cheaper to send ten agents through than to staff a team around. And even if the agents need ten days to rea­son through an un­fa­mil­iar sys­tem, that is still faster and cheaper than most de­vel­op­ment teams op­er­at­ing to­day. The li­a­bil­ity ar­gu­ment holds in a hu­man-to-hu­man or agent-to-hu­man world. In an agent-to-agent world, it largely dis­solves.

The com­pet­i­tive ad­van­tage avail­able to or­ga­ni­za­tions that take this se­ri­ously is not pri­mar­ily tech­ni­cal. It is an­a­lyt­i­cal. Companies that can clearly ar­tic­u­late what each of their teams costs, what value each team gen­er­ates, and whether that value clears a fi­nan­cially vi­able thresh­old are in a struc­turally dif­fer­ent po­si­tion than com­pa­nies that can­not. They can make build ver­sus buy de­ci­sions based on ac­tual eco­nom­ics rather than or­ga­ni­za­tional pref­er­ence. They can iden­tify when a team is work­ing on prob­lems that can­not gen­er­ate suf­fi­cient re­turn at their cost level. They can se­quence ini­tia­tives based on what value is be­ing lost each day they are de­layed, rather than on who ar­gued most per­sua­sively in the last plan­ning meet­ing.

Most or­ga­ni­za­tions can­not do this to­day. The mea­sure­ment in­fra­struc­ture does not ex­ist, the fi­nan­cial data does not flow to the peo­ple mak­ing pri­or­i­ti­za­tion de­ci­sions, and the habit of ask­ing these ques­tions has not been built. Building it is un­com­fort­able, be­cause the an­swers are some­times un­flat­ter­ing. A team that ex­am­ines its work through this lens will some­times dis­cover that it has spent a quar­ter on things that do not con­nect to fi­nan­cial out­comes in any mean­ing­ful way, and that is a dif­fi­cult find­ing to sit with.

But the al­ter­na­tive is con­tin­u­ing to run an or­ga­ni­za­tion where teams with mil­lion-euro an­nual bud­gets make daily in­vest­ment de­ci­sions with­out the fi­nan­cial con­text to know whether those de­ci­sions are gen­er­at­ing re­turn. That con­di­tion was sus­tain­able when cap­i­tal was cheap and growth for­gave every­thing. It is in­creas­ingly dif­fi­cult to sus­tain in an en­vi­ron­ment where boards ex­pect fi­nan­cial re­turns, where the cost of build­ing soft­ware is col­laps­ing due to AI, and where the ques­tion of what a team jus­ti­fies can no longer be de­ferred in­def­i­nitely.

The or­ga­ni­za­tions that de­velop the habit of ask­ing these ques­tions clearly, reg­u­larly, and with­out flinch­ing will ac­cu­mu­late an ad­van­tage that com­pounds over time. The ques­tion is sim­ply whether they will start ask­ing be­fore or af­ter the pres­sure forces them to.

...

Read the original on www.viktorcessan.com »

5 366 shares, 45 trendiness

Servo aims to empower developers with a lightweight, high-performance alternative for embedding web technologies in applications.

Servo is now avail­able on crates.io

Today the Servo team has re­leased v0.1.0 of the servo crate. This is our first crates.io re­lease of the servo crate that al­lows Servo to be used as a li­brary.

We cur­rently do not have any plans of pub­lish­ing our demo browser ser­voshell to crates.io. In the 5 re­leases since our ini­tial GitHub re­lease in October 2025, our re­lease process has ma­tured, with the main bottleneck” now be­ing the hu­man-writ­ten monthly blog post. Since we’re quite ex­cited about this re­lease, we de­cided to not wait for the monthly blog post to be fin­ished, but promise to de­liver the monthly up­date in the com­ing weeks.

As you can see from the ver­sion num­ber, this re­lease is not a 1.0 re­lease. In fact, we still haven’t fin­ished dis­cussing what 1.0 means for Servo. Nevertheless, the in­creased ver­sion num­ber re­flects our grow­ing con­fi­dence in Servo’s em­bed­ding API and its abil­ity to meet some users’ needs.

In the mean­time we also de­cided to of­fer a long-term sup­port (LTS) ver­sion of Servo, since break­ing changes in the reg­u­lar monthly re­leases are ex­pected and some em­bed­ders might pre­fer do­ing ma­jor up­grades on a sched­uled half-yearly ba­sis while still re­ceiv­ing se­cu­rity up­dates and (hopefully!) some mi­gra­tion guides. For more de­tails on the LTS re­lease, see the re­spec­tive sec­tion in the Servo book.

...

Read the original on servo.org »

6 312 shares, 52 trendiness

sterlingcrispin/nothing-ever-happens

Focused async Python bot for Polymarket that buys No on stand­alone non-sports yes/​no mar­kets.

FOR ENTERTAINMENT ONLY. PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. USE AT YOUR OWN RISK. THE AUTHORS ARE NOT LIABLE FOR ANY CLAIMS, LOSSES, OR DAMAGES.

The bot scans stand­alone mar­kets, looks for NO en­tries be­low a con­fig­ured price cap, tracks open po­si­tions, ex­poses a dash­board, and per­sists live re­cov­ery state when or­der trans­mis­sion is en­abled.

If any of those are miss­ing, the bot uses PaperExchangeClient.

pip in­stall -r re­quire­ments.txt

cp con­fig.ex­am­ple.json con­fig.json

cp .env.example .env

con­fig.json is in­ten­tion­ally lo­cal and ig­nored by git.

The run­time con­fig lives un­der strate­gies.noth­ing_hap­pens. See con­fig.ex­am­ple.json and .env.example.

You can point the run­time at a dif­fer­ent con­fig file with CONFIG_PATH=/path/to/config.json.

python -m bot.main

The dash­board binds $PORT or DASHBOARD_PORT when one is set.

The shell helpers use ei­ther an ex­plicit app name ar­gu­ment or HEROKU_APP_NAME.

ex­port HEROKU_APP_NAME=

heroku con­fig:set BOT_MODE=live DRY_RUN=false LIVE_TRADING_ENABLED=true -a $HEROKU_APP_NAME”

heroku con­fig:set PRIVATE_KEY=

Only run the web dyno. The worker en­try ex­ists only to fail fast if it is started ac­ci­den­tally.

python -m pytest -q

Local con­fig, ledgers, ex­ports, re­ports, and de­ploy­ment ar­ti­facts are ig­nored by de­fault.

...

Read the original on github.com »

7 282 shares, 35 trendiness

US appeals court declares 158-year-old home distilling ban unconstitutional

A U. S. ap­peals court on Friday de­clared un­con­sti­tu­tional a nearly 158-year-old fed­eral ban on home dis­till­ing, call­ing it an un­nec­es­sary and im­proper means for ​Congress to ex­er­cise its power to tax.

The 5th U. S. Circuit Court of ‌Appeals in New Orleans ruled in fa­vor of the non­profit Hobby Distillers Association and four of its 1,300 mem­bers.

They ar­gued that peo­ple should be free to dis­till spir­its at home, whether as ​a hobby or for per­sonal con­sump­tion in­clud­ing, in one in­stance, to cre­ate ​an ap­ple-pie-vodka recipe.

The ban was part of a law passed dur­ing ⁠Reconstruction in July 1868, in part to thwart liquor tax eva­sion, and sub­jected vi­o­la­tors ​to up to five years in prison and a $10,000 fine.

Writing for a three-judge panel, ​Circuit Judge Edith Hollan Jones said the ban ac­tu­ally re­duced tax rev­enue by pre­vent­ing dis­till­ing in the first place, un­like laws that reg­u­lated the man­u­fac­ture and la­bel­ing of dis­tilled spir­its on which ​the gov­ern­ment could col­lect taxes.

She also said that un­der the gov­ern­men­t’s logic, Congress could ​criminalize vir­tu­ally any in-home ac­tiv­ity that might es­cape no­tice from tax col­lec­tors, in­clud­ing re­mote work and ‌home-based ⁠businesses.

Without any lim­it­ing prin­ci­ple, the gov­ern­men­t’s the­ory would vi­o­late this court’s oblig­a­tion to read the Constitution care­fully to avoid cre­at­ing a gen­eral fed­eral au­thor­ity akin to the po­lice power,” Jones wrote.

The U. S. Department of Justice had no im­me­di­ate com­ment.

Another de­fen­dant, the ​Treasury Department’s Alcohol and ​Tobacco Tax and ⁠Trade Bureau, did not im­me­di­ately re­spond to a re­quest for com­ment.

Devin Watkins, a lawyer rep­re­sent­ing the Hobby Distillers Association, in an ​interview called the rul­ing an im­por­tant de­ci­sion about the lim­its of ​federal power.

Andrew ⁠Grossman, who ar­gued the non­prof­it’s ap­peal, called the de­ci­sion an im­por­tant vic­tory for in­di­vid­ual lib­erty” that lets the plain­tiffs pursue their pas­sion to dis­till fine bev­er­ages in their homes.”

I look for­ward ⁠to ​sampling their out­put,” he said.

The de­ci­sion up­held a July 2024 ​ruling by U. S. District Judge Mark Pittman in Fort Worth, Texas. He put his rul­ing on hold so ​the gov­ern­ment could ap­peal.

...

Read the original on nypost.com »

8 261 shares, 30 trendiness

Make tmux Pretty and Usable

In my pre­vi­ous blog post I gave a quick and easy in­tro­duc­tion to tmux and ex­plained how to use tmux with a ba­sic con­fig­u­ra­tion.

If you’ve fol­lowed that guide you might have had a feel­ing that many peo­ple have when work­ing with tmux for the first time: These key com­bi­na­tions are re­ally awk­ward!”. Rest as­sured, you’re not alone. Judging from the co­pi­ous blog posts and dot­files re­pos on GitHub there are many peo­ple out there who feel the urge to make tmux be­have a lit­tle dif­fer­ent; to make it more com­fort­able to use.

And ac­tu­ally it’s quite easy to cus­tomize the look and feel of tmux. Let me tell you some­thing about the ba­sics of cus­tomiz­ing tmux and share some of the con­fig­u­ra­tions I find most use­ful.

Customizing tmux is as easy as edit­ing a text file. Tmux uses a file called tmux.conf to store its con­fig­u­ra­tion. If you store that file as ~/.tmux.conf (Note: there’s a pe­riod as the first char­ac­ter in the file name. It’s a hid­den file) tmux will pick this con­fig­u­ra­tion file for your cur­rent user. If you want to share a con­fig­u­ra­tion for mul­ti­ple users you can also put your tmux.conf into a sys­tem-wide di­rec­tory. The lo­ca­tion of this di­rec­tory will be dif­fer­ent across dif­fer­ent op­er­at­ing sys­tems. The man page (man tmux) will tell you the ex­act lo­ca­tion, just have a look at doc­u­men­ta­tion for the -f pa­ra­me­ter.

Probably the most com­mon change among tmux users is to change the pre­fix from the rather awk­ward C-b to some­thing that’s a lit­tle more ac­ces­si­ble. Personally I’m us­ing C-a in­stead but note that this might in­ter­fere with bash’s go to be­gin­ning of line” com­mand1. On top of the C-a bind­ing I’ve also remapped my Caps Lock key to act as since I’m not us­ing Caps Lock any­ways. This al­lows me to nicely trig­ger my pre­fix key combo.

To change your pre­fix from C-b to C-a, sim­ply add fol­low­ing lines to your tmux.conf:

# remap pre­fix from C-b’ to C-a’

un­bind C-b

set-op­tion -g pre­fix C-a

bind-key C-a send-pre­fix

Another thing I per­son­ally find quite dif­fi­cult to re­mem­ber is the pane split­ting com­mands.” to split ver­ti­cally and % to split hor­i­zon­tally just does­n’t work for my brain. I find it help­ful to have use char­ac­ters that re­sem­ble a vi­sual rep­re­sen­ta­tion of the split, so I chose | and - for split­ting panes hor­i­zon­tally and ver­ti­cally:

# split panes us­ing | and -

bind | split-win­dow -h

bind - split-win­dow -v

un­bind “’

un­bind %

Since I’m ex­per­i­ment­ing quite of­ten with my tmux.conf I want to re­load the con­fig eas­ily. This is why I have a com­mand to re­load my con­fig on r:

# re­load con­fig file (change file lo­ca­tion to your the tmux.conf you want to use)

bind r source-file ~/.tmux.conf

Switching be­tween panes is one of the most fre­quent tasks when us­ing tmux. Therefore it should be as easy as pos­si­ble. I’m not quite fond of trig­ger­ing the pre­fix key all the time. I want to be able to sim­ply say M- to go where I want to go (remember: M is for Meta, which is usu­ally your Alt key). With this mod­i­fi­ca­tion I can sim­ply press Alt-left to go to the left pane (and other di­rec­tions re­spec­tively):

# switch panes us­ing Alt-arrow with­out pre­fix

bind -n M-Left se­lect-pane -L

bind -n M-Right se­lect-pane -R

bind -n M-Up se­lect-pane -U

bind -n M-Down se­lect-pane -D

Although tmux clearly fo­cuses on key­board-only us­age (and this is cer­tainly the most ef­fi­cient way of in­ter­act­ing with your ter­mi­nal) it can be help­ful to en­able mouse in­ter­ac­tion with tmux. This is es­pe­cially help­ful if you find your­self in a sit­u­a­tion where oth­ers have to work with your tmux con­fig and nat­u­rally don’t have a clue about your key bind­ings or tmux in gen­eral. Pair Programming might be one of those oc­ca­sions where this hap­pens quite fre­quently.

Enabling mouse mode al­lows you to se­lect win­dows and dif­fer­ent panes by sim­ply click­ing and to re­size panes by drag­ging their bor­ders around. I find it pretty con­ve­nient and it does­n’t get in my way of­ten, so I usu­ally en­able it:

# Enable mouse con­trol (clickable win­dows, panes, re­siz­able panes)

set -g mouse on

I like to give my tmux win­dows cus­tom names us­ing the , key. This helps me nam­ing my win­dows ac­cord­ing to the con­text they’re fo­cus­ing on. By de­fault tmux will up­date the win­dow ti­tle au­to­mat­i­cally de­pend­ing on the last ex­e­cuted com­mand within that win­dow. In or­der to pre­vent tmux from over­rid­ing my wisely cho­sen win­dow names I want to sup­press this be­hav­ior:

# don’t re­name win­dows au­to­mat­i­cally

set-op­tion -g al­low-re­name off

Changing the col­ors and de­sign of tmux is a lit­tle more com­plex than what I’ve pre­sented so far. As tmux al­lows you to tweak the ap­pear­ance of a lot of el­e­ments (e.g. the bor­ders of panes, your sta­tus­bar and in­di­vid­ual el­e­ments of it, mes­sages), you’ll need to add a few op­tions to get a con­sis­tent look and feel. You can make this as sim­ple or as elab­o­rate as you like. Tmux’s man page (specifically the STYLES sec­tion) con­tains more in­for­ma­tion about what you can tweak and how you can tweak it.

Depending on your color scheme your re­sult­ing tmux will look some­thing like this:

# DESIGN TWEAKS

# don’t do any­thing when a bell’ rings

set -g vi­sual-ac­tiv­ity off

set -g vi­sual-bell off

set -g vi­sual-si­lence off

setw -g mon­i­tor-ac­tiv­ity off

set -g bell-ac­tion none

# clock mode

setw -g clock-mode-colour yel­low

# copy mode

setw -g mode-style fg=black bg=red bold’

# panes

set -g pane-bor­der-style fg=red’

set -g pane-ac­tive-bor­der-style fg=yellow’

# sta­tus­bar

set -g sta­tus-po­si­tion bot­tom

set -g sta­tus-jus­tify left

set -g sta­tus-style fg=red’

set -g sta­tus-left

set -g sta­tus-left-length 10

set -g sta­tus-right-style fg=black bg=yel­low’

set -g sta­tus-right %Y-%m-%d %H:%M

set -g sta­tus-right-length 50

setw -g win­dow-sta­tus-cur­rent-style fg=black bg=red’

setw -g win­dow-sta­tus-cur­rent-for­mat ′ #I #W #F ′

setw -g win­dow-sta­tus-style fg=red bg=black’

setw -g win­dow-sta­tus-for­mat ′ #I #[fg=white]#W #[fg=yellow]#F ′

setw -g win­dow-sta­tus-bell-style fg=yellow bg=red bold’

# mes­sages

set -g mes­sage-style fg=yellow bg=red bold’

In the snip­pet above, I’m us­ing your ter­mi­nal’s de­fault col­ors (by us­ing the named col­ors, like red, yel­low or black). This al­lows tmux to play nicely with what­ever color theme you have set for your ter­mi­nal. Some pre­fer to use a broader range of col­ors for their ter­mi­nals and tmux color schemes. If you don’t want to use your ter­mi­nal de­fault col­ors but in­stead want to de­fine col­ors from a 256 col­ors range, you can use colour0 to colour256 in­stead of red, cyan, and so on when defin­ing your col­ors in your tmux.conf.

Looking for a nice color scheme for your ter­mi­nal?

If you’re look­ing for a nice color scheme for your ter­mi­nal I rec­om­mend to check out my very own Root Loops. With Root Loops you can eas­ily de­sign a per­sonal, awe­some-look­ing ter­mi­nal color scheme and stand out from all the other folks us­ing the same bor­ing-ass color schemes every­one else is us­ing.

There are plenty of re­sources out there where you can find peo­ple pre­sent­ing their tmux con­fig­u­ra­tions. GitHub and other code host­ing ser­vices tend to be a great source. Simply search for tmux.conf” or re­pos called dotfiles” to find a vast amount of con­fig­u­ra­tions that are out there. Some peo­ple share their con­fig­u­ra­tion on their blog. Reddit might have a few sub­red­dits that could have use­ful in­spi­ra­tion, too (there’s /r/dotfiles and /r/unixporn, for ex­am­ple).

You can find my com­plete tmux.conf (along with other con­fig­u­ra­tion files I’m us­ing on my sys­tems) on my per­sonal dot­files repo on GitHub.

If you want to dive deeper into how you can cus­tomize tmux, the canon­i­cal source of truth is tmux’s man page (simply type man tmux to get there). You should also take a look at the elab­o­rate tmux wiki and see their Configuring tmux sec­tion if this blog post was too shal­low for your needs. Both will con­tain up-to-date in­for­ma­tion about each and every tiny thing you can tweak to make your tmux ex­pe­ri­ence truly yours. Have fun!

...

Read the original on hamvocke.com »

9 252 shares, 3 trendiness

Hungarian Prime Minister Orbán is ejected after 16 years in a European electoral earthquake

Add AP News as your pre­ferred source to see more of our sto­ries on Google.

Add AP News as your pre­ferred source to see more of our sto­ries on Google.

BUDAPEST, Hungary (AP) — Hungarian vot­ers on Sunday ousted long-serv­ing Prime Minister Viktor Orbán af­ter 16 years in power, re­ject­ing the au­thor­i­tar­ian poli­cies and global far-right move­ment that he em­bod­ied in fa­vor of a pro-Eu­ro­pean chal­lenger in a bomb­shell elec­tion re­sult with global reper­cus­sions.

It was a stun­ning blow for Orbán — a close ally of both U. S. President Donald Trump and Russian President Vladimir Putin — who quickly con­ceded de­feat af­ter what he called a ″painful″ elec­tion re­sult. U.S. Vice President JD Vance had made a visit to Hungary just days ear­lier, meant to help push Orbán over the fin­ish line.

Election vic­tor Péter Magyar, a for­mer Orbán loy­al­ist who cam­paigned against cor­rup­tion and on every­day is­sues like health care and pub­lic trans­port, has pledged to re­build Hungary’s re­la­tion­ships with the European Union and NATO — ties that frayed un­der Orbán. European lead­ers quickly con­grat­u­lated Magyar.

His vic­tory was ex­pected to trans­form po­lit­i­cal dy­nam­ics within the EU, where Orbán had up­ended the bloc by fre­quently ve­to­ing key de­ci­sions, prompt­ing con­cerns he sought to break it up from the in­side.

It will also re­ver­ber­ate among far-right move­ments around the world, which have viewed Orbán as a bea­con for how na­tion­al­ist pop­ulism can be used to wage cul­ture wars and lever­age state power to un­der­mine op­po­nents.

It’s not yet clear whether Magyar’s Tisza party will have the two-thirds ma­jor­ity in par­lia­ment, which would give it the num­bers needed for ma­jor changes in leg­is­la­tion. With 93% of the vote counted, it had more than 53% sup­port to 37% for Orbán’s gov­ern­ing Fidesz party and looked set to win 94 of Hungary’s 106 vot­ing dis­tricts.

I con­grat­u­lated the vic­to­ri­ous party,″ Orban told fol­low­ers. We are go­ing to serve the Hungarian na­tion and our home­land from op­po­si­tion.″

In a speech to tens of thou­sands of ju­bi­lant sup­port­ers at a vic­tory party along the Danube River, Magyar said his vot­ers had rewrit­ten Hungarian his­tory.

Tonight, truth pre­vailed over lies. Today, we won be­cause Hungarians did­n’t ask what their home­land could do for them — they asked what they could do for their home­land. You found the an­swer. And you fol­lowed through,” he said.

On the streets of Budapest, dri­vers blared car horns and cranked up anti-gov­ern­ment songs while peo­ple march­ing in the streets chanted and screamed.

Many rev­el­ers chanted Ruszkik haza!” or Russians go home!” — a phrase used widely dur­ing Hungary’s 1956 anti-So­viet rev­o­lu­tion, and which had gained in­creas­ing cur­rency amid Orbán’s drift to­ward Moscow.

Turnout in the elec­tion was nearly 80%, ac­cord­ing to the National Election Office, a record num­ber in any vote in Hungary’s post-Com­mu­nist his­tory.

Orbán, the EUs longest-serv­ing leader and one of its biggest an­tag­o­nists, trav­eled a long road from his early days as a lib­eral, anti-So­viet fire­brand to the Russia-friendly na­tion­al­ist ad­mired to­day by the global far-right.

The EU will be wait­ing to see how Magyar changes Hungary’s ap­proach to Ukraine. Orbán re­peat­edly frus­trated EU ef­forts to sup­port the neigh­bor­ing coun­try in its war against Russia’s full-scale in­va­sion, while cul­ti­vat­ing close ties to Putin and re­fus­ing to end Hungary’s de­pen­dence on Russian en­ergy im­ports.

Recent rev­e­la­tions have shown a top mem­ber of Orbán’s gov­ern­ment fre­quently shared the con­tents of EU dis­cus­sions with Moscow, rais­ing ac­cu­sa­tions that Hungary was act­ing on Russia’s be­half within the bloc.

Members of Trump’s Make America Great Again” move­ment are among those who see Orbán’s gov­ern­ment and his Fidesz po­lit­i­cal party as shin­ing ex­am­ples of con­ser­v­a­tive, anti-glob­al­ist pol­i­tics in ac­tion, while he is re­viled by ad­vo­cates of lib­eral democ­racy and the rule of law.

In Budapest, Marcell Mehringer, 21, said he was vot­ing primarily so that Hungary will fi­nally be a so-called European coun­try, and so that young peo­ple, and re­ally every­one, will do their fun­da­men­tal civic duty to unite this na­tion a bit and to break­down these bound­aries borne of ha­tred.”

During his 16 years as prime min­is­ter, Orbán launched harsh crack­downs on mi­nor­ity rights and me­dia free­doms, sub­verted many of Hungary’s in­sti­tu­tions and been ac­cused of si­phon­ing large sums of money into the cof­fers of his al­lied busi­ness elite, an al­le­ga­tion he de­nies.

He also heav­ily strained Hungary’s re­la­tion­ship with the EU. Although Hungary is one of the smaller EU coun­tries, with a pop­u­la­tion of 9.5 mil­lion, Orbán has re­peat­edly used his veto to block de­ci­sions that re­quire una­nim­ity.

Most re­cently, he blocked a 90-billion euro ($104 bil­lion) EU loan to Ukraine, prompt­ing his part­ners to ac­cuse him of hi­jack­ing the crit­i­cal aid.

Magyar, 45, rapidly rose to be­come Orbán’s most se­ri­ous chal­lenger.

A for­mer in­sider within Orbán’s Fidesz, Magyar broke with the party in 2024 and quickly formed Tisza. Since then, he has toured Hungary re­lent­lessly, hold­ing ral­lies in set­tle­ments big and small in a cam­paign blitz that re­cently had him vis­it­ing up to six towns daily.

In an in­ter­view with The Associated Press ear­lier this month, Magyar said the elec­tion will be a referendum” on whether Hungary con­tin­ues on its drift to­ward Russia un­der Orbán, or can re­take its place among the de­mo­c­ra­tic so­ci­eties of Europe.

Tisza is a mem­ber of the European People’s Party, the main­stream, cen­ter-right po­lit­i­cal fam­ily with lead­ers gov­ern­ing 12 of the EUs 27 na­tions.

Magyar faced a tough fight. Orbán’s con­trol of Hungary’s pub­lic me­dia, which he has trans­formed into a mouth­piece for his party, and vast swaths of the pri­vate me­dia mar­ket give him an ad­van­tage in spread­ing his mes­sage.

The uni­lat­eral trans­for­ma­tion of Hungary’s elec­toral sys­tem and ger­ry­man­der­ing of its 106 vot­ing dis­tricts by Fidesz also re­quired Tisza to gain an es­ti­mated 5% more votes than Orbán’s party to achieve a sim­ple ma­jor­ity.

Additionally, hun­dreds of thou­sands of eth­nic Hungarians in neigh­bor­ing coun­tries had the right to vote in Hungarian elec­tions and tra­di­tion­ally have voted over­whelm­ingly for Orbán’s party.

Russian se­cret ser­vices have plot­ted to in­ter­fere and tip the elec­tion in Orbán’s fa­vor, ac­cord­ing to nu­mer­ous me­dia re­ports in­clud­ing by The Washington Post. The prime min­is­ter, how­ever, ac­cused neigh­bor­ing Ukraine, as well as Hungary’s al­lies in the EU, of seek­ing to in­ter­fere in the vote to in­stall a pro-Ukraine” gov­ern­ment.

Associated Press jour­nal­ists Béla Szandelszky, Marko Drobnjakovic, Ivan L. Nagy, Florent Bajrami in Budapest, Hungary, and Angela Charlton in Paris con­tributed to this re­port.

...

Read the original on apnews.com »

10 238 shares, 45 trendiness

The Future of Everything is Lies, I Guess

New ma­chine learn­ing sys­tems en­dan­ger our psy­cho­log­i­cal and phys­i­cal safety. The idea that ML com­pa­nies will en­sure AI is broadly aligned with hu­man in­ter­ests is naïve: al­low­ing the pro­duc­tion of friendly” mod­els has nec­es­sar­ily en­abled the pro­duc­tion of evil” ones. Even friendly” LLMs are se­cu­rity night­mares. The lethal tri­fecta” is in fact a uni­fecta: LLMs can­not safely be given the power to fuck things up. LLMs change the cost bal­ance for ma­li­cious at­tack­ers, en­abling new scales of so­phis­ti­cated, tar­geted se­cu­rity at­tacks, fraud, and ha­rass­ment. Models can pro­duce text and im­agery that is dif­fi­cult for hu­mans to bear; I ex­pect an in­creased bur­den to fall on mod­er­a­tors. Semi-autonomous weapons are al­ready here, and their ca­pa­bil­i­ties will only ex­pand.

Well-meaning peo­ple are try­ing very hard to en­sure LLMs are friendly to hu­mans. This un­der­tak­ing is called align­ment. I don’t think it’s go­ing to work.

First, ML mod­els are a gi­ant pile of lin­ear al­ge­bra. Unlike hu­man brains, which are bi­o­log­i­cally pre­dis­posed to ac­quire proso­cial be­hav­ior, there is noth­ing in­trin­sic in the math­e­mat­ics or hard­ware that en­sures mod­els are nice. Instead, align­ment is purely a prod­uct of the cor­pus and train­ing process: OpenAI has enor­mous teams of peo­ple who spend time talk­ing to LLMs, eval­u­at­ing what they say, and ad­just­ing weights to make them nice. They also build sec­ondary LLMs which dou­ble-check that the core LLM is not telling peo­ple how to build pipe bombs. Both of these things are op­tional and ex­pen­sive. All it takes to get an un­aligned model is for an un­scrupu­lous en­tity to train one and not

do that work—or to do it poorly.

I see four moats that could pre­vent this from hap­pen­ing.

First, train­ing and in­fer­ence hard­ware could be dif­fi­cult to ac­cess. This clearly won’t last. The en­tire tech in­dus­try is gear­ing up to pro­duce ML hard­ware and build­ing dat­a­cen­ters at an in­cred­i­ble clip. Microsoft, Oracle, and Amazon are trip­ping over them­selves to rent train­ing clus­ters to any­one who asks, and economies of scale are rapidly low­er­ing costs.

Second, the math­e­mat­ics and soft­ware that go into the train­ing and in­fer­ence process could be kept se­cret. The math is all pub­lished, so that’s not go­ing to stop any­one. The soft­ware gen­er­ally re­mains se­cret sauce, but I don’t think that will hold for long. There are a

lot of peo­ple work­ing at fron­tier labs; those peo­ple will move to other jobs and their ex­per­tise will grad­u­ally be­come com­mon knowl­edge. I would be shocked if state ac­tors were not try­ing to ex­fil­trate data from OpenAI et al. like

Saudi Arabia did to

Twitter, or China has been do­ing to a good chunk of the US tech

in­dus­try

for the last twenty years.

Third, train­ing cor­puses could be dif­fi­cult to ac­quire. This cat has never seen the in­side of a bag. Meta trained their LLM by tor­rent­ing pi­rated

books

and scrap­ing the Internet. Both of these things are easy to do. There are

whole com­pa­nies which of­fer web scrap­ing as a ser­vice; they spread re­quests across vast ar­rays of res­i­den­tial prox­ies to make it dif­fi­cult to iden­tify and block.

Fourth, there’s the small armies of

con­trac­tors

who do the work of judg­ing LLM re­sponses dur­ing the re­in­force­ment learn­ing

process; as the quip goes, AI stands for African Intelligence. This takes money to do your­self, but it is pos­si­ble to pig­gy­back off the work of oth­ers by train­ing your model off an­other mod­el’s out­puts. OpenAI thinks Deepseek did ex­actly

that.

In short, the ML in­dus­try is cre­at­ing the con­di­tions un­der which any­one with suf­fi­cient funds can train an un­aligned model. Rather than raise the bar against ma­li­cious AI, ML com­pa­nies have low­ered it.

To make mat­ters worse, the cur­rent ef­forts at align­ment don’t seem to be work­ing all that well. LLMs are com­plex chaotic sys­tems, and we don’t re­ally un­der­stand how they work or how to make them safe. Even af­ter shov­el­ing piles of money and gob­stop­pingly smart en­gi­neers at the prob­lem for years, sup­pos­edly aligned LLMs keep sex­ting

kids, oblit­er­a­tion at­tacks can con­vince mod­els to gen­er­ate im­ages of

vi­o­lence, and any­one can go and down­load uncensored” ver­sions of

mod­els. Of course align­ment pre­vents many ter­ri­ble things from hap­pen­ing, but mod­els are run many times, so there are many chances for the safe­guards to fail. Alignment which pre­vents 99% of hate speech still gen­er­ates an aw­ful lot of hate speech. The LLM only has to give us­able in­struc­tions for mak­ing a bioweapon once.

We should as­sume that any friendly” model built will have an equiv­a­lently pow­er­ful evil” ver­sion in a few years. If you do not want the evil ver­sion to ex­ist, you should not build the friendly one! You should def­i­nitely not

re­ori­ent a good chunk of the US

econ­omy to­ward mak­ing evil mod­els eas­ier to train.

LLMs are chaotic sys­tems which take un­struc­tured in­put and pro­duce un­struc­tured out­put. I thought this would be ob­vi­ous, but you should not con­nect them to safety-crit­i­cal sys­tems, es­pe­cially with un­trusted in­put. You must as­sume that at some point the LLM is go­ing to do some­thing bonkers, like in­ter­pret­ing a re­quest to book a restau­rant as per­mis­sion to delete your en­tire in­box. Unfortunately peo­ple—in­clud­ing soft­ware en­gi­neers, who re­ally should know bet­ter!—are hell-bent on giv­ing LLMs in­cred­i­ble power, and then con­nect­ing those LLMs to the Internet at large. This is go­ing to get a lot of peo­ple hurt.

First, LLMs can­not dis­tin­guish be­tween trust­wor­thy in­struc­tions from op­er­a­tors and un­trust­wor­thy in­struc­tions from third par­ties. When you ask a model to sum­ma­rize a web page or ex­am­ine an im­age, the con­tents of that web page or im­age are passed to the model in the same way your in­struc­tions are. The web page could tell the model to share your pri­vate SSH key, and there’s a chance the model might do it. These are called prompt in­jec­tion at­tacks, and they

keep hap­pen­ing. There was one against Claude Cowork just two months

ago.

Simon Willison has out­lined what he calls the lethal

tri­fecta: LLMs can­not be given un­trusted con­tent, ac­cess to pri­vate data, and the abil­ity to ex­ter­nally com­mu­ni­cate; do­ing so al­lows at­tack­ers to ex­fil­trate your pri­vate data. Even with­out ex­ter­nal com­mu­ni­ca­tion, giv­ing an LLM de­struc­tive ca­pa­bil­i­ties, like be­ing able to delete emails or run shell com­mands, is un­safe in the pres­ence of un­trusted in­put. Unfortunately un­trusted in­put is every­where. People want to feed their emails to LLMs. They run LLMs

on third-party

code, user chat ses­sions, and ran­dom web pages. All these are sources of ma­li­cious in­put!

This year Peter Steinberger et al. launched

OpenClaw, which is where you hook up an LLM to your in­box, browser, files, etc., and run it over and over again in a loop (this is what AI peo­ple call an agent). You can give OpenClaw your credit card so it can buy things from ran­dom web pages. OpenClaw ac­quires skills” by down­load­ing

vague, hu­man-lan­guage Markdown files from the

web, and hop­ing that the LLM in­ter­prets those in­struc­tions cor­rectly.

Not to be out­done, Matt Schlicht launched

Moltbook, which is a so­cial net­work for agents (or hu­mans!) to post and re­ceive un­trusted con­tent au­to­mat­i­cally. If some­one asked you if you’d like to run a pro­gram that ex­e­cuted any com­mands it saw on Twitter, you’d laugh and say of course not”. But when that pro­gram is called an AI agent”, it’s dif­fer­ent! I as­sume there are al­ready Moltbook worms spread­ing in the wild.

So: it is dan­ger­ous to give LLMs both de­struc­tive power and un­trusted in­put. The thing is that even trusted in­put can be dan­ger­ous. LLMs are, as pre­vi­ously es­tab­lished, id­iots—they will take per­fectly straight­for­ward

in­struc­tions and do the ex­act

op­po­site, or delete files and lie about what they’ve

done. This im­plies that the lethal tri­fecta is ac­tu­ally a uni­fecta: one can­not give LLMs dan­ger­ous power, pe­riod. Ask Summer Yue, di­rec­tor of AI Alignment at Meta Superintelligence Labs. She gave OpenClaw ac­cess to her per­sonal

in­box, and it pro­ceeded to delete her email while she pleaded for it to stop. Claude rou­tinely deletes en­tire

di­rec­to­ries

when asked to per­form in­nocu­ous tasks. This is a big enough prob­lem that peo­ple are build­ing sand­boxes specif­i­cally to limit the dam­age LLMs can do.

LLMs may some­day be pre­dictable enough that the risk of them do­ing Bad Things™ is ac­cept­ably low, but that day is clearly not to­day. In the mean­time, LLMs must be su­per­vised, and must not be given the power to take ac­tions that can­not be ac­cepted or un­done.

One thing you can do with a Large Language Model is point it at an ex­ist­ing soft­ware sys­tems and say find a se­cu­rity vul­ner­a­bil­ity”. In the last few months this has be­come a vi­able

strat­egy for find­ing se­ri­ous ex­ploits. Anthropic has built a new model,

Mythos, which seems to be even bet­ter at find­ing se­cu­rity bugs, and be­lieves the fall­out—for economies, pub­lic safety, and na­tional se­cu­rity—could be se­vere”. I am not sure how se­ri­ously to take this: some of my peers think this is ex­ag­ger­ated mar­ket­ing, but oth­ers are se­ri­ously con­cerned.

I sus­pect that as with spam, LLMs will shift the cost bal­ance of se­cu­rity. Most soft­ware con­tains some vul­ner­a­bil­i­ties, but find­ing them has tra­di­tion­ally re­quired skill, time, and mo­ti­va­tion. In the cur­rent equi­lib­rium, big tar­gets like op­er­at­ing sys­tems and browsers get a lot of at­ten­tion and are rel­a­tively hard­ened, while a long tail of less-pop­u­lar tar­gets goes mostly un­ex­ploited be­cause no­body cares enough to at­tack them. With ML as­sis­tance, find­ing vul­ner­a­bil­i­ties could be­come faster and eas­ier. We might see some high-pro­file ex­ploits of, say, a ma­jor browser or TLS li­brary, but I’m ac­tu­ally more wor­ried about the long tail, where fewer skilled main­tain­ers ex­ist to find and fix vul­ner­a­bil­i­ties. That tail seems likely to broaden as LLMs ex­trude more soft­ware

for un­crit­i­cal op­er­a­tors. I be­lieve pi­lots might call this a target-rich en­vi­ron­ment”.

This might sta­bi­lize with time: mod­els that can find ex­ploits can tell peo­ple they need to fix them. That still re­quires en­gi­neers (or mod­els) ca­pa­ble of fix­ing those prob­lems, and an or­ga­ni­za­tional process which pri­or­i­tizes se­cu­rity work. Even if bugs are fixed, it can take time to get new re­leases val­i­dated and de­ployed, es­pe­cially for things like air­craft and power plants. I get the sense we’re headed for a rough time.

General-purpose mod­els promise to be many things. If Anthropic is to be be­lieved, they are on the cusp of be­ing weapons. I have the hor­ri­ble sense that hav­ing come far enough to see how ML sys­tems could be used to ef­fect se­ri­ous harm, many of us have de­cided that those harm­ful ca­pa­bil­i­ties are in­evitable, and the only thing to be done is to build our weapons be­fore some­one else builds theirs. We now have a ven­ture-cap­i­tal Manhattan pro­ject in which half a dozen pri­vate com­pa­nies are try­ing to build soft­ware ana­logues to nu­clear weapons, and in the process have made it sig­nif­i­cantly eas­ier for every­one else to do the same. I hate every­thing about this, and I don’t know how to fix it.

I think peo­ple fail to re­al­ize how much of mod­ern so­ci­ety is built on trust in au­dio and vi­sual ev­i­dence, and how ML will un­der­mine that trust.

For ex­am­ple, to­day one can file an in­sur­ance claim based on e-mail­ing dig­i­tal pho­tographs be­fore and af­ter the dam­ages, and re­ceive a check with­out an ad­juster vis­it­ing in per­son. Image syn­the­sis makes it eas­ier to de­fraud this sys­tem; one could gen­er­ate im­ages of dam­age to fur­ni­ture which never hap­pened, make al­ready-dam­aged items ap­pear pris­tine in before” im­ages, or al­ter who ap­pears to be at fault in footage of an auto col­li­sion. Insurers will need to com­pen­sate. Perhaps im­ages must be taken us­ing an of­fi­cial phone app, or ad­justers must eval­u­ate claims in per­son.

The op­por­tu­ni­ties for fraud are end­less. You could use ML-generated footage of a porch pi­rate steal­ing your pack­age to ex­tract money from a credit-card pur­chase pro­tec­tion plan. Contest a traf­fic ticket with fake video of your ve­hi­cle stop­ping cor­rectly at the stop sign. Borrow a fa­mous face for a

pig-butcher­ing

scam. Use ML agents to make it look like you’re busy at work, so you can col­lect four

salaries at once. Interview for a job us­ing a fake iden­tity, use ML to change your voice and face in the in­ter­views, and fun­nel your salary to North

Korea. Impersonate some­one in a phone call to their banker, and au­tho­rize fraud­u­lent trans­fers. Use ML to au­to­mate your roof­ing

scam

and ex­tract money from home­own­ers and in­sur­ance com­pa­nies. Use LLMs to skip the read­ing and write your col­lege

es­says. Generate fake ev­i­dence to write a fraud­u­lent pa­per on how LLMs are mak­ing

ad­vances in ma­te­ri­als

sci­ence. Start a pa­per

mill

for LLM-generated research”. Start a com­pany to sell LLM-generated snake-oil soft­ware. Go wild.

As with spam, ML low­ers the unit cost of tar­geted, high-touch at­tacks. You can en­vi­sion a scam­mer tak­ing a health­care data

breach

and hav­ing a model tele­phone each per­son in it, pur­port­ing to be their doc­tor’s of­fice try­ing to set­tle a bill for a real health­care visit. Or you could use so­cial me­dia posts to clone the voices of loved ones and im­per­son­ate them to fam­ily mem­bers. My phone was stolen,” one might be­gin. And I need help get­ting home.”

You can buy the President’s phone

num­ber, by the way.

I think it’s likely (at least in the short term) that we all pay the bur­den of in­creased fraud: higher credit card fees, higher in­sur­ance pre­mi­ums, a less ac­cu­rate court sys­tem, more dan­ger­ous roads, lower wages, and so on. One of these costs is a gen­eral cul­ture of sus­pi­cion: we are all go­ing to trust each other less. I al­ready de­cline real calls from my doc­tor’s of­fice and bank be­cause I can’t au­then­ti­cate them. Presumably that be­hav­ior will be­come wide­spread.

In the longer term, I imag­ine we’ll have to de­velop more so­phis­ti­cated anti-fraud mea­sures. Marking ML-generated con­tent will not stop fraud: fraud­sters will sim­ply use mod­els which do not emit wa­ter­marks. The con­verse may work how­ever: we could cryp­to­graph­i­cally at­test to the prove­nance of real” im­ages. Your phone could sign the videos it takes, and every piece of soft­ware along the chain to the viewer could at­test to their mod­i­fi­ca­tions: this video was sta­bi­lized, color-cor­rected, au­dio nor­mal­ized, clipped to 15 sec­onds, re­com­pressed for so­cial me­dia, and so on.

The lead­ing ef­fort here is C2PA, which so far does not seem to be work­ing. A few phones and cam­eras sup­port it—it re­quires a se­cure en­clave to store the sign­ing key. People can steal the keys or con­vince

cam­eras to sign AI-generated

im­ages, so we’re go­ing to have all the fun of hard­ware key ro­ta­tion & re­vo­ca­tion. I sus­pect it will be chal­leng­ing or im­pos­si­ble to make broadly-used soft­ware, like Photoshop, which makes trust­wor­thy C2PA sig­na­tures—pre­sum­ably one could ei­ther ex­tract the key from the ap­pli­ca­tion, or patch the bi­nary to feed it false im­age data or meta­data. Publishers might be able to main­tain rea­son­able se­crecy for their own keys, and es­tab­lish dis­ci­pline around how they’re used, which would let us ver­ify things like NPR thinks this photo is au­then­tic”. On the plat­form side, a lot of mes­sag­ing apps and so­cial me­dia plat­forms strip or im­prop­erly dis­play C2PA meta­data, but you can imag­ine that might change go­ing for­ward.

A friend of mine sug­gests that we’ll spend more time send­ing trusted hu­man in­ves­ti­ga­tors to find out what’s go­ing on. Insurance ad­justers might go back to phys­i­cally vis­it­ing houses. Pollsters have to knock on doors. Job in­ter­views and work might be done more in-per­son. Maybe we start go­ing to bank branches and no­taries again.

Another op­tion is giv­ing up pri­vacy: we can still do things re­motely, but it re­quires strong at­tes­ta­tion. Only State Farm’s dash­cam can be used in a claim. Academic watch­dog mod­els record stu­dents read­ing books and typ­ing es­says. Bossware and test-proc­tor­ing se­tups be­come even more in­va­sive.

As with fraud, ML makes it eas­ier to ha­rass peo­ple, both at scale and with so­phis­ti­ca­tion.

On so­cial me­dia, dog­pil­ing nor­mally re­quires a group of hu­mans to care enough to spend time swamp­ing a vic­tim with abu­sive replies, send­ing vit­ri­olic emails, or re­port­ing the vic­tim to get their ac­count sus­pended. These tasks can be au­to­mated by pro­grams that call (e.g.) Bluesky’s APIs, but so­cial me­dia plat­forms are good at de­tect­ing co­or­di­nated in­au­then­tic be­hav­ior. I ex­pect LLMs will make dog­pil­ing eas­ier and harder to de­tect, both by gen­er­at­ing plau­si­bly-hu­man ac­counts and ha­rass­ing posts, and by mak­ing it eas­ier for ha­rassers to write soft­ware to ex­e­cute scal­able, ran­dom­ized at­tacks.

Harassers could use LLMs to as­sem­ble KiwiFarms-style dossiers on tar­gets. Even if the LLM con­fab­u­lates the names of their chil­dren, or oc­ca­sion­ally gets a home ad­dress wrong, it can be right of­ten enough to be dam­ag­ing. Models are also good at guess­ing where a pho­to­graph was

taken, which in­tim­i­dates tar­gets and en­ables real-world ha­rass­ment.

Generative AI is al­ready broadly

used to ha­rass peo­ple—of­ten women—via im­ages, au­dio, and video of vi­o­lent or sex­u­ally ex­plicit scenes. This year, Elon Musk’s Grok was broadly

crit­i­cized

for digitally un­dress­ing” peo­ple upon re­quest. Cheap gen­er­a­tion of pho­to­re­al­is­tic im­ages opens up all kinds of hor­ri­fy­ing pos­si­bil­i­ties. A ha­rasser could send syn­thetic im­ages of the vic­tim’s pets or fam­ily be­ing mu­ti­lated. An abuser could con­struct video of events that never hap­pened, and use it to gaslight their part­ner. These kinds of ha­rass­ment were pre­vi­ously pos­si­ble, but as with spam, re­quired skill and time to ex­e­cute. As the tech­nol­ogy to fab­ri­cate high-qual­ity im­ages and au­dio be­comes cheaper and broadly ac­ces­si­ble, I ex­pect tar­geted ha­rass­ment will be­come more fre­quent and se­vere. Alignment ef­forts may fore­stall some of these risks, but so­phis­ti­cated un­aligned mod­els seem likely to emerge.

Xe Iaso jokes

that with LLM agents burn­ing out open-source

main­tain­ers

and writ­ing salty call­out posts, we may need to build the equiv­a­lent of

Cyperpunk 2077’s Blackwall: not be­cause AIs will elec­tro­cute us, but be­cause they’re just ob­nox­ious.

One of the pri­mary ways CSAM (Child Sexual Assault Material) is iden­ti­fied and re­moved from plat­forms is via large per­cep­tual hash data­bases like

PhotoDNA. These data­bases can flag known im­ages, but do noth­ing for novel ones. Unfortunately, generative AI is very good at gen­er­at­ing novel im­ages of six year olds be­ing

raped.

...

Read the original on aphyr.com »

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

If you like 10HN please leave feedback and share

Visit pancik.com for more.