10 interesting stories served every morning and every evening.

DNSSEC Debugger - nic.de

dnssec-analyzer.verisignlabs.com

Back to Verisign Labs Tools

Analyzing DNSSEC prob­lems for nic.de

Move your mouse over any or sym­bols for re­me­di­a­tion hints.

Want a sec­ond opin­ion? Test nic.de at dnsviz.net.

↓ Advanced op­tions

DENIC Status

status.denic.de

Components

DNS

Services

DNS Nameservice

May 6, 2026 01:34 CESTMay 5, 2026 23:34 UTC

RESOLVED

All Services are up and run­ning.

May 5, 2026 23:28 CESTMay 5, 2026 21:28 UTC

INVESTIGATING

Frankfurt am Main, 5 May 2026 — DENIC eG is cur­rently ex­pe­ri­enc­ing a dis­rup­tion in its DNS ser­vice for .de do­mains. As a re­sult, all DNSSEC-signed .de do­mains are cur­rently af­fected in their reach­a­bil­ity. The root cause of the dis­rup­tion has not yet been fully iden­ti­fied. DENICs tech­ni­cal teams are work­ing in­ten­sively on analy­sis and on restor­ing sta­ble op­er­a­tions as quickly as pos­si­ble. Based on cur­rent in­for­ma­tion, users and op­er­a­tors of .de do­mains may ex­pe­ri­ence im­pair­ments in do­main res­o­lu­tion. Further up­dates will be pro­vided as soon as re­li­able find­ings on the cause and re­cov­ery are avail­able. DENIC asks all af­fected par­ties for their un­der­stand­ing. For fur­ther en­quiries, DENIC can be con­tacted via the usual chan­nels.

Accelerating Gemma 4: faster inference with multi-token prediction drafters

blog.google

May 05, 2026

By us­ing Multi-Token Prediction (MTP) drafters, Gemma 4 mod­els re­duce la­tency bot­tle­necks and achieve im­proved re­spon­sive­ness for de­vel­op­ers.

Olivier Lacombe

Director, Product Management

Maarten Grootendorst

Developer Relations Engineer

Your browser does not sup­port the au­dio el­e­ment.

Listen to ar­ti­cle

This con­tent is gen­er­ated by Google AI. Generative AI is ex­per­i­men­tal

[[duration]] min­utes

Just a few weeks ago, we in­tro­duced Gemma 4, our most ca­pa­ble open mod­els to date. With over 60 mil­lion down­loads in just the first few weeks, Gemma 4 is de­liv­er­ing un­prece­dented in­tel­li­gence-per-pa­ra­me­ter to de­vel­oper work­sta­tions, mo­bile de­vices and the cloud. Today, we are push­ing ef­fi­ciency even fur­ther.

We’re re­leas­ing Multi-Token Prediction (MTP) drafters for the Gemma 4 fam­ily. By us­ing a spe­cial­ized spec­u­la­tive de­cod­ing ar­chi­tec­ture, these drafters de­liver up to a 3x speedup with­out any degra­da­tion in out­put qual­ity or rea­son­ing logic.

Tokens-per-second speed in­creases, tested on hard­ware us­ing LiteRT-LM, MLX, Hugging Face Transformers, and vLLM.

Why spec­u­la­tive de­cod­ing?

The tech­ni­cal re­al­ity is that stan­dard LLM in­fer­ence is mem­ory-band­width bound, cre­at­ing a sig­nif­i­cant la­tency bot­tle­neck. The proces­sor spends the ma­jor­ity of its time mov­ing bil­lions of pa­ra­me­ters from VRAM to the com­pute units just to gen­er­ate a sin­gle to­ken. This leads to un­der-uti­lized com­pute and high la­tency, es­pe­cially on con­sumer-grade hard­ware.

Speculative de­cod­ing de­cou­ples to­ken gen­er­a­tion from ver­i­fi­ca­tion. By pair­ing a heavy tar­get model (e.g., Gemma 4 31B) with a light­weight drafter (the MTP model), we can uti­lize idle com­pute to predict” sev­eral fu­ture to­kens at once with the drafter in less time than it takes for the tar­get model to process just one to­ken. The tar­get model then ver­i­fies all of these sug­gested to­kens in par­al­lel.

How spec­u­la­tive de­cod­ing works

Standard large lan­guage mod­els gen­er­ate text au­tore­gres­sively, pro­duc­ing ex­actly one to­ken at a time. While ef­fec­tive, this process ded­i­cates the same amount of com­pu­ta­tion to pre­dict­ing an ob­vi­ous con­tin­u­a­tion (like pre­dict­ing words” af­ter Actions speak louder than…”) as it does to solv­ing a com­plex logic puz­zle.

MTP mit­i­gates this in­ef­fi­ciency through spec­u­la­tive de­cod­ing, a tech­nique in­tro­duced by Google re­searchers in Fast Inference from Transformers via Speculative Decoding. If the tar­get model agrees with the draft, it ac­cepts the en­tire se­quence in a sin­gle for­ward pass —and even gen­er­ates an ad­di­tional to­ken of its own in the process. This means your ap­pli­ca­tion can out­put the full drafted se­quence plus one to­ken in the time it usu­ally takes to gen­er­ate a sin­gle one.

Unlocking faster AI from the edge to the work­sta­tion

For de­vel­op­ers, in­fer­ence speed is of­ten the pri­mary bot­tle­neck for pro­duc­tion de­ploy­ment. Whether you are build­ing cod­ing as­sis­tants, au­tonomous agents that re­quire rapid multi-step plan­ning, or re­spon­sive mo­bile ap­pli­ca­tions run­ning en­tirely on-de­vice, every mil­lisec­ond mat­ters.

By pair­ing a Gemma 4 model with its cor­re­spond­ing drafter, de­vel­op­ers can achieve:

Improved re­spon­sive­ness: Drastically re­duce la­tency for near real-time chat, im­mer­sive voice ap­pli­ca­tions and agen­tic work­flows.

Supercharged lo­cal de­vel­op­ment: Run our 26B MoE and 31B Dense mod­els on per­sonal com­put­ers and con­sumer GPUs with un­prece­dented speed, pow­er­ing seam­less, com­plex of­fline cod­ing and agen­tic work­flows.

Enhanced on-de­vice per­for­mance: Maximize the util­ity of our E2B and E4B mod­els on edge de­vices by gen­er­at­ing out­puts faster, which in turn pre­serves valu­able bat­tery life.

Zero qual­ity degra­da­tion: Because the pri­mary Gemma 4 model re­tains the fi­nal ver­i­fi­ca­tion, you get iden­ti­cal fron­tier-class rea­son­ing and ac­cu­racy, just de­liv­ered sig­nif­i­cantly faster.

Gemma 4 26B on a NVIDIA RTX PRO 6000. Standard Inference (left) vs. MTP Drafter (right) in to­kens per sec­ond. Same out­put qual­ity, half the wait time.

Where you can dive deeper into MTP drafters

To make these MTP drafters ex­cep­tion­ally fast and ac­cu­rate, we in­tro­duced sev­eral ar­chi­tec­tural en­hance­ments un­der the hood. The draft mod­els seam­lessly uti­lize the tar­get mod­el’s ac­ti­va­tions and share its KV cache, mean­ing they don’t have to waste time re­cal­cu­lat­ing con­text the larger model has al­ready fig­ured out. For our E2B and E4B edge mod­els, where the fi­nal logit cal­cu­la­tion be­comes a big bot­tle­neck, we even im­ple­mented an ef­fi­cient clus­ter­ing tech­nique in the em­bed­der to fur­ther ac­cel­er­ate gen­er­a­tion.

We’ve also been closely an­a­lyz­ing hard­ware-spe­cific op­ti­miza­tions. For ex­am­ple, while the 26B mix­ture-of-ex­perts model pre­sents unique rout­ing chal­lenges at a batch size of 1 on Apple Silicon, pro­cess­ing mul­ti­ple re­quests si­mul­ta­ne­ously (e.g., batch sizes of 4 to 8) un­locks up to a ~2.2x speedup lo­cally. We see sim­i­lar gains with Nvidia A100 when in­creas­ing batch size.

Want to see the ex­act me­chan­ics of how this works? We’ve pub­lished an in-depth tech­ni­cal ex­plainer that un­packs the vi­sual ar­chi­tec­ture, KV cache shar­ing and ef­fi­cient em­bed­ders pow­er­ing these drafters.

How to get started

The MTP drafters for the Gemma 4 fam­ily are avail­able to­day un­der the same open-source Apache 2.0 li­cense as Gemma 4. Read the doc­u­men­ta­tion to learn how to use MTP with Gemma 4. You can down­load the model weights right now on Hugging Face, Kaggle, and start ex­per­i­ment­ing with faster in­fer­ence with trans­form­ers, MLX, VLLM, SGLang, and Ollama or try them di­rectly on Google AI Edge Gallery for Android or iOS.

We can’t wait to see how this new­found speed ac­cel­er­ates what you build next in the Gemmaverse.

Red Squares — the GitHub outage graph

red-squares.cian.lol

StarFighter 16-inch

us.starlabs.systems

StarFighter

A full-size Linux per­for­mance lap­top with pre­mium ma­te­ri­als, a hap­tic track­pad, open firmware op­tions, and room for heav­ier work­loads.

Intel® Core™ Ultra

Ultra

proces­sor lineup

Ryzen™

9

proces­sor

Up to

64 GB

7500MT/s LPDDR5X mem­ory

16-inch

120 Hz

IPS Display¹

Up to

18 hrs

bat­tery life

Micro-Arc

P EO

Oxidised Finish

All matte. No glare.

A true matte dis­play with a pro­tec­tive coat­ing al­lows colours to shine while dif­fus­ing am­bi­ent light.

3840x2400 4K Resolution

Enjoy highly ac­cu­rate colours that are crisp and clear.

16:10 Aspect Ratio

Work pro­fi­ciently with a screen ra­tio de­signed for pro­duc­tiv­ity.

625cd/m² of bright­ness

Visible in nearly any light, you can work in­doors and out - wher­ever you’re most pro­duc­tive.

120Hz Refresh Rate

Experience silky smooth im­ages with a re­fresh rate dou­ble that of a stan­dard dis­play.

178° view­ing an­gles and 180° hinge

Get comfy on the couch, share your screen, or work how­ever you’re most com­fort­able.

Do

you ever

get a feel­ing

like you’re be­ing

watched?

01.

Removable Webcam

With its easy-to-dis­con­nect mag­netic con­nec­tor, you can sim­ply un­plug the we­b­cam when­ever you want to en­sure that no one can ac­cess it.

02.

Built-in Storage

The built-in stor­age lets you store the we­b­cam in­side the chas­sis when you’re not us­ing it.

03.

Maximised Viewing Area

The re­mov­able we­b­cam also al­lows for min­i­mal bezels on all four sides, giv­ing you a max­i­mized view­ing area for your screen.

04.

Future Proof

Its fu­ture-proof USB con­nec­tion, you can eas­ily up­grade or re­place the we­b­cam with fu­ture ac­ces­sories.

Kill

Switch

The per­fect so­lu­tion for main­tain­ing con­trol over your wire­less con­nec­tiv­ity. With the sim­ple flick of a switch, you can dis­able your wire­less, en­sur­ing that it is never on un­less you want it to be. This elim­i­nates the risk of ac­ci­den­tally leav­ing your wire­less on and vul­ner­a­ble to hack­ers or other se­cu­rity threats.

A key­board

that’s pure

func­tion.

01.

Backlit Keyboard

Comfortable back­lit keys with snappy scis­sor mech­a­nisms.

02.

Media Keys

Media keys for play­back, vol­ume, bright­ness, screen­shots and more.

03.

Function Lock

Switch be­tween me­dia and tra­di­tional func­tion keys with one tap.

04.

International Layouts

Available in US English, UK English, French, German, Nordic and Spanish lay­outs.

05.

LED in­di­ca­tors

Subtle LED in­di­ca­tors built into the keys let you know when Caps Lock or Function Lock are en­abled.

The

Haptic

Trackpad

An over­sized solid-state track­pad; in­stead of phys­i­cal but­tons, it de­tects pres­sure and vi­brates to sim­u­late a click. It al­lows for 100% of the sur­face area to be click­able, with un­par­al­leled con­sis­tency.

Its glass sur­face is leg­endary; dyed, tough­ened and treated with an oleo­pho­bic coat­ing.

Plasma

Electrolytic

Oxidation

Plasma elec­trolytic ox­i­da­tion (PEO) is a coat­ing tech­nol­ogy that re­sults in ce­ramic lay­ers on the tar­get ma­te­r­ial. The re­sult is a tex­tured fin­ish four times harder than steel and nat­u­rally re­sis­tant to fin­ger­prints; this durable coat­ing will with­stand the test of time.

Versatile Connectivity

01.

WiFi 6E and Bluetooth 5.3

For all things wire­less.

02.

USB-C

With Thunderbolt™ 4/USB 4 for charg­ing and ex­pan­sion.

03.

HDMI

For easy out­put vir­tu­ally every­where.

04.

USB-A

Full-size USB 3.0.

05.

USB-C

With Thunderbolt™ 4/USB 4 for charg­ing and ex­pan­sion.

06.

Combo Jack

For au­dio in­put and out­put.

07.

USB-A

Agents can now create Cloudflare accounts, buy domains, and deploy

blog.cloudflare.com

2026 – 04-30

6 min read

This post is also avail­able in 한국어.

Coding agents are great at build­ing soft­ware. But to de­ploy to pro­duc­tion they need three things from the cloud they want to host their app — an ac­count, a way to pay, and an API to­ken. Until now these have been tasks that hu­mans han­dle di­rectly. Increasingly, agents han­dle them on the user’s be­half. The agent needs to per­form all the tasks a hu­man cus­tomer can. They’re given higher-or­der prob­lems to solve and choose to use Cloudflare and call Cloudflare APIs.

Starting to­day, agents can pro­vi­sion Cloudflare on be­half of their users. They can cre­ate a Cloudflare ac­count, start a paid sub­scrip­tion, reg­is­ter a do­main, and get back an API to­ken to de­ploy code right away. Humans can be in the loop to grant per­mis­sion and must ac­cept Cloudflare’s terms of ser­vice, but no hu­man steps are oth­er­wise re­quired from start to fin­ish. There’s no need to go to the dash­board, copy and paste API to­kens, or en­ter credit card de­tails. Without any ex­tra setup, agents have every­thing they need to de­ploy a new pro­duc­tion ap­pli­ca­tion in one shot. And with Cloudflare’s Code Mode MCP server and Agent Skills, they’re even bet­ter at it.

This all works via a new pro­to­col that we’ve co-de­signed with Stripe as part of the launch of Stripe Projects.

We’re ex­cited to launch this new part­ner­ship with Stripe, and also to of­fer $100,000 in Cloudflare cred­its to all new star­tups who in­cor­po­rate us­ing Stripe Atlas. But this new pro­to­col also makes it pos­si­ble for any plat­form with signed-in users to in­te­grate with Cloudflare in the same way Stripe does, with zero fric­tion for the end user.

How it works: zero to pro­duc­tion with­out any setup or man­ual steps

Install the Stripe CLI with the Stripe Projects plu­gin, lo­gin to Stripe, and then start a new pro­ject:

stripe pro­jects init

Then prompt your agent to build some­thing new and de­ploy it to a new do­main. You can watch a con­densed two-minute video of this en­tire flow be­low:

If the email you’re logged into Stripe with al­ready has a Cloudflare ac­count, you’ll be prompted with a typ­i­cal OAuth flow to grant the agent ac­cess. If there is no ex­ist­ing Cloudflare ac­count for the email you’re logged in with, Cloudflare will pro­vi­sion an ac­count au­to­mat­i­cally for you and your agent:

You will see the agent build and de­ploy a site to a new Cloudflare ac­count, and then use the Stripe Projects CLI to reg­is­ter the do­main:

The agent will prompt for in­put and ap­proval when nec­es­sary. For ex­am­ple, if your Stripe ac­count does­n’t yet have a linked pay­ment method, the agent will prompt you to add one:

At the end, the agent has de­ployed to pro­duc­tion, and the app runs on the newly reg­is­tered do­main:

The agent has gone from lit­eral zero, no Cloudflare ac­count at all, with­out any pre­con­fig­ured Agent Skills or MCP server, to hav­ing:

Provisioned a new Cloudflare ac­count

Provisioned a new Cloudflare ac­count

Obtained an API to­ken

Obtained an API to­ken

Purchased a do­main

Purchased a do­main

Deployed an app to pro­duc­tion

Deployed an app to pro­duc­tion

But wait — how did the agent dis­cover that it could do all of this? How did it know what ser­vices it could pro­vi­sion, and how to pur­chase a do­main? How did it gain the con­text it needed to un­der­stand how to de­ploy to Cloudflare? Let’s dig in.

How the pro­to­col and in­te­gra­tion works

There are three com­po­nents to the in­ter­ac­tion be­tween the agent, Stripe, and Cloudflare shown above:

Discovery — the agent can call a com­mand to query the cat­a­log of avail­able ser­vices.

Discovery — the agent can call a com­mand to query the cat­a­log of avail­able ser­vices.

Authorization — the plat­form at­tests to the iden­tity of the user, al­low­ing providers to pro­vi­sion ac­counts or link ex­ist­ing ones, and se­curely is­sue cre­den­tials back to the agent.

Authorization — the plat­form at­tests to the iden­tity of the user, al­low­ing providers to pro­vi­sion ac­counts or link ex­ist­ing ones, and se­curely is­sue cre­den­tials back to the agent.

Payment — the plat­form pro­vides a pay­ment to­ken that providers can use to bill the cus­tomer, al­low­ing the agent to start sub­scrip­tions, make pur­chases and be billed on a us­age ba­sis.

Payment — the plat­form pro­vides a pay­ment to­ken that providers can use to bill the cus­tomer, al­low­ing the agent to start sub­scrip­tions, make pur­chases and be billed on a us­age ba­sis.

These build on prior art and ex­ist­ing stan­dards like OAuth, OIDC and pay­ment to­k­eniza­tion — but are used to­gether to re­move many steps that might oth­er­wise re­quire a hu­man in the loop.

Discovery: how agents find ser­vices they can pro­vi­sion them­selves

In the agent ses­sion above, be­fore the agent ran the CLI com­mand stripe pro­jects add cloud­flare/​reg­is­trar:do­main, it first had to dis­cover the Cloudflare Registrar ser­vice. It did this by call­ing the stripe pro­jects cat­a­log com­mand, which re­turns avail­able ser­vices:

The full set of Cloudflare prod­ucts and ser­vices from other providers is long and grow­ing — ar­guably over­whelm­ing to hu­mans. But for agents, this cat­a­log of ser­vices is ex­actly the con­text they need. The agent chooses ser­vices to use from this cat­a­log based on what the user has asked them to do and the user’s pref­er­ences — but the user needs no prior knowl­edge of what ser­vices are of­fered by which providers, and does not need to pro­vide any in­put. Providers like Cloudflare make this cat­a­log avail­able via a sim­ple REST API that re­turns JSON, and that gives agents every­thing they need.

Authorization: in­stant ac­count cre­ation for new users

When the agent chooses a ser­vice and pro­vi­sions it (ex: stripe pro­jects add cloud­flare/​reg­is­trar:do­main), it pro­vi­sions the re­source within a Cloudflare ac­count. But how is it able to cre­ate one on de­mand, with­out send­ing a hu­man to a signup page?

Remember how at the start, the user signed in to their Stripe ac­count? Stripe acts as the iden­tity provider, at­test­ing to the user’s iden­tity. Cloudflare au­to­mat­i­cally pro­vi­sions a new ac­count for the user if no ac­count al­ready ex­ists, and re­turns cre­den­tials back to the Stripe Projects CLI, which are se­curely stored, but avail­able to the agent to use to make au­then­ti­cated re­quests to Cloudflare. This means if some­one is brand new to Cloudflare or other ser­vices, they can start build­ing right away with their agent, with­out ex­tra steps.

If the user al­ready has a Cloudflare ac­count, they’re sent through a stan­dard OAuth flow to grant ac­cess to the Stripe Projects CLI, al­low­ing them to pro­vi­sion re­sources on their ex­ist­ing Cloudflare ac­count.

Payment: give your agent a bud­get it can spend, with­out giv­ing it your credit card info

You might rightly worry, What if my agent goes a bit over­board and starts buy­ing dozens of do­mains? Will I end up on the hook for a mas­sive bill? Can I re­ally trust my agent with my credit card?”

The pro­to­col ac­counts for this in two ways. When an agent pro­vi­sions a paid ser­vice, Stripe in­cludes a pay­ment to­ken in the re­quest to the Provider (Cloudflare). Raw pay­ment de­tails like credit card num­bers aren’t ever shared with the agent. Stripe then sets a de­fault limit of $100.00 USD/month as the max­i­mum the agent can spend on any one provider. When you’re ready to raise this limit, you can then set Budget Alerts on your Cloudflare ac­count.

Any plat­form with signed-in users can in­te­grate with Cloudflare in the same way Stripe does

Any plat­form with signed-in users can act as the Orchestrator”, play­ing the same role Stripe does with Stripe Projects, and in­te­grate with Cloudflare.

Let’s say your prod­uct is a cod­ing agent. You’d love for peo­ple to be able to take what they’ve built and get it de­ployed to pro­duc­tion, us­ing Cloudflare and other ser­vices. But the last thing you want is to send peo­ple down a maze of au­tho­riza­tion flows and de­ci­sion trees of where and how to de­ploy it. You just want to let peo­ple ship.

Your plat­form acts as the Orchestrator, with the al­ready signed-in user. When your user needs a do­main, a stor­age bucket, a sand­box to give their agent, or any­thing else, you make one API call to Cloudflare to pro­vi­sion a new Cloudflare ac­count to them, and get back a to­ken to make au­then­ti­cated re­quests on their be­half.

Or let’s say you want Cloudflare cus­tomers to be able to eas­ily pro­vi­sion your ser­vice, sim­i­lar to how Cloudflare is part­ner­ing with Planetscale to make it pos­si­ble to cre­ate Planetscale Postgres data­bases di­rectly from Cloudflare. We started work­ing with Planetscale on this well be­fore this new pro­to­col got off the ground, but the flow here is quite sim­i­lar. Cloudflare acts as the Orchestrator, let­ting you con­nect to your PlanetScale ac­count, cre­ate data­bases, and use the user’s ex­ist­ing pay­ment method for billing.

This new pro­to­col starts to stan­dard­ize the types of cross-prod­uct in­te­gra­tions that many plat­forms have been do­ing for years, of­ten in ways that were one off or be­spoke to a par­tic­u­lar plat­form. Without a stan­dard, each in­te­gra­tion re­quired en­gi­neer­ing work that of­ten could­n’t be lever­aged for fu­ture in­te­gra­tions. Similar to how the OAuth stan­dard made it pos­si­ble to del­e­gate ac­cess to your ac­count to other plat­forms, the pro­to­col uses OAuth and ex­tends fur­ther into pay­ments and ac­count cre­ation, do­ing so in a way that treats agents as a first-class con­cern.

We’re ex­cited to con­tinue evolv­ing the stan­dard, and to work with Stripe on shar­ing a more of­fi­cial spec­i­fi­ca­tion soon. We’re also ex­cited to in­te­grate with more plat­forms —  email us at [email protected], and tell us how you want your plat­form to in­te­grate with Cloudflare.

Give your agent the power to pro­vi­sion and pay

Stripe Projects is in open beta, and you can get started even if you don’t yet have a Cloudflare ac­count. Just in­stall the Stripe CLI, log in to Stripe, and then start a new pro­ject:

stripe pro­jects init

Prompt your agent to build some­thing new on Cloudflare, and show us what you’ve built!

Cloudflare’s con­nec­tiv­ity cloud pro­tects en­tire cor­po­rate net­works, helps cus­tomers build Internet-scale ap­pli­ca­tions ef­fi­ciently, ac­cel­er­ates any web­site or Internet ap­pli­ca­tion, wards off DDoS at­tacks, keeps hack­ers at bay, and can help you on your jour­ney to Zero Trust.

Visit 1.1.1.1 from any de­vice to get started with our free app that makes your Internet faster and safer.

To learn more about our mis­sion to help build a bet­ter Internet, start here. If you’re look­ing for a new ca­reer di­rec­tion, check out our open po­si­tions.

Computer use is 45x More Expensive Than Structured APIs

reflex.dev

We ran a bench­mark com­par­ing two ways of let­ting an AI agent op­er­ate the same ad­min panel, with the goal of putting a price tag on vi­sion agents (browser-use, com­puter-use).

Here is what we mea­sured, what we had to change to make the vi­sion agent work at all, and what changes when gen­er­at­ing an API sur­face stops be­ing a sep­a­rate en­gi­neer­ing pro­ject.

Why vi­sion agents?

Vision agents are the de­fault for let­ting AI agents op­er­ate web apps that don’t ex­pose APIs. The al­ter­na­tive, writ­ing an MCP or REST sur­face per app, is its own en­gi­neer­ing pro­ject across the 20+ in­ter­nal tools most teams have. Most teams de­fault to vi­sion agents not be­cause they are bet­ter, but be­cause the al­ter­na­tive is too ex­pen­sive to build. The cost of the vi­sion ap­proach is treated as a fixed price.

We wanted to mea­sure the price.

The setup

The test app is an ad­min panel for man­ag­ing cus­tomers, or­ders, and re­views, mod­eled on the re­act-ad­min Posters Galore demo. Two agents tar­get the same run­ning app: one dri­ves the UI via screen­shots and clicks, the other calls the ap­p’s HTTP end­points di­rectly. Same Claude Sonnet, same pinned dataset, same task. The in­ter­face is the only vari­able.

The task: find the cus­tomer named Smith” with the most or­ders, lo­cate their most re­cent pend­ing or­der, ac­cept all of their pend­ing re­views, and mark the or­der as de­liv­ered. This touches three re­sources, re­quires fil­ter­ing, pag­i­na­tion, cross-en­tity lookups, and both reads and writes. It is the shape of work a typ­i­cal in­ter­nal tool sees daily.

Path A: Vision agent. Claude Sonnet dri­ving the UI via browser-use 0.12. Vision mode, tak­ing screen­shots and ex­e­cut­ing clicks.

Path B: API agent. Claude Sonnet with tool-use, call­ing the han­dlers the UI calls. Each tool maps to one or more event han­dlers on the ap­p’s State, the same func­tions a but­ton click would trig­ger. The agent gets the struc­tured re­sponse back in­stead of a ren­dered page.

The vi­sion agent could­n’t com­plete the task

We started by giv­ing both agents the same six-sen­tence task above and see­ing what hap­pened.

The API agent com­pleted it in 8 calls. It listed the cus­tomer’s re­views fil­tered by pend­ing sta­tus, ac­cepted each one, and marked the or­der as de­liv­ered. Both agents are call­ing into the same ap­pli­ca­tion logic; the API agent just reads the struc­tured re­sponse di­rectly in­stead of look­ing at a ren­dered page.

The vi­sion agent, on the same prompt, found one of four pend­ing re­views, ac­cepted it, and moved on. It never pag­i­nated. The re­main­ing three re­views were be­low the vis­i­ble fold of the re­views page and the agent had no sig­nal to scroll for them.

This is not a model prob­lem. The vi­sion agent was rea­son­ing about a ren­dered page and had no sig­nal that the page was­n’t show­ing every­thing. The API agent calls the same han­dler the UI calls, but the re­sponse in­cludes the full re­sult set the han­dler re­turned, not just the rows cur­rently ren­dered. The agent reads page 1 of 4 with 50 re­sults per page” di­rectly in­stead of hav­ing to in­ter­pret pag­i­na­tion con­trols from pix­els.

With a 14-step walk­through, it suc­ceeded

To make the com­par­i­son ap­ples-to-ap­ples, we rewrote the vi­sion prompt as an ex­plicit UI walk­through, nam­ing the side­bar items, tabs, and form fields the agent should in­ter­act with at each step. Fourteen num­bered in­struc­tions cov­er­ing the nav­i­ga­tion the agent had failed to fig­ure out on its own.

With the walk­through, the vi­sion agent com­pleted the task. It also ran for four­teen min­utes and con­sumed about half a mil­lion in­put to­kens.

The walk­through is it­self a find­ing. Each num­bered in­struc­tion is en­gi­neer­ing work that does­n’t show up in to­ken counts but rep­re­sents real cost. Anyone de­ploy­ing a vi­sion agent against an in­ter­nal tool is ei­ther writ­ing prompts at this level of speci­ficity or ac­cept­ing that the agent will silently miss work.

How we ran it

We ran the API path five times and the vi­sion path three times. The vi­sion path was capped at three tri­als be­cause each run takes 14 – 22 min­utes and con­sumes 400 – 750k to­kens.

Variance was the most sur­pris­ing part of the vi­sion re­sults. Across three tri­als the wall-clock time spanned 749s to 1257s, and in­put to­kens spanned 407k to 751k. The agent took 43 cy­cles in the short­est run and 68 in the longest. The screen­shot-rea­son-click loop has enough non-de­ter­min­ism that a sin­gle run is not a rep­re­sen­ta­tive cost es­ti­mate.

The API path had no such vari­ance. Sonnet hit iden­ti­cal 8 tool calls on every trial, with in­put to­ken counts vary­ing by ±27 across all five runs. The agent calls the same han­dlers in the same or­der be­cause the struc­tured re­sponses give it no rea­son to de­vi­ate.

The full re­sults

Numbers are mean ± sam­ple stan­dard de­vi­a­tion (n−1), with n=5 per API path and n=3 for the vi­sion path. Full run de­tails are avail­able in the repo.

Numbers are mean ± sam­ple stan­dard de­vi­a­tion (n−1), with n=5 per API path and n=3 for the vi­sion path. Full run de­tails are avail­able in the repo.

Haiku could not com­plete the vi­sion path. The fail­ure was spe­cific to browser-use 0.12′s struc­tured-out­put schema, which Haiku could not re­li­ably pro­duce in ei­ther vi­sion or text-only mode. On the API path, Haiku fin­ished in un­der 8 sec­onds for un­der 10k in­put to­kens, which is the cheap­est con­fig­u­ra­tion we tested.

The struc­tural gap

The cost dif­fer­ence fol­lows di­rectly from the ar­chi­tec­ture. An agent that must see in or­der to act will al­ways pay for the see­ing, re­gard­less of how good the model gets. Better vi­sion mod­els re­duce er­ror rates per screen­shot, but they do not re­duce the num­ber of screen­shots re­quired to reach the rel­e­vant data. Each ren­der is a screen­shot is thou­sands of in­put to­kens.

Both agents in this bench­mark walk through the same ap­pli­ca­tion logic. They both fil­ter, pag­i­nate, and up­date the same way the UI does. The dif­fer­ence is what they read at each step. The vi­sion agent reads pix­els and has to ren­der every in­ter­me­di­ate state to in­ter­pret it. The API agent reads the struc­tured re­sponse from the same han­dlers, which al­ready con­tains the data the UI was go­ing to dis­play.

Better mod­els will nar­row the cost per step. They will not nar­row the step count, be­cause the step count is set by the in­ter­face.

How we jus­tify the API en­gi­neer­ing cost

The bench­mark was made cheap to run by Reflex 0.9, which in­cludes a plu­gin that auto-gen­er­ates HTTP end­points from a Reflex ap­pli­ca­tion’s event han­dlers. None of the struc­tural ar­gu­ment de­pends on Reflex specif­i­cally, but it is what made run­ning the API path pos­si­ble with­out writ­ing a sec­ond code­base.

The in­ter­est­ing ques­tion is what be­comes pos­si­ble when the en­gi­neer­ing cost of an API sur­face drops to zero. Vision agents re­main the right tool for ap­pli­ca­tions you do not con­trol: third-party SaaS prod­ucts, legacy sys­tems, any­thing you can­not mod­ify. For in­ter­nal tools you build your­self, the math now points the other way.

Notes

Vision re­sults are spe­cific to browser-use 0.12 in vi­sion mode, and other vi­sion agents may be­have dif­fer­ently. The Path B run­ner shapes the auto-gen­er­ated end­points into a small REST tool sur­face of about thirty lines, which the agent sees as list_­cus­tomers, up­date_or­der, and sim­i­lar. The dataset is pinned and small (900 cus­tomers, 600 or­ders, 324 re­views), so be­hav­ior on pro­duc­tion-scale data is not mea­sured here. The vi­sion agent runs through LangChain’s ChatAnthropic, and the API agent runs through the Anthropic SDK di­rectly. Reported to­ken counts are un­cached in­put to­kens.

Reproduce it

The repo in­cludes seed data gen­er­a­tion, the patched re­act-ad­min demo, both agent scripts, and raw re­sults.

Meta, Zuckerberg Sued Over Alleged Copyright Infringement by Book Publishers and Scott Turow

variety.com

In a new le­gal bat­tle in the AI space, Meta and CEO Mark Zuckerberg have been sued by five pub­lish­ers and au­thor Scott Turow, who al­lege the tech com­pany il­le­gally copied mil­lions of books, ar­ti­cles and other works to train Meta’s ar­ti­fi­cial-in­tel­li­gence sys­tems.

In their ef­fort to win the AI arms race’ and build a func­tional gen­er­a­tive AI model, Defendants Meta and Zuckerberg fol­lowed their well-known motto: move fast and break things,’” the plain­tiffs say in their law­suit. They first il­le­gally tor­rented mil­lions of copy­righted books and jour­nal ar­ti­cles from no­to­ri­ous pi­rate sites and down­loaded unau­tho­rized web scrapes of vir­tu­ally the en­tire in­ter­net. They then copied those stolen fruits many times over to train Meta’s multi­bil­lion-dol­lar gen­er­a­tive AI sys­tem called Llama. In do­ing so, Defendants en­gaged in one of the most mas­sive in­fringe­ments of copy­righted ma­te­ri­als in his­tory.”

The suit was filed Tuesday (May 5) in the U.S. District Court for the Southern District of New York by five pub­lish­ers (Hachette, Macmillan, McGraw Hill, Elsevier and Cengage) and Turow in­di­vid­u­ally. The pro­posed class-ac­tion suit seeks un­spe­cific mon­e­tary dam­ages for the al­leged copy­right in­fringe­ment. A copy of the law­suit is avail­able at this link.

Asked for com­ment, a Meta spokesper­son said, AI is pow­er­ing trans­for­ma­tive in­no­va­tions, pro­duc­tiv­ity and cre­ativ­ity for in­di­vid­u­als and com­pa­nies, and courts have rightly found that train­ing AI on copy­righted ma­te­r­ial can qual­ify as fair use. We will fight this law­suit ag­gres­sively.”

Authors have sued AI com­pa­nies for copy­right in­fringe­ment be­fore — and lost.

For ex­am­ple, in June 2025, a fed­eral judge re­jected a claim brought by 13 au­thors, in­clud­ing Sarah Silverman and Junot Díaz, that Meta vi­o­lated their copy­rights by train­ing its AI model on their books. Judge Vincent Chhabria ruled that Meta had en­gaged in fair use” when it used a data set of nearly 200,000 books to train its Llama lan­guage model for gen­er­a­tive AI.

But the lat­est law­suit al­leges that Meta and Zuckerberg de­lib­er­ately cir­cum­vented copy­right-pro­tec­tion mech­a­nisms — and had con­sid­ered pay­ing to li­cense the works be­fore aban­don­ing that strat­egy at Zuckerberg’s per­sonal in­struc­tion.” The suit es­sen­tially ar­gues that the con­duct de­scribed falls out­side pro­tec­tions af­forded by fair-use pro­vi­sions of the U.S. copy­right code.

Meta — at Zuckerberg’s di­rec­tion — copied mil­lions of books, jour­nal ar­ti­cles, and other writ­ten works with­out au­tho­riza­tion, in­clud­ing those owned or con­trolled by Plaintiffs and the Class, and then made ad­di­tional copies of those works to train Llama,” the suit says. Zuckerberg him­self per­son­ally au­tho­rized and ac­tively en­cour­aged the in­fringe­ment. Meta also stripped [copyright man­age­ment in­for­ma­tion] from the copy­righted works it stole. It did this to con­ceal its train­ing sources and fa­cil­i­tate their unau­tho­rized use.”

According to the law­suit, af­ter the re­lease of Llama 1, Meta briefly con­sid­ered en­ter­ing into li­cens­ing deals with ma­jor pub­lish­ers. Meta dis­cussed in­creas­ing the com­pa­ny’s dataset li­cens­ing” bud­get to as much as $200 mil­lion from January to April 2023, per the com­plaint.

But then in early April 2023, Meta abruptly stopped its li­cens­ing strat­egy,” ac­cord­ing to the law­suit. The ques­tion of whether to li­cense or pi­rate [copyrighted ma­te­r­ial] mov­ing for­ward was escalated’ to Zuckerberg. After this es­ca­la­tion to Zuckerberg, Meta’s busi­ness de­vel­op­ment team re­ceived ver­bal in­struc­tions to stop li­cens­ing ef­forts. One Meta em­ployee pre­sciently de­scribed the ra­tio­nale: if we li­cense once [sic] sin­gle book, we won’t be able to lean into the fair use strat­egy.’”

According to the law­suit, Meta and Zuckerberg are well aware of the mar­ket for li­cens­ing AI train­ing ma­te­ri­als.” Meta signed four li­censes in 2022 with African-language book pub­lish­ers for a lim­ited train­ing set, and it sub­se­quently reached li­cens­ing agree­ments with ma­jor news pub­lish­ers in­clud­ing Fox News, CNN and USA Today,” the suit says.

On Dec. 13, 2023, Meta em­ploy­ees in­ter­nally cir­cu­lated a memo con­cern­ing the le­gal risks of us­ing LibGen, a repos­i­tory of copy­righted ma­te­r­ial that the Meta memo de­scribed as a dataset we know to be pi­rated” and added that we would not dis­close use of Libgen datasets used to train,” per the suit. Ultimately, how­ever, those con­cerns went un­heeded. Zuckerberg and other Meta ex­ec­u­tives au­tho­rized and di­rected the tor­rent­ing of over 267 TB of pi­rated ma­te­r­ial — equiv­a­lent to hun­dreds of mil­lions of pub­li­ca­tions and many times the size of the en­tire print col­lec­tion of the Library of Congress,” ac­cord­ing to the law­suit.

As a re­sult of the al­leged in­fringe­ment, Meta’s AI sys­tem readily gen­er­ates, at speed and scale, sub­sti­tutes for Plaintiffs’ and the Class’s works on which it was trained,” the law­suit states. Those sub­sti­tutes take mul­ti­ple forms, in­clud­ing ver­ba­tim and near-ver­ba­tim copies, re­place­ment chap­ters of aca­d­e­mic text­books, sum­maries and al­ter­na­tive ver­sions of fa­mous nov­els and jour­nal ar­ti­cles, in­fe­rior knock­offs that copy cre­ative el­e­ments of orig­i­nal works, and de­riv­a­tive works ex­clu­sively re­served to rights hold­ers. Llama even tai­lors out­puts to mimic the ex­pres­sive el­e­ments and cre­ative choices of spe­cific au­thors.”

A dispute over the TAB key highlights a mismatch between Microsoft and IBM organizational structures

devblogs.microsoft.com

I’ve writ­ten in the past about the cul­tural mis­match be­tween Microsoft and IBM dur­ing the col­lab­o­ra­tion on OS/2, with the Microsofties view­ing their IBM col­leagues as mired in point­less bu­reau­cracy and the IBM folks view­ing Microsofties as undis­ci­plined hack­ers.¹

One of many points of mis­match was the or­ga­ni­za­tional struc­ture.

A col­league re­calls that while he was as­signed to the IBM of­fices in Boca Raton, Florida, there was a dis­pute over what key should be used to move from one field to an­other in di­a­log boxes. The folks at IBM were not happy with my col­league’s de­ci­sion to use the TAB key, so they asked him to es­ca­late the is­sue to his man­ager back in Redmond.

My col­league’s man­ager replied, The rea­son you are in Boca is to make these de­ci­sions so I don’t have to be in Boca.”

My col­league rephrased this re­ply in a more cor­po­rate man­ner be­fore pass­ing it on to IBM: Microsoft sup­ports the use of the TAB key for this pur­pose.”

Unsatisfied, the IBM folks es­ca­lated the is­sue up their or­ga­ni­za­tional chain for sev­eral lev­els, and replied that their VP (who was around seven lev­els of man­age­ment above the pro­gram­mers) was ab­solutely op­posed to the use of the TAB for this pur­pose, and they wanted con­fir­ma­tion from the equiv­a­lent-level man­ager at Microsoft that Microsoft stands by the choice of the TAB key.

My col­league replied, Bill Gates’s mother is not in­ter­ested in the TAB key.”

This ap­par­ently ended the dis­cus­sion, and the TAB key stayed.

Note: This up­com­ing Sunday is Mother’s Day in the United States. You prob­a­bly should­n’t ask her for her opin­ion on the TAB key.

¹ There was prob­a­bly merit to both ar­gu­ments.

Author

Raymond has been in­volved in the evo­lu­tion of Windows for more than 30 years. In 2003, he be­gan a Web site known as The Old New Thing which has grown in pop­u­lar­ity far be­yond his wildest imag­i­na­tion, a de­vel­op­ment which still gives him the hee­bie-jee­bies. The Web site spawned a book, co­in­ci­den­tally also ti­tled The Old New Thing (Addison Wesley 2007). He oc­ca­sion­ally ap­pears on the Windows Dev Docs Twitter ac­count to tell sto­ries which con­vey no use­ful in­for­ma­tion.

Client Challenge

www.sfgate.com

JavaScript is dis­abled in your browser.

A re­quired part of this site could­n’t load. This may be due to a browser

ex­ten­sion, net­work is­sues, or browser set­tings. Please check your

con­nec­tion, dis­able any ad block­ers, or try us­ing a dif­fer­ent browser.

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

If you like 10HN please leave feedback and share

Visit pancik.com for more.