10 interesting stories served every morning and every evening.




1 607 shares, 45 trendiness

Open Source Zero Trust Networking

...

Read the original on netbird.io »

2 356 shares, 106 trendiness

How I Taught My Neighbor to Keep the Volume Down

When I moved to a new apart­ment with my fam­ily, the ca­ble com­pany we were used to was­n’t avail­able. We had to set­tle for Dish Network. I was­n’t too happy about mak­ing that switch, but some­thing on their web­site caught my at­ten­tion. For an ad­di­tional $5 a month, I could have ac­cess to DVR. I switched im­me­di­ately.

This was 2007. DVR was not new, but it was­n’t com­monly bun­dled with set-top boxes. TiVo was still the pop­u­lar way to record, pause, and rewind live TV. We re­ceived two set-top boxes, one for each room with a TV, and three re­motes. Two re­motes had IR (infrared) blasters and, sur­pris­ingly, one RF (radio fre­quency) re­mote.

After us­ing the RF re­mote, I won­dered: Why would any­one ever use an IR re­mote again? You did­n’t need a di­rect line of sight with the de­vice you were con­trol­ling. I could ac­tu­ally stand in the kitchen and con­trol the TV. It was amaz­ing. But with the con­ve­nience of RF came other prob­lems that IR users never had to worry about. Interference.

After sev­eral months of en­joy­ing my ser­vice, one of my neigh­bors, the loud­est in the build­ing, also switched to Dish Network. And he also got the RF re­mote. This was the type of neigh­bor who would leave the house with the TV on, vol­ume blast­ing.

One day, I was in the liv­ing room watch­ing TV when the chan­nel just flipped. I must have ac­ci­den­tally hit a but­ton, so I changed it back. But not a few sec­onds later, the chan­nel changed again. Then the vol­ume went up. I fig­ured my sis­ter must have had the RF re­mote and was mess­ing with me. But no, the re­mote was in my hand. I as­sumed some­thing was wrong with it.

The whole time I was watch­ing TV, the chan­nels kept ran­domly switch­ing. I banged the re­mote on the table a cou­ple of times, but it still switched. I re­moved the bat­ter­ies from the re­mote, it still switched. I un­plugged the de­vice for a few min­utes, plugged it back in, and… it still switched. Frustrated, I went through the de­vice set­tings and dis­abled the RF re­mote. That’s when it fi­nally stopped. I was­n’t happy with this so­lu­tion, but it al­lowed me to watch TV un­til I fig­ured some­thing out.

One evening, when every­one was asleep and the neigh­bor was watch­ing a loud TV show, I de­cided to di­ag­nose the is­sue. The mo­ment I pressed the power but­ton on the RF re­mote, my TV and set-top box turned on, and the neigh­bor’s TV went silent. Fuck!” I heard some­one say. I was con­fused. Did I just do that? The TV turned back on, the vol­ume went up. I walked to the win­dow armed with the re­mote. I counted to three, then pressed the power but­ton. My neigh­bor’s TV went silent. He growled.

I am the cap­tain now.

Every time he turned the TV on, I pressed the power but­ton again and his de­vice went off. Well, what do you know? We had in­ter­fer­ence some­how. Our re­motes were set up to op­er­ate at the same fre­quency. Each re­mote con­trolled both de­vices.

But I’m not that kind of neigh­bor. I was­n’t go­ing to con­tinue to mess with him. Instead, I de­cided I would pay him a visit in the morn­ing and ex­plain that our re­motes are tuned to the same fre­quency. I would bring the RF re­mote with me just to show him a demo. I was go­ing to be a good neigh­bor.

In the morn­ing, I went down­stairs, re­mote in hand. I knocked on the door, and a gen­tle­man in his for­ties an­swered the door. I had re­hearsed my speech and pre­sen­ta­tion. This would be a good op­por­tu­nity to build a good rap­port, and have a shared story. Maybe he would tell me how he felt when the TV went off. How he thought there was a ghost in the house or some­thing. But that’s not what hap­pened.

Hi, I’m Ibrahim. Your up­stairs neigh­bor…” I started and was in­ter­rupted al­most im­me­di­ately. Whatever you are sell­ing,” he yelled. I’m not buy­ing.” and he closed the door on my face. I knocked a sec­ond time, be­cause ob­vi­ously there was a mis­un­der­stand­ing. He never an­swered. Instead, the TV turned on and a movie played at high vol­ume. So much for my pre­pared speech.

The RF set­tings on my set-top box re­mained turned off. My fam­ily never dis­cov­ered its ben­e­fit any­way, they al­ways pointed at the box when press­ing the but­tons. It was­n’t much of an in­con­ve­nience. In fact, I later found in the man­ual that you could re­pro­gram the de­vice and re­mote to use a dif­fer­ent fre­quency. I did not re­pro­gram my re­mote. Instead, my fam­ily used the two IR re­motes, and brought the RF re­mote in my bed­room where it per­ma­nently re­mained on my night stand.

Why in the bed­room? Because I de­cided to teach my neigh­bor some good man­ners. Whenever he turned up his vol­ume, I would sim­ply turn off his de­vice. I would hear his frus­tra­tion, and his at­tempts at solv­ing the prob­lem. Like a cir­cus an­i­mal trainer, I re­mained con­sis­tent. If the vol­ume of his TV went above what I imag­ined to be 15 to 20, I would press the power but­ton. It be­came a rou­tine for me for weeks. Some nights were dif­fi­cult, I would keep the re­mote un­der my pil­low, bat­tling my stub­born neigh­bor all night.

One day, I no­ticed that I had­n’t pressed the but­ton in days. I opened the win­dow and I could still hear the faint sound of his TV. Through trial and er­ror, he learned the les­son. If the vol­ume re­mained un­der my ar­bi­trary thresh­old, the TV would re­main on. But as soon as he passed that thresh­old, the de­vice would turn off.

Sometimes, he would have com­pany and there would be noise com­ing out of his apart­ment. I used the one tool in my tool box to send him a mes­sage. Turn off the TV. All of the sud­den, my neigh­bor and his guest will be re­minded of the un­spo­ken rules, and be­come mind­ful of their neigh­bors.

Maybe some­where on the web, in some ob­scure fo­rum, some­one asked the ques­tion: Why does my set-top box turn off when I in­crease the vol­ume?” Well, it might be 18 years too late, but there’s your an­swer. There is a man out there who re­li­giously sets his vol­ume to 18. He does­n’t quite know why. That’s Pavlovian con­di­tion­ing at its best.

Next: This is NOT the worst LLM you’ll ever use

...

Read the original on idiallo.com »

3 320 shares, 26 trendiness

What I learned building an opinionated and minimal coding agent

In the past three years, I’ve been us­ing LLMs for as­sisted cod­ing. If you read this, you prob­a­bly went through the same evo­lu­tion: from copy­ing and past­ing code into ChatGPT, to Copilot auto-com­ple­tions (which never worked for me), to Cursor, and fi­nally the new breed of cod­ing agent har­nesses like Claude Code, Codex, Amp, Droid, and open­code that be­came our daily dri­vers in 2025.

I pre­ferred Claude Code for most of my work. It was the first thing I tried back in April af­ter us­ing Cursor for a year and a half. Back then, it was much more ba­sic. That fit my work­flow per­fectly, be­cause I’m a sim­ple boy who likes sim­ple, pre­dictable tools. Over the past few months, Claude Code has turned into a space­ship with 80% of func­tion­al­ity I have no use for. The sys­tem prompt and tools also change on every re­lease, which breaks my work­flows and changes model be­hav­ior. I hate that. Also, it flick­ers.

I’ve also built a bunch of agents over the years, of var­i­ous com­plex­ity. For ex­am­ple, Sitegeist, my lit­tle browser-use agent, is es­sen­tially a cod­ing agent that lives in­side the browser. In all that work, I learned that con­text en­gi­neer­ing is para­mount. Exactly con­trol­ling what goes into the mod­el’s con­text yields bet­ter out­puts, es­pe­cially when it’s writ­ing code. Existing har­nesses make this ex­tremely hard or im­pos­si­ble by in­ject­ing stuff be­hind your back that is­n’t even sur­faced in the UI.

Speaking of sur­fac­ing things, I want to in­spect every as­pect of my in­ter­ac­tions with the model. Basically no har­ness al­lows that. I also want a cleanly doc­u­mented ses­sion for­mat I can post-process au­to­mat­i­cally, and a sim­ple way to build al­ter­na­tive UIs on top of the agent core. While some of this is pos­si­ble with ex­ist­ing har­nesses, the APIs smell like or­ganic evo­lu­tion. These so­lu­tions ac­cu­mu­lated bag­gage along the way, which shows in the de­vel­oper ex­pe­ri­ence. I’m not blam­ing any­one for this. If tons of peo­ple use your shit and you need some sort of back­wards com­pat­i­bil­ity, that’s the price you pay.

I’ve also dab­bled in self-host­ing, both lo­cally and on DataCrunch. While some har­nesses like open­code sup­port self-hosted mod­els, it usu­ally does­n’t work well. Mostly be­cause they rely on li­braries like the Vercel AI SDK, which does­n’t play nice with self-hosted mod­els for some rea­son, specif­i­cally when it comes to tool call­ing.

So what’s an old guy yelling at Claudes go­ing to do? He’s go­ing to write his own cod­ing agent har­ness and give it a name that’s en­tirely un-Google-able, so there will never be any users. Which means there will also never be any is­sues on the GitHub is­sue tracker. How hard can it be?

To make this work, I needed to build:

* pi-ai: A uni­fied LLM API with multi-provider sup­port (Anthropic, OpenAI, Google, xAI, Groq, Cerebras, OpenRouter, and any OpenAI-compatible end­point), stream­ing, tool call­ing with TypeBox schemas, think­ing/​rea­son­ing sup­port, seam­less cross-provider con­text hand­offs, and to­ken and cost track­ing.

* pi-agent-core: An agent loop that han­dles tool ex­e­cu­tion, val­i­da­tion, and event stream­ing.

* pi-tui: A min­i­mal ter­mi­nal UI frame­work with dif­fer­en­tial ren­der­ing, syn­chro­nized out­put for (almost) flicker-free up­dates, and com­po­nents like ed­i­tors with au­to­com­plete and mark­down ren­der­ing.

* pi-cod­ing-agent: The ac­tual CLI that wires it all to­gether with ses­sion man­age­ment, cus­tom tools, themes, and pro­ject con­text files.

My phi­los­o­phy in all of this was: if I don’t need it, it won’t be built. And I don’t need a lot of things.

I’m not go­ing to bore you with the API specifics of this pack­age. You can read it all in the README.md. Instead, I want to doc­u­ment the prob­lems I ran into while cre­at­ing a uni­fied LLM API and how I re­solved them. I’m not claim­ing my so­lu­tions are the best, but they’ve been work­ing pretty well through­out var­i­ous agen­tic and non-agen­tic LLM pro­jects.

There’s re­ally only four APIs you need to speak to talk to pretty much any LLM provider: OpenAI’s Completions API, their newer Responses API, Anthropic’s Messages API, and Google’s Generative AI API.

They’re all pretty sim­i­lar in fea­tures, so build­ing an ab­strac­tion on top of them is­n’t rocket sci­ence. There are, of course, provider-spe­cific pe­cu­liar­i­ties you have to care for. That’s es­pe­cially true for the Completions API, which is spo­ken by pretty much all providers, but each of them has a dif­fer­ent un­der­stand­ing of what this API should do. For ex­am­ple, while OpenAI does­n’t sup­port rea­son­ing traces in their Completions API, other providers do in their ver­sion of the Completions API. This is also true for in­fer­ence en­gines like llama.cpp, Ollama, vLLM, and LM Studio.

For ex­am­ple, in ope­nai-com­ple­tions.ts:

* Cerebras, xAI, Mistral, and Chutes don’t like the store field

* Mistral and Chutes use max_­to­kens in­stead of max_­com­ple­tion_­to­kens

* Cerebras, xAI, Mistral, and Chutes don’t sup­port the de­vel­oper role for sys­tem prompts

* Different providers re­turn rea­son­ing con­tent in dif­fer­ent fields (reasoning_content vs rea­son­ing)

To en­sure all fea­tures ac­tu­ally work across the gazil­lion of providers, pi-ai has a pretty ex­ten­sive test suite cov­er­ing im­age in­puts, rea­son­ing traces, tool call­ing, and other fea­tures you’d ex­pect from an LLM API. Tests run across all sup­ported providers and pop­u­lar mod­els. While this is a good ef­fort, it still won’t guar­an­tee that new mod­els and providers will just work out of the box.

Another big dif­fer­ence is how providers re­port to­kens and cache reads/​writes. Anthropic has the san­est ap­proach, but gen­er­ally it’s the Wild West. Some re­port to­ken counts at the start of the SSE stream, oth­ers only at the end, mak­ing ac­cu­rate cost track­ing im­pos­si­ble if a re­quest is aborted. To add in­sult to in­jury, you can’t pro­vide a unique ID to later cor­re­late with their billing APIs and fig­ure out which of your users con­sumed how many to­kens. So pi-ai does to­ken and cache track­ing on a best-ef­fort ba­sis. Good enough for per­sonal use, but not for ac­cu­rate billing if you have end users con­sum­ing to­kens through your ser­vice.

Special shout out to Google who to this date seem to not sup­port tool call stream­ing which is ex­tremely Google.

pi-ai also works in the browser, which is use­ful for build­ing web-based in­ter­faces. Some providers make this es­pe­cially easy by sup­port­ing CORS, specif­i­cally Anthropic and xAI.

Context hand­off be­tween providers was a fea­ture pi-ai was de­signed for from the start. Since each provider has their own way of track­ing tool calls and think­ing traces, this can only be a best-ef­fort thing. For ex­am­ple, if you switch from Anthropic to OpenAI mid-ses­sion, Anthropic think­ing traces are con­verted to con­tent blocks in­side as­sis­tant mes­sages, de­lim­ited by tags. This may or may not be sen­si­ble, be­cause the think­ing traces re­turned by Anthropic and OpenAI don’t ac­tu­ally rep­re­sent what’s hap­pen­ing be­hind the scenes.

These providers also in­sert signed blobs into the event stream that you have to re­play on sub­se­quent re­quests con­tain­ing the same mes­sages. This also ap­plies when switch­ing mod­els within a provider. It makes for a cum­ber­some ab­strac­tion and trans­for­ma­tion pipeline in the back­ground.

I’m happy to re­port that cross-provider con­text hand­off and con­text se­ri­al­iza­tion/​de­se­ri­al­iza­tion work pretty well in pi-ai:

Speaking of mod­els, I wanted a type­safe way of spec­i­fy­ing them in the get­Model call. For that I needed a model reg­istry that I could turn into TypeScript types. I’m pars­ing data from both OpenRouter and mod­els.dev (created by the open­code folks, thanks for that, it’s su­per use­ful) into mod­els.gen­er­ated.ts. This in­cludes to­ken costs and ca­pa­bil­i­ties like im­age in­puts and think­ing sup­port.

And if I ever need to add a model that’s not in the reg­istry, I wanted a type sys­tem that makes it easy to cre­ate new ones. This is es­pe­cially use­ful when work­ing with self-hosted mod­els, new re­leases that aren’t yet on mod­els.dev or OpenRouter, or try­ing out one of the more ob­scure LLM providers:

im­port { Model, stream } from @mariozechner/pi-ai’;

const ol­la­maModel: Model

Many uni­fied LLM APIs com­pletely ig­nore pro­vid­ing a way to abort re­quests. This is en­tirely un­ac­cept­able if you want to in­te­grate your LLM into any kind of pro­duc­tion sys­tem. Many uni­fied LLM APIs also don’t re­turn par­tial re­sults to you, which is kind of ridicu­lous. pi-ai was de­signed from the be­gin­ning to sup­port aborts through­out the en­tire pipeline, in­clud­ing tool calls. Here’s how it works:

im­port { get­Model, stream } from @mariozechner/pi-ai’;

const model = get­Model(‘ope­nai’, gpt-5.1-codex’);

const con­troller = new AbortController();

// Abort af­ter 2 sec­onds

set­Time­out(() => con­troller.abort(), 2000);

const s = stream(model, {

mes­sages: [{ role: user’, con­tent: Write a long sto­ry’ }]

sig­nal: con­troller.sig­nal

for await (const event of s) {

if (event.type === text_delta’) {

process.std­out.write(event.delta);

} else if (event.type === error’) {

con­sole.log(`${event.rea­son === aborted’ ? Aborted’ : Error’}:`, event.er­ror.er­rorMes­sage);

// Get re­sults (may be par­tial if aborted)

const re­sponse = await s.re­sult();

if (response.stopReason === aborted’) {

con­sole.log(‘Par­tial con­tent:’, re­sponse.con­tent);

Another ab­strac­tion I haven’t seen in any uni­fied LLM API is split­ting tool re­sults into a por­tion handed to the LLM and a por­tion for UI dis­play. The LLM por­tion is gen­er­ally just text or JSON, which does­n’t nec­es­sar­ily con­tain all the in­for­ma­tion you’d want to show in a UI. It also sucks hard to parse tex­tual tool out­puts and re­struc­ture them for dis­play in a UI. pi-ai’s tool im­ple­men­ta­tion al­lows re­turn­ing both con­tent blocks for the LLM and sep­a­rate con­tent blocks for UI ren­der­ing. Tools can also re­turn at­tach­ments like im­ages that get at­tached in the na­tive for­mat of the re­spec­tive provider. Tool ar­gu­ments are au­to­mat­i­cally val­i­dated us­ing TypeBox schemas and AJV, with de­tailed er­ror mes­sages when val­i­da­tion fails:

What’s still lack­ing is tool re­sult stream­ing. Imagine a bash tool where you want to dis­play ANSI se­quences as they come in. That’s cur­rently not pos­si­ble, but it’s a sim­ple fix that will even­tu­ally make it into the pack­age.

Partial JSON pars­ing dur­ing tool call stream­ing is es­sen­tial for good UX. As the LLM streams tool call ar­gu­ments, pi-ai pro­gres­sively parses them so you can show par­tial re­sults in the UI be­fore the call com­pletes. For ex­am­ple, you can dis­play a diff stream­ing in as the agent rewrites a file.

Finally, pi-ai pro­vides an agent loop that han­dles the full or­ches­tra­tion: pro­cess­ing user mes­sages, ex­e­cut­ing tool calls, feed­ing re­sults back to the LLM, and re­peat­ing un­til the model pro­duces a re­sponse with­out tool calls. The loop also sup­ports mes­sage queu­ing via a call­back: af­ter each turn, it asks for queued mes­sages and in­jects them be­fore the next as­sis­tant re­sponse. The loop emits events for every­thing, mak­ing it easy to build re­ac­tive UIs.

The agent loop does­n’t let you spec­ify max steps or sim­i­lar knobs you’d find in other uni­fied LLM APIs. I never found a use case for that, so why add it? The loop just loops un­til the agent says it’s done. On top of the loop, how­ever, pi-agent-core pro­vides an Agent class with ac­tu­ally use­ful stuff: state man­age­ment, sim­pli­fied event sub­scrip­tions, mes­sage queu­ing with two modes (one-at-a-time or all-at-once), at­tach­ment han­dling (images, doc­u­ments), and a trans­port ab­strac­tion that lets you run the agent ei­ther di­rectly or through a proxy.

Am I happy with pi-ai? For the most part, yes. Like any uni­fy­ing API, it can never be per­fect due to leaky ab­strac­tions. But it’s been used in seven dif­fer­ent pro­duc­tion pro­jects and has served me ex­tremely well.

Why build this in­stead of us­ing the Vercel AI SDK? Armin’s blog post mir­rors my ex­pe­ri­ence. Building on top of the provider SDKs di­rectly gives me full con­trol and lets me de­sign the APIs ex­actly as I want, with a much smaller sur­face area. Armin’s blog gives you a more in-depth trea­tise on the rea­sons for build­ing your own. Go read that.

I grew up in the DOS era, so ter­mi­nal user in­ter­faces are what I grew up with. From the fancy setup pro­grams for Doom to Borland prod­ucts, TUIs were with me un­til the end of the 90s. And boy was I fuck­ing happy when I even­tu­ally switched to a GUI op­er­at­ing sys­tem. While TUIs are mostly portable and eas­ily stream­able, they also suck at in­for­ma­tion den­sity. Having said all that, I thought start­ing with a ter­mi­nal user in­ter­face for pi makes the most sense. I could strap on a GUI later when­ever I felt like I needed to.

So why build my own TUI frame­work? I’ve looked into the al­ter­na­tives like Ink, Blessed, OpenTUI, and so on. I’m sure they’re all fine in their own way, but I def­i­nitely don’t want to write my TUI like a React app. Blessed seems to be mostly un­main­tained, and OpenTUI is ex­plic­itly not pro­duc­tion ready. Also, writ­ing my own TUI frame­work on top of Node.js seemed like a fun lit­tle chal­lenge.

Writing a ter­mi­nal user in­ter­face is not rocket sci­ence per se. You just have to pick your poi­son. There’s ba­si­cally two ways to do it. One is to take own­er­ship of the ter­mi­nal view­port (the por­tion of the ter­mi­nal con­tents you can ac­tu­ally see) and treat it like a pixel buffer. Instead of pix­els you have cells that con­tain char­ac­ters with back­ground color, fore­ground color, and styling like italic and bold. I call these full screen TUIs. Amp and open­code use this ap­proach.

The draw­back is that you lose the scroll­back buffer, which means you have to im­ple­ment cus­tom search. You also lose scrolling, which means you have to sim­u­late scrolling within the view­port your­self. While this is not hard to im­ple­ment, it means you have to re-im­ple­ment all the func­tion­al­ity your ter­mi­nal em­u­la­tor al­ready pro­vides. Mouse scrolling specif­i­cally al­ways feels kind of off in such TUIs.

The sec­ond ap­proach is to just write to the ter­mi­nal like any CLI pro­gram, ap­pend­ing con­tent to the scroll­back buffer, only oc­ca­sion­ally mov­ing the rendering cur­sor” back up a lit­tle within the vis­i­ble view­port to re­draw things like an­i­mated spin­ners or a text edit field. It’s not ex­actly that sim­ple, but you get the idea. This is what Claude Code, Codex, and Droid do.

Coding agents have this nice prop­erty that they’re ba­si­cally a chat in­ter­face. The user writes a prompt, fol­lowed by replies from the agent and tool calls and their re­sults. Everything is nicely lin­ear, which lends it­self well to work­ing with the native” ter­mi­nal em­u­la­tor. You get to use all the built-in func­tion­al­ity like nat­ural scrolling and search within the scroll­back buffer. It also lim­its what your TUI can do to some de­gree, which I find charm­ing be­cause con­straints make for min­i­mal pro­grams that just do what they’re sup­posed to do with­out su­per­flu­ous fluff. This is the di­rec­tion I picked for pi-tui.

If you’ve done any GUI pro­gram­ming, you’ve prob­a­bly heard of re­tained mode vs im­me­di­ate mode. In a re­tained mode UI, you build up a tree of com­po­nents that per­sist across frames. Each com­po­nent knows how to ren­der it­self and can cache its out­put if noth­ing changed. In an im­me­di­ate mode UI, you re­draw every­thing from scratch each frame (though in prac­tice, im­me­di­ate mode UIs also do caching, oth­er­wise they’d fall apart).

pi-tui uses a sim­ple re­tained mode ap­proach. A Component is just an ob­ject with a ren­der(width) method that re­turns an ar­ray of strings (lines that fit the view­port hor­i­zon­tally, with ANSI es­cape codes for col­ors and styling) and an op­tional han­dleIn­put(data) method for key­board in­put. A Container holds a list of com­po­nents arranged ver­ti­cally and col­lects all their ren­dered lines. The TUI class is it­self a con­tainer that or­ches­trates every­thing.

When the TUI needs to up­date the screen, it asks each com­po­nent to ren­der. Components can cache their out­put: an as­sis­tant mes­sage that’s fully streamed does­n’t need to re-parse mark­down and re-ren­der ANSI se­quences every time. It just re­turns the cached lines. Containers col­lect lines from all chil­dren. The TUI gath­ers all these lines and com­pares them to the lines it pre­vi­ously ren­dered for the pre­vi­ous com­po­nent tree. It keeps a back­buffer of sorts, re­mem­ber­ing what was writ­ten to the scroll­back buffer.

Then it only re­draws what changed, us­ing a method I call dif­fer­en­tial ren­der­ing. I’m very bad with names, and this likely has an of­fi­cial name.

Here’s a sim­pli­fied demo that il­lus­trates what ex­actly gets re­drawn.

First ren­der: Just out­put all lines to the ter­mi­nal

Width changed: Clear screen com­pletely and re-ren­der every­thing (soft wrap­ping changes)

Normal up­date: Find the first line that dif­fers from what’s on screen, move the cur­sor to that line, and re-ren­der from there to the end

There’s one catch: if the first changed line is above the vis­i­ble view­port (the user scrolled up), we have to do a full clear and re-ren­der. The ter­mi­nal does­n’t let you write to the scroll­back buffer above the view­port.

To pre­vent flicker dur­ing up­dates, pi-tui wraps all ren­der­ing in syn­chro­nized out­put es­cape se­quences (CSI ?2026h and CSI ?2026l). This tells the ter­mi­nal to buffer all the out­put and dis­play it atom­i­cally. Most mod­ern ter­mi­nals sup­port this.

How well does it work and how much does it flicker? In any ca­pa­ble ter­mi­nal like Ghostty or iTerm2, this works bril­liantly and you never see any flicker. In less for­tu­nate ter­mi­nal im­ple­men­ta­tions like VS Code’s built-in ter­mi­nal, you will get some flicker de­pend­ing on the time of day, your dis­play size, your win­dow size, and so on. Given that I’m very ac­cus­tomed to Claude Code, I haven’t spent any more time op­ti­miz­ing this. I’m happy with the lit­tle flicker I get in VS Code. I would­n’t feel at home oth­er­wise. And it still flick­ers less than Claude Code.

How waste­ful is this ap­proach? We store an en­tire scroll­back buffer worth of pre­vi­ously ren­dered lines, and we re-ren­der lines every time the TUI is asked to ren­der it­self. That’s al­le­vi­ated with the caching I de­scribed above, so the re-ren­der­ing is­n’t a big deal. We still have to com­pare a lot of lines with each other. Realistically, on com­put­ers younger than 25 years, this is not a big deal, both in terms of per­for­mance and mem­ory use (a few hun­dred kilo­bytes for very large ses­sions). Thanks V8. What I get in re­turn is a dead sim­ple pro­gram­ming model that lets me it­er­ate quickly.

I don’t need to ex­plain what fea­tures you should ex­pect from a cod­ing agent har­ness. pi comes with most crea­ture com­forts you’re used to from other tools:

* Runs on Windows, Linux, and ma­cOS (or any­thing with a Node.js run­time and a ter­mi­nal)

* Message queu­ing while the agent is work­ing

If you want the full run­down, read the README. What’s more in­ter­est­ing is where pi de­vi­ates from other har­nesses in phi­los­o­phy and im­ple­men­ta­tion.

That’s it. The only thing that gets in­jected at the bot­tom is your AGENTS.md file. Both the global one that ap­plies to all your ses­sions and the pro­ject-spe­cific one stored in your pro­ject di­rec­tory. This is where you can cus­tomize pi to your lik­ing. You can even re­place the full sys­tem prompt if you want to. Compared to, for ex­am­ple, Claude Code’s sys­tem prompt, Codex’s sys­tem prompt, or open­code’s model-spe­cific prompts (the Claude one is a cut-down ver­sion of the orig­i­nal Claude Code prompt they copied).

You might think this is crazy. In all like­li­hood, the mod­els have some train­ing on their na­tive cod­ing har­ness. So us­ing the na­tive sys­tem prompt or some­thing close to it like open­code would be most ideal. But it turns out that all the fron­tier mod­els have been RL-trained up the wa­zoo, so they in­her­ently un­der­stand what a cod­ing agent is. There does not ap­pear to be a need for 10,000 to­kens of sys­tem prompt, as we’ll find out later in the bench­mark sec­tion, and as I’ve anec­do­tally found out by ex­clu­sively us­ing pi for the past few weeks. Amp, while copy­ing some parts of the na­tive sys­tem prompts, seems to also do just fine with their own prompt.

Here are the tool de­f­i­n­i­tions:

read

Read the con­tents of a file. Supports text files and im­ages (jpg, png,

gif, webp). Images are sent as at­tach­ments. For text files, de­faults to

first 2000 lines. Use off­set/​limit for large files.

- path: Path to the file to read (relative or ab­solute)

- off­set: Line num­ber to start read­ing from (1-indexed)

- limit: Maximum num­ber of lines to read

write

Write con­tent to a file. Creates the file if it does­n’t ex­ist, over­writes

if it does. Automatically cre­ates par­ent di­rec­to­ries.

- path: Path to the file to write (relative or ab­solute)

- con­tent: Content to write to the file

edit

Edit a file by re­plac­ing ex­act text. The old­Text must match ex­actly

(including white­space). Use this for pre­cise, sur­gi­cal ed­its.

- path: Path to the file to edit (relative or ab­solute)

- old­Text: Exact text to find and re­place (must match ex­actly)

- new­Text: New text to re­place the old text with

bash

...

Read the original on mariozechner.at »

4 313 shares, 9 trendiness

Swift is a more convenient Rust

Rust is one of the most loved lan­guages out there, is fast, and has an amaz­ing com­mu­nity. Rust in­vented the con­cept of own­er­ship as a so­lu­tion mem­ory man­age­ment is­sues with­out re­sort­ing to some­thing slower like Garbage Collection or Reference Counting. But, when you don’t need to be quite as low level, it gives you util­i­ties such as Rc, Arc and Cow to do ref­er­ence count­ing and clone-on-right” in your code. And, when you need to go lower-level still, you can use the un­safe sys­tem and ac­cess raw C point­ers.

Rust also has a bunch of awe­some fea­tures from func­tional lan­guages like tagged enums, match ex­pres­sions, first class func­tions and a pow­er­ful type sys­tem with gener­ics.

Rust has an LLVM-based com­piler which lets it com­pile to na­tive code and WASM.

I’ve also been do­ing a bit of Swift pro­gram­ming for a cou­ple of years now. And the more I learn Rust, the more I see a re­flec­tion of Swift. (I know that Swift stole a lot of ideas from Rust, I’m talk­ing about my own per­spec­tive here).

Swift, too, has awe­some fea­tures from func­tional lan­guages like tagged enums, match ex­pres­sions and first-class func­tions. It too has a very pow­er­ful type sys­tem with gener­ics.

Swift too gives you com­plete type-safety with­out a garbage col­lec­tor. By de­fault, every­thing is a value type with copy-on-write” se­man­tics. But when you need ex­tra speed you can opt into an own­er­ship sys­tem and move” val­ues to avoid copy­ing. And if you need to go even lower level, you can use the un­safe sys­tem and ac­cess raw C point­ers.

Swift has an LLVM-based com­piler which lets it com­pile to na­tive code and WASM.

You’re prob­a­bly feel­ing like you just read the same para­graphs twice. This is no ac­ci­dent. Swift is ex­tremely sim­i­lar to Rust and has most of the same fea­ture-set. But there is a very big dif­fer­ence is per­spec­tive. If you con­sider the de­fault mem­ory model, this will start to make a lot of sense.

Rust is a low-level sys­tems lan­guage at heart, but it gives you the tools to go higher level. Swift starts at a high level and gives you the abil­ity to go low-level.

The most ob­vi­ous ex­am­ple of this is the mem­ory man­age­ment model. Swift use value-types by de­fault with copy-on-write se­man­tics. This is the equiv­a­lent of us­ing Cow<> for all your val­ues in Rust. But de­faults mat­ter. Rust makes it easy to use moved” and borrowed” val­ues but re­quires ex­tra cer­e­mony to use Cow<> val­ues as you need to unwrap” them .as_mutable() to ac­tu­ally use the value within. Swift makes these Copy-on-Write val­ues easy to use and in­stead re­quires ex­tra cer­e­mony to use bor­row­ing and mov­ing in­stead. Rust is faster by de­fault, Swift is sim­pler and eas­ier by de­fault.

Swift’s syn­tax is a mas­ter­class in tak­ing awe­some func­tional lan­guage con­cepts and hid­ing them in C-like syn­tax to trick the de­vel­op­ers into ac­cept­ing them.

Consider match state­ments. This is what a match state­ment looks like in Rust:

Here’s how that same code would be writ­ten in Swift:

Swift does­n’t have a match state­ment or ex­pres­sion. It has a switch state­ment that de­vel­op­ers are al­ready fa­mil­iar with. Except this switch state­ment is ac­tu­ally not a switch state­ment at all. It’s an ex­pres­sion. It does­n’t fallthrough”. It does pat­tern match­ing. It’s just a match ex­pres­sion with a dif­fer­ent name and syn­tax.

In fact, Swift treats enums as more than just types and lets you put meth­ods di­rectly on it:

Rust does­n’t have null, but it does have None. Swift has a nil, but it’s re­ally just a None in hid­ing. Instead of an Option, Swift let’s you use T?, but the com­piler still forces you to check that the value is not nil be­fore you can use it.

You get the same safety with more con­ve­nience since you can do this in Swift with an op­tional type:

let val: T?if let val { // val is now of type `T`.}

Also, you’re not forced to wrap every value with a Some(val) be­fore re­turn­ing it. The Swift com­piler takes care of that for you. A T will trans­par­ently be con­verted into a T? when needed.

Rust does­n’t have try-catch. Instead it has a Result type which con­tains the suc­cess and er­ror types.

Swift does­n’t have a try-catch ei­ther, but it does have do-catch and you have to use try be­fore call­ing a func­tion that could throw. Again, this is just de­cep­tion for those de­vel­op­ers com­ing from C-like lan­guages. Swift’s er­ror han­dling works ex­actly like Rust’s be­hind the scenes, but it is hid­den in a clever, fa­mil­iar syn­tax.

func us­esEr­rorThrow­ing­Func­tion() throws { let x = try th­isFn­Can­Throw()}func han­dle­sEr­rors() { do { let x = try th­isFn­Can­Throw() } catch err { // han­dle the `err` here. }}

This is very sim­i­lar to how Rust let’s you use ? at the end of state­ments to au­to­mat­i­cally for­ward er­rors, but you don’t have to wrap your suc­cess val­ues in Ok().

There are many com­mon prob­lems that Rust’s com­piler will catch at com­pile time and even sug­gest so­lu­tions for you. The ex­am­ple that por­trays this well is self-ref­er­enc­ing enums.

Consider an enum that rep­re­sents a tree. Since, it is a re­cur­sive type, Rust will force you to use some­thing like Box<> for ref­er­enc­ing a type within it­self.

This makes the prob­lem ex­plicit and forces you to deal with it di­rectly. Swift is a lit­tle more, au­to­matic.

Note: that you still have to an­no­tate this enum with the in­di­rect key­word to in­di­cate that it is re­cur­sive. But once you’ve done that, Swift’s com­piler takes care of the rest. You don’t have to think about Box<> or Rc<>. The val­ues just work nor­mally.

Swift was de­signed to re­place Objective-C and needed to be able to in­ter­face with ex­ist­ing code. So, it has made a lot of prag­matic choices that makes it a much less pure” and minimalist” lan­guage. Swift is a pretty big lan­guage com­pared to Rust and has many more fea­tures built-in. However, Swift is de­signed with progressive dis­clo­sure” in mind which means that just as soon as you think you’ve learned the lan­guage a lit­tle more of the ice­berg pops out of the wa­ter.

Here are just some of the lan­guage fea­tures:

Swift is a far eas­ier lan­guage to get started and pro­duc­tive with. The syn­tax is more fa­mil­iar and a lot more is done for you au­to­mat­i­cally. But this re­ally just makes Swift a higher-level lan­guage and it comes with the same trade­offs.

By de­fault, a Rust pro­gram is much faster than a Swift pro­gram. This is be­cause Rust is fast by de­fault, and lets you be slow, while Swift is easy by de­fault and lets you be fast.

Based on this, I would say both lan­guages have their uses. Rust is bet­ter for sys­tems and em­bed­ded pro­gram­ming. It’s bet­ter for writ­ing com­pil­ers and browser en­gines (Servo) and it’s bet­ter for writ­ing en­tire op­er­at­ing sys­tems.

Swift is bet­ter for writ­ing UI and servers and some parts of com­pil­ers and op­er­at­ing sys­tems. Over time I ex­pect to see the over­lap get big­ger.

There is a per­cep­tion that Swift is only a good lan­guage for Apple plat­forms. While this was once true, this is no longer the case and Swift is be­com­ing in­creas­ingly a good cross-plat­form lan­guage. Hell, Swift even com­piles to wasm, and the forks made by the swift-wasm team were merged back into Swift core ear­lier this year.

Swift on Windows is be­ing used by The Browser Company to share code and bring the Arc browser to win­dows. Swift on Linux has long been sup­ported by Apple them­selves in or­der to push Swift on Server”. Apple is di­rectly spon­sor­ing the Swift on Server con­fer­ence.

This year Embedded Swift was also an­nounced which is al­ready be­ing used on small de­vices like the Panic Playdate.

Swift web­site has been high­light­ing many of these pro­jects:

The browser com­pany says that Interoperability is Swift’s su­per power.

And the Swift pro­ject has been try­ing make work­ing with Swift a great ex­pe­ri­ence out­side of XCode with pro­jects like an open source LSP and fund­ing the the VSCode ex­ten­sion.

Compile times are (like Rust) quite bad. There is some amount of fea­ture creep and the lan­guage is larger than it should be. Not all syn­tax feels fa­mil­iar. The pack­age ecosys­tem is­n’t nearly as rich as Rust.

But the Swift is only for Apple plat­forms” is an old and tired cliche at this point. Swift is al­ready a cross-plat­form, ABI-stable lan­guage with no GC, au­to­matic Reference Counting and the op­tion to opt into own­er­ship for even more per­for­mance. Swift pack­ages in­creas­ingly work on Linux. Foundation was ported to Swift, open sourced and made open source. It’s still early days for Swift as a good, more con­ve­nient, Rust al­ter­na­tive for cross-plat­form de­vel­op­ment, but it is here now. It’s no longer a fu­ture to wait for.

...

Read the original on nmn.sh »

5 309 shares, 16 trendiness

list animals until failure

You have lim­ited time, but get more time for each an­i­mal listed. When the timer runs out, that’s game over.

No over­lap­ping terms.

For ex­am­ple, if you list bear” and polar bear”, you get no point (or time bonus) for the lat­ter. But you can still get a point for a sec­ond kind of bear. Order does­n’t mat­ter.

...

Read the original on rose.systems »

6 293 shares, 13 trendiness

Scientist who helped eradicate smallpox dies at age 89

A leader in the global fight against small­pox and a cham­pion of vac­cine sci­ence, William Foege died last SaturdayThe late physi­cians and health ad­min­is­tra­tors William Foege (middle), J. Donald Millar (left) and J. Michael Lane (right), all of whom served in the Global Smallpox Eradication Program, in 1980. Sign Up for Our Free Daily NewsletterI agree my in­for­ma­tion will be processed in ac­cor­dance with the Scientific American and Springer Nature Limited Privacy Policy . We lever­age third party ser­vices to both ver­ify and de­liver email. By pro­vid­ing your email ad­dress, you also con­sent to hav­ing the email ad­dress shared with third par­ties for those pur­poses. William Foege, a leader in the global fight to elim­i­nate small­pox, has died. Foege passed away on Saturday at the age of 89, ac­cord­ing to the Task Force for Global Health, a pub­lic health or­ga­ni­za­tion he co-founded.Foege headed the U.S. Centers for Disease Control and Prevention’s Smallpox Eradication Program in the 1970s. Before the dis­ease was of­fi­cially erad­i­cated in 1980, it killed around one in three peo­ple who were in­fected. According to the CDC, there have been no new small­pox cases since 1977.“If you look at the sim­ple met­ric of who has saved the most lives, he is right up there with the pan­theon,” said for­mer CDC di­rec­tor Tom Frieden to the Associated Press. Smallpox erad­i­ca­tion has pre­vented hun­dreds of mil­lions of deaths.”If you’re en­joy­ing this ar­ti­cle, con­sider sup­port­ing our award-win­ning jour­nal­ism by sub­scrib­ing. By pur­chas­ing a sub­scrip­tion you are help­ing to en­sure the fu­ture of im­pact­ful sto­ries about the dis­cov­er­ies and ideas shap­ing our world to­day.Foege went on to lead the CDC and served as a se­nior med­ical ad­viser and se­nior fel­low at the Bill & Melinda Gates Foundation. In 2012 then pres­i­dent Barack Obama awarded him the Presidential Medal of Freedom.Foege was a vo­cal pro­po­nent of vac­cines for pub­lic health, writ­ing with epi­demi­ol­o­gist Larry Brilliant in Scientific American in 2013 that the ef­fort to elim­i­nate po­lio has never been closer” to suc­cess. By work­ing to­gether,” they wrote, we will soon rel­e­gate po­lio—along­side small­pox—to the his­tory books.” Polio re­mains a candidate for erad­i­ca­tion,” ac­cord­ing to the World Health Assembly.And in 2025 Foege, along­side sev­eral other for­mer CDC di­rec­tors, spoke out against the poli­cies of the cur­rent sec­re­tary of health and hu­man ser­vices Robert F. Kennedy, Jr. In a New York Times op-ed, they wrote that the top health of­fi­cial’s tenure was unlike any­thing we had ever seen at the agency.”In a state­ment, Task Force for Global Health CEO Patrick O’Carroll re­mem­bered Foege as an inspirational” fig­ure, both for early-ca­reer pub­lic health work­ers and vet­er­ans of the field. Whenever he spoke, his vi­sion and com­pas­sion would reawaken the op­ti­mism that prompted us to choose this field, and re-en­er­gize our ef­forts to make this world a bet­ter place,” O’Carroll said.It’s Time to Stand Up for ScienceIf you en­joyed this ar­ti­cle, I’d like to ask for your sup­port. Scientific American has served as an ad­vo­cate for sci­ence and in­dus­try for 180 years, and right now may be the most crit­i­cal mo­ment in that two-cen­tury his­tory.I’ve been a Scientific American sub­scriber since I was 12 years old, and it helped shape the way I look at the world. SciAm al­ways ed­u­cates and de­lights me, and in­spires a sense of awe for our vast, beau­ti­ful uni­verse. I hope it does that for you, too.If you sub­scribe to Scientific American, you help en­sure that our cov­er­age is cen­tered on mean­ing­ful re­search and dis­cov­ery; that we have the re­sources to re­port on the de­ci­sions that threaten labs across the U.S.; and that we sup­port both bud­ding and work­ing sci­en­tists at a time when the value of sci­ence it­self too of­ten goes un­rec­og­nized.In re­turn, you get es­sen­tial news, cap­ti­vat­ing pod­casts, bril­liant in­fo­graph­ics, can’t-miss newslet­ters, must-watch videos, chal­leng­ing games, and the sci­ence world’s best writ­ing and re­port­ing. You can even gift some­one a sub­scrip­tion.There has never been a more im­por­tant time for us to stand up and show why sci­ence mat­ters. I hope you’ll sup­port us in that mis­sion.

...

Read the original on www.scientificamerican.com »

7 281 shares, 12 trendiness

In Praise of –dry-run

For the last few months, I have been de­vel­op­ing a new re­port­ing ap­pli­ca­tion. Early on, I de­cided to add a –dry-run op­tion to the run com­mand. This turned out to be quite use­ful — I have used it many times a day while de­vel­op­ing and test­ing the ap­pli­ca­tion.

The ap­pli­ca­tion will gen­er­ate a set of re­ports every week­day. It has a loop that checks pe­ri­od­i­cally if it is time to gen­er­ate new re­ports. If so, it will read data from a data­base, ap­ply some logic to cre­ate the re­ports, zip the re­ports, up­load them to an sftp server, check for er­ror re­sponses on the sftp server, parse the er­ror re­sponses, and send out no­ti­fi­ca­tion mails. The files (the gen­er­ated re­ports, and the down­loaded feed­back files) are moved to dif­fer­ent di­rec­to­ries de­pend­ing on the step in the process. A sim­ple and straight­for­ward ap­pli­ca­tion.

Early in the de­vel­op­ment process, when test­ing the in­com­plete ap­pli­ca­tion, I re­mem­bered that Subversion (the ver­sion con­trol sys­tem af­ter CVS, be­fore Git) had a –dry-run op­tion. Other linux com­mands have this op­tion too. If a com­mand is run with the ar­gu­ment –dry-run, the out­put will print what will hap­pen when the com­mand is run, but no changes will be made. This lets the user see what will hap­pen if the com­mand is run with­out the –dry-run ar­gu­ment.

I re­mem­bered how help­ful that was, so I de­cided to add it to my com­mand as well. When I run the com­mand with –dry-run, it prints out the steps that will be taken in each phase: which re­ports that will be gen­er­ated (and which will not be), which files will be zipped and moved, which files will be up­loaded to the sftp server, and which files will be down­loaded from it (it logs on and lists the files).

Looking back at the pro­ject, I re­al­ized that I ended up us­ing the –dry-run op­tion pretty much every day.

I am sur­prised how use­ful I found it to be. I of­ten used it as a check be­fore get­ting started. Since I know –dry-run will not change any­thing, it is safe to run with­out think­ing. I can im­me­di­ately see that every­thing is ac­ces­si­ble, that the con­fig­u­ra­tion is cor­rect, and that the state is as ex­pected. It is a quick and easy san­ity check.

I also used it quite a bit when test­ing the com­plete sys­tem. For ex­am­ple, if I changed a date in the re­port state file (the date for the last suc­cess­ful re­port of a given type), I could im­me­di­ately see from the out­put whether it would now be gen­er­ated or not. Without –dry-run, the ac­tual re­port would also be gen­er­ated, which takes some time. So I can test the be­hav­ior, and re­ceive very quick feed­back.

The down­side is that the dryRun-flag pol­lutes the code a bit. In all the ma­jor phases, I need to check if the flag is set, and only print the ac­tion that will be taken, but not ac­tu­ally do­ing it. However, this does­n’t go very deep. For ex­am­ple, none of the code that ac­tu­ally gen­er­ates the re­port needs to check it. I only need to check if that code should be in­voked in the first place.

The type of ap­pli­ca­tion I have been writ­ing is ideal for –dry-run. It is in­voked by a com­mand, and it may cre­ate some changes, for ex­am­ple gen­er­at­ing new re­ports. More re­ac­tive ap­pli­ca­tions (that wait for mes­sages be­fore act­ing) don’t seem to be a good fit.

I added –dry-run on a whim early on in the pro­ject. I was sur­prised at how use­ful I found it to be. Adding it early was also good, since I got the ben­e­fit of it while de­vel­op­ing more func­tion­al­ity.

The –dry-run flag is not for every sit­u­a­tion, but when it fits, it can be quite use­ful.

...

Read the original on henrikwarne.com »

8 229 shares, 29 trendiness

Adventure Game Studio

Log in to Our Site

Please log in (using your fo­rums ac­count) to make the most

of our site, or reg­is­ter here.

Learn More

Adventure Game Studio (AGS) is open-source soft­ware for cre­at­ing graph­i­cal point-and-click ad­ven­ture games. It is free, stand­alone, and re­quires no sub­scrip­tion.

The Windows-based IDE, stream­lines game cre­ation by in­te­grat­ing tools for im­port­ing graph­ics, writ­ing scripts, and test­ing. Games cre­ated with AGS can be played on mul­ti­ple plat­forms, in­clud­ing Linux, iOS, and Android.

Suitable for all skill lev­els, AGS fea­tures an ac­tive com­mu­nity for sup­port and so­cial­is­ing.

Showcase your games by up­load­ing them to this web­site.

Rot your brain by con­sum­ing AI slop and ser­vices in this clas­sic ar­cade style game cre­ated for the MAGS January 2026 game jam in the AGS fo­rums. Move […]

You awaken alone on a cold, rocky shore be­neath a moon­less sky, dragged from the sea through a sewer pipe with no mem­ory of who you are, how you […]

A dead man’s soul cries out against the force of a fe­ro­cious bliz­zard. He cries for help. He cries for an­swers. Then he screams as he is torn apart […]

The jury of Los Angeles County District has ruled in fa­vor of four po­lice of­fi­cers ac­cused of abus­ing their power against coloured cit­i­zen Rodney […]

Rot your brain by con­sum­ing AI slop and ser­vices in this clas­sic ar­cade style game cre­ated for the MAGS January 2026 game jam in the AGS fo­rums.Move […]

You awaken alone on a cold, rocky shore be­neath a moon­less sky, dragged from the sea through a sewer pipe with no mem­ory of who you are, how you […]

A dead man’s soul cries out against the force of a fe­ro­cious bliz­zard. He cries for help. He cries for an­swers. Then he screams as he is torn apart […]

The jury of Los Angeles County District has ruled in fa­vor of four po­lice of­fi­cers ac­cused of abus­ing their power against coloured cit­i­zen Rodney […]

The lat­est from our fo­rums

In: The Rumpus Room

By: cat (2 hours ago)

In: The Rumpus Room

By: Danvzare (3 hours ago)

In: AGS Games in Production

By: Rui Giger Kitty” Pires (4 hours ago)

AGS has an ac­tive and friendly com­mu­nity, with many ways of keep­ing in touch and get­ting help with your pro­ject or games made with AGS.

These in­clude our lo­cal fo­rums, Facebook page, Discord server, in-per­son meet-ups, and many more.

The AGS com­mu­nity is run by a team of ded­i­cated vol­un­teers, who put their time and ef­forts into keep­ing it run­ning as a wel­com­ing, friendly and in­for­ma­tive place to be. The AGS server and fo­rums are paid for out of our own pock­ets, so in ef­fect it costs us money to pro­vide a free ser­vice to AGS users.

If you ap­pre­ci­ate the work we do, and would like to give a lit­tle some­thing back, please use the be­low link to do­nate via PayPal. Any profit made af­ter cov­er­ing server costs will be put back into host­ing com­mu­nity events such as Mittens.

By con­tin­u­ing to use this site you agree to the use of

cook­ies .

Please visit this page to see ex­actly how we use these.

...

Read the original on www.adventuregamestudio.co.uk »

9 226 shares, 9 trendiness

Generative AI and Wikipedia editing

Like many or­ga­ni­za­tions, Wiki Education has grap­pled with gen­er­a­tive AI, its im­pacts, op­por­tu­ni­ties, and threats, for sev­eral years. As an or­ga­ni­za­tion that runs large-scale pro­grams to bring new ed­i­tors to Wikipedia (we’re re­spon­si­ble for about 19% of all new ac­tive ed­i­tors on English Wikipedia), we have deep un­der­stand­ing of what chal­lenges face new con­tent con­trib­u­tors to Wikipedia — and how to sup­port them to suc­cess­fully edit. As many peo­ple have be­gun us­ing gen­er­a­tive AI chat­bots like ChatGPT, Gemini, or Claude in their daily lives, it’s un­sur­pris­ing that peo­ple will also con­sider us­ing them to help draft con­tri­bu­tions to Wikipedia. Since Wiki Education’s pro­grams pro­vide a co­hort of con­tent con­trib­u­tors whose work we can eval­u­ate, we’ve looked into how our par­tic­i­pants are us­ing GenAI tools.

We are choos­ing to share our per­spec­tive through this blog post be­cause we hope it will help in­form dis­cus­sions of GenAI-created con­tent on Wikipedia. In an open en­vi­ron­ment like the Wikimedia move­ment, it’s im­por­tant to share what you’ve learned. In this case, we be­lieve our learn­ings can help Wikipedia ed­i­tors who are try­ing to pro­tect the in­tegrity of con­tent on the en­cy­clo­pe­dia, Wikipedians who may be in­ter­ested in us­ing gen­er­a­tive AI tools them­selves, other pro­gram lead­ers glob­ally who are try­ing to on­board new con­trib­u­tors who may be in­ter­ested in us­ing these tools, and the Wikimedia Foundation, whose prod­uct and tech­nol­ogy team builds soft­ware to help sup­port the de­vel­op­ment of high-qual­ity con­tent on Wikipedia.

Our fun­da­men­tal con­clu­sion about gen­er­a­tive AI is: Wikipedia ed­i­tors should never copy and paste the out­put from gen­er­a­tive AI chat­bots like ChatGPT into Wikipedia ar­ti­cles.

Let me ex­plain more.

Since the launch of ChatGPT in November 2022, we’ve been pay­ing close at­ten­tion to GenAI-created con­tent, and how it re­lates to Wikipedia. We’ve spot-checked work of new ed­i­tors from our pro­grams, pri­mar­ily fo­cus­ing on ci­ta­tions to en­sure they were real and not hal­lu­ci­nated. We ex­per­i­mented with tools our­selves, we led video ses­sions about GenAI for our pro­gram par­tic­i­pants, and we closely tracked on-wiki pol­icy dis­cus­sions around GenAI. Currently, English Wikipedia pro­hibits the use of gen­er­a­tive AI to cre­ate im­ages or in talk page dis­cus­sions, and re­cently adopted a guide­line against us­ing large lan­guage mod­els to gen­er­ate new ar­ti­cles.

As our Wiki Experts Brianda Felix and Ian Ramjohn worked with pro­gram par­tic­i­pants through­out the first half of 2025, they found more and more text bear­ing the hall­marks of gen­er­a­tive AI in ar­ti­cle con­tent, like bolded words or bul­leted lists in odd places. But the use of gen­er­a­tive AI was­n’t nec­es­sar­ily prob­lem­atic, as long as the con­tent was ac­cu­rate. Wikipedia’s open edit­ing process en­cour­ages styl­is­tic re­vi­sions to fac­tual text to bet­ter fit Wikipedia’s style.

This find­ing led us to in­vest sig­nif­i­cant staff time into clean­ing up these ar­ti­cles — far more than these ed­i­tors had likely spent cre­at­ing them. Wiki Education’s core mis­sion is to im­prove Wikipedia, and when we dis­cover our pro­gram has un­know­ingly con­tributed to mis­in­for­ma­tion on Wikipedia, we are com­mit­ted to clean­ing it up. In the clean-up process, Wiki Education staff moved more re­cent work back to sand­boxes, we stub-ified ar­ti­cles that passed no­ta­bil­ity but mostly failed ver­i­fi­ca­tion, and we PRODed some ar­ti­cles that from our judg­ment weren’t sal­vage­able. All these are ways of ad­dress­ing Wikipedia ar­ti­cles with flaws in their con­tent. (While there are many grum­blings about Wikipedia’s dele­tion processes, we found sev­eral of the ar­ti­cles we PRODed due to their fully hal­lu­ci­nated GenAI con­tent were then de-PRODed by other ed­i­tors, show­ing the di­ver­sity of opin­ion about gen­er­a­tive AI among the Wikipedia com­mu­nity.

Given what we found through our in­ves­ti­ga­tion into the work from prior terms, and given the in­creas­ing us­age of gen­er­a­tive AI, we wanted to proac­tively ad­dress gen­er­a­tive AI us­age within our pro­grams. Thanks to in-kind sup­port from our friends at Pangram, we be­gan run­ning our par­tic­i­pants’ Wikipedia ed­its, in­clud­ing in their sand­boxes, through Pangram nearly in real time. This is pos­si­ble be­cause of the Dash­board course man­age­ment plat­form Sage built, which tracks ed­its and gen­er­ates tick­ets for our Wiki Experts based on on-wiki ed­its.

We cre­ated a brand-new train­ing mod­ule on Us­ing gen­er­a­tive AI tools with Wikipedia. This train­ing em­pha­sizes where par­tic­i­pants could use gen­er­a­tive AI tools in their work, and where they should not. The core mes­sage of these train­ings is, do not copy and paste any­thing from a GenAI chat­bot into Wikipedia.

We crafted a va­ri­ety of au­to­mated emails to par­tic­i­pants who Pangram de­tected were adding text cre­ated by gen­er­a­tive AI chat­bots. Sage also recorded some videos, since many young peo­ple are ac­cus­tomed to learn­ing via video rather than read­ing text. We also pro­vided op­por­tu­ni­ties for en­gage­ment and con­ver­sa­tion with pro­gram par­tic­i­pants.

In to­tal, we had 1,406 AI edit alerts in the sec­ond half of 2025, al­though only 314 of these (or 22%) were in the ar­ti­cle name­space on Wikipedia (meaning ed­its to live ar­ti­cles). In most cases, Pangram de­tected par­tic­i­pants us­ing GenAI in their sand­boxes dur­ing early ex­er­cises, when we ask them to do things like choose an ar­ti­cle, eval­u­ate an ar­ti­cle, cre­ate a bib­li­og­ra­phy, and out­line their con­tri­bu­tion.

Pangram strug­gled with false pos­i­tives in a few sand­box sce­nar­ios:

* Bibliographies, which are of­ten a com­bi­na­tion of hu­man-writ­ten prose (describing a source and its rel­e­vance) and non-prose text (the ci­ta­tion for a source, in some stan­dard for­mat)

* Outlines with a high por­tion of non-prose con­tent (such as bul­let lists, sec­tion head­ers, text frag­ments, and so on)

We also had a hand­ful of cases where sand­boxes were flagged for AI af­ter a par­tic­i­pant copied an AI-written sec­tion from an ex­ist­ing ar­ti­cle to use as a start­ing point to edit or to ex­pand. (This is­n’t a flaw of Pangram, but a re­minder of how much AI-generated con­tent ed­i­tors out­side our pro­grams are adding to Wikipedia!)

In broad strokes, we found that Pangram is great at an­a­lyz­ing plain prose — the kind of sen­tences and para­graphs you’ll find in the body of a Wikipedia ar­ti­cle — but some­times it gets tripped up by for­mat­ting, markup, and non-prose text. Early on, we dis­abled alert emails for par­tic­i­pants’ bib­li­og­ra­phy and out­line ex­er­cises, and through­out the end of 2025, we re­fined the Dashboard’s pre­pro­cess­ing steps to ex­tract the prose por­tions of re­vi­sions and con­vert them to plain text be­fore send­ing them to Pangram.

Many par­tic­i­pants also re­ported just us­ing Grammarly to copy edit.” In our ex­pe­ri­ence, how­ever, the small­est fixes done with Grammarly never trig­ger Pangram’s de­tec­tion, but if you use its more ad­vanced con­tent cre­ation fea­tures, the re­sult­ing text reg­is­ters as be­ing AI gen­er­ated.

But over­whelm­ingly, we were pleased with Pangram’s re­sults. Our early in­ter­ven­tions with par­tic­i­pants who were flagged as us­ing gen­er­a­tive AI for ex­er­cises that would not en­ter main­space seemed to head off their fu­ture use of gen­er­a­tive AI. We sup­ported 6,357 new ed­i­tors in fall 2025, and only 217 of them (or 3%) had mul­ti­ple AI alerts. Only 5% of the par­tic­i­pants we sup­ported had main­space AI alerts. That means thou­sands of par­tic­i­pants suc­cess­fully edited Wikipedia with­out us­ing gen­er­a­tive AI to draft their con­tent.

For those who did add GenAI-drafted text, we en­sured that the con­tent was re­verted. In fact, par­tic­i­pants some­times self-re­verted once they re­ceived our email let­ting them know Pangram had de­tected their con­tri­bu­tions as be­ing AI cre­ated. Instructors also jumped in to re­vert, as did some Wikipedians who found the con­tent on their own. Our tick­et­ing sys­tem also alerted our Wiki Expert staff, who re­verted the text as soon as they could.

While some in­struc­tors in our Wikipedia Student Program had con­cerns about AI de­tec­tion, we had a lot of suc­cess fo­cus­ing the con­ver­sa­tion on the con­cept of ver­i­fi­a­bil­ity. If the in­struc­tor as sub­ject mat­ter ex­pert could at­test the in­for­ma­tion was ac­cu­rate, and they could find the spe­cific facts in the sources they were cited to, we per­mit­ted text to come back to Wikipedia. However, the process of at­tempt­ing to ver­ify stu­dent-cre­ated work (which in many cases the stu­dents swore they’d writ­ten them­selves) led many in­struc­tors to re­al­ize what we had found in our own as­sess­ment: In their cur­rent states, GenAI-powered chat­bots can­not write fac­tu­ally ac­cu­rate text for Wikipedia that is ver­i­fi­able.

We be­lieve our Pangram-based de­tec­tion in­ter­ven­tions led to fewer par­tic­i­pants adding GenAI-created con­tent to Wikipedia. Following the trend lines, we an­tic­i­pated about 25% of par­tic­i­pants to add GenAI con­tent to Wikipedia ar­ti­cles; in­stead, it was only 5%, and our staff were able to re­vert all prob­lem­atic con­tent.

I’m deeply ap­pre­cia­tive of every­one who made this suc­cess pos­si­ble this term: Participants who fol­lowed our rec­om­men­da­tions, Pangram who gave us ac­cess to their de­tec­tion ser­vice, Wiki Education staff who did the heavy lift of work­ing with all of the pos­i­tive de­tec­tions, and the Wikipedia com­mu­nity, some of whom got to the prob­lem­atic work from our pro­gram par­tic­i­pants be­fore we did.

So far, I’ve fo­cused on the prob­lems with gen­er­a­tive AI-created con­tent. But that’s not all these tools can do, and we did find some ways they were use­ful. Our train­ing mod­ule en­cour­ages ed­i­tors — if their in­sti­tu­tion’s poli­cies per­mit it — to con­sider us­ing gen­er­a­tive AI tools for:

To eval­u­ate the suc­cess of these use sce­nar­ios, we worked di­rectly with 7 of the classes we sup­ported in fall 2025 in our Wikipedia Student Program. We asked stu­dents to anony­mously fill out a sur­vey every time they used gen­er­a­tive AI tools in their Wikipedia work. We asked what tool they used, what prompt they used, how they used the out­put, and whether they found it help­ful. While some stu­dents filled the sur­vey out mul­ti­ple times, oth­ers filled it out once. We had 102 re­sponses re­port­ing us­age at var­i­ous stages in the pro­ject. Overwhelmingly, 87% of the re­sponses who re­ported us­ing gen­er­a­tive AI said it was help­ful for them in the task. The most pop­u­lar tool by far was ChatGPT, with Grammarly as a dis­tant sec­ond, and the oth­ers in the sin­gle-dig­its of us­age.

* Identifying ar­ti­cles to work on that were rel­e­vant to the course they were tak­ing

* Highlighting gaps within ex­ist­ing ar­ti­cles, in­clud­ing miss­ing sec­tions or more re­cent in­for­ma­tion that was miss­ing

* Finding re­li­able sources that they had­n’t al­ready lo­cated

* Pointing to which data­base a cer­tain jour­nal ar­ti­cle could be found

* When prompted with the text they had drafted and the check­list of re­quire­ments, eval­u­at­ing the draft against those re­quire­ments

* Identifying cat­e­gories they could add to the ar­ti­cle they’d edited

Critically, no par­tic­i­pants re­ported us­ing AI tools to draft text for their as­sign­ments. One stu­dent re­ported: I pasted all of my writ­ing from my sand­box and said Put this in a ca­sual, less aca­d­e­mic tone’ … I fig­ured I’d try this but it did­n’t sound like what I nor­mally write and I did­n’t feel that it cap­tured what I was try­ing to get across so I scrapped it.”

While this was an in­for­mal re­search pro­ject, we re­ceived enough pos­i­tive feed­back from it to be­lieve us­ing ChatGPT and other tools can be help­ful in the re­search stage if ed­i­tors then crit­i­cally eval­u­ate the out­put they get, in­stead of blindly ac­cept­ing it. Even par­tic­i­pants who found AI help­ful re­ported that they did­n’t use every­thing it gave them, as some was ir­rel­e­vant. Undoubtedly, it’s cru­cial to main­tain the hu­man think­ing com­po­nent through­out the process.

My con­clu­sion is that, at least as of now, gen­er­a­tive AI-powered chat­bots like ChatGPT should never be used to gen­er­ate text for Wikipedia; too much of it will sim­ply be un­ver­i­fi­able. Our staff would spend far more time at­tempt­ing to ver­ify facts in AI-generated ar­ti­cles than if we’d sim­ply done the re­search and writ­ing our­selves.

That be­ing said, AI tools can be help­ful in the re­search process, es­pe­cially to help iden­tify con­tent gaps or sources, when used in con­junc­tion with a hu­man brain that care­fully eval­u­ates the in­for­ma­tion. Editors should never sim­ply take a chat­bot’s sug­ges­tion; in­stead, if they want to use a chat­bot, they should use it as a brain­storm part­ner to help them think through their plans for an ar­ti­cle.

To date, Wiki Education’s in­ter­ven­tions as our pro­gram par­tic­i­pants edit Wikipedia show promise for keep­ing un­ver­i­fi­able, GenAI-drafted con­tent off Wikipedia. Based on our ex­pe­ri­ences in the fall term, we have high con­fi­dence in Pangram as a de­tec­tor of AI con­tent, at least in Wikipedia ar­ti­cles. We will con­tinue our cur­rent strat­egy in 2026 (with more small ad­just­ments to make the sys­tem as re­li­able as we can).

More gen­er­ally, we found par­tic­i­pants had less AI lit­er­acy than pop­u­lar dis­course might sug­gest. Because of this, we cre­ated a sup­ple­men­tal large lan­guage mod­els train­ing that we’ve of­fered as an op­tional mod­ule for all par­tic­i­pants. Many par­tic­i­pants in­di­cated that they found our guid­ance re­gard­ing AI to be wel­come and help­ful as they at­tempt to nav­i­gate the new com­plex­i­ties cre­ated by AI tools.

We are also look­ing for­ward to more re­search on our work. A team of re­searchers — Francesco Salvi and Manoel Horta Ribeiro at Princeton University, Robert Cummings at the University of Mississippi, and Wiki Education’s Sage Ross — have been look­ing into Wiki Education’s Wikipedia Student Program ed­i­tors’ use of gen­er­a­tive AI over time. Preliminary re­sults have backed up our anec­do­tal un­der­stand­ing, while also re­veal­ing nu­ances of how text pro­duced by our stu­dents over time has changed with the in­tro­duc­tion of GenAI chat­bots. They also con­firmed our be­lief in Pangram: After run­ning stu­dent ed­its from 2015 up un­til the launch of ChatGPT through Pangram, with­out any date in­for­ma­tion in­volved, the team found Pangram cor­rectly iden­ti­fied that it was all 100% hu­man writ­ten. This re­search will con­tinue into the spring, as the team ex­plores ways of un­pack­ing the ef­fects of AI on dif­fer­ent as­pects of ar­ti­cle qual­ity.

And, of course, gen­er­a­tive AI is a rapidly chang­ing field. Just be­cause these were our find­ings in 2025 does­n’t mean they will hold true through­out 2026. Wiki Education re­mains com­mit­ted to mon­i­tor­ing, eval­u­at­ing, it­er­at­ing, and adapt­ing as needed. Fundamentally, we are com­mit­ted to en­sur­ing we add high qual­ity con­tent to Wikipedia through our pro­grams. And when we miss the mark, we are com­mit­ted to clean­ing up any dam­age.

While I’ve fo­cused this post on what Wiki Education has learned from work­ing with our pro­gram par­tic­i­pants, the lessons are ex­tend­able to oth­ers who are edit­ing Wikipedia. Already, 10% of adults world­wide are us­ing ChatGPT, and draft­ing text is one of the top use cases. As gen­er­a­tive AI us­age pro­lif­er­ates, its us­age by well-mean­ing peo­ple to draft con­tent for Wikipedia will as well. It’s un­likely that long­time, daily Wikipedia ed­i­tors would add con­tent copied and pasted from a GenAI chat­bot with­out ver­i­fy­ing all the in­for­ma­tion is in the sources it cites. But many ca­sual Wikipedia con­trib­u­tors or new ed­i­tors may un­know­ingly add bad con­tent to Wikipedia when us­ing a chat­bot. After all, it pro­vides what looks like ac­cu­rate facts, cited to what are of­ten real, rel­e­vant, re­li­able sources. Most ed­its we ended up re­vert­ing seemed ac­cept­able with a cur­sory re­view; it was only af­ter we at­tempted to ver­ify the in­for­ma­tion that we un­der­stood the prob­lems.

Because this un­ver­i­fi­able con­tent of­ten seems okay at first pass, it’s crit­i­cal for Wikipedia ed­i­tors to be equipped with tools like Pangram to more ac­cu­rately de­tect when they should take a closer look at ed­its. Automating re­view of text for gen­er­a­tive AI us­age — as Wikipedians have done for copy­right vi­o­la­tion text for years — would help pro­tect the in­tegrity of Wikipedia con­tent. In Wiki Education’s ex­pe­ri­ence, Pangram is a tool that could pro­vide ac­cu­rate as­sess­ments of text for ed­i­tors, and we would love to see a larger scale ver­sion of the tool we built to eval­u­ate ed­its from our pro­grams to be de­ployed across all ed­its on Wikipedia. Currently, ed­i­tors can add a warn­ing ban­ner that high­lights that the text might be LLM gen­er­ated, but this is based solely on the as­sess­ment of the per­son adding the ban­ner. Our ex­pe­ri­ence sug­gests that judg­ing by tone alone is­n’t enough; in­stead, tools like Pangram can flag highly prob­lem­atic in­for­ma­tion that should be re­verted im­me­di­ately but that might sound okay.

We’ve also found suc­cess in the train­ing mod­ules and sup­port we’ve cre­ated for our pro­gram par­tic­i­pants. Providing clear guid­ance — and the rea­son why that guid­ance ex­ists — has been key in help­ing us head off poor us­age of gen­er­a­tive AI text. We en­cour­age Wikipedians to con­sider re­vis­ing guid­ance to new con­trib­u­tors in the wel­come mes­sages to em­pha­size the pit­falls of adding GenAI-drafted text. Software aimed at new con­trib­u­tors cre­ated by the Wikimedia Foundation should cen­ter start­ing with a list of sources and draw­ing in­for­ma­tion from them, us­ing hu­man in­tel­lect, in­stead of gen­er­a­tive AI, to sum­ma­rize in­for­ma­tion. Providing guid­ance up­front can help well-mean­ing con­trib­u­tors steer clear of bad GenAI-created text.

Wikipedia re­cently cel­e­brated its 25th birth­day. For it to sur­vive into the fu­ture, it will need to adapt as tech­nol­ogy around it changes. Wikipedia would be noth­ing with­out its corps of vol­un­teer ed­i­tors. The con­sen­sus-based de­ci­sion-mak­ing model of Wikipedia means change does­n’t come quickly, but we hope this deep-dive will help spark a con­ver­sa­tion about changes that are needed to pro­tect Wikipedia into the fu­ture.

...

Read the original on wikiedu.org »

10 210 shares, 2 trendiness

US Has Investigated Claims WhatsApp Chats Aren’t Private

US law en­force­ment has been in­ves­ti­gat­ing al­le­ga­tions by for­mer Meta Platforms Inc. con­trac­tors that Meta per­son­nel can ac­cess WhatsApp mes­sages, de­spite the com­pa­ny’s state­ments that the chat ser­vice is pri­vate and en­crypted, ac­cord­ing to in­ter­views and an agen­t’s re­port seen by Bloomberg News.

The for­mer con­trac­tors’ claims — that they and some Meta staff had unfettered” ac­cess to WhatsApp mes­sages — were be­ing ex­am­ined by spe­cial agents with the US Department of Commerce, ac­cord­ing to the law en­force­ment records, as well as a per­son fa­mil­iar with the mat­ter and one of the con­trac­tors. Similar claims were also the sub­ject of a 2024 whistle­blower com­plaint to the US Securities and Exchange Commission, ac­cord­ing to the records and the per­son, who spoke on the con­di­tion that they not be iden­ti­fied out of con­cern for po­ten­tial re­tal­i­a­tion. The in­ves­ti­ga­tion and whistle­blower com­plaint haven’t been pre­vi­ously re­ported.

...

Read the original on www.bloomberg.com »

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

If you like 10HN please leave feedback and share

Visit pancik.com for more.