10 interesting stories served every morning and every evening.




1 387 shares, 92 trendiness

mitchellh/vouch: A contributor trust management system based on explicit vouches to participate.

A pro­ject trust man­age­ment sys­tem. People must be vouched for be­fore in­ter­act­ing with cer­tain parts of a pro­ject (the ex­act parts are con­fig­urable to the pro­ject to en­force). People can also be ex­plic­itly

de­nounced to block them from in­ter­act­ing with the pro­ject.

The im­ple­men­ta­tion is generic and can be used by any pro­ject on any code forge, but we pro­vide GitHub in­te­gra­tion out of the box via GitHub ac­tions and the CLI.

The vouch list is main­tained in a sin­gle flat file us­ing a min­i­mal for­mat that can be triv­ially parsed us­ing stan­dard POSIX tools and any pro­gram­ming lan­guage with­out ex­ter­nal li­braries.

Vouch lists can also form a web of trust. You can con­fig­ure Vouch to read other pro­jec­t’s lists of vouched or de­nounced users. This way, pro­jects with shared val­ues can share their trust de­ci­sions with each other and cre­ate a larger, more com­pre­hen­sive web of trust across the ecosys­tem. Users al­ready proven to be trust­wor­thy in one pro­ject can au­to­mat­i­cally be as­sumed trust­wor­thy in an­other pro­ject, and so on.

Open source has al­ways worked on a sys­tem of trust and ver­ify.

Historically, the ef­fort re­quired to un­der­stand a code­base, im­ple­ment a change, and sub­mit that change for re­view was high enough that it nat­u­rally fil­tered out many low qual­ity con­tri­bu­tions from un­qual­i­fied peo­ple. For over 20 years of my life, this was enough for my pro­jects as well as enough for most oth­ers.

Unfortunately, the land­scape has changed par­tic­u­larly with the ad­vent of AI tools that al­low peo­ple to triv­ially cre­ate plau­si­ble-look­ing but ex­tremely low-qual­ity con­tri­bu­tions with lit­tle to no true un­der­stand­ing. Contributors can no longer be trusted based on the min­i­mal bar­rier to en­try to sim­ply sub­mit a change.

But, open source still works on trust! And every pro­ject has a def­i­nite group of trusted in­di­vid­u­als (maintainers) and a larger group of prob­a­bly trusted in­di­vid­u­als (active mem­bers of the com­mu­nity in any form). So, let’s move to an ex­plicit trust model where trusted in­di­vid­u­als can vouch for oth­ers, and those vouched in­di­vid­u­als can then con­tribute.

Who and how some­one is vouched or de­nounced is left en­tirely up to the pro­ject in­te­grat­ing the sys­tem. Additionally, what con­se­quences a vouched or de­nounced per­son has is also fully up to the pro­ject. Implement a pol­icy that works for your pro­ject and com­mu­nity.

Integrating vouch into a GitHub pro­ject is easy with the

pro­vided GitHub Actions. By choos­ing which ac­tions to use, you can fully con­trol how users are vouched and what they can or can’t do.

For an ex­am­ple, look at this repos­i­tory! It fully in­te­grates vouch.

Below is a list of the ac­tions and a brief de­scrip­tion of their func­tion. See the linked README in the ac­tion di­rec­tory for full us­age de­tails.

The CLI is im­ple­mented as a Nushell mod­ule and only re­quires Nushell to run. There are no other ex­ter­nal de­pen­den­cies.

This is Nushell, so you can get help on any com­mand:

use vouch *

help add

help check

help de­nounce

help gh-check-pr

help gh-man­age-by-is­sue

vouch check

# Preview new file con­tents (default)

vouch add someuser

# Write the file in-place

vouch add someuser –write

# Preview new file con­tents (default)

vouch de­nounce badac­tor

# With a rea­son

vouch de­nounce badac­tor –reason Submitted AI slop”

# Write the file in-place

vouch de­nounce badac­tor –write

Requires the GITHUB_TOKEN en­vi­ron­ment vari­able. If not set and gh

is avail­able, the to­ken from gh auth to­ken is used.

# Check PR au­thor sta­tus (dry run)

vouch gh-check-pr 123 –repo owner/​repo

# Auto-close un­vouched PRs (dry run)

vouch gh-check-pr 123 –repo owner/​repo –auto-close

# Actually close un­vouched PRs

vouch gh-check-pr 123 –repo owner/​repo –auto-close –dry-run=false

# Allow un­vouched users, only block de­nounced

vouch gh-check-pr 123 –repo owner/​repo –require-vouch=false –auto-close

# Dry run (default)

vouch gh-man­age-by-is­sue 123 456789 –repo owner/​repo

# Actually per­form the ac­tion

vouch gh-man­age-by-is­sue 123 456789 –repo owner/​repo –dry-run=false

Responds to com­ments from col­lab­o­ra­tors with write ac­cess:

* vouch — vouches for the is­sue au­thor with a rea­son

Keywords are cus­tomiz­able via –vouch-keyword and –denounce-keyword.

The mod­ule also ex­ports a lib sub­mod­ule for script­ing:

use vouch/​lib.nu *

let records = open VOUCHED.td

$records | check-user mitchellh” –default-platform github # vouched”, denounced”, or unknown”

$records | add-user newuser” # re­turns up­dated table

$records | de­nounce-user badactor” reason” # re­turns up­dated table

$records | re­move-user olduser” # re­turns up­dated table

The vouch list is stored in a .td file. See

VOUCHED.example.td for an ex­am­ple. The file is looked up at VOUCHED.td or .github/VOUCHED.td by de­fault.

* One han­dle per line (without @), sorted al­pha­bet­i­cally.

* Optionally add de­tails af­ter a space fol­low­ing the han­dle.

The from td and to td com­mands are ex­ported by the mod­ule, so Nushell’s open com­mand works na­tively with .td files to de­code into struc­tured ta­bles and en­code back to the file for­mat with com­ments and white­space pre­served.

...

Read the original on github.com »

2 368 shares, 42 trendiness

AI fatigue is real and nobody talks about it

You’re us­ing AI to be more pro­duc­tive. So why are you more ex­hausted than ever? The para­dox every en­gi­neer needs to con­front.

You’re us­ing AI to be more pro­duc­tive. So why are you more ex­hausted than ever? The para­dox every en­gi­neer needs to con­front.

I shipped more code last quar­ter than any quar­ter in my ca­reer. I also felt more drained than any quar­ter in my ca­reer. These two facts are not un­re­lated.

I build AI agent in­fra­struc­ture for a liv­ing. I’m one of the core main­tain­ers of OpenFGA (CNCF Incubating), I built agen­tic-au­thz for agent au­tho­riza­tion, I built Distill for con­text dedu­pli­ca­tion, I shipped MCP servers. I’m not some­one who dab­bles with AI on the side. I’m deep in it. I build the tools that other en­gi­neers use to make AI agents work in pro­duc­tion.

And yet, I hit a wall. The kind of ex­haus­tion that no amount of tool­ing or work­flow op­ti­miza­tion could fix.

If you’re an en­gi­neer who uses AI daily - for de­sign re­views, code gen­er­a­tion, de­bug­ging, doc­u­men­ta­tion, ar­chi­tec­ture de­ci­sions - and you’ve no­ticed that you’re some­how more tired than be­fore AI ex­isted, this post is for you. You’re not imag­in­ing it. You’re not weak. You’re ex­pe­ri­enc­ing some­thing real that the in­dus­try is ag­gres­sively pre­tend­ing does­n’t ex­ist. And if some­one who builds agent in­fra­struc­ture full-time can burn out on AI, it can hap­pen to any­one.

I want to talk about it hon­estly. Not the AI is amaz­ing and here’s my work­flow” ver­sion. The real ver­sion. The one where you stare at your screen at 11pm, sur­rounded by AI-generated code you still need to re­view, won­der­ing why the tool that was sup­posed to save you time has con­sumed your en­tire day.

Here’s the thing that broke my brain for a while: AI gen­uinely makes in­di­vid­ual tasks faster. That’s not a lie. What used to take me 3 hours now takes 45 min­utes. Drafting a de­sign doc, scaf­fold­ing a new ser­vice, writ­ing test cases, re­search­ing an un­fa­mil­iar API. All faster.

But my days got harder. Not eas­ier. Harder.

The rea­son is sim­ple once you see it, but it took me months to fig­ure out. When each task takes less time, you don’t do fewer tasks. You do more tasks. Your ca­pac­ity ap­pears to ex­pand, so the work ex­pands to fill it. And then some. Your man­ager sees you ship­ping faster, so the ex­pec­ta­tions ad­just. You see your­self ship­ping faster, so your own ex­pec­ta­tions ad­just. The base­line moves.

Before AI, I might spend a full day on one de­sign prob­lem. I’d sketch on pa­per, think in the shower, go for a walk, come back with clar­ity. The pace was slow but the cog­ni­tive load was man­age­able. One prob­lem. One day. Deep fo­cus.

Now? I might touch six dif­fer­ent prob­lems in a day. Each one only takes an hour with AI.” But con­text-switch­ing be­tween six prob­lems is bru­tally ex­pen­sive for the hu­man brain. The AI does­n’t get tired be­tween prob­lems. I do.

This is the para­dox: AI re­duces the cost of pro­duc­tion but in­creases the cost of co­or­di­na­tion, re­view, and de­ci­sion-mak­ing. And those costs fall en­tirely on the hu­man.

Before AI, my job was: think about a prob­lem, write code, test it, ship it. I was the cre­ator. The maker. That’s what drew most of us to en­gi­neer­ing in the first place - the act of build­ing.

After AI, my job in­creas­ingly be­came: prompt, wait, read out­put, eval­u­ate out­put, de­cide if out­put is cor­rect, de­cide if out­put is safe, de­cide if out­put matches the ar­chi­tec­ture, fix the parts that don’t, re-prompt, re­peat. I be­came a re­viewer. A judge. A qual­ity in­spec­tor on an as­sem­bly line that never stops.

This is a fun­da­men­tally dif­fer­ent kind of work. Creating is en­er­giz­ing. Reviewing is drain­ing. There’s re­search on this - the psy­cho­log­i­cal dif­fer­ence be­tween gen­er­a­tive tasks and eval­u­a­tive tasks. Generative work gives you flow states. Evaluative work gives you de­ci­sion fa­tigue.

I no­ticed it first dur­ing a week where I was us­ing AI heav­ily for a new mi­croser­vice. By Wednesday, I could­n’t make sim­ple de­ci­sions any­more. What should this func­tion be named? I did­n’t care. Where should this con­fig live? I did­n’t care. My brain was full. Not from writ­ing code - from judg­ing code. Hundreds of small judg­ments, all day, every day.

The cruel irony is that AI-generated code re­quires more care­ful re­view than hu­man-writ­ten code. When a col­league writes code, I know their pat­terns, their strengths, their blind spots. I can skim the parts I trust and fo­cus on the parts I don’t. With AI, every line is sus­pect. The code looks con­fi­dent. It com­piles. It might even pass tests. But it could be sub­tly wrong in ways that only sur­face in pro­duc­tion, un­der load, at 3am.

So you read every line. And read­ing code you did­n’t write, that was gen­er­ated by a sys­tem that does­n’t un­der­stand your code­base’s his­tory or your team’s con­ven­tions, is ex­haust­ing work.

This is also why I think agent se­cu­rity and au­tho­riza­tion mat­ter so much. If we can’t re­view every­thing AI pro­duces - and we can’t, not at scale - then we need sys­tems that con­strain what agents can do in the first place. Least-privilege ac­cess, scoped to­kens, au­dit trails. The less you have to worry about did the AI do some­thing dan­ger­ous,” the more cog­ni­tive bud­get you have for the work that ac­tu­ally mat­ters. This is­n’t just a se­cu­rity prob­lem. It’s a hu­man sus­tain­abil­ity prob­lem.

Engineers are trained on de­ter­min­ism. Same in­put, same out­put. That’s the con­tract. That’s what makes de­bug­ging pos­si­ble. That’s what makes rea­son­ing about sys­tems pos­si­ble.

I had a prompt that worked per­fectly on Monday. Generated clean, well-struc­tured code for an API end­point. I used the same prompt on Tuesday for a sim­i­lar end­point. The out­put was struc­turally dif­fer­ent, used a dif­fer­ent er­ror han­dling pat­tern, and in­tro­duced a de­pen­dency I did­n’t ask for.

Why? No rea­son. Or rather, no rea­son I can ac­cess. There’s no stack trace for the model de­cided to go a dif­fer­ent di­rec­tion to­day.” There’s no log that says temperature sam­pling chose path B in­stead of path A.” It just… hap­pened dif­fer­ently.

For some­one whose en­tire ca­reer is built on if it broke, I can find out why,” this is deeply un­set­tling. Not in a dra­matic way. In a slow, grind­ing, back­ground-anx­i­ety way. You can never fully trust the out­put. You can never fully re­lax. Every in­ter­ac­tion re­quires vig­i­lance.

I tried to fight this. I ver­sion-con­trolled my prompts. I built elab­o­rate sys­tem mes­sages. I cre­ated tem­plates. Some of it helped. None of it solved the fun­da­men­tal prob­lem: you are col­lab­o­rat­ing with a prob­a­bilis­tic sys­tem, and your brain is wired for de­ter­min­is­tic ones. That mis­match is a con­stant, low-grade source of stress.

This frus­tra­tion is ac­tu­ally what led me to build Distill - de­ter­min­is­tic con­text dedu­pli­ca­tion for LLMs. No LLM calls, no em­bed­dings, no prob­a­bilis­tic heuris­tics. Pure al­go­rithms that clean your con­text in ~12ms. I wanted at least one part of the AI pipeline to be some­thing I could rea­son about, de­bug, and trust. If the mod­el’s out­put is go­ing to be non­de­ter­min­is­tic, the least I can do is make sure the in­put is clean and pre­dictable.

The en­gi­neers I’ve talked to who han­dle this best are the ones who’ve made peace with it. They treat AI out­put like a first draft from a smart but un­re­li­able in­tern. They ex­pect to rewrite 30% of it. They bud­get time for that rewrit­ing. They don’t get frus­trated when the out­put is wrong be­cause they never ex­pected it to be right. They ex­pected it to be use­ful. There’s a dif­fer­ence.

Take a breath and try to keep up with just the last few months. Claude Code ships sub-agents, then skills, then an Agent SDK, then Claude Cowork. OpenAI launches Codex CLI, then GPT-5.3-Codex - a model that lit­er­ally helped code it­self. New cod­ing agents an­nounce back­ground mode with hun­dreds of con­cur­rent au­tonomous ses­sions. Google drops Gemini CLI. GitHub adds an MCP Registry. Acquisitions hap­pen weekly. Amazon Q Developer gets agen­tic up­grades. CrewAI, AutoGen, LangGraph, MetaGPT - pick your agent frame­work, there’s a new one every week. Google an­nounces A2A (Agent-to-Agent pro­to­col) to com­pete with Anthropic’s MCP. OpenAI ships its own Swarm frame­work. Kimi K2.5 drops with agent swarm ar­chi­tec­ture or­ches­trat­ing 100 par­al­lel agents. Vibe cod­ing” be­comes a thing. OpenClaw launches a skills mar­ket­place and within one week, re­searchers find 400+ ma­li­cious agent skills up­loaded to ClawHub. And some­where in the mid­dle of all this, some­one on LinkedIn posts if you’re not us­ing AI agents with sub-agent or­ches­tra­tion in 2026, you’re al­ready ob­so­lete.”

That’s not a year. That’s a few months. And I’m leav­ing stuff out.

I fell into this trap hard. I was spend­ing week­ends eval­u­at­ing new tools. Reading every changelog. Watching every demo. Trying to stay at the fron­tier be­cause I was ter­ri­fied of falling be­hind.

Here’s what that ac­tu­ally looked like: I’d spend Saturday af­ter­noon set­ting up a new AI cod­ing tool. By Sunday I’d have a ba­sic work­flow. By the fol­low­ing Wednesday, some­one would post about a dif­fer­ent tool that was way bet­ter.” I’d feel a pang of anx­i­ety. By the next week­end, I’d be set­ting up the new thing. The old thing would sit un­used. One cod­ing as­sis­tant to the next to the next and back to the first one. Each mi­gra­tion cost me a week­end and gave me maybe a 5% im­prove­ment that I could­n’t even mea­sure prop­erly.

Multiply this by every cat­e­gory - cod­ing as­sis­tants, chat in­ter­faces, agent frame­works, multi-agent or­ches­tra­tion plat­forms, MCP servers, con­text man­age­ment tools, prompt li­braries, swarm ar­chi­tec­tures, skills mar­ket­places - and you get a per­son who is per­pet­u­ally learn­ing new tools and never get­ting deep with any of them. The Hacker News front page alone is enough to give you whiplash. One day it’s Show HN: Autonomous Research Swarm” and the next it’s Ask HN: How will AI swarms co­or­di­nate?” Nobody knows. Everyone’s build­ing any­way.

The worst part is the knowl­edge de­cay. I spent two weeks build­ing a so­phis­ti­cated prompt en­gi­neer­ing work­flow in early 2025. Carefully crafted sys­tem prompts, few-shot ex­am­ples, chain-of-thought tem­plates. It worked well. Three months later, the model up­dated, the prompt­ing best prac­tices shifted, and half my tem­plates pro­duced worse re­sults than a sim­ple one-liner. Those two weeks were gone. Not in­vested. Spent. The same thing hap­pened with my MCP server setup - I built five cus­tom servers (Dev.to pub­lisher, Apple Notes in­te­gra­tion, Python and TypeScript sand­boxes, more), then the pro­to­col evolved, then the MCP Registry launched on GitHub and sud­denly there were thou­sands of pre-built ones. Some of my cus­tom work be­came re­dun­dant overnight.

The agent frame­work churn is even worse. I watched teams go from LangChain to CrewAI to AutoGen to cus­tom or­ches­tra­tion in the span of a year. Each mi­gra­tion meant rewrit­ing in­te­gra­tions, re­learn­ing APIs, re­build­ing work­flows. The peo­ple who waited and did noth­ing of­ten ended up in a bet­ter po­si­tion than the peo­ple who adopted early and had to mi­grate twice.

I’ve since adopted a dif­fer­ent ap­proach. Instead of chas­ing every new tool, I go deep on the in­fra­struc­ture layer un­der­neath them. Tools come and go. The prob­lems they solve don’t. Context ef­fi­ciency, agent au­tho­riza­tion, au­dit trails, run­time se­cu­rity - these are durable prob­lems re­gard­less of which frame­work is trend­ing this month. That’s why I built agen­tic-au­thz on OpenFGA in­stead of ty­ing it to any spe­cific agent frame­work. That’s why Distill works at the con­text level, not the prompt level. Build on the layer that does­n’t churn.

I still track the land­scape closely - you have to when you’re build­ing in­fra­struc­ture for it. But I track it to un­der­stand where the ecosys­tem is go­ing, not to adopt every new thing. There’s a dif­fer­ence be­tween be­ing in­formed and be­ing re­ac­tive.

This one is in­sid­i­ous. You’re try­ing to get AI to gen­er­ate some­thing spe­cific. The first out­put is 70% right. So you re­fine your prompt. The sec­ond out­put is 75% right but broke some­thing the first one had cor­rect. Third at­tempt: 80% right but now the struc­ture is dif­fer­ent. Fourth at­tempt: you’ve been at this for 45 min­utes and you could have writ­ten the thing from scratch in 20.

I call this the prompt spi­ral. It’s the AI equiv­a­lent of yak shav­ing. You started with a clear goal. Thirty min­utes later you’re de­bug­ging your prompt in­stead of de­bug­ging your code. You’re op­ti­miz­ing your in­struc­tions to a lan­guage model in­stead of solv­ing the ac­tual prob­lem.

The prompt spi­ral is es­pe­cially dan­ger­ous be­cause it feels pro­duc­tive. You’re it­er­at­ing. You’re get­ting closer. Each at­tempt is slightly bet­ter. But the mar­ginal re­turns are di­min­ish­ing fast, and you’ve lost sight of the fact that the goal was never get the AI to pro­duce per­fect out­put.” The goal was to ship the fea­ture.

I now have a hard rule: three at­tempts. If the AI does­n’t get me to 70% us­able in three prompts, I write it my­self. No ex­cep­tions. This sin­gle rule has saved me more time than any prompt­ing tech­nique I’ve ever learned.

Engineers tend to­ward per­fec­tion­ism. We like clean code. We like tests that pass. We like sys­tems that be­have pre­dictably. This is a fea­ture, not a bug - it’s what makes us good at build­ing re­li­able soft­ware.

AI out­put is never per­fect. It’s al­ways pretty good.” 70-80% there. The vari­able names are slightly off. The er­ror han­dling is in­com­plete. The edge cases are ig­nored. The ab­strac­tion is wrong for your code­base. It works, but it’s not right.

For a per­fec­tion­ist, this is tor­ture. Because almost right” is worse than completely wrong.” Completely wrong, you throw away and start over. Almost right, you spend an hour tweak­ing. And tweak­ing AI out­put is uniquely frus­trat­ing be­cause you’re fix­ing some­one else’s de­sign de­ci­sions - de­ci­sions that were made by a sys­tem that does­n’t share your taste, your con­text, or your stan­dards.

I had to learn to let go. Not of qual­ity - I still care about qual­ity. But of the ex­pec­ta­tion that AI would pro­duce qual­ity. I now treat every AI out­put as a rough draft. A start­ing point. Raw ma­te­r­ial. I men­tally la­bel it draft” the mo­ment it ap­pears, and that fram­ing change alone re­duced my frus­tra­tion by half.

The en­gi­neers who strug­gle most with AI are of­ten the best en­gi­neers. The ones with the high­est stan­dards. The ones who no­tice every im­per­fec­tion. AI re­wards a dif­fer­ent skill: the abil­ity to ex­tract value from im­per­fect out­put quickly, with­out get­ting emo­tion­ally in­vested in mak­ing it per­fect.

This is the one that scares me most.

I no­ticed it dur­ing a de­sign re­view meet­ing. Someone asked me to rea­son through a con­cur­rency prob­lem on the white­board. No lap­top. No AI. Just me and a marker. And I strug­gled. Not be­cause I did­n’t know the con­cepts - I did. But be­cause I had­n’t ex­er­cised that mus­cle in months. I’d been out­sourc­ing my first-draft think­ing to AI for so long that my abil­ity to think from scratch had de­graded.

It’s like GPS and nav­i­ga­tion. Before GPS, you built men­tal maps. You knew your city. You could rea­son about routes. After years of GPS, you can’t nav­i­gate with­out it. The skill at­ro­phied be­cause you stopped us­ing it.

The same thing is hap­pen­ing with AI and en­gi­neer­ing think­ing. When you al­ways ask AI first, you stop build­ing the neural path­ways that come from strug­gling with a prob­lem your­self. The strug­gle is where learn­ing hap­pens. The con­fu­sion is where un­der­stand­ing forms. Skip that, and you get faster out­put but shal­lower un­der­stand­ing.

I now de­lib­er­ately spend the first hour of my day with­out AI. I think on pa­per. I sketch ar­chi­tec­tures by hand. I rea­son through prob­lems the slow way. It feels in­ef­fi­cient. It is in­ef­fi­cient. But it keeps my think­ing sharp, and that sharp­ness pays div­i­dends for the rest of the day when I do use AI - be­cause I can eval­u­ate its out­put bet­ter when my own rea­son­ing is warmed up.

Social me­dia is full of peo­ple who seem to have AI fig­ured out. They post their work­flows. Their pro­duc­tiv­ity num­bers. Their I built this en­tire app in 2 hours with AI threads. And you look at your own ex­pe­ri­ence - the failed prompts, the wasted time, the code you had to rewrite - and you think: what’s wrong with me?

Nothing is wrong with you. Those threads are high­light reels. Nobody posts I spent 3 hours try­ing to get Claude to un­der­stand my data­base schema and even­tu­ally gave up and wrote the mi­gra­tion by hand.” Nobody posts AI-generated code caused a pro­duc­tion in­ci­dent be­cause it silently swal­lowed an er­ror.” Nobody posts I’m tired.”

The com­par­i­son trap is am­pli­fied by the fact that AI skill is hard to mea­sure. With tra­di­tional en­gi­neer­ing, you can look at some­one’s code and roughly gauge their abil­ity. With AI, the out­put de­pends on the model, the prompt, the con­text, the tem­per­a­ture, the phase of the moon. Someone’s im­pres­sive demo might not re­pro­duce on your ma­chine with your code­base.

I be­came much more se­lec­tive about AI con­tent on so­cial me­dia. I still fol­low the space closely - I have to, it’s my job. But I shifted from con­sum­ing every­one’s hot takes to fo­cus­ing on peo­ple who are ac­tu­ally build­ing and ship­ping, not just demo­ing. The ra­tio of sig­nal to anx­i­ety mat­ters. If a feed is mak­ing you feel be­hind in­stead of in­formed, it’s not serv­ing you.

I’ll be spe­cific about what changed my re­la­tion­ship with AI from ad­ver­sar­ial to sus­tain­able.

Time-boxing AI ses­sions. I don’t use AI in an open-ended way any­more. I set a timer. 30 min­utes for this task with AI. When the timer goes off, I ship what I have or switch to writ­ing it my­self. This pre­vents the prompt spi­ral and the per­fec­tion­ism trap si­mul­ta­ne­ously.

Separating AI time from think­ing time. Morning is for think­ing. Afternoon is for AI-assisted ex­e­cu­tion. This is­n’t rigid - some­times I break the rule. But hav­ing a de­fault struc­ture means my brain gets both ex­er­cise and as­sis­tance in the right pro­por­tions.

Accepting 70% from AI. I stopped try­ing to get per­fect out­put. 70% us­able is the bar. I’ll fix the rest my­self. This ac­cep­tance was the sin­gle biggest re­ducer of AI-related frus­tra­tion in my work­flow.

Being strate­gic about the hype cy­cle. I track the AI land­scape be­cause I build in­fra­struc­ture for it. But I stopped adopt­ing every new tool the week it launches. I use one pri­mary cod­ing as­sis­tant and know it deeply. I eval­u­ate new tools when they’ve proven them­selves over months, not days. Staying in­formed and stay­ing re­ac­tive are dif­fer­ent things.

Logging where AI helps and where it does­n’t. I kept a sim­ple log for two weeks: task, used AI (yes/no), time spent, sat­is­fac­tion with re­sult. The data was re­veal­ing. AI saved me sig­nif­i­cant time on boil­er­plate, doc­u­men­ta­tion, and test gen­er­a­tion. It cost me time on ar­chi­tec­ture de­ci­sions, com­plex de­bug­ging, and any­thing re­quir­ing deep con­text about my code­base. Now I know when to reach for it and when not to.

Not re­view­ing every­thing AI pro­duces. This was hard to ac­cept. But if you’re us­ing AI to gen­er­ate large amounts of code, you phys­i­cally can­not re­view every line with the same rigor. I fo­cus my re­view en­ergy on the parts that mat­ter most - se­cu­rity bound­aries, data han­dling, er­ror paths - and rely on au­to­mated tests and sta­tic analy­sis for the rest. Some rough­ness in non-crit­i­cal code is ac­cept­able.

The tech in­dus­try has a burnout prob­lem that pre­dates AI. AI is mak­ing it worse, not bet­ter. Not be­cause AI is bad, but be­cause AI re­moves the nat­ural speed lim­its that used to pro­tect us.

Before AI, there was a ceil­ing on how much you could pro­duce in a day. That ceil­ing was set by typ­ing speed, think­ing speed, the time it takes to look things up. It was frus­trat­ing some­times, but it was also a gov­er­nor. You could­n’t work your­self to death be­cause the work it­self im­posed lim­its.

AI re­moved the gov­er­nor. Now the only limit is your cog­ni­tive en­durance. And most peo­ple don’t know their cog­ni­tive lim­its un­til they’ve blown past them.

I burned out in late 2025. Not dra­mat­i­cally - I did­n’t quit or have a break­down. I just stopped car­ing. Code re­views be­came rub­ber stamps. Design de­ci­sions be­came whatever AI sug­gests.” I was go­ing through the mo­tions, pro­duc­ing more than ever, feel­ing less than ever. It took me a month to re­al­ize what had hap­pened and an­other month to re­cover.

The re­cov­ery was­n’t about us­ing less AI. It was about us­ing AI dif­fer­ently. With bound­aries. With in­ten­tion. With the un­der­stand­ing that I am not a ma­chine and I don’t need to keep pace with one. Working at Ona helped me see this clearly - when you’re build­ing AI agent in­fra­struc­ture for en­ter­prise cus­tomers, you see the hu­man cost of un­sus­tain­able AI work­flows at scale. The prob­lems aren’t just per­sonal. They’re sys­temic. And they need to be solved at the tool­ing level, not just the in­di­vid­ual level.

Ironically, the burnout pe­riod is when some of my best work hap­pened. When I stopped try­ing to use every AI tool and started think­ing about what was ac­tu­ally bro­ken, I saw the prob­lems clearly for the first time. Context win­dows fill­ing up with garbage - that be­came Distill. Agents with all-or-noth­ing API key ac­cess - that be­came agen­tic-au­thz. The in­abil­ity to au­dit what an agent ac­tu­ally did - that’s be­com­ing AgentTrace. The fa­tigue forced me to stop con­sum­ing and start build­ing. Not build­ing more fea­tures faster, but build­ing the right things de­lib­er­ately.

Here’s what I think the real skill of the AI era is. It’s not prompt en­gi­neer­ing. It’s not know­ing which model to use. It’s not hav­ing the per­fect work­flow.

It’s know­ing when to stop.

Knowing when the AI out­put is good enough. Knowing when to write it your­self. Knowing when to close the lap­top. Knowing when the mar­ginal im­prove­ment is­n’t worth the cog­ni­tive cost. Knowing that your brain is a fi­nite re­source and that pro­tect­ing it is not lazi­ness - it’s en­gi­neer­ing.

We op­ti­mize our sys­tems for sus­tain­abil­ity. We add cir­cuit break­ers. We im­ple­ment back­pres­sure. We de­sign for grace­ful degra­da­tion. We should do the same for our­selves.

AI is the most pow­er­ful tool I’ve ever used. It’s also the most drain­ing. Both things are true. The en­gi­neers who thrive in this era won’t be the ones who use AI the most. They’ll be the ones who use it the most wisely.

If you’re tired, it’s not be­cause you’re do­ing it wrong. It’s be­cause this is gen­uinely hard. The tool is new, the pat­terns are still form­ing, and the in­dus­try is pre­tend­ing that more out­put equals more value. It does­n’t. Sustainable out­put does.

I’m still build­ing in this space every day. Agent au­tho­riza­tion, con­text en­gi­neer­ing, au­dit trails, run­time se­cu­rity - the in­fra­struc­ture that makes AI agents ac­tu­ally work in pro­duc­tion. I’m more com­mit­ted to AI than ever. But I’m com­mit­ted on my terms, at my pace, build­ing things that mat­ter in­stead of chas­ing things that trend.

Take care of your brain. It’s the only one you’ve got, and no AI can re­place it.

If this res­onated, I’d love to hear your ex­pe­ri­ence. What does AI fa­tigue look like for you? Find me on X or LinkedIn, or join the dis­cus­sion on Hacker News.

I write about AI agent in­fra­struc­ture, se­cu­rity, con­text en­gi­neer­ing, and the hu­man side of build­ing with AI. You can find all my writ­ing on my writ­ing page.

...

Read the original on siddhantkhare.com »

3 359 shares, 26 trendiness

Open Source — DoNotNotify

We’re ex­cited to an­nounce that DoNotNotify has been open sourced. The full source code for the app is now pub­licly avail­able for any­one to view, study, and con­tribute to.

You can find the source code on GitHub:

...

Read the original on donotnotify.com »

4 333 shares, 31 trendiness

I Am Happier Writing Code by Hand

I felt the fa­mil­iar feel­ing of de­pres­sion and lethargy creep in while my eyes darted from watch­ing claude-code work and my phone. What’s the point of it all?” I thought, LLMs can gen­er­ate de­cent-ish and cor­rect-ish look­ing code while I have more time to do what? doom­scroll? This was the third time I gave claude-code a try. I felt the same feel­ings every sin­gle time and ended up delet­ing claude-code af­ter 2-3 weeks, and whad­dy­ouknow? Every. Single. Time. I re­dis­cov­ered the joy of cod­ing.

Yes, cod­ing is not soft­ware en­gi­neer­ing, but for me, it is a fun and es­sen­tial part of it. In or­der to be ef­fec­tive at soft­ware en­gi­neer­ing, you must be fa­mil­iar with the prob­lem space, and this re­quires think­ing and wrestling with the prob­lem. You can’t truly know the pain of us­ing an API by just read­ing its doc­u­men­ta­tion or im­ple­men­ta­tion. You have to use it to ex­pe­ri­ence it. The act of writ­ing code, de­spite be­ing slower, was a way for me to wres­tle with the prob­lem space, a way for me to find out that my ini­tial ideas did­n’t work, a way for think­ing. Vibe cod­ing in­ter­fered with that.

If you’re think­ing with­out writ­ing, you only think you’re think­ing.

The other ma­jor part of the job is to en­sure cor­rect­ness. For me, it is much harder to ver­ify the cor­rect­ness of code I did­n’t write com­pared to code I wrote. The process of writ­ing code helps in­ter­nal­ize the con­text and is eas­ier for my brain to think deeply about it. If I out­source this to an LLM, I skip over the process of in­ter­nal­iz­ing the prob­lem do­main and I can’t be cer­tain that the gen­er­ated code is cor­rect.

By de­sign, vibe cod­ing has an ad­dic­tive na­ture to it, you write some in­struc­tions, and code that looks cor­rect is gen­er­ated. Bam! Dopamine hit! If the code is­n’t cor­rect, then it’s just one prompt away from be­ing cor­rect, right? right?

Vibe cod­ing also has the pro­found ef­fect of turn­ing my brain off and pas­sively ac­cept­ing changes. When it is time to use my brain, the in­er­tia is much harder to over­come and it is easy to choose the lazy way out. At my low­est point, I even asked it to do a find-and-re­place in a file. Something that takes a few sec­onds, now took min­utes and a net­work call.

Even if I gen­er­ate a 1,000 line PR in 30 min­utes I still need to un­der­stand and re­view it. Since I am re­spon­si­ble for the code I ship, this makes me the bot­tle­neck.

The com­mon view of vibe cod­ing is that it is nei­ther good nor bad, it is a tool. But tools shape your work­flow and your thought process, and if a tool pre­vents you from think­ing deeply, I don’t think it is a good tool. If you are a knowl­edge worker, your core com­pe­tency is your abil­ity to think, and if a tool in­ter­feres with that, be afraid, be very afraid.

Now, I would be ly­ing if I said I did­n’t use LLMs to gen­er­ate code. I still use Claude, but I do so in a more con­trolled man­ner. I copy-paste files that I think are nec­es­sary to pro­vide the con­text, and then I copy-paste code and ask it to make changes to it or write tests for it. This fric­tion has sev­eral ben­e­fits. I can’t make changes that span mul­ti­ple files, this means the gen­er­ated diff is­n’t too large, and if I have to man­u­ally change other files I know how the code fits in. Manually giv­ing claude the con­text forces me to be fa­mil­iar with the code­base my­self, rather than tell it to just cook”. It turns code gen­er­a­tion from a pas­sive ac­tion to a de­lib­er­ate thought­ful ac­tion. It also keeps my brain en­gaged and ac­tive, which means I can still en­ter the flow state. I have found this to be the best of both worlds and a way to pre­serve my hap­pi­ness at work.

Ultimately, life is too short to not op­ti­mize for hap­pi­ness. Maybe (a big maybe) gen­er­at­ing en­tire fea­tures would make me more pro­duc­tive, but if it causes ex­is­ten­tial dread and makes me de­pressed, I don’t see it be­ing pro­duc­tive in the long run. Maybe you re­late to some of the feel­ings. Maybe you don’t. But don’t be afraid to choose dif­fer­ently.

...

Read the original on abhinavomprakash.com »

5 308 shares, 15 trendiness

localgpt-app/localgpt

A lo­cal de­vice fo­cused AI as­sis­tant built in Rust — per­sis­tent mem­ory, au­tonomous tasks, ~27MB bi­nary. Inspired by and com­pat­i­ble with OpenClaw.

* Local de­vice fo­cused — runs en­tirely on your ma­chine, your mem­ory data stays yours

* Autonomous heart­beat — del­e­gate tasks and let it work in the back­ground

# Full in­stall (includes desk­top GUI)

cargo in­stall lo­cal­gpt

# Headless (no desk­top GUI — for servers, Docker, CI)

cargo in­stall lo­cal­gpt –no-default-features

# Initialize con­fig­u­ra­tion

lo­cal­gpt con­fig init

# Start in­ter­ac­tive chat

lo­cal­gpt chat

# Ask a sin­gle ques­tion

lo­cal­gpt ask What is the mean­ing of life?”

# Run as a dae­mon with heart­beat, HTTP API and web ui

lo­cal­gpt dae­mon start

LocalGPT uses plain mark­down files as its mem­ory:

Files are in­dexed with SQLite FTS5 for fast key­word search, and sqlite-vec for se­man­tic search with lo­cal em­bed­dings

[agent]

de­fault­_­model = claude-cli/opus”

[providers.anthropic]

api_key = ${ANTHROPIC_API_KEY}”

[heartbeat]

en­abled = true

in­ter­val = 30m”

ac­tive_hours = { start = 09:00”, end = 22:00″ }

[memory]

work­space = ~/.localgpt/workspace”

# Chat

lo­cal­gpt chat # Interactive chat

lo­cal­gpt chat –session

When the dae­mon is run­ning:

Why I Built LocalGPT in 4 Nights — the full story with com­mit-by-com­mit break­down.

...

Read the original on github.com »

6 304 shares, 30 trendiness

(AI) Slop Terrifies Me – ezhik.jp

What if this is as good as soft­ware is ever go­ing to be? What if AI stops get­ting bet­ter and what if peo­ple stop car­ing?

Imagine if this is as good as AI gets. If this is where it stops, you’d still have mod­els that can al­most code a web browser, al­most code a com­piler—and can even pre­sent a pretty cool demo if al­lowed to take a few short­cuts. You’d still get mod­els that can kinda-sorta sim­u­late worlds and write kinda-sorta en­gag­ing sto­ries. You’d still get self-dri­ving cars that al­most work, ex­cept when they don’t. You get AI that can make you like 90% of a thing!

90% is a lot. Will you care about the last 10%?

I’m ter­ri­fied of the good enough to ship—and I’m ter­ri­fied of no­body else car­ing. I’m less afraid of AI agents writ­ing apps that they will never ex­pe­ri­ence than I am of the AI herders who won’t care enough to ac­tu­ally learn what they ship. And I sure as hell am afraid of the peo­ple who will ex­pe­ri­ence the slop and will be fine with it.

As a wood­work­ing en­thu­si­ast I am slowly mak­ing my peace with stand­ing in the mid­dle of an IKEA. But at the rate things are go­ing in this drop­ship­ping hell, IKEA would be the dream. Software temu­fi­ca­tion stings much more than soft­ware com­modi­ti­za­tion.

I think Claude and friends can help with craft­ing good soft­ware and with learn­ing new tech­nolo­gies and pro­gram­ming lan­guages—though I sure as hell move slower when I stop to learn and un­der­stand than the guy play­ing Dwarf Fortress with 17 agents. But at the same time AI mod­els seem to con­stantly nudge to­wards that same me­dian Next-React-Tailwind, good enough app. These things just don’t han­dle go­ing off the beaten path well.

Spend all the to­kens you want, try­ing to make some­thing unique like Paper by FiftyThree with AI tools will just end up look­ing nor­mal and unin­spired.

Mind you, it’s not like slop is any­thing new. A lot of hu­man de­ci­sions had to hap­pen be­fore your back­side ended up in an ex­tremely un­com­fort­able chair, your search re­sults got pol­luted by poorly-writ­ten SEO-optimized ar­ti­cles, and your brain had to deal with a ticket book­ing web­site with a user in­ter­face so poorly de­signed that it made you cry. So it’s a peo­ple prob­lem. Incentives just don’t seem to align to make good soft­ware. Move fast and break things, etc, etc. You’ll make a lit­tle ar­ti­san app, and if it’s any good, Google will come along with a free clone, kill you, then kill its clone—and the world will be left with net zero new good soft­ware. And now, with AI agents, it gets even worse as agent herders can do the same thing much faster.

Developers aside, there’s also the users. AI mod­els can’t be imag­i­na­tive, and the de­vel­op­ers can’t af­ford to, but surely with AI tools, the gap be­tween users and de­vel­op­ers will be bridged, ChatGPT will be­come the new HyperCard and peo­ple will turn their ideas into re­al­ity with just a few sen­tences? There’s so many peo­ple out there who are cod­ing with­out know­ing it, from Carol in Accounting mak­ing in­sane Excel spread­sheets to all the kids on TikTok au­tomat­ing their phones with Apple Shortcuts and hack­ing up cool Notion note­books.

But what if those peo­ple are an aber­ra­tion? What if this state of tech learned help­less­ness can­not be fixed? What if peo­ple re­ally do just want a glo­ri­fied lit­tle TV in their pocket? What if most peo­ple truly just don’t care about tech prob­lems, about pri­vacy, about Liquid Glass, about Microsoft’s up­sells, about con­stantly deal­ing with apps and fea­tures which just don’t work? What if there will be no­body left to carry the torch? What if the fu­ture of com­put­ing be­longs not to ar­ti­san de­vel­op­ers or Carol from Accounting, but to who­ever can churn out the most soft­ware out the fastest? What if good enough re­ally is good enough for most peo­ple?

I’m ter­ri­fied that our craft will die, and no­body will even care to mourn it.

...

Read the original on ezhik.jp »

7 239 shares, 15 trendiness

The world heard JD Vance being booed at the Olympics. Except for viewers in the US

The mod­ern Olympics sell them­selves on a sim­ple premise: the whole world, watch­ing the same mo­ment, at the same time. On Friday night in Milan, that il­lu­sion frac­tured in real time.

When Team USA en­tered the San Siro dur­ing the pa­rade of na­tions, the speed skater Erin Jackson led the del­e­ga­tion into a wall of cheers. Moments later, when cam­eras cut to US vice-pres­i­dent JD Vance and sec­ond lady Usha Vance, large sec­tions of the crowd re­sponded with boos. Not sub­tle ones, but au­di­ble and sus­tained ones. Canadian view­ers heard them. Journalists seated in the press tri­bunes in the up­per deck, my­self in­cluded, clearly heard them. But as I quickly re­al­ized from a groupchat with friends back home, American view­ers watch­ing NBC did not.

On its own, the sit­u­a­tion might once have passed un­no­ticed. But the defin­ing fea­ture of the mod­ern sports me­dia land­scape is that no sin­gle broad­caster con­trols the mo­ment any more. CBC car­ried it. The BBC live­blogged it. Fans clipped it. Within min­utes, mul­ti­ple ver­sions of the same hap­pen­ing were cir­cu­lat­ing on­line — some with boos, some with­out — turn­ing what might once have been a rou­tine pro­duc­tion call into a case study in in­for­ma­tion asym­me­try.

For its part, NBC has de­nied edit­ing the crowd au­dio, al­though it is dif­fi­cult to re­solve why the boos so au­di­ble in the sta­dium and on other broad­casts were ab­sent for US view­ers. But in a broader sense, it is be­com­ing harder, not eas­ier, to cu­rate re­al­ity when the rest of the world is hold­ing up its own cam­era an­gles. And that raises an un­com­fort­able ques­tion as the United States moves to­ward host­ing two of the largest sport­ing events on the planet: the 2026 men’s World Cup and the 2028 Los Angeles Olympics.

If a US ad­min­is­tra­tion fig­ure is booed at the Olympics in Los Angeles, or a World Cup match in New Jersey or Dallas, will American do­mes­tic broad­casts sim­ply mute or avoid men­tion­ing the crowd au­dio? If so, what hap­pens when the world feed, or a for­eign broad­caster, shows some­thing else en­tirely? What hap­pens when 40,000 phones in the sta­dium up­load their own ver­sion in real time?

The risk is not just that view­ers will see through it. It is that at­tempts to man­age the nar­ra­tive will make American broad­cast­ers look less cred­i­ble, not more. Because the au­di­ence now as­sumes there is al­ways an­other an­gle. Every time a broad­caster makes that trade — cred­i­bil­ity for in­su­la­tion — it is a trade au­di­ences even­tu­ally no­tice.

There is also a deeper struc­tural pres­sure be­hind de­ci­sions like this. The Trump era has been de­fined in part by sus­tained hos­til­ity to­ward me­dia in­sti­tu­tions. Broadcasters do not op­er­ate in a vac­uum; they op­er­ate in­side reg­u­la­tory en­vi­ron­ments, po­lit­i­cal cli­mates and cor­po­rate risk cal­cu­la­tions. When pres­i­dents and their al­lies openly threaten or tar­get net­works, it is naive to pre­tend that has no down­stream ef­fect on ed­i­to­r­ial choices — es­pe­cially in high-stakes live broad­casts tied to bil­lion-dol­lar rights deals.

But there is a dif­fer­ence be­tween con­tex­tual pres­sure and vis­i­ble re­al­ity dis­tor­tion. When global au­di­ences can com­pare feeds in real time, the lat­ter be­gins to re­sem­ble some­thing else en­tirely: not ed­i­to­r­ial judg­ment, but nar­ra­tive man­age­ment. Which is why com­par­isons to Soviet-style state-con­trolled broad­cast­ing mod­els — once breath­less rhetor­i­cal ex­ag­ger­a­tions — are start­ing to feel less hy­per­bolic.

The irony is that the Olympics them­selves are built around the idea that sport can ex­ist along­side po­lit­i­cal ten­sion with­out pre­tend­ing it does not ex­ist. The International Olympic Committee’s own lan­guage — ath­letes should not be pun­ished for gov­ern­ments’ ac­tions — im­plic­itly ac­knowl­edges that gov­ern­ments are part of the Olympic the­ater whether or­ga­niz­ers like it or not.

Friday night il­lus­trated that per­fectly. American ath­letes were cheered, their enor­mous con­tin­gent given one of the most full-throated re­cep­tions of the night. The po­lit­i­cal emis­saries were not uni­ver­sally wel­comed. Both things can be true at once. Crowd dis­sent is not a fail­ure of the Olympic ideal. In open so­ci­eties, it is part of how pub­lic sen­ti­ment is ex­pressed. Attempting to erase one side of that equa­tion risks flat­ten­ing re­al­ity into some­thing au­di­ences no longer trust. And if Milan was a warn­ing shot, Los Angeles is the main event.

Since Donald Trump’s first term, American po­lit­i­cal cov­er­age around sport has fix­ated on the mi­cro-mo­ments: Was the pres­i­dent booed or cheered? Did the broad­cast show it? Did he at­tend or skip events likely to pro­duce hos­tile crowds? The dis­course has of­ten felt like a Rorschach test, fil­tered through par­ti­san in­ter­pre­ta­tion and se­lec­tive clips.

The LA Olympics will be some­thing else en­tirely. There is no hid­ing from an open­ing cer­e­mony for Trump. No duck­ing a sta­dium when the Olympic Charter re­quires the host coun­try’s head of state to of­fi­cially de­clare the Games open. No con­trol­ling how 200 in­ter­na­tional broad­cast­ers carry the mo­ment.

If Trump is still in the White House on 14 July 2028, one month af­ter his 82nd birth­day and in the thick of an­other heated US pres­i­den­tial cam­paign, he will stand in front of a global tele­vi­sion au­di­ence as a key part of the open­ing cer­e­mony. He will do so in California, in a po­lit­i­cal en­vi­ron­ment far less friendly than many do­mes­tic sport­ing venues he has ap­peared in over the past decade. And he will do it in a city syn­ony­mous with the po­lit­i­cal op­po­si­tion, po­ten­tially in the back yard of the Democratic pres­i­den­tial can­di­date.

There will be some cheers. There will al­most cer­tainly be boos. There will be every­thing in be­tween. And there will be no way to make them dis­ap­pear. The real risk for American broad­cast­ers is not that dis­sent will be vis­i­ble. It is that au­di­ences will start as­sum­ing any­thing they do not show is be­ing hid­den. In an era when trust in in­sti­tu­tions is al­ready frag­ile, that is a dan­ger­ous place to op­er­ate from.

The Olympics have al­ways been po­lit­i­cal, whether through boy­cotts, protests, sym­bolic ges­tures or crowd re­ac­tions. What has changed is not the pol­i­tics. It is the im­pos­si­bil­ity of con­tain­ing the op­tics.

Milan may ul­ti­mately be re­mem­bered as a small mo­ment — a few sec­onds of crowd noise dur­ing a long cer­e­mony. But it also felt like a pre­view of the next phase of global sport broad­cast­ing: one where nar­ra­tive con­trol is shared, con­tested and in­stantly ver­i­fi­able. The world is watch­ing. And this time, it is also record­ing.

...

Read the original on www.theguardian.com »

8 230 shares, 12 trendiness

Beyond agentic coding

I’m gen­er­ally pretty pro-AI with one ma­jor ex­cep­tion: agen­tic cod­ing. My con­sis­tent im­pres­sion is that agen­tic cod­ing does not ac­tu­ally im­prove pro­duc­tiv­ity and de­te­ri­o­rates the user’s com­fort and fa­mil­iar­ity with the code­base. I formed that im­pres­sion from:

Every time I use agen­tic cod­ing tools I’m con­sis­tently unim­pressed with the qual­ity of the re­sults.

I al­low in­ter­view can­di­dates to use agen­tic cod­ing tools and can­di­dates who do so con­sis­tently per­formed worse than other can­di­dates, fail­ing to com­plete the chal­lenge or pro­duc­ing in­cor­rect re­sults1. This was a huge sur­prise to me at first be­cause I ex­pected agen­tic cod­ing to con­fer an un­fair ad­van­tage but … nope!

Studies like the Becker study and Shen study show that users of agen­tic cod­ing per­form no bet­ter and some­times worse when you mea­sure pro­duc­tiv­ity in terms of fixed out­comes rather than code ve­loc­ity/​vol­ume.

I don’t be­lieve agen­tic cod­ing is a lost cause, but I do be­lieve agen­tic cod­ing in its pre­sent in­car­na­tion is do­ing more harm than good to soft­ware de­vel­op­ment. I also be­lieve it is still worth­while to push on the in­ad­e­qua­cies of agen­tic cod­ing so that it em­pow­ers de­vel­op­ers and im­proves code qual­ity.

However, in this post I’m tak­ing a dif­fer­ent tack: I want to pre­sent other ways to lever­age AI for soft­ware de­vel­op­ment. I be­lieve that agen­tic cod­ing has so cap­tured the cul­tural imag­i­na­tion that peo­ple are sleep­ing on other good and un­der­ex­plored so­lu­tions to AI-assisted soft­ware de­vel­op­ment.

I like to de­sign tools and in­ter­faces from first prin­ci­ples rather than re­act­ing to in­dus­try trends/​hype and I’ve ac­crued quite a few gen­eral de­sign prin­ci­ples from over a decade of work­ing in DevProd and also an even longer his­tory of open source pro­jects and con­tri­bu­tions.

One of those de­sign prin­ci­ples is my per­sonal master cue”, which is:

A good tool or in­ter­face should keep the user in a flow state as long as pos­si­ble

This prin­ci­ple is­n’t even spe­cific to AI-assisted soft­ware de­vel­op­ment, and yet still high­lights why agen­tic cod­ing some­times misses the mark. Both stud­ies and de­vel­oper tes­ti­mo­ni­als show that agen­tic cod­ing breaks flow and keeps de­vel­op­ers in an idle/​in­ter­rupt­ible hold­ing pat­tern more than or­di­nary cod­ing.

For ex­am­ple, the Becker study took screen record­ings and saw that idle time ap­prox­i­mately dou­bled:

I be­lieve we can im­prove AI-assisted cod­ing tools (agentic or not) if we set our north star to preserve flow state”.

Calm tech­nol­ogy is a de­sign dis­ci­pline that pro­motes flow state in tools that we build. The de­sign prin­ci­ples most rel­e­vant to cod­ing are:

tools should min­i­mize de­mands on our at­ten­tion

Interruptions and in­tru­sions on our at­ten­tion break us out of flow state.

tools should be built to be pass-through”

A tool is not meant to be the ob­ject of our at­ten­tion; rather the tool should re­veal the true ob­ject of our at­ten­tion (the thing the tool acts upon), rather than ob­scur­ing it. The more we use the tool the more the tool fades into the back­ground of our aware­ness while still sup­port­ing our work.

tools should cre­ate and en­hance calm (thus the name: calm tech­nol­ogy)

Engineers al­ready use calm” tools and in­ter­faces as part of our work and here are a cou­ple of ex­am­ples you’re prob­a­bly al­ready fa­mil­iar with:

IDEs (like VSCode) can sup­port in­lay hints that sprin­kle the code with use­ful an­no­ta­tions for the reader, such as in­ferred type an­no­ta­tions:

These types of in­lay hints em­body calm de­sign prin­ci­ples be­cause:

they min­i­mize de­mands on our at­ten­tion

They ex­ist on the pe­riph­ery of our at­ten­tion, avail­able for us if we’re in­ter­ested but un­ob­tru­sive if we’re not in­ter­ested.

they are built to be pass-through”

They don’t re­place or sub­sti­tute the code that we are edit­ing. They en­hance the code edit­ing ex­pe­ri­ence but the user is still in di­rect con­tact with the edited code. The more we use type hints the more they fade into the back­ground of our aware­ness and the more the code re­mains the fo­cus of our at­ten­tion.

They pro­mote a sense of calm by in­form­ing our un­der­stand­ing of the code pas­sively. As one of the Calm Technology prin­ci­ples puts it: Technology can com­mu­ni­cate, but does­n’t need to speak”.

Tools like VSCode or GitHub’s pull re­quest viewer let you pre­view at a glance changes to the file tree, like this:

You might think to your­self this is a very un­in­ter­est­ing thing to use as an ex­am­ple” but that’s ex­actly the point. The best tools (designed with the prin­ci­ples of calm tech­nol­ogy) are per­va­sive and bor­ing things that we take for granted (like light switches) and that have faded so strongly into the back­ground of our at­ten­tion that we for­get they even ex­ist as a part of our daily work­flow (also like light switches).

They’re there if we need the in­for­ma­tion, but easy to ig­nore (or even for­get they ex­ist) if we don’t use them.

are built to be pass-through”

When we in­ter­act with the file tree viewer we are in­ter­act­ing di­rectly with the filesys­tem and the in­ter­ac­tion be­tween the rep­re­sen­ta­tion (the viewer) and the re­al­ity (the filesys­tem) feels di­rect, snappy, and pre­cise. The more we use the viewer the more the rep­re­sen­ta­tion be­comes in­dis­tin­guish­able from the re­al­ity in our minds.

We do not need to con­stantly in­ter­act with the file tree to gather up-to-date in­for­ma­tion about our pro­ject struc­ture. It pas­sively up­dates in the back­ground as we make changes to the pro­ject and those up­dates are un­ob­tru­sive and not at­ten­tion-grab­bing.

We can think about the lim­i­ta­tions of chat-based agen­tic cod­ing tools through this same lens:

they place high de­mands on our at­ten­tion

The user has to ei­ther sit and wait for the agent to re­port back or do some­thing else and run the LLM in a semi-au­tonomous man­ner. However, even semi-au­tonomous ses­sions pre­vent the user from en­ter­ing flow state be­cause they have to re­main in­ter­rupt­ible.

they are not built to be pass-through”

Chat agents are a highly me­di­ated in­ter­face to the code which is in­di­rect (we in­ter­act more with the agent than the code), slow (we spend a lot of time wait­ing), and im­pre­cise (English is a dull in­ter­face).

The user needs to con­stantly stim­u­late the chat to gather new in­for­ma­tion or up­date their un­der­stand­ing of the code (the chat agent does­n’t in­form the user’s un­der­stand­ing pas­sively or qui­etly). Chat agents are also fine-tuned to max­i­mize en­gage­ment.

One of the ear­li­est ex­am­ples of an AI cod­ing as­sis­tant that be­gins to model calm de­sign prin­ci­ples is the OG AI-assistant: GitHub Copilot’s sup­port for in­line sug­ges­tions, with some caveats I’ll go into.

This does one thing re­ally well:

it’s built to be pass-through”

The user is still in­ter­act­ing di­rectly with the code and the sug­ges­tions are rea­son­ably snappy. The user can also ig­nore or type through the sug­ges­tion.

However, by de­fault these in­line sug­ges­tions vi­o­late other calm tech­nol­ogy prin­ci­ples:

By de­fault Copilot pre­sents the sug­ges­tions quite fre­quently and the user has to pause what they’re do­ing to ex­am­ine the out­put of the sug­ges­tion. After enough times the user be­gins to con­di­tion them­selves into reg­u­larly paus­ing and wait­ing for a sug­ges­tion which breaks them out of a flow state. Now in­stead of be­ing proac­tive the user’s been con­di­tioned by the tool to be re­ac­tive.

GitHub Copilot’s in­line sug­ges­tion in­ter­face is vi­su­ally busy and in­tru­sive. Even if the user ig­nores every sug­ges­tion the ef­fect is still dis­rup­tive: sug­ges­tions ap­pear on the user’s screen in the cen­ter of their vi­sual fo­cus and the user has to de­cide on the spot whether to ac­cept or ig­nore them be­fore pro­ceed­ing fur­ther. The user also can’t eas­ily pas­sively ab­sorb in­for­ma­tion pre­sented in this way: un­der­stand­ing each sug­ges­tion re­quires the user’s fo­cused at­ten­tion.

… bu­u­u­uut these is­sues are par­tially fix­able by dis­abling the au­to­matic sug­ges­tions and re­quir­ing them to be ex­plic­itly trig­gered by Alt + \. However, un­for­tu­nately that also dis­ables the next fea­ture, which I like even more:

Next edit sug­ges­tions (also from GitHub Copilot)

Next edit sug­ges­tions are a re­lated GitHub Copilot fea­ture that dis­play re­lated fol­low-up ed­its through­out the file/​pro­ject and let the user cy­cle be­tween them and pos­si­bly ac­cept each sug­gested change. They be­have like a super-charged find and re­place”:

These sug­ges­tions do an amaz­ing job of keep­ing the user in a flow state:

they min­i­mize de­mand on the user’s at­ten­tion

The cog­ni­tive load on the user is smaller than in­line sug­ges­tions be­cause the sug­ges­tions are more likely to be bite-sized (and there­fore eas­ier for a hu­man to re­view and ac­cept).

Just like in­line sug­ges­tions, next edit sug­ges­tions still keep the user in close con­tact with the code they are mod­i­fy­ing.

Suggestions are pre­sented in an un­ob­tru­sive way: they aren’t dumped in the dead cen­ter of the user’s at­ten­tion and they don’t de­mand im­me­di­ate re­view. They ex­ist on the pe­riph­ery of the user’s at­ten­tion as code sug­ges­tions that the user can ig­nore or fo­cus on at their leisure.

I be­lieve there is a lot of un­tapped po­ten­tial in AI-assisted cod­ing tools and in this sec­tion I’ll sketch a few small ex­am­ples of how we can em­body calm tech­nol­ogy de­sign prin­ci­ples in build­ing the next gen­er­a­tion of cod­ing tools.

You could browse a pro­ject by a tree of se­man­tic facets. For ex­am­ple, if you were edit­ing the Haskell im­ple­men­ta­tion of Dhall the tree viewer might look like this pro­to­type I hacked up2:

The goal here is to not only pro­vide a quick way to ex­plore the pro­ject by in­tent, but to also im­prove the user’s un­der­stand­ing of the pro­ject the more they use the fea­ture. String in­ter­po­la­tion re­gres­sion” is so much more in­for­ma­tive than dhall/​tests/​for­mat/​is­sue2078A.dhall3.

Also, the above video is based on a real tool and not just a mock. You can find the code I used to gen­er­ate that tree of se­man­tics facets here and I’ll write up an­other post soon walk­ing through how that code works.

You could take an ed­i­tor ses­sion, a diff, or a pull re­quest and au­to­mat­i­cally split it into a se­ries of more fo­cused com­mits that are eas­ier for peo­ple to re­view. This is one of the cases where the AI can re­duce hu­man re­view la­bor (most agen­tic cod­ing tools cre­ate more hu­man re­view la­bor).

There is some prior art here but this is still a nascent area of de­vel­op­ment.

You could add two new tools to the user’s tool­bar or con­text menu: Focus on…” and Edit as…”.

Focus on…” would al­low the user to spec­ify what they’re in­ter­ested in chang­ing and pre­sent only files and lines of code re­lated to their spec­i­fied in­ter­est. For ex­am­ple, if they want to fo­cus on command line op­tions” then only re­lated files and lines of code would be shown in the ed­i­tor and other lines of code would be hid­den/​col­lapsed/​folded. This would ba­si­cally be like Zen mode” but for edit­ing a fea­ture do­main of in­ter­est.

Edit as…” would al­low the user to edit the file or se­lected code as if it were a dif­fer­ent pro­gram­ming lan­guage or file for­mat. For ex­am­ple, some­one who was new to Haskell could edit a Haskell file as Python” and then af­ter fin­ish­ing their ed­its the AI at­tempts to back-prop­a­gate their changes to Haskell. Or some­one mod­i­fy­ing a com­mand-line parser could edit the file as YAML and be pre­sented with a sim­pli­fied YAML rep­re­sen­ta­tion of the com­mand line op­tions which they could mod­ify to add new op­tions.

This is ob­vi­ously not a com­pre­hen­sive list of ideas, but I wrote this to en­cour­age peo­ple to think of more in­no­v­a­tive ways to in­cor­po­rate AI into peo­ple’s work­flows be­sides just build­ing yet an­other chat­bot. I strongly be­lieve that chat is the least in­ter­est­ing in­ter­face to LLMs and AI-assisted soft­ware de­vel­op­ment is no ex­cep­tion to this.

Copyright © 2026 Gabriella Gonzalez. This work is li­censed un­der CC BY-SA 4.0

...

Read the original on haskellforall.com »

9 208 shares, 21 trendiness

Why E cores make Apple silicon fast

If you use an Apple sil­i­con Mac I’m sure you have been im­pressed by its per­for­mance. Whether you’re work­ing with im­ages, au­dio, video or build­ing soft­ware, we’ve en­joyed a new turn of speed since the M1 on day 1. While most at­tribute this to their Performance cores, as it goes with the name, much is in truth the re­sult of the un­sung Efficiency cores, and how they keep back­ground tasks where they should be.

To see what I mean, start your Apple sil­i­con Mac up from the cold, and open Activity Monitor in its CPU view, with its CPU History win­dow open as well. For the first five to ten min­utes you’ll see its E cores are a wall of red and green with Spotlight’s in­dex­ing ser­vices, CGPDFService, me­di­a­analy­sisd, BackgroundShortcutRunner, Siri com­po­nents, its ini­tial Time Machine backup, and of­ten an XProtect Remediator scan. Meanwhile its P cores are largely idle, and if you were to dive straight into us­ing your work­ing apps, there’s plenty of ca­pac­ity for them to run un­af­fected by all that back­ground may­hem.

It’s this stage that scares those who are still ac­cus­tomed to us­ing Intel Macs. Seeing processes us­ing more than 100% CPU is ter­ri­fy­ing, be­cause they know that Intel cores can strug­gle un­der so much load, af­fect­ing user apps. But on an Apple sil­i­con Mac, who no­tices or cares that there’s over a dozen md­worker processes each tak­ing a good 50% CPU si­mul­ta­ne­ously? After all, this is what the Apple sil­i­con ar­chi­tec­ture is de­signed for. Admittedly the im­pres­sion is­n’t helped by a dread­ful piece of psy­chol­ogy, as those E cores at 100% are prob­a­bly run­ning at a fre­quency a quar­ter of those of P cores shown at the same 100%, mak­ing vi­sual com­par­i­son com­pletely false.*

This is noth­ing new. Apple brought it to the iPhone 7 in 2016, in its first SoC with sep­a­rate P and E cores. That’s an im­ple­men­ta­tion of Arm’s big. LITTLE an­nounced in 2011, and de­vel­op­ment work at Cray and else­where in the pre­vi­ous decade. What makes the dif­fer­ence in Apple sil­i­con Macs is how threads are al­lo­cated to the two dif­fer­ent CPU core types on the ba­sis of a met­ric known as Quality of Service, or QoS.

As with so much in to­day’s Macs, QoS has been around since OS X 10.10 Yosemite, six years be­fore it be­came so cen­tral in per­for­mance. When all CPU cores are the same, it has lim­ited use­ful­ness over more tra­di­tional con­trols like Posix’s nice sched­ul­ing pri­or­ity. All those back­ground tasks still have to be com­pleted, and giv­ing them a lower pri­or­ity only pro­longs the time they take on the CPU cores, and the pe­riod in which the user’s apps are com­pet­ing with them for CPU cy­cles.

With the ex­pe­ri­ence gained from its iPhones and other de­vices, Apple’s en­gi­neers had a bet­ter so­lu­tion for fu­ture Macs. In ad­di­tion to pro­vid­ing pri­or­ity-based queues, QoS makes a fun­da­men­tal dis­tinc­tion be­tween those threads run in the fore­ground, and those of the back­ground. While fore­ground threads will be run on P cores when they’re avail­able, they can also be sched­uled on E cores when nec­es­sary. But back­ground threads aren’t nor­mally al­lowed to run on P cores, even if they’re de­layed by the load on the E cores they’re re­stricted to. We know this from our in­abil­ity to pro­mote ex­ist­ing back­ground threads to run on P cores us­ing St. Clair Software’s App Tamer and the com­mand tool taskpol­icy.

This is why, even if you sit and watch all those back­ground processes load­ing the E cores im­me­di­ately af­ter start­ing up, leav­ing the P cores mostly idle, ma­cOS won’t try run­ning them on its P cores. If it did, even if you wanted it to, the dis­tinc­tion be­tween fore­ground and back­ground, P and E cores would start to fall apart, our apps would suf­fer as a con­se­quence, and bat­tery en­durance would de­cline. Gone are the days of crash­ing md­worker processes bring­ing our Macs to their knees with a spin­ning beach­ball every few sec­onds.

If see­ing all those processes us­ing high % CPU can look scary, the in­evitable con­se­quence in terms of soft­ware ar­chi­tec­ture might seem ter­ri­fy­ing. Rather than build­ing mono­lithic apps, many of their tasks are now bro­ken out into dis­crete processes run in the back­ground on de­mand, on the E cores when ap­pro­pri­ate. The fact that an idle Mac has over 2,000 threads run­ning in over 600 processes is good news, and the more of those that are run on the E cores, the faster our apps will be. The first and last M-series chips to have only two E cores were the M1 Pro and Max, since when every one has had at least four E cores, and some as many as six or eight.

Because Efficiency cores get the back­ground threads off the cores we need for per­for­mance.

* For the record, I have mea­sured those fre­quen­cies us­ing pow­er­met­rics. For an M4 Pro, for ex­am­ple, high QoS threads run­ning on the P cores ben­e­fit from fre­quen­cies close to the P core max of 4,512 MHz. Low QoS threads run­ning on the E cores are run at fre­quen­cies close to idle, typ­i­cally around 1,050 MHz. However, when the E cores run high QoS threads that have over­flowed from the P cores, the E cores are nor­mally run at around their max­i­mum of 2,592 MHz. By my arith­metic, 1,050 di­vided by 4,512 is 0.233, which is slightly less than a quar­ter. Other M-series chips are sim­i­lar.

...

Read the original on eclecticlight.co »

10 197 shares, 44 trendiness

I put a real-time 3D shader on the Game Boy Color

Check out the code, down­load the ROMsMaking it work on the Game BoyThe Game Boy has no mul­ti­ply in­struc­tion­All scalars and lookups are 8-bit frac­tion­sHow fast is it?An over­all failed at­tempt at us­ing AI

Check out the code, down­load the ROMsMaking it work on the Game BoyThe Game Boy has no mul­ti­ply in­struc­tion­All scalars and lookups are 8-bit frac­tion­sHow fast is it?An over­all failed at­tempt at us­ing AI

I made a Game Boy Color game that ren­ders im­ages in real time. The player con­trols an or­bit­ing light and spins an ob­ject.

Before re­ally div­ing into this pro­ject, I ex­per­i­mented with the look in Blender to see if it would even look good. IMO it did, so I went ahead with it!

I ex­per­i­mented with a pseudo-dither” on the Blender mon­key by adding a small ran­dom vec­tor to each nor­mal.

It does­n’t re­ally mat­ter what soft­ware I used to pro­duce the nor­mal maps. Blender was the path of least re­sis­tance for me, so I chose that.

For the teapot, I sim­ply put in a teapot, ro­tated a cam­era around it, and ex­ported the nor­mal AOV as a PNG se­quence. Pretty straight-for­ward.

For the spin­ning Game Boy Color, I wanted to en­sure that cer­tain col­ors were solid, so I used cryp­to­mattes in the com­pos­i­tor to iden­tify spe­cific geom­e­try and out­put hard-coded val­ues in the out­put.

The geom­e­try in the screen was done by ren­der­ing a sep­a­rate scene, then com­posit­ing it in the fi­nal ren­der us­ing a cryp­to­matte for the screen.

The above an­i­ma­tions are nor­mal map frames that are used to solve the value of each pixel

Normal maps are a core con­cept of this pro­ject. They’re al­ready used every­where in 3D graph­ics.

And in­deed, nor­mal map im­ages are se­cretly a vec­tor field. The rea­son nor­mal maps tend to have a blue-ish base­line color, is be­cause every­one likes to as­so­ci­ate XYZ with RGB, and +Z is the for­ward vec­tor by con­ven­tion.

In a typ­i­cal 3D work­flow, a nor­mal map is used to en­code the nor­mal vec­tor at any given point on a tex­tured mesh.

The sim­plest way to shade a 3D ob­ject is us­ing the dot prod­uct:

where is the nor­mal vec­tor, and is the light po­si­tion when it points to­wards the ori­gin (or equiv­a­lently: the neg­a­tive light di­rec­tion).

Expanded out com­po­nent-wise, this is:

When the light vec­tor is con­stant for all pix­els, it mod­els what most 3D graph­ics soft­ware calls a distant light”, or a sun light”.

To speed up com­pu­ta­tion on the Game Boy, I use an al­ter­nate ver­sion of the dot prod­uct, us­ing spher­i­cal co­or­di­nates.

A spher­i­cal co­or­di­nate is a point rep­re­sented by a ra­dius , a pri­mary an­gle theta”, and a sec­ondary an­gle phi”. This is rep­re­sented as a tu­ple:

The dot prod­uct of two spher­i­cal co­or­di­nates:

Because all nor­mal vec­tors are unit length, and the light vec­tor is unit length, we can just as­sume the ra­dius is equal to 1. This sim­pli­fies to:

And us­ing the pre­vi­ous vari­able names, we get the for­mula:

In the ROM, I de­cided to fix L-theta” to a con­stant value for per­for­mance rea­sons. The player gets to con­trol L-phi”, cre­at­ing an or­bit­ing light ef­fect.

This means that we can ex­tract con­stant co­ef­fi­cients and and rewrite the for­mula:

The ROM en­codes each pixel as a 3-byte tu­ple of .

Not only does the SM83 CPU not sup­port mul­ti­pli­ca­tion, but it also does­n’t sup­port floats. That’s a real bum­mer.

We have to get re­ally cre­ative when the en­tire math­e­mat­i­cal foun­da­tion of this pro­ject in­volves mul­ti­ply­ing non-in­te­ger num­bers.

What do we do in­stead? We use log­a­rithms and lookup ta­bles!

Logarithms have this nice prop­erty of be­ing able to fac­tor prod­ucts to out­side the . This way, we can add val­ues in­stead!

This re­quires two lookups: a log lookup, and a pow lookup.

In pseudocode, mul­ti­ply­ing 0.3 and 0.5 looks like this:

pow = [ … ] # A 256-entry lookup table

# float_­to_log­space() is com­pile-time. Accepts -1.0 to +1.0.

# x and y are 8-bit val­ues in log-space

x = float_­to_log­space(0.3)

y = float_­to_log­space(0.5)

re­sult = pow[x + y]

One lim­i­ta­tion of this is that it’s not pos­si­ble to take the log of a neg­a­tive num­ber. e.g. has no real so­lu­tion.

We can over­come this by en­cod­ing a sign” bit in the MSB of the log-space value. When adding two log-space val­ues to­gether, the sign bit is ef­fec­tively XOR’d (toggled). We just need to en­sure the re­main­ing bits don’t over­flow into it. We en­sure this by keep­ing the re­main­ing bits small enough.

The pow lookup ac­counts for this bit and re­turns a pos­i­tive or neg­a­tive re­sult based on it.

It’s ad­van­ta­geous to re­strict num­bers to a sin­gle byte, for both run-time per­for­mance and ROM size. 8-bit frac­tions are pretty ex­treme by to­day’s stan­dards, but be­lieve it or not, it works. It’s lossy as hell, but it works!

All scalars we’re work­ing with are be­tween -1.0 and +1.0.

Addition and mul­ti­pli­ca­tion both use… ad­di­tion!

Consider adding the two bytes: 5 + 10 = 15

Why is the de­nom­i­na­tor 127 in­stead of 128? It’s be­cause I needed to rep­re­sent both pos­i­tive and neg­a­tive 1. In a two’s-com­ple­ment en­cod­ing, signed pos­i­tive 128 does­n’t ex­ist.

You might no­tice that the log-space val­ues cy­cle and be­come neg­a­tive at byte 128. The log-space val­ues use bit 7 of the byte to en­code the sign” bit. As men­tioned in the pre­vi­ous sec­tion, this is im­por­tant for tog­gling the sign dur­ing mul­ti­pli­ca­tion.

The log-space val­ues also use as a base, be­cause I chose this as a suf­fi­ciently small base to meet the re­quire­ment that adding 3 of these log-space val­ues won’t over­flow (42+42+42 = 126). Bytes 43 thru 127 are near 0, so in prac­tice the ROM does­n’t en­code these val­ues.

The lookup ta­bles look like this:

Reconstructed func­tions look like this. The pre­ci­sion er­ror is shown in the jagged staircase” pat­terns:

It may look like there’s a lot of er­ror, but it’s fast and it’s pass­able enough to look al­right! ;)

It’s ba­si­cally a com­bined . This ex­ists be­cause in prac­tice, co­sine is al­ways used with a mul­ti­pli­ca­tion.

The core cal­cu­la­tion for the shader is:

And we can rewrite it as:

The pro­ce­dure processes 15 tiles per frame. It can process more if some of the tile’s rows are empty (all 0), but it’s guar­an­teed to process at least 15.

Figure: Mesen’s Event Viewer” win­dow, show­ing a dot for each it­er­a­tion (tile row) of the shader’s crit­i­cal loop.

There’s some in­ten­tional vi­sual tear­ing as well. The im­age it­self is more than 15 tiles, so the ROM ac­tu­ally switches to ren­der­ing dif­fer­ent por­tions of the im­age for each frame. The tear­ing is less no­tice­able be­cause of ghost­ing on the LCD dis­play, so I thought it was ac­cept­able.

A pixel takes about 130 cy­cles, and an empty row’s pixel takes about 3 cy­cles.

At one point I had cal­cu­lated 15 tiles ren­der­ing at ex­actly 123,972 cy­cles, in­clud­ing the call and branch over­head. This is an over­es­ti­mate now, be­cause I since added an op­ti­miza­tion for empty rows.

The Game Boy Color’s CPU runs up to 8.388608 MHz, or roughly 139,810 T-cycles per frame (1/60 of a sec­ond).

About 89% of a frame’s avail­able CPU time goes to ren­der­ing the 15 tiles per frame. The re­main­ing time goes to other func­tion­al­ity like re­spond­ing to user in­put and per­form­ing hard­ware IO.

Figure: A hex rep­re­sen­ta­tion of the shader sub­rou­tine in­struc­tions in RAM. The blue dig­its show a patch to change sub a, 0 into sub a, 8.

The core shader sub­rou­tine con­tains a hot path that processes about 960 pix­els per frame. It’s re­ally im­por­tant to make this as fast as pos­si­ble!

Self-modifying code is a su­per-ef­fec­tive way to make code fast. But most mod­ern de­vel­op­ers don’t do this any­more, and there are good rea­sons: It’s dif­fi­cult, rarely portable, and it’s hard to do it right with­out in­tro­duc­ing se­ri­ous se­cu­rity vul­ner­a­bil­i­ties. Modern de­vel­op­ers are spoiled by an abun­dance of pro­cess­ing power, su­per-scalar proces­sors that take op­ti­mal paths, and mod­ern JIT (Just-In-Time) run­times that gen­er­ate code on the fly. But we’re on the Game Boy, bay­beee, so we don’t have those op­tions.

If you’re a de­vel­oper who uses higher-level lan­guages like Python and JavaScript, the clos­est equiv­a­lent to self-mod­i­fy­ing code is eval(). Think about how ner­vous eval() makes you feel. That’s al­most ex­actly how na­tive de­vel­op­ers feel about mod­i­fy­ing in­struc­tions.

On the Game Boy’s SM83 proces­sor, it’s faster to add and sub­tract by a hard-coded num­ber than it is to load that num­ber from mem­ory.

un­signed char Ltheta = 8;

// Slower

v = (*in++) - Ltheta;

// Faster

v = (*in++) - 8;

In SM83 as­sem­bly, this looks like:

; Slower: 28 cy­cles

ld a, [Ltheta]  ; 12 cy­cles: Read vari­able Ltheta” from HRAM

ld b, a  ; 4 cy­cles: Move value to B reg­is­ter

ld a, [hl+]  ; 8 cy­cles: Read from the HL pointer

sub a, b  ; 4 cy­cles: A = A - B

; Faster: 16 cy­cles

ld a, [hl+]  ; 8 cy­cles: Read from the HL pointer

sub a, 8  ; 8 cy­cles: A = A - 8

The faster way shaves off 12 cy­cles. If we’re ren­der­ing 960 pix­els, this saves a to­tal of 11,520 cy­cles. This does­n’t sound like a lot, but it’s roughly 10% of the shader’s run­time!

So how can we get the faster sub­trac­tion if the value we’re sub­tract­ing with changes?

2A ld a, [hl+]

D6 08 sub a, 8

AI Will Be Writing 90% of Code in 3 to 6 Months”

— Dario Amodei, CEO of Anthropic (March 2025 - 9 months ago as of writ­ing)

95% of this pro­ject was made by hand. Large lan­guage mod­els strug­gle to write Game Boy as­sem­bly. I don’t blame them.

Update: 2026-02-03: I at­tempted to use AI to try out the process, mostly be­cause 1) the in­dus­try won’t shut up about AI, and 2) I wanted a grounded opin­ion of it for novel pro­jects, so I have a con­crete and per­sonal ref­er­ence point when talk­ing about it in the wild. At the end of the day, this is still a hob­by­ist pro­ject, so AI re­ally is­n’t the point! But still…

I be­lieve in dis­clos­ing all at­tempts or ac­tual uses of gen­er­a­tive AI out­put, be­cause I think it’s un­eth­i­cal to de­ceive peo­ple about the process of your work. Not do­ing so un­der­mines trust, and amounts to dis­in­for­ma­tion or pla­gia­rism. Disclosure also in­vites peo­ple who have dis­agree­ments to en­gage with the work, which they should be able to. I’m open to feed­back, btw.

I’ll prob­a­bly write some­thing about my ex­pe­ri­ences with AI in the fu­ture.

As far as dis­clo­sures go, I used AI for:

Python: Reading OpenEXR lay­ers, as part of a con­ver­sion script to read nor­mal map data

Python/Blender: Some Python scripts for pop­u­lat­ing Blender scenes, to demo the process in Blender

SM83 as­sem­bly: Snippets for Game Boy Color fea­tures like dou­ble-speed and VRAM DMA. Unsurprising, be­cause these are likely avail­able some­where else.

I at­tempted - and failed - to use AI for:

SM83 as­sem­bly: (Unused) Generating an ini­tial re­vi­sion of the shader code

I’ll also choose to dis­close what I did NOT use AI for:

The al­go­rithms, lookups, all other SM83 as­sem­bly

The soul 🌟 (AI tech­bros are groan­ing right now)

Just to see what it would do, I fed pseudocode into Claude Sonnet 4 (the in­dus­try claims that it’s the best AI model for cod­ing in 2025), and got it to gen­er­ate SM83 as­sem­bly:

It was an in­ter­est­ing process. To start, I chewed Claude’s food and gave it pseudocode, be­cause I had a data for­mat in mind, and I as­sumed it’d strug­gle with a higher-level de­scrip­tion.

...

Read the original on blog.otterstack.com »

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

If you like 10HN please leave feedback and share

Visit pancik.com for more.