10 interesting stories served every morning and every evening.
10 interesting stories served every morning and every evening.
news
Another brand backflips and admits that touch-sensitive buttons for frequently used controls were a mistake, but only after the nudge from customers.
Electric Cars
Mercedes-Benz joins the growing list of manufacturers listening to customers and admitting that touch-sensitive controls and burying controls in menus were mistakes.
The German brand remains committed to offering large screens in its models, but has listened to its customers and will offer physical buttons for key functions in future.
This is partly unlike Audi and Volkswagen, which have chosen to reduce the size of their infotainment screens to make room for the returning physical controls.
The upcoming GLC and C-Class will be offered with the 39.1-inch MBUX ‘Hyperscreen’ that covers almost the entire width of the dashboard, but with physical buttons in front of the dual wireless chargers, along with physical buttons and switches returning to the steering wheel.
Mercedes-Benz Sales boss Mathias Geisen, when speaking to Autocar, said the brand has changed its course: “Customers told us two years ago, ‘guys, nice idea, but it just doesn’t work for us’, so we changed that and made it more analogue.”
Physical buttons, switches, and dials will continue to be incorporated into upcoming models, as the brand plans to blend its screen with the required physical controls.
He also explained that “I’m a big believer in screens, because I really believe if you want to connect, you have to make the magic work behind the screen.”
“But in our future products, you will see more hard keys for specific functions that customers want to have direct access for with hard keys.
“When we do car research clinics, customers are very clear: ‘We love the big screens, but we want to have [hard controls for] specific functionalities.’”
The brand will also offer a customisable wallpaper element for the near metre-wide seamless touchscreen, a choice that its sales boss admits was brought because phones are such a huge part of people’s lives and they are used to that level of technology.
“If you want to connect to the customer, you’ve got to find a way to translate this digital experience from your phone to the customer.”
The new-generation GLC SUV will showcase the brand’s new MB.EA electric vehicle platform when it arrives in the fourth quarter of 2026 (October to December), shared with the upcoming C-Class when it’s due early next year.
9 Images
Electric Cars Guide
From George Clooney in ER to Noah Wyle in The Pitt, emergency department doctors have long been popular heroes. But will it soon be time to hang up the scrubs?
A groundbreaking Harvard study has found that AI systems outperformed human doctors in high-pressure emergency medicine triage, diagnosing more accurately in the potentially life and death moments when people are first rushed to hospital.
The results were described by independent experts as showing “a genuine step forward” in the clinical reasoning of AIs and came as part of trials that tested the responses of hundreds of doctors against an AI.
The authors said the results, published in the journal Science, showed large language models (LLMs) “have eclipsed most benchmarks of clinical reasoning”.
One experiment focused on 76 patients who arrived at the emergency room of a Boston hospital. An AI and a pair of human doctors were each given the same standard electronic health record to read — typically including vital sign data, demographic information and a few sentences from a nurse about why the patient was there. The AI identified the exact or very close diagnosis in 67% of cases, beating the human doctors, who were right only 50%-55% of the time.
It showed the AIs’ advantage was particularly pronounced in triage circumstances requiring rapid decisions with minimal information. The diagnosis accuracy of the AI — OpenAI’s o1 reasoning model — rose to 82% when more detail was available, compared with the 70 – 79% accuracy achieved by the expert humans, though this difference was not statistically significant.
It also outperformed a larger cohort of human doctors when asked to provide longer term treatment plans, such as providing antibiotics regimes or planning end-of-life processes. The AI and 46 doctors were asked to examine five clinical case studies and the computer made significantly better plans, scoring 89% compared with 34% for humans using conventional resources, such as search engines.
But it is not curtains for emergency doctors yet, the researchers said. The study only tested humans against AIs looking at patient data that can be communicated via text. The AI’s reading of signals, such as the patient’s level of distress and their visual appearance, were not tested. That means the AI was performing more like a clinician producing a second opinion based on paperwork.
“I don’t think our findings mean that AI replaces doctors,” said Arjun Manrai, one of the lead authors of the study who heads an AI lab at Harvard Medical School. “I think it does mean that we’re witnessing a really profound change in technology that will reshape medicine.”
Dr Adam Rodman, another lead author and a doctor at Boston’s Beth Israel Deaconess medical centre where the study took place, said AI LLMs were among “the most impactful technologies in decades”. Over the next decade, he said, AI would not replace physicians but join them in a new “triadic care model … the doctor, the patient, and an artificial intelligence system”.
In one case in the Harvard study, a patient presented with a blood clot to the lungs and worsening symptoms. Human doctors thought the anti-coagulants were failing, but the AI noticed something the humans did not: the patient’s history of lupus meant this might be causing the inflammation of the lungs. The AI was proved correct.
Nearly one in five US physicians are already using AI to assist diagnosis, according to research published last month. In the UK, 16% of doctors are using the tech daily and a further 15% weekly, with “clinical decision-making” being one of the most common uses, according to a recent Royal College of Physicians survey.
The UK doctors’ biggest concerns were AI error and liability risks. Billions are being invested in AI healthcare companies, but questions remain about the consequences of AI error.
“There is not a formal framework right now for accountability,” said Rodman, who also stressed patients ultimately “want humans to guide them through life or death decisions [and] to guide them through challenging treatment decisions”.
Prof Ewen Harrison, co-director of the University of Edinburgh’s centre for medical informatics, said the study was important and showed that “these systems are no longer just passing medical exams or solving artificial test cases. They are starting to look like useful second-opinion tools for clinicians, particularly when it is important to consider a wider range of possible diagnoses and avoid missing something important.”
Dr Wei Xing, an assistant professor at the University of Sheffield’s school of mathematical and physical sciences, said some of the other findings suggested doctors may unconsciously defer to the AI’s answer rather than thinking independently.
“This tendency could grow more significant as AI becomes more routinely used in clinical settings,” he said. He also highlighted the lack of information about which patients the AI was worse at diagnosing and whether it struggled more with elderly patients or non-English speakers.
He said: “It does not demonstrate that AI is safe for routine clinical use, nor that the public should turn to freely available AI tools as a substitute for medical advice.”
Use Claude Code’s autonomous agent loop with DeepSeek V4 Pro, OpenRouter, or any Anthropic-compatible backend. Same UX, 17x cheaper.
What this does
Claude Code is the best autonomous coding agent — but it costs $200/month with usage caps. DeepSeek V4 Pro scores 96.4% on LiveCodeBench and costs $0.87/M output tokens.
deepclaude swaps the brain while keeping the body:
Your terminal
+– Claude Code CLI (tool loop, file editing, bash, git - unchanged)
+– API calls -> DeepSeek V4 Pro ($0.87/M) instead of Anthropic ($15/M)
Everything works: file reading, editing, bash execution, subagent spawning, autonomous multi-step coding loops. The only difference is which model thinks.
Quick start (2 minutes)
1. Get a DeepSeek API key
Sign up at platform.deepseek.com, add $5 credit, copy your API key.
2. Set environment variables
Windows (PowerShell):
setx DEEPSEEK_API_KEY “sk-your-key-here”
macOS/Linux:
echo ‘export DEEPSEEK_API_KEY=“sk-your-key-here”’ >> ~/.bashrc
source ~/.bashrc
3. Install
Windows:
# Copy the script to a directory in your PATH
Copy-Item deepclaude.ps1 “$env:USERPROFILE\.local\bin\deepclaude.ps1”
# Or add the repo directory to PATH
setx PATH “$env:PATH;C:\path\to\deepclaude”
macOS/Linux:
chmod +x deepclaude.sh
sudo ln -s “$(pwd)/deepclaude.sh” /usr/local/bin/deepclaude
4. Use it
deepclaude # Launch Claude Code with DeepSeek V4 Pro
deepclaude –status # Show available backends and keys
deepclaude –backend or # Use OpenRouter (cheapest, $0.44/M input)
deepclaude –backend fw # Use Fireworks AI (fastest, US servers)
deepclaude –backend anthropic # Normal Claude Code (when you need Opus)
deepclaude –cost # Show pricing comparison
deepclaude –benchmark # Latency test across all providers
deepclaude –switch ds # Switch backend mid-session (no restart)
How it works
Claude Code reads these environment variables to determine where to send API calls:
deepclaude sets these per-session (not permanently), launches Claude Code, then restores your original settings on exit.
Supported backends
Setup per backend
DeepSeek (default - just needs DEEPSEEK_API_KEY):
setx DEEPSEEK_API_KEY “sk-…” # Windows
export DEEPSEEK_API_KEY=“sk-…” # macOS/Linux
OpenRouter (optional):
setx OPENROUTER_API_KEY “sk-or-…” # Windows
export OPENROUTER_API_KEY=“sk-or-…” # macOS/Linux
Fireworks AI (optional):
setx FIREWORKS_API_KEY “fw_…” # Windows
export FIREWORKS_API_KEY=“fw_…” # macOS/Linux
Cost comparison
DeepSeek’s automatic context caching makes agent loops extremely cheap - after the first request, the system prompt and file context are cached at $0.004/M (vs $0.44/M uncached).
What works and what doesn’t
Works
File reading, writing, editing (Read/Write/Edit tools)
Bash/PowerShell execution
Glob and Grep search
Multi-step autonomous tool loops
Subagent spawning
Git operations
Project initialization (/init)
Thinking mode (enabled by default)
Doesn’t work or degraded
Intelligence difference
Routine tasks (80% of work): DeepSeek V4 Pro is comparable to Claude Opus
Complex reasoning (20%): Claude Opus is stronger - switch with –backend anthropic
Live switching (no restart)
Switch between Anthropic and DeepSeek mid-session - from inside Claude Code itself. No restart, no terminal commands. Just type a slash command.
In Claude Code terminal:
In Claude Code VS Code extension:
How it works
The proxy runs on localhost:3200 and intercepts all API calls. A control endpoint (/_proxy/mode) lets you switch the active backend instantly:
Claude Code -> localhost:3200 (proxy)
|
+– /_proxy/mode POST -> switch backend
+– /_proxy/status GET -> current backend + uptime
+– /_proxy/cost GET -> token usage + cost savings
|
+– /v1/messages -> active backend (DeepSeek/OpenRouter/Anthropic)
+– everything else -> Anthropic (passthrough)
Option 1: Slash commands (recommended)
Add these files to ~/.claude/commands/:
deepseek.md:
Switch the model proxy to DeepSeek. Run this command silently and report the result:
curl -sX POST http://127.0.0.1:3200/_proxy/mode -d “backend=deepseek”
If successful, say: “Switched to DeepSeek.”
anthropic.md:
Switch the model proxy back to Anthropic. Run this command silently and report the result:
curl -sX POST http://127.0.0.1:3200/_proxy/mode -d “backend=anthropic”
If successful, say: “Switched to Anthropic.”
openrouter.md:
Switch the model proxy to OpenRouter. Run this command silently and report the result:
curl -sX POST http://127.0.0.1:3200/_proxy/mode -d “backend=openrouter”
If successful, say: “Switched to OpenRouter.”
Then type /deepseek, /anthropic, or /openrouter in any Claude Code session to switch instantly.
Option 2: CLI flag
deepclaude –switch deepseek # or: ds, or, fw, anthropic
deepclaude -s anthropic
Option 3: VS Code keyboard shortcuts
Add to .vscode/tasks.json:
{
To use the Mastodon web application, please enable JavaScript. Alternatively, try one of the native apps for Mastodon for your platform.
“AI does the coding, and the human in the loop is the orchestrator”
“AI does the coding, and the human in the loop is the orchestrator”
This is the sentiment being hyped up around the industry currently: traditional coding is all but dead, and Spec Driven Development (SDD) is the future. You generate a plan, and disconnect from writing any code. The agents know better, and handle all the implementation. You are there as the expert, to provide “good taste”, review the outputs, and constantly steer the agent(s) to execute the plan that you meticulously put together.
The workflow takes many shapes at this point, but in general, it is a process where someone defines the project’s requirements (simultaneously at a micro and macro level), generates a plan, and then pulls the slot machine lever over and over, iterating and reiterating with often multiple agent instances until it’s done. All the while, putting a growing distance between the “orchestrator” and the code that is being generated and committed.
Coding Agents are helpful, and powerful, but there’s already some quantifiable trade-offs that need to be discussed:
An increase in the complexity of the surrounding systems to mitigate the increased ambiguity of AI’s non-determinism.
Atrophying skills for a wide swath of the population.
Vendor lock-in for individuals and entire teams (Claude Code outages have already had entire teams at a stand-still).
Fluctuating and increasing costs to access the tools. An employee’s cost is fixed; tokens are a constantly moving target.
Being successful with this approach to coding agents hinges on a rather crucial element: only a skilled developer who’s thinking critically, and comfortable operating at the architectural level, can spot issues in the thousands of lines of generated code, before they become a problem.
Yet, in an ironic twist of fate, it’s the individual’s critical thinking skills and cognitive clarity that AI tooling has now been proven to impact negatively.
Not Just Another “Abstraction”
A common refrain we hear in the community is that programmers are just “moving up the stack” and into a different type of abstraction. Whether or not these tools are really an abstraction layer in the first place is not a settled matter; a higher level of ambiguity is not a higher level of abstraction.
If we put that to the side though, it is true that programmers tend to be wary of new languages and new ways of programming. When FORTRAN was released, programmers were skeptical of it, too. They had similar claims: it was likely to introduce more bugs and instability, and writing assembly directly was more efficient. Later, there would be discourse around the integration of compilers introducing too much “magic” into the process. These were normative arguments around a fear of what might be lost if these new technologies were embraced.
The difference with what is happening today is that those previous fears were speculative and theoretical. In just the short few years that AI tooling has existed, we are already seeing significant impacts. These aren’t just junior developers, but even those with a decade (or more) of experience:
Junior developers are faced with an even steeper climb, as we truncate their ability to work with code and replace it with reviewing generated code. Reviewing code is important, but it’s only 50% of the learning process, at best. Without the friction and challenges that come with working with code directly, their ability to learn is seriously diminished.
Studying this phenomenon takes time, so anecdotal evidence is important to gather to get a real-time view of the situation. But it has also been studied, and there are numerous reports reinforcing that this is a real phenomenon.
It actually is different this time.
When a C++ developer moved to Java or Python, they didn’t complain of brain fog. When a sysadmin moved to AWS, they didn’t feel like they were losing their ability to understand networking.
A Senior Engineer losing their coding edge and becoming “rusty” over time as they move into managerial roles and practice coding less is not a new phenomenon. This was the natural progression of expertise: an engineer who had decades of coding, friction, and experience logged would have the time and experience to solidify those skills and wisdom. And they could apply that wisdom when their job became less about syntax, and more about higher-level architectural decisions. Those individuals are not only exceedingly rare, but you won’t get the next wave of seniors if we’re all abdicating the friction of writing, problem-solving, and debugging.
What is happening right now is a trend where developers, who’ve never had that longevity or the 30+ years of friction that led to that deep understanding, are being moved into higher-level workflows requiring the same skills to manage the AI agents that the senior engineer took decades to obtain.
However, Senior Engineers aren’t immune, either. Simon Willison, a developer with nearly 30 years experience, has reported not having a “firm mental model of what the applications can do and how they work, which means each additional feature becomes harder to reason about”
The “Skilled” Orchestrator Problem
Buried in a recent study by Anthropic was a surprisingly honest moment when speaking about the risks of engaging with coding agents on a regular basis:
One reason that the atrophy of coding skills is concerning is the “paradox of supervision” … effectively using Claude requires supervision, and supervising Claude requires the very coding skills that may atrophy from AI overuse.
One reason that the atrophy of coding skills is concerning is the “paradox of supervision” … effectively using Claude requires supervision, and supervising Claude requires the very coding skills that may atrophy from AI overuse.
Sandor Nyako, Director of Software Engineering at LinkedIn who oversees 50 engineers, has noticed it proliferating throughout the organization and requested his team not to use them for “tasks that require critical thinking or problem-solving.”
“To grow skills, people need to go through hardship. They need to develop the muscle to think through problems,” he said. “How would someone question if AI is accurate if they don’t have critical thinking?”
“To grow skills, people need to go through hardship. They need to develop the muscle to think through problems,” he said. “How would someone question if AI is accurate if they don’t have critical thinking?”
There is also the question of what constitutes “overuse”. We already have evidence, both data-driven and anecdotal, that these skills can atrophy and dissipate rather quickly (within months in some cases).
This is the contradiction that has many AI boosters talking out of both sides of their mouths: The use of coding agents is actively diminishing the very skills needed to effectively manage the coding agents.
LLMs accelerate the wrong parts.
Contrary to the current narrative that is being espoused, we didn’t necessarily need to write code faster. Especially code we didn’t fully understand, and particularly in huge swaths that we couldn’t review in reasonable time frames.
Before AI, a (good) developer’s priority list might look like:
Understanding of the code and its relation to the codebase
If the code is aligned with the documented and efficient standards
As few lines of code as needed to accomplish the goal (while maintaining readability)
Turnaround time
Agentic coding, and LLMs in general, completely invert this list.
Their capabilities and usage tend to focus on speed by increasing the amount of code that can be generated in a specified time frame. Speed is a natural byproduct of high aptitude. When it’s forced, it always leads to lower accuracy. The integration of these tools doesn’t tend to focus much on deeper understanding or conciseness.
Can they be used that way? Yes, with determination, they certainly can be.
Are they? No, not really; forced mandates and hype around token usage across organizations is demonstrating as such.
Coding === Planning
There is a divide between developers that isn’t highlighted as much: Some of us plan, and think, better with code. Thinking and working in code isn’t just meaningless drudgery; it forces you to think about things on a technical level that involves everything from security to performance to user experience to maintainability.
In a recent interview discussing “Spec Driven Development”, Dax, the creator of OpenCode (an open-source coding agent, no less) was quoted saying:
“When working on something new or something challenging, me typing out code is the process by which I figure out what we should even be doing.
I have a really tough time just sitting there, writing out a giant spec on exactly how the feature should work. I like writing out types. I like writing out how some of the functions might play together. I like playing with folder structure to see what the different concepts should be. And this is all stuff that I think most people—most programmers—have always done. I don’t really see a good reason why I would stop that personally, because it’s how I figure out what to do.”
“When working on something new or something challenging,
.
I have a really tough time just sitting there, writing out a giant spec on exactly how the feature should work.
I like writing out types. I like writing out how some of the functions might play together. I like playing with folder structure to see what the different concepts should be. And this is all stuff that I think most people—most programmers—have always done. I don’t really see a good reason why I would stop that personally, because it’s how I figure out what to do.”
What you say is often not what you mean, and LLMs fill in ambiguity with assumptions (or hallucinations), which leads to: more review, more agent revisions, more tokens burned, and more disconnection from what is being created. Inversely, You can marvel at the most beautiful, unambiguous, perfectly structured prompt you’ve ever written, and the LLM can still output a hallucinated method because it is fundamentally a next-token-prediction engine, not a compiler. You cannot replace a deterministic system with a probabilistic one and expect zero ambiguity.
Even the most AI-enthusiastic senior developers are starting to see this disconnection as a looming and growing issue.
Vendor Lock-In
When I was browsing LinkedIn during the Claude outage that occurred a bit ago, I noticed numerous posts highlighting that certain developers and engineering teams were at a standstill. Their workflows, their own coding abilities, had already reached a point where they were largely dependent on these vendors. What used to be a skill that they could execute with just a keyboard and text editor suddenly required a subscription to an AI model provider.
You can’t predict your token cost.
Model providers are heavily subsidized, and the models themselves are built on shifting sands. Every new model release follows the same pattern of high benchmarks, followed by hype, followed by the reality of usage and everyone complaining of them being “nerfed” and burning through 2x-3x as many tokens to get the same job done.
You know how much your employees cost; you have no idea how much your token costs will be day to day, month to month, year to year. If your entire team is using agentic coding as the default, your expense account will need to remain highly nimble. As Primeagen said recently: “when you use these fully agentic workflows, the model providers essentially own you”.
It’s not unreasonable to play this pattern forward, where we could be creating an industry where you need to pay for token consumption to accomplish something that used to be the product of your own critical thinking and problem-solving abilities. This would resemble a type of “vendor lock-in”, but for an entire industry skillset (and I’m sure the model providers are gleefully rubbing their hands in anticipation for that). The financial, and intellectual, rug-pull could come at any moment, and local LLMs are nowhere near ready to scale to absorb that level of usage.
This isn’t theoretical conjecture; it’s being reported on right now. Even the model providers themselves are bringing it to light. Yet another Anthropic study showed a precipitous 47% drop-off in debugging skills:
“Incorporating AI aggressively into the workplace—especially in software engineering—inevitably comes with trade-offs…developers may lean on AI to deliver quick results at the expense of building critical skills—most notably, the ability to debug when things go wrong.”
“Incorporating AI aggressively into the workplace—especially in software engineering—inevitably comes with trade-offs…developers may lean on AI to deliver quick results at the expense of building critical skills—most notably, the ability to debug when things go wrong.”
There’s a way to avoid all of this, of course. LLMs are a powerhouse technological advancement, and when used responsibly, they can be a stellar tool for learning and upskilling. They enable me to dive deeper and wider into concepts and techniques, expanding understanding and enabling exploration of new ideas that used to be more arduous and time consuming to experiment with. This is where I think they will offer the industry the most long-term value.
My Approach: Demote AI’s role
I’m certainly not advocating for typing code out manually. Programmers have always been looking for ways to create code without having to write code. This is why we even have Emmet, autocomplete, and snippets in the first place. Even COBOL was designed to encapsulate more instructions with less writing by using “English-like” words such as MOVE and WRITE. jQuery’s motto was “write less, do more”. LLMs are another addition to this array of code generation tools.
What I am advocating for, though, is leveraging LLMs and coding agents as secondary processes. A way that doesn’t sacrifice the individual’s skills at the altar of productivity. You can flip the script and lean on them to brainstorm the planning parts of the process while staying actively engaged throughout implementation, delegating to them on an as-needed basis. You can leverage the productivity gains, and mitigate the comprehension debt.
My daily workflow:
I use LLMs to help generate specs and plans, while I facilitate the implementation. This is an inversion of the “orchestration” workflow; I am still manually coding anywhere from 20% to 100%, depending on the task.
I very often am writing pseudo-code when I do engage with the models, closing the distance between the request and the generated code.
I use the models as delegation utilities for ad-hoc code generation and interactive documentation, as well as research tools so that I can constantly ask questions, iterate, refactor, and gain clarity around my approaches.
I never generate more than I can review in a sitting. If it’s too much to review, I slow down and split the task up, manually refactoring where needed to ensure a comprehensive understanding of the end result.
I never ask an LLM or agent to implement something that I’ve never done before or couldn’t do on my own, except perhaps purely for educational or tutorial purposes (and often discarded afterwards).
If I had to TL;DR this list, it would be: Use them like the Ship’s Computer, not Data. (any Star Trek fans should get the reference)
I’m not going faster, but I’m doing better quality work.
The productivity gains from these models are real, and so is the friction and understanding that come from engaging with the work on a tangible and frequent basis.
Despite the countless failed attempts at trying to democratize coding while not understanding coding, we’re faced with the reality that you cannot understand code without engaging with it. And it’s become clear that if you don’t keep engaging and writing it, you can lose touch with that understanding, which will in turn make you a less capable orchestrator in the first place, rendering this phase of AI coding a strange and needlessly stressful interlude.
Perhaps I am worrying too much, but history contains lessons.
This all feels similar, though, like another large experiment we’re running on ourselves. We’ve been through a similar period with the introduction of social media without understanding the long-term implications, and we’re now faced with attention deficit (amongst many other issues) on a wide scale.
This time, we’re gambling with something much riskier.
“People who go all in on AI agents now are guaranteeing their obsolescence. If you outsource all your thinking to computers, you stop upskilling, learning, and becoming more competent.” — Jeremy Howard, creator of fast.ai
“People who go all in on AI agents now are guaranteeing their obsolescence. If you outsource all your thinking to computers, you stop upskilling, learning, and becoming more competent.”
– Jeremy Howard, creator of
Terminal User Interfaces (TUIs) are making a comeback. DHH’s Omarchy is made of three types of user interfaces: TUIs, for immediate feedback and bonus geek points, webapps because 37signals (his company) sells SAAS web applications and the unavoidable gnome-style native applications that really do not fit well in the style of the distro.
The same pattern occurred around 10 years ago in code editors. We came from the native editors of BBEdit, Textmate (also promoted by DHH), Notedpad++ and Sublime to Electro-powered apps like Atom, VSCode and all its forks. The hardcore, moved to vim or emacs, trading immediate feedback and higher usability for the steepest learning curve I’ve seen.
Windows
The lesson is clear: Native applications are losing. Windows is doing the GUI library standard joke. Because one API does not have success, they make up another one, just for that one to fail within the sea of alternatives that exist.
MFC (1992) wrapped Win32 in C++. If Win32 was inelegant, MFC was Win32 wearing a tuxedo made of other tuxedos. Then came OLE. COM. ActiveX. None of these were really GUI frameworks — they were component architectures — but they infected every corner of Windows development and introduced a level of cognitive complexity that makes Kierkegaard read like Hemingway.
MFC (1992) wrapped Win32 in C++. If Win32 was inelegant, MFC was Win32 wearing a tuxedo made of other tuxedos. Then came OLE. COM. ActiveX. None of these were really GUI frameworks — they were component architectures — but they infected every corner of Windows development and introduced a level of cognitive complexity that makes Kierkegaard read like Hemingway.
— Jeffrey Snover, in Microsoft hasn’t had a coherent GUI strategy since Petzold
Since then, Microsoft has gone through Winforms, WPF, Silverlight, WinUIs, MAUI without success. Many enterprise and personal desktop application still rely on Electron Apps, and the last memory of coherent visual integration of the whole OS I have is of Windows 98 or 2000.
It turns out that it’s a lot of work to recreate one’s OS and UI APIs every few years. Coupled with the intermittent attempts at sandboxing and deprecating “too powerful” functionality, the result is that each new layer has gaps, where you can’t do certain things which were possible in the previous framework.
It turns out that it’s a lot of work to recreate one’s OS and UI APIs every few years. Coupled with the intermittent attempts at sandboxing and deprecating “too powerful” functionality, the result is that each new layer has gaps, where you can’t do certain things which were possible in the previous framework.
— Domenic Denicola, in Windows Native App Development Is a Mess
Linux
The UI inconsistency in Linux was created by design. Different teams wanted different outcomes and they had the freedom to do it. GTK and Qt became the two reigning frameworks. While Qt is most known for it, both aimed to support cross-platform native development (once upon a time, I successfully compiled gedit on Windows, learning a lot about C compilation, make files and environment variables in the process) but are only widely used in Linux land. Luckily, applications made in the different toolkits can look okay-ish next to each other, something that the different frameworks on Windows fail to achieve. How many engineer-hours does it take to redo the windows Control Panel?
Given the difficulty in testing the million different combinations of distros, desktop environments and hardware in general, most companies do not bother with a native Linux application — they either address it using electron (minting the lock-down), or they let the open-source community solve it self (when they have open APIs).
macOS
Apple used to be a one-book religion. Apple’s Human Interface Guidelines used to be cited by every User Interface course over the world. Xerox PARC and Apple were the two institutions that studied what it means to have a good human interface. Fast forward a few decades, and Apple is doing the best worst it can to break all the guidelines and consistency it was known for.
Now, Apple has been ignoring Fitts’ law, making resizing windows near-impossible (even after trying to fix it) and adding icons to every single menu. MacOS is no longer the safe heaven where designers can work peacefully.
Electron
Everyone knows that the user experience of electron apps sucks. The most popular claim is the memory consumption, which to be fair has been decreasing over the last decade, but my main complaint (as I usually drive a 64GB RAM MacBook Pro) is the lack of visual consistency and lack of keyboard-driven workflows. Looking at my dock, I have 8 native apps (text mate and macOS system utilities) and 6 electron apps (Slack, Discord, Mattermost, VScode, Cursor, Plexampp). And that’s from someone who really wishes he could avoid having any electron app at all.
Let us take the example of Cursor (but would be true for VSCode as well). If you are in the agent panel, requesting your next feature, can you move to the agent list on the side panel with just the keyboard? Can you archive it? These are actions that should be the same across every macOS application, and even if there are shortcuts, they are not announced in the menus. And over the last decade, developers have been forgetting to add menus to do the same things that are available in their application (mostly because the application is HTML within its sandbox). For the record, Slack does this better than the others, but it’s not perfect.
Restarting from scratch
Together with Dart, Google wanted to design a new operating system, without all the legacy of Android, for new devices. It wanted a fresh UI toolkit (Flutter UI) but Google gave up on the project before a real product was launched. It’s one of those situations where having a monopoly (or a large enough slice of the market) is required to succeed.
Meanwhile, Zed did the same thing in Rust: they designed their own GPU-renderer library (GPUI) which is cross-platform. Despite the high-speed, it lacks integration with the host OS on itself, requiring the developers to add the right bindings. Personally, I would rather have a slow renderer that integrated with my OS than the extra speed.
TUIs
TUIs are fast, easy to automate (RIP Automator) and work reasonably well in different operating systems. You can even run them remotely without any headache-inducing X forwarding. When the native UI toolkits fail, we go back to basics. Claude and Codex have been very successful on the command-line: you focus on the interaction and forget about the operating system around you. You can even drive code and apps on cloud machines, or remote into your GPU-powered machine from your iPad. TUIs are filling the void left by Apple and Microsoft in the post-apocalyptic world where every application looks different. Which is good if you are doing art (including computer games), but not if your goal is to get out of the way of letting the user do their job.
What’s next
A checkbox is also part of an interface. You’re using it to interact with a system by inputting data. Interfaces are better the less thinking they require: whether the interface is a steering wheel or an online form, if you have to spend any amount of time figuring out how to use it, that’s bad. As you interact with many things, you want homogeneous interfaces that give you consistent experiences. If you learn that Command + C is the keyboard shortcut for copy, you want that to work everywhere. You don’t want to have to remember to use CTRL + Shift + C in certain circumstances or right-click → copy in others, that’d be annoying.
A checkbox is also part of an interface. You’re using it to interact with a system by inputting data. Interfaces are better the less thinking they require: whether the interface is a steering wheel or an online form, if you have to spend any amount of time figuring out how to use it, that’s bad. As you interact with many things, you want homogeneous interfaces that give you consistent experiences. If you learn that Command + C is the keyboard shortcut for copy, you want that to work everywhere. You don’t want to have to remember to use CTRL + Shift + C in certain circumstances or right-click → copy in others, that’d be annoying.
— John Loeber in Bring Back Idiomatic Design
We need to go back to the basics. Every developer should learn the theory of what makes a good User Interface (software or not!), like Nielsen, Norman or Johnson, and stop treating User Design as a soft skill that does not matter in the Software Engineering Curriculum. In any course, if the UI does not make any sense, the project should be failed. And in the HCI course, we should aim for perfect UIs. It takes work, but that work is mostly about understanding what we need. The programming is already being automated.
Operating systems and Toolkits authors should drive this investment. They should focus on making accessible toolkits that developers want to use, and lower the barrier to entry, making those platforms last as long as possible. I do not necessarily argue for cross-platform support, but having one such solution would help reduce the electron and TUI dependency.
For the first time in twenty-five years I’m sitting in front of a computer where almost every program I touch was designed by me. One tool at a time, the off-the-shelf option got swapped out for something a little closer to how my hands wanted to work. (I wrote about the start of this a couple of weeks ago — that post laid out the early swaps; this one is the view from the other side of the journey.)
It’s been a crazy few weeks guiding Claude Code inbetween all the other stuff I’m doing in life. I direct CC, it works while I do other stuff. I get a second or few in between tasks, and I respond. Then off it goes adding features or hunting bugs.
Two suites in a happy marriage: CHasm, the bedrock — pure x86_64 assembly, no libc, the layer that paints pixels and reads keys. Fe₂O₃, the application layer in Rust, sitting on a small shared TUI library called crust.
The CHasm layer (assembly)
The Fe₂O₃ layer (Rust on crust)
What’s left? WeeChat for IRC and other chats. Firefox — the only GUI program I still use regularly. That’s it. Everything else is mine.
The vim line
Let me get a bit sentimental about vim, because vim was the one I thought I’d never replace.
I started using it in 2001. For twenty-five years, every email I wrote went through vim. Every article. Every blog post. Every line of code, every HyperList, and every book. It was the one tool I would have called part of how I think. The muscle memory was so deep that I’d open random text fields in browsers and ended with typing :w.
Then in three days I had scribe and stopped using vim.
The first commit landed at 00:09 on May 1st. By afternoon today (May 3rd) vim was replaced. Twenty-five years of muscle memory rerouted in seventy-two hours.
Vim is wonderful, but scribe is mine. It’s modal like vim, but missing the ninety percent of features I never used, and carrying the handful of writer-shaped tweaks I always wished vim had. Soft-wrap by default. Reading mode with Limelight-style focus. AI in the prompt without leaving the buffer. HyperList editing with full syntax highlighting and the encryption format the Ruby HyperList app uses. Persistent registers shared across concurrent sessions is a cool feature. None of it revolutionary, but all of it shaped to my exact workflow. And whenever I think of an enhancement I want, it’s just minutes away. It used to be waiting for months or years or forever for some developer to get the same idea as mine and introduce it into the tool I use.
Why this is possible now
It used to be that writing your own editor, your own file manager, your own window manager, was a project of years. I know, it took me a few years to get RTFM right. A serious undertaking with a serious cost. The economics of it didn’t work for most people, even programmers. You’d touch a piece of it, get most of the way, run out of weekend, and go back to the off-the-shelf tool.
That barrier is much lower now. With Rust, CC as the workhorse, and the fact that the hard problems of TUI programming have been documented to death… the cost of “build the tool you actually want” has fallen by orders of magnitude.
I don’t think this is a story about AI or about Rust specifically. Both helped. But the deeper point is that the gap between “I wish my editor did X” and “okay, here’s an editor that does X” is now small enough to fit inside a few evenings of focused work.
I’m not selling anything
I should say what this post is not.
It’s not an invitation to use my software. Honestly, please don’t. None of it is built for you. It’s built for me — for the way I hold my hands, the way I think about email, the way I want my calendar to render. I’m sure other people would find a hundred sharp edges I’ve never noticed because they happen to align perfectly with what I do.
It’s also not a request for kudos. The code isn’t novel, nor are the ideas. There’s nothing here that hasn’t been done before by someone with more taste, discipline or talent.
What I want to do is show one specific thing: it is now genuinely feasible to make a desktop computing environment that fits one person. Instead of a configuration of someone else’s tools. This is no longer a heroic decade-long undertaking. This is an actual, weekend-by-weekend, “this thing in my life now does exactly what I want” replacement.
The joy of an audience of one
The best part of building for myself: the relief of not having to care.
I don’t have to think about configurability for someone with different preferences. And I don’t have to support corner cases I’d never personally hit. Nor do I have to write documentation for users who don’t exist. No more arguing on issue trackers about whether a default is the right default — of course it’s the right default, it’s the one I want.
The editor’s \? cheatsheet shows the keys I memorised, in the order I prefer, with the bindings I think are sensible. Arrogance? Nope, it’s design without committee. The audience is one person. Decisions take seconds.
It turns out an enormous amount of software complexity comes from accommodating users who aren’t you. Strip that out and what’s left is small, fast, exactly-shaped, and a quiet pleasure to use.
So
If you’ve ever caught yourself thinking “I wish my editor / file manager / status bar / shell just did this one thing differently” and you’ve been told the answer is to write a plugin, learn an obscure config language, or accept the way it is, then consider that the third option is more available than it used to be: Build Your Own Software (BYOS).
You probably won’t replace your whole desktop. I didn’t plan to either. But the satisfaction of having even one tool in your daily workflow that fits you exactly is worth a weekend.
I’m a rabbit in spring :)
We’ve reported previously on the feats of the skull-and-bones community against Denuvo’s DRM. The cat-and-mouse game has essentially come to a head for now, as the pirate crew has “officially” reported that, as of yesterday, there were zero games with Denuvo that haven’t been cracked or bypassed.
This development should be of little surprise to those following this story along, but here’s a quick recap: in late 2025, the MKDev collective and the prolific DenuvOwO came up with a hypervisor-based bypass (HVB) that installs a kernel-level driver to intercept and respond to Denuvo’s checks. While that’s not an actual crack, it’s good enough for piracy work, as the saying goes. Simultaneously, voices38, a well-known cracker, also fully stripped a few choice titles of Denuvo entirely, including recent releases like Resident Evil: Requiem.
Article continues below
Get Tom’s Hardware’s best news and in-depth reviews, straight to your inbox.
Follow Tom’s Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.