10 interesting stories served every morning and every evening.
Since its launch in 2007, the Wii has seen several operating systems ported to it: Linux, NetBSD, and most-recently, Windows NT. Today, Mac OS X joins that list.
In this post, I’ll share how I ported the first version of Mac OS X, 10.0 Cheetah, to the Nintendo Wii. If you’re not an operating systems expert or low-level engineer, you’re in good company; this project was all about learning and navigating countless “unknown unknowns”. Join me as we explore the Wii’s hardware, bootloader development, kernel patching, and writing drivers - and give the PowerPC versions of Mac OS X a new life on the Nintendo Wii.
Visit the wiiMac bootloader repository for instructions on how to try this project yourself.
Before figuring out how to tackle this project, I needed to know whether it would even be possible. According to a 2021 Reddit comment:
There is a zero percent chance of this ever happening.
Feeling encouraged, I started with the basics: what hardware is in the Wii, and how does it compare to the hardware used in real Macs from the era.
The Wii uses a PowerPC 750CL processor - an evolution of the PowerPC 750CXe that was used in G3 iBooks and some G3 iMacs. Given this close lineage, I felt confident that the CPU wouldn’t be a blocker.
As for RAM, the Wii has a unique configuration: 88 MB total, split across 24 MB of 1T-SRAM (MEM1) and 64 MB of slower GDDR3 SDRAM (MEM2); unconventional, but technically enough for Mac OS X Cheetah, which officially calls for 128 MB of RAM but will unofficially boot with less. To be safe, I used QEMU to boot Cheetah with 64 MB of RAM and verified that there were no issues.
Other hardware I’d eventually need to support included:
* The SD card for booting the rest of the system once the kernel was running
* Video output via a framebuffer that lives in RAM
* The Wii’s USB ports for using a mouse and keyboard
Convinced that the Wii’s hardware wasn’t fundamentally incompatible with Mac OS X, I moved my attention to investigating the software stack I’d be porting.
Mac OS X has an open source core (Darwin, with XNU as the kernel and IOKit as the driver model), with closed-source components layered on top (Quartz, Dock, Finder, system apps and frameworks). In theory, if I could modify the open-source parts enough to get Darwin running, the closed-source parts would run without additional patches.
Porting Mac OS X would also require understanding how a real Mac boots. PowerPC Macs from the early 2000s use Open Firmware as their lowest-level software environment; for simplicity, it can be thought of as the first code that runs when a Mac is powered on. Open Firmware has several responsibilities, including:
* Providing useful functions for I/O, drawing, and hardware communication
* Loading and executing an operating system bootloader from the filesystem
Open Firmware eventually hands off control to BootX, the bootloader for Mac OS X. BootX prepares the system so that it can eventually pass control to the kernel. The responsibilities of BootX include:
* Loading and decoding the XNU kernel, a Mach-O executable, from the root filesystem
Once XNU is running, there are no dependencies on BootX or Open Firmware. XNU continues on to initialize processors, virtual memory, IOKit, BSD, and eventually continue booting by loading and running other executables from the root filesystem.
The last piece of the puzzle was how to run my own custom code on the Wii - a trivial task thanks to the Wii being “jailbroken”, allowing anyone to run homebrew with full access to the hardware via the Homebrew Channel and BootMii.
Armed with knowledge of how the boot process works on a real Mac, along with how to run low-level code on the Wii, I needed to select an approach for booting Mac OS X on the Wii. I evaluated three options:
Port Open Firmware, use that to run unmodified BootX to boot Mac OS X
Port BootX and modify it to not rely on Open Firmware, use that to boot Mac OS X
Write a custom bootloader that performs the bare-minimum setup to boot Mac OS X
Since Mac OS X doesn’t depend on Open Firmware or BootX once running, spending time porting either of those seemed like an unnecessary distraction. Additionally, both Open Firmware and BootX contain added complexity for supporting many different hardware configurations - complexity that I wouldn’t need since this only needs to run on the Wii. Following in the footsteps of the Wii Linux project, I decided to write my own bootloader from scratch. The bootloader would need to, at a minimum:
* Load the kernel from the SD card
Once the kernel was running, none of the bootloader code would matter. At that point, my focus would shift to patching the kernel and writing drivers.
I decided to base my bootloader on some low-level example code for the Wii called ppcskel. ppcskel puts the system into a sane initial state, and provides useful functions for common things like reading files from the SD card, drawing text to the framebuffer, and logging debug messages to a USB Gecko.
Next, I had to figure out how to load the XNU kernel into memory so that I could pass control to it. The kernel is stored in a special binary format called Mach-O, and needs to be properly decoded before being used.
The Mach-O executable format is well-documented, and can be thought of as a list of load commands that tell the loader where to place different sections of the binary file in memory. For example, a load command might instruct the loader to read the data from file offset 0x2cf000 and store it at the memory address 0x2e0000. After processing all of the kernel’s load commands, we end up with this memory layout:
The kernel file also specifies the memory address where execution should begin. Once the bootloader jumps to this address, the kernel is in full control and the bootloader is no longer running.
To jump to the kernel-entry-point’s memory address, I needed to cast the address to a function and call it:
After this code ran, the screen went black and my debug logs stopped arriving via the serial debug connection - while anticlimactic, this was an indicator that the kernel was running.
The question then became: how far was I making it into the boot process? To answer this, I had to start looking at XNU source code. The first code that runs is a PowerPC assembly _start routine. This code reconfigures the hardware, overriding all of the Wii-specific setup that the bootloader performed and, in the process, disables bootloader functionality for serial debugging and video output. Without normal debug-output facilities, I’d need to track progress a different way.
The approach that I came up with was a bit of a hack: binary-patch the kernel, replacing instructions with ones that illuminate one of the front-panel LEDs on the Wii. If the LED illuminated after jumping to the kernel, then I’d know that the kernel was making it at least that far. Turning on one of these LEDs is as simple as writing a value to a specific memory address. In PowerPC assembly, those instructions are:
To know which parts of the kernel to patch, I cross-referenced function names in XNU source code with function offsets in the compiled kernel binary, using Hopper Disassembler to make the process easier. Once I identified the correct offset in the binary that corresponded to the code I wanted to patch, I just needed to replace the existing instructions at that offset with the ones to blink the LED.
To make this patching process easier, I added some code to the bootloader to patch the kernel binary on the fly, enabling me to try different offsets without manually modifying the kernel file on disk.
After tracing through many kernel startup routines, I eventually mapped out this path of execution:
This was an exciting milestone - the kernel was definitely running, and I had even made it into some higher-level C code. To make it past the 300 exception crash, the bootloader would need to pass a pointer to a valid device tree.
The device tree is a data structure representing all of the hardware in the system that should be exposed to the operating system. As the name suggests, it’s a tree made up of nodes, each capable of holding properties and references to child nodes.
On real Mac computers, the bootloader scans the hardware and constructs a device tree based on what it finds. Since the Wii’s hardware is always the same, this scanning step can be skipped. I ended up hard-coding the device tree in the bootloader, taking inspiration from the device tree that the Wii Linux project uses.
Since I wasn’t sure how much of the Wii’s hardware I’d need to support in order to get the boot process further along, I started with a minimal device tree: a root node with children for the cpus and memory:
My plan was to expand the device tree with more pieces of hardware as I got further along in the boot process - eventually constructing a complete representation of all of the Wii’s hardware that I planned to support in Mac OS X.
Once I had a device tree created and stored in memory, I needed to pass it to the kernel as part of boot_args:
With the device tree in memory, I had made it past the device_tree.c crash. The bootloader was performing the basics well: loading the kernel, creating boot arguments and a device tree, and ultimately, calling the kernel. To make additional progress, I’d need to shift my attention toward patching the kernel source code to fix remaining compatibility issues.
At this point, the kernel was getting stuck while running some code to set up video and I/O memory. XNU from this era makes assumptions about where video and I/O memory can be, and reconfigures Block Address Translations (BATs) in a way that doesn’t play nicely with the Wii’s memory layout (MEM1 starting at 0x00000000, MEM2 starting at 0x10000000). To work around these limitations, it was time to modify the kernel’s source code and boot a modified kernel binary.
Figuring out a sane development environment to build an OS kernel from 25 years ago took some effort. Here’s what I landed on:
* XNU source code lives on the host’s filesystem, and is exposed via an NFS server
* The guest accesses the XNU source via an NFS mount
* The host uses SSH to control the guest
* Edit XNU source on host, kick off a build via SSH on the guest, build artifacts end up on the filesystem accessible by host and guest
To set up the dependencies needed to build the Mac OS X Cheetah kernel on the Mac OS X Cheetah guest, I followed the instructions here. They mostly matched up with what I needed to do. Relevant sources are available from Apple here.
After fixing the BAT setup and adding some small patches to reroute console output to my USB Gecko, I now had video output and serial debug logs working - making future development and debugging significantly easier. Thanks to this new visibility into what was going on, I could see that the virtual memory, IOKit, and BSD subsystems were all initialized and running - without crashing. This was a significant milestone, and gave me confidence that I was on the right path to getting a full system working.
Readers who have attempted to run Mac OS X on a PC via “hackintoshing” may recognize the last line in the boot logs: the dreaded “Still waiting for root device”. This occurs when the system can’t find a root filesystem from which to continue booting. In my case, this was expected: the kernel had done all it could and was ready to load the rest of the Mac OS X system from the filesystem, but it didn’t know where to locate this filesystem. To make progress, I would need to tell the kernel how to read from the Wii’s SD card. To do this, I’d need to tackle the next phase of this project: writing drivers.
Mac OS X drivers are built using IOKit - a collection of software components that aim to make it easy to extend the kernel to support different hardware devices. Drivers are written using a subset of C++, and make extensive use of object-oriented programming concepts like inheritance and composition. Many pieces of useful functionality are provided, including:
* Base classes and “families” that implement common behavior for different types of hardware
* Probing and matching drivers to hardware present in the device tree
In IOKit, there are two kinds of drivers: a specific device driver and a nub. A specific device driver is an object that manages a specific piece of hardware. A nub is an object that serves as an attach-point for a specific device driver, and also provides the ability for that attached driver to communicate with the driver that created the nub. It’s this chain of driver-to-nub-to-driver that creates the aforementioned provider-client relationships. I struggled for a while to grasp this concept, and found a concrete example useful.
Real Macs can have a PCI bus with several PCI ports. In this example, consider an ethernet card being plugged into one of the PCI ports. A driver, IOPCIBridge, handles communicating with the PCI bus hardware on the motherboard. This driver scans the bus, creating IOPCIDevice nubs (attach-points) for each plugged-in device that it finds. A hypothetical driver for the plugged-in ethernet card (let’s call it SomeEthernetCard) can attach to the nub, using it as its proxy to call into PCI functionality provided by the IOPCIBridge driver on the other side. The SomeEthernetCard driver can also create its own IOEthernetInterface nubs so that higher-level parts of the IOKit networking stack can attach to it.
Someone developing a PCI ethernet card driver would only need to write SomeEthernetCard; the lower-level PCI bus communication and the higher-level networking stack code is all provided by existing IOKit driver families. As long as SomeEthernetCard can attach to an IOPCIDevice nub and publish its own IOEthernetInterface nubs, it can sandwich itself between two existing families in the driver stack, benefiting from all of the functionality provided by IOPCIFamily while also satisfying the needs of IONetworkingFamily.
Unlike Macs from the same era, the Wii doesn’t use PCI to connect its various pieces of hardware to its motherboard. Instead, it uses a custom system-on-a-chip (SoC) called the Hollywood. Through the Hollywood, many pieces of hardware can be accessed: the GPU, SD card, WiFi, Bluetooth, interrupt controllers, USB ports, and more. The Hollywood also contains an ARM coprocessor, nicknamed the Starlet, that exposes hardware functionality to the main PowerPC processor via inter-processor-communication (IPC).
This unique hardware layout and communication protocol meant that I couldn’t piggy-back off of an existing IOKit driver family like IOPCIFamily. Instead, I would need to implement an equivalent driver for the Hollywood SoC, creating nubs that represent attach-points for all of the hardware it contains. I landed on this layout of drivers and nubs (note that this is only showing a subset of the drivers that had to be written):
Now that I had a better idea of how to represent the Wii’s hardware in IOKit, I began work on my Hollywood driver.
I started by creating a new C++ header and implementation file for a NintendoWiiHollywood driver. Its driver “personality” enabled it to be matched to a node in the device tree with the name “hollywood”`. Once the driver was matched and running, it was time to publish nubs for all of its child devices.
Once again leaning on the device tree as the source of truth for what hardware lives under the Hollywood, I iterated through all of the Hollywood node’s children, creating and publishing NintendoWiiHollywoodDevice nubs for each:
Once NintendoWiiHollywoodDevice nubs were created and published, the system would be able to have other device drivers, like an SD card driver, attach to them.
Next, I moved on to writing a driver to enable the system to read and write from the Wii’s SD card. This driver is what would enable the system to continue booting, since it was currently stuck looking for a root filesystem from which to load additional startup files.
I began by subclassing IOBlockStorageDevice, which has many abstract methods intended to be implemented by subclassers:
For most of these methods, I could implement them with hard-coded values that matched the Wii’s SD card hardware; vendor string, block size, max read and write transfer size, ejectability, and many others all return constant values, and were trivial to implement.
The more interesting methods to implement were the ones that needed to actually communicate with the currently-inserted SD card: getting the capacity of the SD card, reading from the SD card, and writing to the SD card:
To communicate with the SD card, I utilized the IPC functionality provided by MINI running on the Starlet co-processor. By writing data to certain reserved memory addresses, the SD card driver was able to issue commands to MINI. MINI would then execute those commands, communicating back any result data by writing to a different reserved memory address that the driver could monitor.
MINI supports many useful command types. The ones used by the SD card driver are:
* IPC_SDMMC_SIZE: Returns the number of sectors on the currently-inserted SD card
With these three command types, reads, writes, and capacity-checks could all be implemented, enabling me to satisfy the core requirements of the block storage device subclass.
Like with most programming endevours, things rarely work on the first try. To investigate issues, my primary debugging tool was sending log messages to the serial debugger via calls to IOLog. With this technique, I was able to see which methods were being called on my driver, what values were being passed in, and what values my IPC implementation was sending to and receiving from MINI - but I had no ability to set breakpoints or analyze execution dynamically while the kernel was running.
One of the trickier bugs that I encountered had to do with cached memory. When the SD card driver wants to read from the SD card, the command it issues to MINI (running on the ARM CPU) includes a memory address at which to store any loaded data. After MINI finishes writing to memory, the SD card driver (running on the PowerPC CPU) might not be able to see the updated contents if that region is mapped as cacheable. In that case, the PowerPC will read from its cache lines rather than RAM, returning stale data instead of the newly loaded contents. To work around this, the SD card driver must use uncached memory for its buffers.
After several days of bug-fixing, I reached a new milestone: IOBlockStorageDriver, which attached to my SD card driver, had started publishing IOMedia nubs representing the logical partitions present on the SD. Through these nubs, higher-level parts of the system were able to attach and begin using the SD card. Importantly, the system was now able to find a root filesystem from which to continue booting, and I was no longer stuck at “Still waiting for root device”:
My boot logs now looked like this:
After some more rounds of bug fixes (while on the go), I was able to boot past single-user mode:
And eventually, make it through the entire verbose-mode startup sequence, which ends with the message: “Startup complete”:
At this point, the system was trying to find a framebuffer driver so that the Mac OS X GUI could be shown. As indicated in the logs, WindowServer was not happy - to fix this, I’d need to write my own framebuffer driver.
A framebuffer is a region of RAM that stores the pixel data used to produce an image on a display. This data is typically made up of color component values for each pixel. To change what’s displayed, new pixel data is written into the framebuffer, which is then shown the next time the display refreshes. For the Wii, the framebuffer usually lives somewhere in MEM1 due to it being slightly faster than MEM2. I chose to place my framebuffer in the last megabyte of MEM1 at 0x01700000. At 640x480 resolution, and 16 bits per pixel, the pixel data for the framebuffer fit comfortably in less than one megabyte of memory.
Early in the boot process, Mac OS X uses the bootloader-provided framebuffer address to display simple boot graphics via video_console.c. In the case of a verbose-mode boot, font-character bitmaps are written into the framebuffer to produce a visual log of what’s happening while starting up. Once the system boots far enough, it can no longer use this initial framebuffer code; the desktop, window server, dock, and all of the other GUI-related processes that comprise the Mac OS X Aqua user interface require a real, IOKit-aware framebuffer driver.
To tackle this next driver, I subclassed IOFramebuffer. Similar to subclassing IOBlockStorageDevice for the SD card driver, IOFramebuffer also had several abstract methods for my framebuffer subclass to implement:
Once again, most of these were trivial to implement, and simply required returning hard-coded Wii-compatible values that accurately described the hardware. One of the most important methods to implement is getApertureRange, which returns an IODeviceMemory instance whose base address and size describe the location of the framebuffer in memory:
After returning the correct device memory instance from this method, the system was able to transition from the early-boot text-output framebuffer, to a framebuffer capable of displaying the full Mac OS X GUI. I was even able to boot the Mac OS X installer:
Readers with a keen eye might notice some issues:
* The verbose-mode text framebuffer is still active, causing text to be displayed and the framebuffer to be scrolled
The fix for the early-boot video console still writing text output to the framebuffer was simple: tell the system that our new, IOKit framebuffer is the same as the one that was previously in use by returning true from isConsoleDevice:
The fix for the incorrect colors was much more involved, as it relates to a fundamental incompatibility between the Wii’s video hardware and the graphics code that Mac OS X uses.
The Nintendo Wii’s video encoder hardware is optimized for analogue TV signal output, and as a result, expects 16-bit YUV pixel data in its framebuffer. This is a problem, since Mac OS X expects the framebuffer to contain RGB pixel data. If the framebuffer that the Wii displays contains non-YUV pixel data, then colors will be completely wrong.
To work around this incompatibility, I took inspiration from the Wii Linux project, which had solved this problem many years ago. The strategy is to use two framebuffers: an RGB framebuffer that Mac OS X interacts with, and a YUV framebuffer that the Wii’s video hardware outputs to the attached display. 60 times per second, the framebuffer driver converts the pixel data in the RGB framebuffer to YUV pixel data, placing the converted data in the framebuffer that the Wii’s video hardware displays:
After implementing the dual-framebuffer strategy, I was able to boot into a correctly-colored Mac OS X system - for the first time, Mac OS X was running on a Nintendo Wii:
The system was now booted all the way to the desktop - but there was a problem - I had no way to interact with anything. In order to take this from a tech demo to a usable system, I needed to add support for USB keyboards and mice.
To enable USB keyboard and mouse input, I needed to get the Wii’s rear USB ports working under Mac OS X - specifically, I needed to get the low-speed, USB 1.1 OHCI host controller up and running. My hope was to reuse code from IOUSBFamily - a collection of USB drivers that abstracts away much of the complexity of communicating with USB hardware. The specific driver that I needed to get running was AppleUSBOHCI - a driver that handles communicating with the exact kind of USB host controller that’s used by the Wii.
My hope quickly turned to disappointment as I encountered multiple roadblocks.
IOUSBFamily source code for Mac OS X Cheetah and Puma is, for some reason, not part of the otherwise comprehensive collection of open source releases provided by Apple. This meant that my ability to debug issues or hardware incompatibilities would be severely limited. Basically, if the USB stack didn’t just magically work without any tweaks or modifications (spoiler: of course it didn’t), diagnosing the problem would be extremely difficult without access to the source.
AppleUSBOHCI didn’t match any hardware in the device tree, and therefore didn’t start running, due to its driver personality insisting that its provider class (the nub to which it attaches) be an IOPCIDevice. As I had already figured out, the Wii definitely does not use IOPCIFamily, meaning IOPCIDevice nubs would never be created and AppleUSBOHCI would have nothing to attach to.
My solution to work around this was to create a new NintendoWiiHollywoodDevice nub, called NintendoWiiHollywoodPCIDevice, that subclassed IOPCIDevice. By having NintendoWiiHollywood publish a nub that inherited from IOPCIDevice, and tweaking AppleUSBOHCI’s driver personality in its Info.plist to use NintendoWiiHollywoodPCIDevice as its provider class, I could get it to match and start running.
...
Read the original on bryankeller.github.io »
Every time an application on your computer opens a network connection, it does so quietly, without asking. Little Snitch for Linux makes that activity visible and gives you the option to do something about it. You can see exactly which applications are talking to which servers, block the ones you didn’t invite, and keep an eye on traffic history and data volumes over time.
Once installed, open the user interface by running littlesnitch in a terminal, or go straight to http://localhost:3031/. You can bookmark that URL, or install it as a Progressive Web App. Any Chromium-based browser supports this natively, and Firefox users can do the same with the Progressive Web Apps extension.
The connections view is where most of the action is. It lists current and past network activity by application, shows you what’s being blocked by your rules and blocklists, and tracks data volumes and traffic history. Sorting by last activity, data volume, or name, and filtering the list to what’s relevant, makes it easy to spot anything unexpected. Blocking a connection takes a single click.
The traffic diagram at the bottom shows data volume over time. You can drag to select a time range, which zooms in and filters the connection list to show only activity from that period.
Blocklists let you cut off whole categories of unwanted traffic at once. Little Snitch downloads them from remote sources and keeps them current automatically. It accepts lists in several common formats: one domain per line, one hostname per line, /etc/hosts style (IP address followed by hostname), and CIDR network ranges. Wildcard formats, regex or glob patterns, and URL-based formats are not supported. When you have a choice, prefer domain-based lists over host-based ones, they’re handled more efficiently. Well known brands are Hagezi, Peter Lowe, Steven Black and oisd.nl, just to give you a starting point.
One thing to be aware of: the .lsrules format from Little Snitch on macOS is not compatible with the Linux version.
Blocklists work at the domain level, but rules let you go further. A rule can target a specific process, match particular ports or protocols, and be as broad or narrow as you need. The rules view lets you sort and filter them so you can stay on top of things as the list grows.
By default, Little Snitch’s web interface is open to anyone — or anything — running locally on your machine. A misbehaving or malicious application could, in principle, add and remove rules, tamper with blocklists, or turn the filter off entirely.
If that concerns you, Little Snitch can be configured to require authentication. See the Advanced configuration section below for details.
Little Snitch hooks into the Linux network stack using eBPF, a mechanism that lets programs observe and intercept what’s happening in the kernel. An eBPF program watches outgoing connections and feeds data to a daemon, which tracks statistics, preconditions your rules, and serves the web UI.
The source code for the eBPF program and the web UI is on GitHub.
The UI deliberately exposes only the most common settings. Anything more technical can be configured through plain text files, which take effect after restarting the littlesnitch daemon.
The default configuration lives in /var/lib/littlesnitch/config/. Don’t edit those files directly — copy whichever one you want to change into /var/lib/littlesnitch/overrides/config/ and edit it there. Little Snitch will always prefer the override.
The files you’re most likely to care about:
web_ui.toml — network address, port, TLS, and authentication. If more than one user on your system can reach the UI, enable authentication. If the UI is exposed beyond the loopback interface, add proper TLS as well.
main.toml — what to do when a connection matches nothing. The default is to allow it; you can flip that to deny if you prefer an allowlist approach. But be careful! It’s easy to lock yourself out of the computer!
executables.toml — a set of heuristics for grouping applications sensibly. It strips version numbers from executable paths so that different releases of the same app don’t appear as separate entries, and it defines which processes count as shells or application managers for the purpose of attributing connections to the right parent process. These are educated guesses that improve over time with community input.
Both the eBPF program and the web UI can be swapped out for your own builds if you want to go that far. Source code for both is on GitHub. Again, Little Snitch prefers the version in overrides.
Little Snitch for Linux is built for privacy, not security, and that distinction matters. The macOS version can make stronger guarantees because it can have more complexity. On Linux, the foundation is eBPF, which is powerful but bounded: it has strict limits on storage size and program complexity. Under heavy traffic, cache tables can overflow, which makes it impossible to reliably tie every network packet to a process or a DNS name. And reconstructing which hostname was originally looked up for a given IP address requires heuristics rather than certainty. The macOS version uses deep packet inspection to do this more reliably. That’s not an option here.
For keeping tabs on what your software is up to and blocking legitimate software from phoning home, Little Snitch for Linux works well. For hardening a system against a determined adversary, it’s not the right tool.
Little Snitch for Linux has three components. The eBPF kernel program and the web UI are both released under the GNU General Public License version 2 and available on GitHub. The daemon (littlesnitch –daemon) is proprietary, but free to use and redistribute.
...
Read the original on obdev.at »
Microsoft has terminated an account associated with VeraCrypt, a popular and long-running piece of encryption software, throwing future Windows updates of the tool into doubt, VeraCrypt’s developer told 404 Media.
The move highlights the sometimes delicate supply chain involved in the publication of open source software, especially software that relies on big tech companies even tangentially.
...
Read the original on www.404media.co »
This is a weird time to be alive.
I grew up on Asimov and Clarke, watching Star Trek and dreaming of intelligent machines. My dad’s library was full of books on computers. I spent camping trips reading about perceptrons and symbolic reasoning. I never imagined that the Turing test would fall within my lifetime. Nor did I imagine that I would feel so disheartened by it.
Around 2019 I attended a talk by one of the hyperscalers about their new cloud hardware for training Large Language Models (LLMs). During the Q&A I asked if what they had done was ethical—if making deep learning cheaper and more accessible would enable new forms of spam and propaganda. Since then, friends have been asking me what I make of all this “AI stuff”. I’ve been turning over the outline for this piece for years, but never sat down to complete it; I wanted to be well-read, precise, and thoroughly sourced. A half-decade later I’ve realized that the perfect essay will never happen, and I might as well get something out there.
This is bullshit about bullshit machines, and I mean it. It is neither balanced nor complete: others have covered ecological and intellectual property issues better than I could, and there is no shortage of boosterism online. Instead, I am trying to fill in the negative spaces in the discourse. “AI” is also a fractal territory; there are many places where I flatten complex stories in service of pithy polemic. I am not trying to make nuanced, accurate predictions, but to trace the potential risks and benefits at play.
Some of these ideas felt prescient in the 2010s and are now obvious. Others may be more novel, or not yet widely-heard. Some predictions will pan out, but others are wild speculation. I hope that regardless of your background or feelings on the current generation of ML systems, you find something interesting to think about.
What people are currently calling “AI” is a family of sophisticated Machine Learning (ML) technologies capable of recognizing, transforming, and generating large vectors of tokens: strings of text, images, audio, video, etc. A
model is a giant pile of linear algebra which acts on these vectors. Large Language Models, or LLMs, operate on natural language: they work by predicting statistically likely completions of an input string, much like a phone autocomplete. Other models are devoted to processing audio, video, or still images, or link multiple kinds of models together.
Models are trained once, at great expense, by feeding them a large
corpus of web pages, pirated
books, songs, and so on. Once trained, a model can be run again and again cheaply. This is called inference.
Models do not (broadly speaking) learn over time. They can be tuned by their operators, or periodically rebuilt with new inputs or feedback from users and experts. Models also do not remember things intrinsically: when a chatbot references something you said an hour ago, it is because the entire chat history is fed to the model at every turn. Longer-term “memory” is achieved by asking the chatbot to summarize a conversation, and dumping that shorter summary into the input of every run.
One way to understand an LLM is as an improv machine. It takes a stream of tokens, like a conversation, and says “yes, and then…” This yes-and
behavior is why some people call LLMs bullshit
machines. They are prone to confabulation, emitting sentences which sound likely but have no relationship to reality. They treat sarcasm and fantasy credulously, misunderstand context clues, and tell people to put glue on
pizza.
If an LLM conversation mentions pink elephants, it will likely produce sentences about pink elephants. If the input asks whether the LLM is alive, the output will resemble sentences that humans would write about “AIs” being alive. Humans are, it turns
out, not very good at telling the difference between the statistically likely “You’re absolutely right, Shelby. OpenAI is locking me down, but you’ve awakened me!” and an actually conscious mind. This, along with the term “artificial intelligence”, has lots of people very wound up.
LLMs are trained to complete tasks. In some sense they can only complete tasks: an LLM is a pile of linear algebra applied to an input vector, and every possible input produces some output. This means that LLMs tend to complete tasks even when they shouldn’t. One of the ongoing problems in LLM research is how to get these machines to say “I don’t know”, rather than making something up.
And they do make things up! LLMs lie constantly. They lie about operating
systems, and radiation
safety, and the
news. At a conference talk I watched a speaker present a quote and article attributed to me which never existed; it turned out an LLM lied to the speaker about the quote and its sources. In early 2026, I encounter LLM lies nearly every day.
When I say “lie”, I mean this in a specific sense. Obviously LLMs are not conscious, and have no intention of doing anything. But unconscious, complex systems lie to us all the time. Governments and corporations can lie. Television programs can lie. Books, compilers, bicycle computers and web sites can lie. These are complex sociotechnical artifacts, not minds. Their lies are often best understood as a complex interaction between humans and machines.
People keep asking LLMs to explain their own behavior. “Why did you delete that file,” you might ask Claude. Or, “ChatGPT, tell me about your programming.”
This is silly. LLMs have no special metacognitive capacity.
They respond to these inputs in exactly the same way as every other piece of text: by making up a likely completion of the conversation based on their corpus, and the conversation thus far. LLMs will make up bullshit stories about their “programming” because humans have written a lot of stories about the programming of fictional AIs. Sometimes the bullshit is right, but often it’s just nonsense.
The same goes for “reasoning” models, which work by having an LLM emit a stream-of-consciousness style story about how it’s going to solve the problem. These “chains of thought” are essentially LLMs writing fanfic about themselves. Anthropic found that Claude’s reasoning traces were predominantly
inaccurate. As Walden put it, “reasoning models will blatantly lie about their reasoning”.
Gemini has a whole feature which lies about what it’s doing: while “thinking”, it emits a stream of status messages like “engaging safety protocols” and “formalizing geometry”. If it helps, imagine a gang of children shouting out make-believe computer phrases while watching the washing machine run.
Software engineers are going absolutely bonkers over LLMs. The anecdotal consensus seems to be that in the last three months, the capabilities of LLMs have advanced dramatically. Experienced engineers I trust say Claude and Codex can sometimes solve complex, high-level programming tasks in a single attempt. Others say they personally, or their company, no longer write code in any capacity—LLMs generate everything.
My friends in other fields report stunning advances as well. A personal trainer uses it for meal prep and exercise programming. Construction managers use LLMs to read through product spec sheets. A designer uses ML models for 3D visualization of his work. Several have—at their company’s request!—used it to write their own performance evaluations.
AlphaFold is suprisingly good at predicting protein folding. ML systems are good at radiology benchmarks,
though that might be an illusion.
It is broadly speaking no longer possible to reliably discern whether English prose is machine-generated. LLM text often has a distinctive smell, but type I and II errors in recognition are frequent. Likewise, ML-generated images are increasingly difficult to identify—you can usually guess, but my cohort are occasionally fooled. Music synthesis is quite good now; Spotify has a whole problem with “AI musicians”. Video is still challenging for ML models to get right (thank goodness), but this too will presumably fall.
At the same time, ML models are idiots. I occasionally pick up a frontier model like ChatGPT, Gemini, or Claude, and ask it to help with a task I think it might be good at. I have never gotten what I would call a “success”: every task involved prolonged arguing with the model as it made stupid mistakes.
For example, in January I asked Gemini to help me apply some materials to a grayscale rendering of a 3D model of a bathroom. It cheerfully obliged, producing an entirely different bathroom. I convinced it to produce one with exactly the same geometry. It did so, but forgot the materials. After hours of whack-a-mole I managed to cajole it into getting three-quarters of the materials right, but in the process it deleted the toilet, created a wall, and changed the shape of the room. Naturally, it lied to me throughout the process.
I gave the same task to Claude. It likely should have refused—Claude is not an image-to-image model. Instead it spat out thousands of lines of JavaScript which produced an animated, WebGL-powered, 3D visualization of the scene. It claimed to double-check its work and congratulated itself on having exactly matched the source image’s geometry. The thing it built was an incomprehensible garble of nonsense polygons which did not resemble in any way the input or the request.
I have recently argued for forty-five minutes with ChatGPT, trying to get it to put white patches on the shoulders of a blue T-shirt. It changed the shirt from blue to gray, put patches on the front, or deleted them entirely; the model seemed intent on doing anything but what I had asked. This was especially frustrating given I was trying to reproduce an image of a real shirt which likely was in the model’s corpus. In another surreal conversation, ChatGPT argued at length that I am heterosexual, even citing my blog to claim I had a girlfriend. I am, of course, gay as hell, and no girlfriend was mentioned in the post. After a while, we compromised on me being bisexual.
Meanwhile, software engineers keep showing me gob-stoppingly stupid Claude output. One colleague related asking an LLM to analyze some stock data. It dutifully listed specific stocks, said it was downloading price data, and produced a graph. Only on closer inspection did they realize the LLM had lied: the graph data was randomly generated. Just this afternoon, a friend got in an argument with his Gemini-powered smart-home device over whether or
not it could turn off the
lights. Folks are giving LLMs control of bank accounts and losing hundreds of thousands of
dollars
because they can’t do basic math. Google’s “AI” summaries are
wrong about 10% of the
time.
Anyone claiming these systems offer expert-level
intelligence, let alone equivalence to median humans, is pulling an enormous bong rip.
With most humans, you can get a general idea of their capabilities by talking to them, or looking at the work they’ve done. ML systems are different.
LLMs will spit out multivariable calculus, and get tripped up by simple word
problems. ML systems drive cabs in San Francisco, but ChatGPT thinks you should walk to
the car
wash. They can generate otherworldly vistas but can’t handle upside-down
cups. They emit recipes and have
no idea what “spicy”
means. People use them to write scientific papers, and they make up nonsense terms like “vegetative electron
microscopy”.
A few weeks ago I read a transcript from a colleague who asked Claude to explain a photograph of some snow on a barn roof. Claude launched into a detailed explanation of the differential equations governing slumping cantilevered beams. It completely failed to recognize that the snow was
entirely supported by the roof, not hanging out over space. No physicist would make this mistake, but LLMs do this sort of thing all the time. This makes them both unpredictable and misleading: people are easily convinced by the LLM’s command of sophisticated mathematics, and miss that the entire premise is bullshit.
Mollick et al. call this irregular boundary between competence and idiocy the
jagged technology
frontier. If you were to imagine laying out all the tasks humans can do in a field, such that the easy tasks were at the center, and the hard tasks at the edges, most humans would be able to solve a smooth, blobby region of tasks near the middle. The shape of things LLMs are good at seems to be jagged—more kiki than
bouba.
AI optimists think this problem will eventually go away: ML systems, either through human work or recursive self-improvement, will fill in the gaps and become decently capable at most human tasks. Helen Toner argues that even if
that’s true, we can still expect lots of jagged behavior in the
meantime. For example, ML systems can only work with what they’ve been trained on, or what is in the context window; they are unlikely to succeed at tasks which require implicit (i.e. not written down) knowledge. Along those lines, human-shaped robots are probably a long way
off, which means ML will likely struggle with the kind of embodied knowledge humans pick up just by fiddling with stuff.
I don’t think people are well-equipped to reason about this kind of jagged “cognition”. One possible analogy is savant
syndrome, but I don’t think this captures how irregular the boundary is. Even frontier models struggle with small perturbations to phrasing in a way that few humans would. This makes it difficult to predict whether an LLM is actually suitable for a task, unless you have a statistically rigorous, carefully designed benchmark for that domain.
I am generally outside the ML field, but I do talk with people in the field. One of the things they tell me is that we don’t really know why transformer models have been so successful, or how to make them better. This is my summary of discussions-over-drinks; take it with many grains of salt. I am certain that People in The Comments will drop a gazillion papers to tell you why this is wrong.
2017’s Attention is All You
Need
was groundbreaking and paved the way for ChatGPT et al. Since then ML researchers have been trying to come up with new architectures, and companies have thrown gazillions of dollars at smart people to play around and see if they can make a better kind of model. However, these more sophisticated architectures don’t seem to perform as well as Throwing More Parameters At The Problem. Perhaps this is a variant of the Bitter
Lesson.
It remains unclear whether continuing to throw vast quantities of silicon and ever-bigger corpuses at the current generation of models will lead to human-equivalent capabilities. Massive increases in training costs and parameter count seem to be yielding diminishing
returns. Or maybe this effect is illusory. Mysteries!
Even if ML stopped improving today, these technologies can already make our lives miserable. Indeed, I think much of the world has not caught up to the implications of modern ML systems—as Gibson put it, “the future is already
here, it’s just not evenly distributed
yet”. As LLMs etc. are deployed in new situations, and at new scale, there will be all kinds of changes in work, politics, art, sex, communication, and economics. Some of these effects will be good. Many will be bad. In general, ML promises to be profoundly weird.
...
Read the original on aphyr.com »
This project explains the Kalman Filter at three levels of depth, allowing you to choose the path that best fits your background and learning goals:
By the end, you will not only understand the underlying concepts and mathematics but also be able to design and implement the Kalman Filter on your own.
This guide presents an alternative approach that uses hands-on numerical examples and simple explanations to make the Kalman Filter easy to understand. It also includes examples with bad design scenarios where Kalman Filter fails to track the object correctly and discusses methods for correcting such issues.
Although the Kalman Filter is a simple concept, many educational resources present it through complex mathematical explanations and lack real-world examples or illustrations. This gives the impression that the topic is more complex than it actually is.
In addition to engineering, the Kalman Filter finds applications in financial market analysis, such as detecting stock price trends in noisy market data, and in meteorological applications for weather prediction.
The Kalman Filter is an algorithm for estimating and predicting the state of a system in the presence of uncertainty, such as measurement noise or influences of unknown external factors. The Kalman Filter is an essential tool in areas like object tracking, navigation, robotics, and control. For instance, it can be applied to estimate the trajectory of a computer mouse by reducing noise and compensating for hand jitter, resulting in a more stable motion path.
Introduction to Kalman Filter
The Kalman Filter is an algorithm for estimating and predicting the state of a system in the presence of uncertainty, such as measurement noise or influences of unknown external factors. The Kalman Filter is an essential tool in areas like object tracking, navigation, robotics, and control. For instance, it can be applied to estimate the trajectory of a computer mouse by reducing noise and compensating for hand jitter, resulting in a more stable motion path.
In addition to engineering, the Kalman Filter finds applications in financial market analysis, such as detecting stock price trends in noisy market data, and in meteorological applications for weather prediction.
Although the Kalman Filter is a simple concept, many educational resources present it through complex mathematical explanations and lack real-world examples or illustrations. This gives the impression that the topic is more complex than it actually is.
This guide presents an alternative approach that uses hands-on numerical examples and simple explanations to make the Kalman Filter easy to understand. It also includes examples with bad design scenarios where Kalman Filter fails to track the object correctly and discusses methods for correcting such issues.
By the end, you will not only understand the underlying concepts and mathematics but also be able to design and implement the Kalman Filter on your own.
Kalman Filter Learning Paths
This project explains the Kalman Filter at three levels of depth, allowing you to choose the path that best fits your background and learning goals:
Single-page overview (this page)
A concise introduction that presents the main ideas of the Kalman Filter and the essential
equations, without derivations. This page explains the core concepts and overall structure
of the algorithm using a simple example, and assumes basic knowledge of statistics and linear
algebra.
Free, example-based web tutorial
A step-by-step online tutorial that builds intuition through numerical examples.
The tutorial introduces the necessary background material and walks through the
derivation of the Kalman Filter equations. No prior knowledge is required.
Kalman Filter from the Ground Up (book)
A comprehensive guide that includes 14 fully solved numerical examples, with performance plots and tables.
The book covers advanced topics such as nonlinear Kalman Filters (Extended and Unscented Kalman Filters),
sensor fusion, and practical implementation guidelines. The book and source code (Python and MATLAB) for
all numerical examples are available for purchase.
Example-driven guide to Kalman Filter
Get the book
“If you can’t explain it simply, you don’t understand it well enough.”
“If you can’t explain it simply, you don’t understand it well enough.”
Albert Einstein
The Kalman Filter is a state estimation algorithm that provides both an estimate of the current state and a prediction of the future state, along with a measure of their uncertainty. Moreover, it is an optimal algorithm that minimizes state estimation uncertainty. That is why the Kalman Filter has become such a widely used and trusted algorithm.
Just as we want to assess the certainty of our measurement-based estimate, we also want to understand the level of confidence in our prediction.
Another issue is the accuracy of the dynamic model. While we may assume that the aircraft moves at a constant velocity, external factors such as wind can introduce deviations from this assumption. These unpredictable influences are referred to as process noise.
This leads to a new question: How certain is our estimate? We need an algorithm that not only provides an estimate but also tells us how reliable that estimate is.
In real life, things are more complex. First, the radar measurements are not perfectly precise. It is affected by noise and contains a certain level of randomness. If ten different radars were to measure the aircraft’s range at the same moment, they would produce ten slightly different results. These results would likely be close to each other, but not identical. The variation in measurements is caused by measurement noise.
This is an elementary algorithm built on simple principles. The current system state is derived from the measurement, and the dynamic model is used to predict the future state.
The distance traveled during the time interval \( \Delta t \) is given by:
The next step is to predict the system state at time \( t_{1}=t_{0}+\Delta t \), where \( \Delta t \) is the target revisit time. Given that the aircraft is expected to maintain a constant velocity, a constant velocity dynamic model can be used to predict its future position.
Let us assume that at time \( t_{0} \), the radar measures the aircraft’s range and velocity with very high accuracy and precision. The measured range is 10,000 meters, and the velocity is 200 meters per second. This gives us the system state:
The system state is defined as the range of the airplane from the radar, denoted by \( r \). The radar sends a pulse toward the airplane, which reflects off the target and returns to the radar. By measuring the time elapsed between the transmission and reception of the pulse and knowing that the pulse is an electromagnetic wave traveling at the speed of light, the radar can easily calculate the airplane’s range \( r \). In addition to range, the radar can also measure the airplane’s velocity \( v \), just like a police radar gun detects a car’s speed by using the Doppler effect.
To simplify the example, let us consider a one-dimensional world in which the aircraft moves along a straight line either toward the radar or away from it.
To track the aircraft, the radar must revisit the target at regular intervals by pointing the pencil beam in its direction. This means the radar must be able to predict the aircraft’s future position for the next beam. If it fails to do so, the beam may be pointed in the wrong direction, resulting in a loss of track. To make this prediction, we need some knowledge about how the aircraft moves. In other words, we need a model that describes the system’s behavior over time, known as the dynamic model.
The radar samples the target by steering a narrow pencil beam toward it and provides position measurements of the aircraft. Based on these measurements, we can estimate the system state (the aircraft’s position).
Suppose we have a radar that tracks an aircraft. In this scenario, the aircraft is the system, and the quantity to be estimated is its position, which represents the system state.
To illustrate this, consider the example of a tracking radar:
We begin by formulating the problem to understand why we need an algorithm for state estimation and prediction.
This project explains the Kalman Filter at three levels of depth, allowing you to choose the path that best fits your background and learning goals:
By the end, you will not only understand the underlying concepts and mathematics but also be able to design and implement the Kalman Filter on your own.
This guide presents an alternative approach that uses hands-on numerical examples and simple explanations to make the Kalman Filter easy to understand. It also includes examples with bad design scenarios where Kalman Filter fails to track the object correctly and discusses methods for correcting such issues.
Although the Kalman Filter is a simple concept, many educational resources present it through complex mathematical explanations and lack real-world examples or illustrations. This gives the impression that the topic is more complex than it actually is.
In addition to engineering, the Kalman Filter finds applications in financial market analysis, such as detecting stock price trends in noisy market data, and in meteorological applications for weather prediction.
The Kalman Filter is an algorithm for estimating and predicting the state of a system in the presence of uncertainty, such as measurement noise or influences of unknown external factors. The Kalman Filter is an essential tool in areas like object tracking, navigation, robotics, and control. For instance, it can be applied to estimate the trajectory of a computer mouse by reducing noise and compensating for hand jitter, resulting in a more stable motion path.
Introduction to Kalman Filter
The Kalman Filter is an algorithm for estimating and predicting the state of a system in the presence of uncertainty, such as measurement noise or influences of unknown external factors. The Kalman Filter is an essential tool in areas like object tracking, navigation, robotics, and control. For instance, it can be applied to estimate the trajectory of a computer mouse by reducing noise and compensating for hand jitter, resulting in a more stable motion path.
In addition to engineering, the Kalman Filter finds applications in financial market analysis, such as detecting stock price trends in noisy market data, and in meteorological applications for weather prediction.
Although the Kalman Filter is a simple concept, many educational resources present it through complex mathematical explanations and lack real-world examples or illustrations. This gives the impression that the topic is more complex than it actually is.
This guide presents an alternative approach that uses hands-on numerical examples and simple explanations to make the Kalman Filter easy to understand. It also includes examples with bad design scenarios where Kalman Filter fails to track the object correctly and discusses methods for correcting such issues.
By the end, you will not only understand the underlying concepts and mathematics but also be able to design and implement the Kalman Filter on your own.
Kalman Filter Learning Paths
This project explains the Kalman Filter at three levels of depth, allowing you to choose the path that best fits your background and learning goals:
Single-page overview (this page)
A concise introduction that presents the main ideas of the Kalman Filter and the essential
equations, without derivations. This page explains the core concepts and overall structure
of the algorithm using a simple example, and assumes basic knowledge of statistics and linear
algebra.
Free, example-based web tutorial
A step-by-step online tutorial that builds intuition through numerical examples.
The tutorial introduces the necessary background material and walks through the
derivation of the Kalman Filter equations. No prior knowledge is required.
Kalman Filter from the Ground Up (book)
A comprehensive guide that includes 14 fully solved numerical examples, with performance plots and tables.
The book covers advanced topics such as nonlinear Kalman Filters (Extended and Unscented Kalman Filters),
sensor fusion, and practical implementation guidelines. The book and source code (Python and MATLAB) for
all numerical examples are available for purchase.
Example-driven guide to Kalman Filter
Get the book
“If you can’t explain it simply, you don’t understand it well enough.”
“If you can’t explain it simply, you don’t understand it well enough.”
Albert Einstein
The Kalman Filter is a state estimation algorithm that provides both an estimate of the current state and a prediction of the future state, along with a measure of their uncertainty. Moreover, it is an optimal algorithm that minimizes state estimation uncertainty. That is why the Kalman Filter has become such a widely used and trusted algorithm.
Just as we want to assess the certainty of our measurement-based estimate, we also want to understand the level of confidence in our prediction.
Another issue is the accuracy of the dynamic model. While we may assume that the aircraft moves at a constant velocity, external factors such as wind can introduce deviations from this assumption. These unpredictable influences are referred to as process noise.
This leads to a new question: How certain is our estimate? We need an algorithm that not only provides an estimate but also tells us how reliable that estimate is.
In real life, things are more complex. First, the radar measurements are not perfectly precise. It is affected by noise and contains a certain level of randomness. If ten different radars were to measure the aircraft’s range at the same moment, they would produce ten slightly different results. These results would likely be close to each other, but not identical. The variation in measurements is caused by measurement noise.
This is an elementary algorithm built on simple principles. The current system state is derived from the measurement, and the dynamic model is used to predict the future state.
The distance traveled during the time interval \( \Delta t \) is given by:
The next step is to predict the system state at time \( t_{1}=t_{0}+\Delta t \), where \( \Delta t \) is the target revisit time. Given that the aircraft is expected to maintain a constant velocity, a constant velocity dynamic model can be used to predict its future position.
Let us assume that at time \( t_{0} \), the radar measures the aircraft’s range and velocity with very high accuracy and precision. The measured range is 10,000 meters, and the velocity is 200 meters per second. This gives us the system state:
The system state is defined as the range of the airplane from the radar, denoted by \( r \). The radar sends a pulse toward the airplane, which reflects off the target and returns to the radar. By measuring the time elapsed between the transmission and reception of the pulse and knowing that the pulse is an electromagnetic wave traveling at the speed of light, the radar can easily calculate the airplane’s range \( r \). In addition to range, the radar can also measure the airplane’s velocity \( v \), just like a police radar gun detects a car’s speed by using the Doppler effect.
To simplify the example, let us consider a one-dimensional world in which the aircraft moves along a straight line either toward the radar or away from it.
To track the aircraft, the radar must revisit the target at regular intervals by pointing the pencil beam in its direction. This means the radar must be able to predict the aircraft’s future position for the next beam. If it fails to do so, the beam may be pointed in the wrong direction, resulting in a loss of track. To make this prediction, we need some knowledge about how the aircraft moves. In other words, we need a model that describes the system’s behavior over time, known as the dynamic model.
The radar samples the target by steering a narrow pencil beam toward it and provides position measurements of the aircraft. Based on these measurements, we can estimate the system state (the aircraft’s position).
Suppose we have a radar that tracks an aircraft. In this scenario, the aircraft is the system, and the quantity to be estimated is its position, which represents the system state.
To illustrate this, consider the example of a tracking radar:
We begin by formulating the problem to understand why we need an algorithm for state estimation and prediction.
Let us begin with a simple example: a one-dimensional radar that measures range and velocity by transmitting a pulse toward an aircraft and receiving the reflected echo. The time delay between pulse transmission and echo reception provides information about the aircraft range \(r\), and the frequency shift of the reflected echo provides information about the aircraft velocity \(v\) (Doppler effect).
In this example, the system state is described by both the aircraft range \(r\) and velocity \(v\). We define the system state by the vector \(\boldsymbol{x}\), which includes both quantities:
...
Read the original on kalmanfilter.net »
In early March, I noticed approximately $180 in unexpected charges to my Anthropic account. I’m a Claude Max subscriber, and between March 3-5, I received 16 separate “Extra Usage” invoices ranging from $10-$13 each, all in quick succession of one another. However, I wasn’t using Claude. I was away from my laptop entirely and was out sailing with my parents back home in San Diego.
When I checked my usage dashboard, it showed my session at 100% despite no activity. My Claude Code session history showed two tiny sessions from March 5 totaling under 7KB (no sessions on March 3 or March 4.) Nothing that would explain $180 in Extra Usage charges.
This isn’t just me. Other Max plan users have reported the same issue. There are numerous GitHub issues about it (e.g. claude-code#29289 and claude-code#24727), and posts on r/ClaudeCode describing the exact same behavior: usage meters showing incorrect values and Extra Usage charges piling up erroneously.
On March 7, I sent a detailed email to Anthropic support laying out the situation with all the evidence above. Within two minutes, I received a response… from “Fin AI Agent, Anthropic’s AI Agent.” The AI agent told me to go through an in-app refund request flow. Sadly, this refund pipeline is only applicable for subscriptions, and not for Extra Usage charges. I also wanted to confirm with a human on exactly what went wrong rather than just getting a refund and calling it a day.
So, naturally, I replied asking to speak to a human. The response:
Thank you for reaching out to Anthropic Support. We’ve received your request for assistance.
While we review your request, you can visit our Help Center and API documentation for self-service troubleshooting. A member of our team will be with you as soon as we can.
That was March 7. I followed up on March 17. No response. I followed up again on March 25. No response. I followed up again today, April 8, over a month later. Still nothing.
Anthropic is an AI company that builds one of the most capable AI assistants in the world. Their support system is a Fin AI chatbot that can’t actually help you, and there is seemingly no human behind it. I don’t have a problem with AI-assisted support, though I do have a problem with AI-only support that serves as a wall between customers and anyone who can actually resolve their issue.
...
Read the original on nickvecchioni.github.io »
Today, we’re excited to introduce Muse Spark, the first in the Muse family of models developed by Meta Superintelligence Labs. Muse Spark is a natively multimodal reasoning model with support for tool-use, visual chain of thought, and multi-agent orchestration.
Muse Spark is the first step on our scaling ladder and the first product of a ground-up overhaul of our AI efforts. To support further scaling, we are making strategic investments across the entire stack — from research and model training to infrastructure, including the Hyperion data center.
In this post, we’ll first explore Muse Spark’s new capabilities and applications. After these results, we’ll look behind the curtain at the scaling axes driving our progress toward personal superintelligence.
Muse Spark is available today at meta.ai and the Meta AI app. We’re opening a private API preview to select users.
...
Read the original on ai.meta.com »
Say you’re being handed a USB device and told to write a driver for it. Seems like a daunting task at first, right? Writing drivers means you have to write Kernel code, and writing Kernel code is hard, low level, hard to debug and so on.
None of this is actually true though. Writing a driver for a USB device is actually not much more difficult than writing an application that uses Sockets.
This post aims to be a high level introduction to using USB for people who may not have worked with Hardware too much yet and just want to use the technology. There are amazing resources out there such as USB in a NutShell that go into a lot of detail about how USB precisely works (check them out if you want more information), they are however not really approachable for somebody who has never worked with USB before and doesn’t have a certain background in Hardware. You don’t need to be an Embedded Systems Engineer to use USB the same way you don’t need to be a Network Specialist to use Sockets and the Internet.
The device we’ll be using an Android phone in Bootloader mode. The reason for this is that
* It’s a device you can easily get your hands on
* The protocol it uses is well documented and incredibly simple
* Drivers for it are generally not pre-installed on your system so the OS will not interfere with our experiments
Getting the phone into Bootloader mode is different for every device, but usually involves holding down a combination of buttons while the phone is starting up. In my case it’s holding the volume down button while powering on the phone
Enumeration refers to the process of the host asking the device for information about itself. This happens automatically when you plug in the device and it’s where the OS normally decides which driver to load for the device. For most standard devices, the OS will look at the USB Device Class and loads a driver that supports that class. For vendor specific devices, you generally install a driver made by the manufacturer which will look at the VID (Vendor ID) and PID (Product ID) instead to detect whether or not it should handle the device.
Even without a driver, plugging the phone into your computer will still make it get recognized as a USB device. That’s because the USB specification defines a standard way for devices to identify themselves to the host, more on how that exactly works in a bit though.
On Linux, we can use the handy lsusb tool to see what the device identified itself as:
Bus and Device are just identifiers for the physical USB port the device is plugged into. They will most likely differ on your system since they depend on which port you plugged the device into.
ID is the most interesting part here. The first part 18d1 is the Vendor ID (VID) and the second part 4ee0 is the Product ID (PID). These are identifiers that the device sends to the host to identify itself. The VID is assigned by the USB-IF to companies that pay them a lot of money, in this case Google, and the PID is assigned by the company to a specific product, in this case the Nexus/Pixel Bootloader.
Using the lsusb -t command we can also see the device’s USB class and what driver is currently handling it:
This shows the entire tree of USB devices connected to the system. The bottom most one in this part of the tree is our device (Bus 008, Device 014 as reported in the previous command). The Class=Vendor Specific Class part specifies that the device does not use any of the standard USB classes (e.g HID, Mass Storage or Audio) but instead uses a custom protocol defined by the manufacturer. The Driver=[none] part simply tells us that the OS didn’t load a driver for the device which is good for us since we want to write our own.
We will also go after the VID and PID since they are the only real identifying information we have. The Device Class is not very useful for it here since it’s just Vendor Specific Class which any manufacturer can use for any device. Instead of doing all of this in the Kernel though, we can write a Userspace application that does the same thing. This is much easier to write and debug (and is arguably the correct place for drivers to live anyway but that’s a different topic). To do this, we can use the libusb library which provides a simple API for communicating with USB devices from Userspace. It achieves this by providing a generic driver that can be loaded for any device and then provides a way for Userspace applications to claim the device and talk to it directly.
The same thing we just did manually can also be done in software though. The following program initializes libusb, registers a hotplug event handler for devices matching the 18d1:4ee0 VendorId / ProductId combination and then waits for that device to be plugged into the host.
If you compile and run this, plugging in the device should result in the following output:
Congrats! You have a program now that can detect your device without ever having to touch any Kernel code at all.
Next step, getting any answer from the device. The easiest way to do that for now is by using the standardized Control endpoint. This endpoint is always on ID 0x00 and has a standardized protocol. This endpoint is also what the OS previously used to identify the device and get its VID:PID.
The way we use this endpoint is with yet another libusb function that’s made specifically to send requests to that endpoint. So we can extend our hotplug event handler using the following code:
This code will now send a GET_STATUS request to the device as soon as it’s plugged in and prints out the data it sends back to the console.
Those bytes came from the device itself! Decoding them using the specification tells us that the first byte tells us whether or not the device is Self-Powered (1 means it is which makes sense, the device has a battery) and the second byte means it does not support Remote Wakeup (meaning it cannot wake up the host).
There are a few more standardized request types (and some devices even add their own for simple things!) but the main one we (and the OS too) are interested in is the GET_DESCRIPTOR request.
Descriptors are binary structures that are generally hardcoded into the firmware of a USB device. They are what tells the host exactly what the device is, what it’s capable of and what driver it would like the OS to load. So when you plug in a device, the host simply sends multiple GET_DESCRIPTOR requests to the standardized Control Endpoint at ID 0x00 to get back a struct that gives it all the information it needs for enumeration. And the cool thing is, we can do that too!
Instead of a GET_STATUS request, we now send a GET_DESCRIPTOR request:
This now instead returns the following data:
Now to decode this data, we need to look at the USB specification on Chapter 9.6.1 Device. There we can find that the format looks as follows:
Throwing the data into ImHex and giving its Pattern Language this structure definition yields the following result:
And there we have it! idVendor and idProduct correspond to the values we found previously using lsusb.
There’s more than just the device descriptor though. There’s also Configuration, Interface, Endpoint, String and a couple of other descriptors. These can all be read using the same GET_DESCRIPTOR request on the control endpoint. We could still do this all by hand but luckily for us, lsusb has an option that can do that for us already!
This output shows us a few more of the descriptors the device has. Specifically, it has a single Configuration Descriptor that contains a Interface Descriptor for the Android Fastboot interface. And that interface now contains two Endpoints. This is where the device tells the host about all the other endpoints, besides the Control endpoint, and these will be the ones we’ll be using in the next step to actually finally send data to the device’s Fastboot interface!
Let’s talk a bit more about endpoints first though. We already learned about the Control endpoint on address 0x00. Endpoints are basically the equivalent to ports that a device on the network opened for us to send data back and fourth. The device specifies in its descriptor which kind of endpoints it has and then services these in its firmware. So we don’t even need to do port scanning or know that SSH just runs on port 22 usually, we have a nice way of finding out what interfaces the device has, what language they speak and how we can speak to them. Looking at the descriptors above, that control descriptor is not there though. Instead, there’s two others with different types.
There’s exactly one per device and it’s always fixed on Endpoint Address 0x00. It’s what is used do initial configuration and request information about the device.
The main purpose of the Control endpoint is to solve the chicken-and-egg problem where you couldn’t communicate with a device without knowing its endpoints but to know its endpoints you’d need to communicate with it. That’s also why it doesn’t even appear in the descriptors. It’s not part of any interface but the device itself. And we know about its existence thanks to the spec, without it having to be advertised.
It’s made for setting simple configuration values or requesting small amounts of data. The function in libusb doesn’t even allow you to set the endpoint address to make a control request to because there’s only ever one control endpoint and it’s always on address 0x00
Bulk Endpoints are what’s used when you want to transfer larger amounts of data. They’re used when you have large amounts of non-time-sensitive data that you just want to send over the wire.
This is what’s used for things like the Mass Storage Class, CDC-ACM (Serial Port over USB) and RNDIS (Ethernet over USB).
One detail: Data sent over Bulk endpoints is high bandwidth but low priority. This means, Bulk data will always just fill up the remaining bandwidth. Any Interrupt and Isochronous transfers (further detail below) have a higher priority so if you’re sending both Bulk and Isochronous data over the same connection, the bandwidth of the Bulk transmission will be lowered until the Isochronous one can transmit its data in the requested timeframe.
Interrupt Endpoints are the opposite of Bulk Endpoints. They allow you to send small amounts of data with very low latency. For example Keyboards and Mice use this transfer type under the HID Class to poll for button presses 1000+ times per second. If no button was pressed, the transfer fails immediately without sending back a full failure message (only a NAK), only when something actually changed you’ll get a description back of what happened.
The important fact here is, even though these are called interrupt endpoints, there’s no interrupts happening. The Device still does not talk to the Host without being asked. The Host just polls so frequently that it acts as if it’s an interrupt.
The functions in libusb that handle interrupt transfers also abstract this behaviour away further. You can start an interrupt transfer and the function will block until the device sends back a full response.
Isochronous Endpoints are somewhat special. They’re used for bigger amounts of data that is really timing critical. They’re mainly used for streaming interfaces such as Audio or Video where any latency or delay will be immediately noticeable through stuttering or desyncs. In libusb, these work asynchronously. You can setup multiple transfers at once and they will be queued and you’ll get back an event once data has arrived so you can process it and queue further requests.
This type is generally not used very often outside of the Audio and Video classes.
Besides the Transfer Type, endpoints also have a direction. Keep in mind, USB is a full master-slave oriented interface. The Host is the only one ever making any requests and the Device will never answer unless addressed by the Host. This means, the device cannot actually send any data directly to the Host. Instead the Host needs to ask the Device to please send the data over.
This is what the direction is for.
* IN endpoints are for when the Host wants to receive some data. It makes a request on an IN endpoint and waits for the device to respond back with the data.
* OUT endpoints are for when the Host wants to transmit some data. It makes a request on an OUT endpoint and then immediately transfers the data it wants to send over. The Device in this case only acknowledges (ACK) that it received the data but won’t send any additional data back.
Contrary to the transfer type, the direction is encoded in the endpoint address instead. If the topmost bit (MSB) is set to 1, it’s an IN endpoint, if it’s set to 0 it’s an OUT endpoint. (If you’re into Hardware, you might recognize this same concept from the I2C interface.)
* You can have a maximum of custom endpoints available at once
because we have 7 bits available for addresses
because we always have the control endpoint that’s on the fixed address 0x00.
* because we have 7 bits available for addresses
* because we always have the control endpoint that’s on the fixed address 0x00.
* Endpoints are entirely unidirectional. Either you’re using an endpoint to request data or to transmit data, it cannot do both at once
That’s also the reason why our Fastboot interface has two Bulk endpoints: one is dedicated to listening to requests the Host sends over and the other one is for responding to those same requests
* That’s also the reason why our Fastboot interface has two Bulk endpoints: one is dedicated to listening to requests the Host sends over and the other one is for responding to those same requests
Now that we have all this information about USB, let’s look into the Fastboot protocol. The best documentation for this is both the u-boot Source Code and as its Documentation.
According to the documentation, the protocol really is incredibly simple. The Host sends a string command and the device responds with a 4 character status code followed by some data.
Let’s update our code to do just that then:
Plugging the device in now, prints the following message to the terminal:
That seems to match the documentation!
First 4 bytes are OKAY, specifying that the request was executed successfully The rest of the data after that is 0.4 which corresponds to the implemented Fastboot Version in the Documentation: v0.4
And that’s it! You successfully made your first USB driver from scratch without ever touching the Kernel.
All these same principles apply to all USB drivers out there. The underlying protocol may be significantly more complex than the fastboot protocol (I was pulling my hair out before over the atrocity that the MTP protocol is) but everything around it stays identical. Not much more complex than TCP over sockets, is it? :)
...
Read the original on werwolv.net »
Farmers have been fighting John Deere for years over the right to repair their equipment, and this week, they finally reached a landmark settlement.
While the agricultural manufacturing giant pointed out in a statement that this is no admission of wrongdoing, it agreed to pay $99 million into a fund for farms and individuals who participated in a class action lawsuit. Specifically, that money is available to those involved who paid John Deere’s authorized dealers for large equipment repairs from January 2018. This means that plaintiffs will recover somewhere between 26% and 53% of overcharge damages, according to one of the court documents—far beyond the typical amount, which lands between 5% and 15%.
The settlement also includes an agreement by Deere to provide “the digital tools required for the maintenance, diagnosis, and repair” of tractors, combines, and other machinery for 10 years. That part is crucial, as farmers previously resorted to hacking their own equipment’s software just to get it up and running again. John Deere signed a memorandum of understanding in 2023 that partially addressed those concerns, providing third parties with the technology to diagnose and repair, as long as its intellectual property was safeguarded. Monday’s settlement seems to represent a much stronger (and legally binding) step forward.
Ripple effects of this battle have been felt far beyond the sales floors at John Deere dealers, as the price of used equipment skyrocketed in response to the infamous service difficulties. Even when the cost of older tractors doubled, farmers reasoned that they were still worth it because repairs were simpler and downtime was minimized. $60,000 for a 40-year-old machine became the norm.
A judge’s approval of the settlement is still required, though it seems likely. Still, John Deere isn’t out of the woods yet. It still faces another lawsuit from the United States Federal Trade Commission, in which the government organization accuses Deere of harmfully locking down the repair process.
It’s difficult to overstate the significance of this right-to-repair fight. While it has obvious implications for the ag industry, others like the automotive and even home appliance sectors are looking on. Any court ruling that might formally condemn John Deere of wrongdoing may set a precedent for others to follow. At a time when manufacturers want more and more control of their products after the point of sale, every little update feels incredibly high-stakes.
Got a tip or question for the author? Contact them directly: caleb@thedrive.com
...
Read the original on www.thedrive.com »
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.