10 interesting stories served every morning and every evening.
We’ve been searching for a memory-safe programming language to replace C++ in Ladybird for a while now. We previously explored Swift, but the C++ interop never quite got there, and platform support outside the Apple ecosystem was limited. Rust is a different story. The ecosystem is far more mature for systems programming, and many of our contributors already know the language. Going forward, we are rewriting parts of Ladybird in Rust.
When we originally evaluated Rust back in 2024, we rejected it because it’s not great at C++ style OOP. The web platform object model inherits a lot of 1990s OOP flavor, with garbage collection, deep inheritance hierarchies, and so on. Rust’s ownership model is not a natural fit for that.
But after another year of treading water, it’s time to make the pragmatic choice. Rust has the ecosystem and the safety guarantees we need. Both Firefox and Chromium have already begun introducing Rust into their codebases, and we think it’s the right choice for Ladybird too.
Our first target was LibJS , Ladybird’s JavaScript engine. The lexer, parser, AST, and bytecode generator are relatively self-contained and have extensive test coverage through test262, which made them a natural starting point.
I used Claude Code and Codex for the translation. This was human-directed, not autonomous code generation. I decided what to port, in what order, and what the Rust code should look like. It was hundreds of small prompts, steering the agents where things needed to go. After the initial translation, I ran multiple passes of adversarial review, asking different models to analyze the code for mistakes and bad patterns.
The requirement from the start was byte-for-byte identical output from both pipelines. The result was about 25,000 lines of Rust, and the entire port took about two weeks. The same work would have taken me multiple months to do by hand. We’ve verified that every AST produced by the Rust parser is identical to the C++ one, and all bytecode generated by the Rust compiler is identical to the C++ compiler’s output. Zero regressions across the board:
No performance regressions on any of the JS benchmarks we track either.
Beyond the test suites, I’ve done extensive testing by browsing the web in a lockstep mode where both the C++ and Rust pipelines run simultaneously, verifying that output is identical for every piece of JavaScript that flows through them.
If you look at the code, you’ll notice it has a strong “translated from C++” vibe. That’s because it is translated from C++. The top priority for this first pass is compatibility with our C++ pipeline. The Rust code intentionally mimics things like the C++ register allocation patterns so that the two compilers produce identical bytecode. Correctness is a close second. We know the result isn’t idiomatic Rust, and there’s a lot that can be simplified once we’re comfortable retiring the C++ pipeline. That cleanup will come in time.
This is not becoming the main focus of the project. We will continue developing the engine in C++, and porting subsystems to Rust will be a sidetrack that runs for a long time. New Rust code will coexist with existing C++ through well-defined interop boundaries.
We want to be deliberate about which parts get ported and in what order, so the porting effort is managed by the core team. Please coordinate with us before starting any porting work so nobody wastes their time on something we can’t merge.
I know this will be a controversial move, but I believe it’s the right decision for Ladybird’s future. :^)
...
Read the original on ladybird.org »
Brian Merchant, writing for Blood in the Machine, reports that people across the United States are dismantling and destroying Flock surveillance cameras, amid rising public anger that the license plate readers aid U. S. immigration authorities and deportations.
Flock is the Atlanta-based surveillance startup valued at $7.5 billion a year ago and a maker of license plate readers. It has faced criticism for allowing federal authorities access to its massive network of nationwide license plate readers and databases at a time when U. S. Immigration and Customs Enforcement is increasingly relying on data to raid communities as part of the Trump administration’s immigration crackdown.
Flock cameras allow authorities to track where people go and when by taking photos of their license plates from thousands of cameras located across the United States. Flock claims it doesn’t share data with ICE directly, but reports show that local police have shared their own access to Flock cameras and its databases with federal authorities.
While some communities are calling on their cities to end their contracts with Flock, others are taking matters into their own hands.
Merchant reports instances of broken and smashed Flock cameras in La Mesa, California, just weeks after the city council approved the continuation of Flock cameras deployed in the city, despite a clear majority of attendees favoring their shutdown. A local report cited strong opposition to the surveillance technology, with residents raising privacy concerns.
Other cases of vandalism have stretched from California and Connecticut to Illinois and Virginia. In Oregon, six license plate-scanning cameras on poles were cut down and at least one spray-painted. A note left at the base of the severed poles said, “Hahaha get wrecked ya surveilling fucks,” reports Merchant.
According to DeFlock, a project aimed at mapping license plate readers, there are close to 80,000 cameras across the United States. Dozens of cities have so far rejected the use of Flock’s cameras, and some police departments have since blocked federal authorities from using their resources.
A Flock spokesperson did not say, when reached by TechCrunch, if the company keeps track of how many cameras have been destroyed since being deployed.
...
Read the original on techcrunch.com »
My old 2016 MacBook Pro has been collecting dust in a cabinet for some time now. The laptop suffers from a “flexgate” problem, and I don’t have any practical use for it. For quite some time, I’ve been thinking about repurposing it as a guinea pig, to play with FreeBSD — an OS that I’d aspired to play with for a long while, but had never had a real reason to.
During the recent holiday season, right after FreeBSD 15 release, I’ve finally found time to set the laptop up. Doing that I didn’t plan, or even think, this may turn into a story about AI coding.
2016 MacBook Pro models use Broadcom BCM4350 Wi-Fi chip. FreeBSD doesn’t have native support for this chip. To have a working Wi-Fi, a typical suggestion on FreeBSD forums, is to run wifibox — a tiny Linux VM, with the PCI Wi-Fi device in pass through, that allows Linux to manage the device through its brcmfmac driver.
Brcmfmac is a Linux driver (ISC licence) for set of FullMAC chips from Broadcom. The driver offloads the processing jobs, like 802.11 frame movement, WPA encryption and decryption, etc, to the firmware, which is running inside the chip. Meanwhile, the driver and the OS do high-level management work (ref Broadcom brcmfmac(PCIe) in Linux Wireless documentation).
Say we want to build a native FreeBSD kernel module for the BCM4350 chip. In theory, this separation of jobs between the firmware and the driver sounds perfect. The “management” part of work is what FreeBSD already does for other supported Wi-Fi devices. We need to port some amount of existing “glue code” from specifics of Linux to FreeBSD. If we ignore a lot of details, the problem doesn’t sound too complicated, right?
A level-zero idea, when one hears about “porting a bunch of existing code from A to B”, in 2026 is, of course, to use AI. So that was what I tried.
I cloned the brcmfmac subtree, and asked Claude Code to make it work for FreeBSD. FreeBSD already has drivers that work through LinuxKPI — compatibility layer for running Linux kernel drivers. So I specifically pointed Claude at the iwlwifi driver (a softmac driver for Intel wireless network card), asking “do as they did it”. And, at first, this even looked like this can work — Claude told me so.
The module, indeed, compiled, but it didn’t do anything. Because, of course: the VM, where we tested the module, didn’t even have the hardware. After I set the PCI device into the VM, and attempted to load the driver against the chip, the challenges started to pop up immediately. The kernel paniced, and after Claude fixed the panics, it discovered that “module didn’t do anything”. Claude honestly tried to sift through the code, adding more and more #ifdef __FreeBSD__ wrappers here and there. It complained about missing features in LinuxKPI. The module kept causing panics, and the agent kept building FreeBSD-specific shims and callbacks, while warning me that this project will be very complicated and messy.
After a number of sessions, the diff, produced by the agent, stared to look significantly larger than what I’d hoped it will be. Even worse, the driver didn’t look even close to be working. This was right around time when Armin Ronacher posted about his experience building a game from scratch with Claude Opus and PI agent.
Besides the part that working in Pi coding agent feels more productive, than in Claude Code, the video got me thinking that my approach to the task was too straightforward. The code of brcmfmac driver is moderately large. The driver supports several generations of Wi-Fi adaptors, different capabilities, etc. But my immediate task was very narrow: one chip, only PCI, only Wi-Fi client.
Instead of continuing with the code, I spawned a fresh Pi session, and asked the agent to write a detailed specification of how the brcmfmac driver works, with the focus on BCM4350 Wi-Fi chip. I explicitly set the audience for the specification to be readers, who are tasked with implementing the specification in a clean-room environment. I asked the agent to explain how things work “to the bits”. I added some high-level details for how I wanted the specification to be laid out, and let the agent go brrrr.
After a couple of rounds, the agent produced me a “book of 11 chapters”, that honestly looked like a fine specification
% ls –tree spec/
spec
├── 00-overview.md
├── 01-data-structures.md
├── 02-bus-layer.md
├── 03-protocol-layer.md
├── 04-firmware-interface.md
├── 05-event-handling.md
├── 06-cfg80211-operations.md
├── 07-initialization.md
├── 08-data-path.md
├── 09-firmware-commands.md
└── 10-structures-reference.md
Of course, one can’t just trust what AI has written.
To proofread the spec I spawned a clean Pi sessions, and — for fun — asked Codex model, to read the specification, and flag any places, where the text isn’t aligned with the driver’s code (“Source code is the ground truth. The spec needs to be verified, and updated with any missing or wrong details”). The agent followed through and found several places to fix, and also proposed multiple improvements.
Of course, one can’t just trust what AI has written, even if this was in a proofreading session.
To double-proofread the fixes I spawned another clean Pi sessions, asking Opus model to verify if what was proposed was aligned with how it works in the code of the driver.
As a procrastination exercise, I tried this loop with a couple of coding models: Opus 4.5, Opus 4.6, Codex 5.2, Gemini 3 Pro preview. So far my experience was that Gemini hallucinated the most. This was quite sad, given that the model itself isn’t too bad for simple coding tasks, and it is free for a limited use.
Having a written specification should have (in theory) explained how a driver’s code interacts with the firmware.
I started a fresh project, with nothing but the mentioned “spec”, and prompted the Pi agent, that we were building a brand new FreeBSD driver for BCM4350 chip. I pointed the agent to the specification, and asked it to ask me back about any important decisions we must make, and details we must outline, before jumping into “slopping the code”. The agent came back with questions and decision points, like “Will the driver live in the kernels source-tree?”, “Will we write the code in C?”, “Will we rely on LinuxKPI?”, “What are our high-level milestones?”, etc. One influential bit, that turned fairly productive moving forward, was that I asked the agent to document all these decision points in the project’s docs, and to explicitly referenced to these decision docs in the project’s AGENTS.md.
It’s worth saying that, just like in any real project, not all decisions stayed to the end. For example,
Initially I asked the agent to build the driver using linuxkpi and linuxkpi_wlan. My naive thinking was that, given the spec was written after looking at Linux driver’s code, it might be simpler for the agent, than building the on top of the native primitives. After a couple of sessions, it didn’t look like this was the case. I asked the agent to drop LinuxKPI from the code, and to refactor everything. The agent did it in one go, and updated the decision document.
With specification, docs and a plan, the workflow process turned into a “boring routine”. The agent had SSH access to both the build host, and a testing VM, that had been running with the Wi-Fi PCI device passed from the host. It methodically crunched through the backlog of its own milestones, iterating over the code, building and testing the module. Every time a milestone or a portion was finished, I asked the agent to record the progress to the docs. Occasionally, an iteration of the code crashed or hanged the VM. When this happened, before fixing the problem, I asked — in a forked Pi’s session — to summarize, investigate and record the problem for agent’s future-self.
After many low-involved sessions, I got a working FreeBSD kernel module for the BCM4350 Wi-Fi chip. The module supports Wi-Fi network scanning, 2.4GHz/5GHz connectivity, WPA/WPA2 authentication.
The source code is in repository github.com/narqo/freebsd-brcmfmac. I didn’t write any piece of code there. There are several known issues, which I will task the agent to resolve, eventually. Meanwhile, I advise against using it for anything beyond a studying exercise.
...
Read the original on vladimir.varank.in »
This article has been reviewed according to Science X’s editorial process
and policies. Editors have highlighted the following attributes while ensuring the content’s credibility:
This article has been reviewed according to Science X’s editorial process
and policies. Editors have highlighted the following attributes while ensuring the content’s credibility:
A protein lurking around in the blood can help with the accurate diagnosis of Alzheimer’s disease. In a recent study, researchers from Spain investigated how blood-based biomarkers, such as a protein called p-tau217, affect both the clinical diagnosis of Alzheimer’s and neurologists’ confidence in their diagnosis.
After following 200 consecutive new patients aged 50 and older who presented with cognitive symptoms, they found that a simple blood test measuring p-tau217 significantly improved diagnostic accuracy in routine clinical practice.
When relying solely on standard clinical evaluation, doctors correctly diagnosed Alzheimer’s in 75.5% of cases, but when incorporating blood test results, diagnostic accuracy increased to 94.5%. The findings are published in the Journal of Neurology.
Phosphorylated tau, or p-tau217, is a protein that naturally occurs in the brain and helps keep neurons, the cells that carry signals, stable and healthy. The trouble begins when this protein becomes abnormally phosphorylated and clumps together, forming tangles that disrupt communication between brain cells. Over time, this damage can impact brain function and lead to neurodegenerative conditions such as Alzheimer’s disease.
While p-tau217 is not considered the direct cause of Alzheimer’s, elevated levels in the blood are now recognized as one of the most accurate early warning signs of the disease.
In many parts of the world, the population is rapidly aging and so is the number of age-related diseases like Alzheimer’s and dementia. However, most of the standard ways to diagnose Alzheimer’s today, like expensive brain scans or invasive spinal taps, are costly, uncomfortable, and often hard for patients to access.
Scientists have long known that p-tau217 is a reliable biomarker for detecting early signs of Alzheimer’s, but most of these data come from highly controlled research labs. How well it works in everyday medical clinics and whether it truly boosts doctors’ confidence in their diagnoses remain less explored.
In this study, the researchers focused on both these factors in real-world medical settings. They followed patients who came in for general neurology consultations and to a specialized cognitive neurology unit with cognitive symptoms. Clinicians noted their initial diagnosis and how confident they felt about it, then reviewed the p-tau217 blood test results and recorded any changes.
The team found that after reviewing the p-tau217 results, diagnostic accuracy jumped by 19%. For about one in four patients, the blood test prompted doctors to change their diagnosis. Some people who were first believed to have Alzheimer’s turned out to have a different condition, while others who were thought to be experiencing normal aging were correctly identified as having Alzheimer’s. Also, the doctors’ confidence in their diagnoses rose from an average of 6.90 to 8.49 on a 10-point scale.
The p-tau217 tests proved to be effective across every stage of cognitive decline, be it early memory complaints or late-stage decline such as dementia. The findings show that this blood test could provide a more accurate and less invasive way to diagnose Alzheimer’s, potentially improving care for millions of people.
...
Read the original on medicalxpress.com »
The latest update of Firefox, version 148, introduces a much-anticipated “AI kill switch” feature, allowing users to disable AI functionalities such as chatbot prompts and AI-generated link summaries. Mozilla emphasizes that once AI features are turned off, future updates will not override this choice. This decision reflects the company’s new revenue-focused strategy regarding AI integrations.
To disable AI features, users can navigate to Settings > AI Controls and toggle the ‘Block AI Enhancements’ option. This will prevent any in-app notifications encouraging users to try out AI features, as well as remove any previously downloaded AI models from the device. For those who wish to maintain some AI functionalities, a selective blocking option is available, enabling users to retain useful features like on-device translations while avoiding cloud-based services.
Beyond the AI kill switch, Firefox 148 offers users more control over remote updates, allowing them to opt out while still minimizing data collection. Users can set these preferences under Settings > Privacy & Settings > Firefox Data Collection.
The update also focuses on enhancing core web platform capabilities, including the integration of the Trusted Types API and Sanitizer API to combat cross-site scripting (XSS) issues. Additionally, Firefox 148 now includes improved screen reader compatibility for mathematical formulas in PDFs, availability of Firefox Backup on Windows 10, and translation capabilities for Vietnamese and Traditional Chinese. New tab wallpapers will also be featured in new container tabs, alongside the addition of Service worker support for WebGPU.
For more detailed information on the update, users can refer to the official release notes.
...
Read the original on serverhost.com »
Meta, Amazon, Google, OpenAI, and other tech companies spent billions last year investing in AI. They’re expected to spend even more, roughly $700 billion, this year on dozens of new data centers to train and run their advanced models.
This spending frenzy has kept Wall Street buzzing and fueled a narrative that all this investment is helping prop up and even grow the U. S. economy.
President Donald Trump has cited that argument as a reason the industry should not face state-level regulations.
“Investment in AI is helping to make the U. S. Economy the ‘HOTTEST’ in the World — But overregulation by the States is threatening to undermine this Growth Engine,” Trump wrote in a post on Truth Social in November. “We MUST have one Federal Standard instead of a patchwork of 50 State Regulatory Regimes.”
Some prominent economists have also given credibility to this story with their analysis. Jason Furman, a Harvard economics professor, said in a post on X that investments in information processing equipment and software accounted for 92% of GDP growth in the first half of the year. Meanwhile, economists at the Federal Reserve Bank of St. Louis similarly estimated that AI-related investments made up 39% of GDP growth in the third quarter of 2025.
But now some Wall Street analysts are starting to rethink this narrative.
“It was a very intuitive story,” Joseph Briggs, a Goldman Sachs analyst, told The Washington Post on Monday. “That maybe prevented or limited the need to actually dig deeper into what was happening.”
Briggs’ colleague, Goldman Sachs Chief Economist Jan Hatzius, said in an interview with the Atlantic Council that AI investment spending has had “basically zero” contribution to the U. S. GDP growth in 2025.
“We don’t actually view AI investment as strongly growth positive,” said Hatzius. “I think there’s a lot of misreporting, actually, of the impact AI investment had on U. S. GDP growth in 2025, and it’s much smaller than is often perceived.”
Hatzius said one major reason is that much of the equipment powering AI is imported. While U. S. companies are spending billions, importing chips and hardware offsets those investments in GDP calculations.
“A lot of the AI investment that we’re seeing in the U. S. adds to Taiwanese GDP, and it adds to Korean GDP but not really that much to U.S. GDP,” he said.
On top of that, there is currently no reliable way to accurately measure how AI use among businesses and consumers contributes to economic growth.
So far, many business leaders say AI hasn’t significantly improved productivity.
A recent survey of nearly 6,000 executives in the U. S., Europe, and Australia found that despite 70% of firms actively using AI, about 80% reported no impact on employment or productivity.
...
Read the original on gizmodo.com »
The Las Vegas Metropolitan Police Department (LVMPD) quietly entered an agreement in 2023 with Flock Security, an automated license plate reader company that uses cameras to collect vehicle information and cross-reference it with police databases.
But unlike many of the other police departments around the country that use the cameras in their police work, Metro funds the project with donor money funneled into a private foundation. It’s an arrangement that allows Metro to avoid soliciting public comment on the surveillance technology, which critics worry could be co-opted to track undocumented immigrants, political dissidents and abortion seekers, among others.
“It’s a short circuit of the democratic process,” Jay Stanley, a Washington D. C.-based lawyer for the American Civil Liberties Union (ACLU) who works on how technology can infringe on individual privacy and civil liberties, said in an interview with The Nevada Independent.
The cameras scan license plates as well as vehicles’ identifying details — such as make, model and color — plugging that information into a national database that police can use to search the location of specific vehicles beyond their own jurisdictions. Flock operates more than 80,000 of these AI-powered cameras nationwide, and the company’s popularity has exploded in recent years, with police touting it as a tool to solve crime faster and boost public safety.
Although taxpayer dollars fund Flock cameras in other jurisdictions, most of the cameras in the Las Vegas area have been bought with money from the Horowitz Family Foundation, a philanthropy group connected to the Las Vegas-based venture capitalist Ben Horowitz, co-founder of the firm Andreessen Horowitz.
The Horowitz Family Foundation did not respond to a request for comment at the time of publication.
Metro told The Nevada Independent that it operates approximately 200 Flock license plate reader cameras on city or county infrastructure and it shares its Flock data with hundreds of state and local law enforcement agencies throughout the country.
Since late 2023, Las Vegas police have made more than 23,000 searches of vehicles, according to the website Have I Been Flocked, which compiles public audit logs of Flock data.
As the cameras were not bought with public funds, Metro does not have to hold meetings with the public to comment on the technology, something experts say leaves citizens without any input on the policing method.
In other cities, Stanley said Flock is often brought up and discussed during city council meetings or other public forums. It’s not required to be on public meeting agendas in the Las Vegas area.
“Police departments serve the community and are supposed to make life in the community better. Does the community want this technology imposed on it?” Stanley said.
Though Horowitz’s foundation donated additional funds for Flock cameras in October, it was not brought up at the Clark County Commission meeting that month, nor was their use discussed anytime in 2025, according to commission meeting minutes.
Some municipalities in Clark County, such as the City of Las Vegas, have license plate reader policies that includes a public Flock policy with a dashboard on how many license plates Flock picked up (about 185,000 in the past month in the city), how many cameras were in use (22 in Las Vegas), and how many searches had been done on a monthly basis (five in the past 30 days). In comparison, Metro’s policy is not publicly available online, though The Indy obtained a copy through a public records request.
Flock’s most recent contract with Metro, signed in 2023, stipulates that the company retains all rights in any recordings or data provided by the service and that Flock can use any of the data for “any purpose” at the company’s discretion. The agreement also says that Flock recordings are not stored for longer than 30 days.
Meanwhile, Metro policy says that department members will not seek or retain license plate reader information about individuals or an organization based solely on their citizenship, social views, race or other classifications protected by law. The policy states that retained license plate reader data does not include specific identification of individuals. Misuse of the data will result in disciplinary action up to termination, according to the policy.
But for many, including a former officer who spoke to The Indy on the condition of anonymity for fear of professional repercussions, such policies are not enough.
“It’s ripe for misuse,” the officer said, pointing to examples around the country of people using Flock to look for current and former romantic partners and track their movements. A police chief in Kansas used Flock to track his ex-girlfriend 228 times in four months. An officer in South Carolina used public cameras to monitor his wife, who he suspected was having an affair.
The former Metro officer said his major concern was not the technology itself, but the fact that there was little transparency on how the technology was being used or what the department’s policy was on Flock usage.
“If you look around the country where license plate readers are being used, there’s some kind of public meeting, there’s some kind of public process,” the officer said. “What’s happening here is on a very large scale — they’re putting out surveillance technology — and there’s no public disclosure.”
The Horowitz Foundation donation in October included a software subscription to Flock’s Nova feature, which allows officers to easily access private license plate information alongside other personal data, such as Social Security numbers, credit scores, property and occupancy information, as well as emails or social media handles.
Experts say this data could be used to identify undocumented immigrants, political protesters and people traveling across state lines to obtain abortions.
Athar Haseebullah, the executive director of the ACLU of Nevada, said that Flock not only poses a heightened risk for immigrants, but anyone engaged in actions that are found to be politically defiant. He pointed to a case in Texas where police conducted a nationwide search using Flock technology for a woman who self-induced an abortion.
“This could be ripe for abuse by ICE (Immigrations and Customs Enforcement), but it could also be ripe for abuse by other government entities,” Haseebullah said. In 2025, the ACLU pushed back against a measure that would allow local jurisdictions to use automated traffic cameras to crack down on speeding and red-light crossings, although the bill was never voted on.
Flock has received backlash nationwide for allowing federal agencies such as Customs and Border Patrol to tap into their data. The company has said it does not work with ICE after evidence was found that the agency used Flock data for immigration investigations. Several cities have terminated or modified their Flock agreements after realizing they were inadvertently sharing their data with other agencies.
However, though Flock might not want to partner with ICE, it has little choice — Flock is obligated to fulfill subpoenas from ICE and can’t refuse a legal warrant, Andrew Ferguson, an attorney and a professor researching tech and police surveillance at George Washington University, said.
Flock’s surveillance cameras are meant to catch crime, though experts say it could deter certain behaviors if citizens are aware they are being watched.
“There’s a chilling effect knowing that your government is essentially tracking you wherever you go,” Ferguson said. “It might be even more chilling if you put cameras in sensitive places, like a medical clinic, or a Gambler’s Anonymous meeting, or a church.”
In a city such as Las Vegas, known for drinking, gambling and a hearty party culture, surveillance is the last thing people are interested in, according to Ferguson.
“Things are happening in Vegas that are not going to stay in Vegas,” Ferguson said. “They’re going to be broadcast through Flock.”
As recently as October of last year, the Horowitz Family Foundation donated almost $1.9 million for Flock license plate readers and another $2.47 million for supporting software for Flock machines, according to the minutes of an LVMPD fiscal affairs committee meeting.
Because the donations aren’t coming directly to Metro, but to the nonprofit LVMPD foundation, also known as “Friends of Metro,” any discussions on the cameras’ use aren’t subject to Nevada’s open meeting laws.
The license plate readers and their supporting software are not the only gift that the Horowitz Family Foundation, led by Ben Horowitz’s wife, Felicia Horowitz, has donated to Las Vegas police. The foundation has also gifted drones, as well as Tesla Cybertrucks, to the agency.
Proponents have billed the gifts as morale boosters for police that help the agency stay on the cutting edge without tapping into limited taxpayer dollars. Critics, such as the Progressive Leadership Alliance of Southern Nevada, have suggested that the Cybertrucks show that Metro is “prioritizing corporate giveaways.”
Felicia Horowitz said she is focused on “creating the best community in America” in Las Vegas, according to her bio from a local nonprofit organization that she sits on the board of. Part of that is combating crime and keeping citizens safe. In a Wall Street Journal article, Felicia Horowitz emphasized how crime and weak policing had hurt Black communities across the country.
“The new policies — defund the police, don’t prosecute crime — are destroying the communities where I grew up,” Felicia Horowitz, who is Black, told the WSJ in 2024. Felicia Horowitz was raised in Los Angeles and the Horowitzes relocated to Las Vegas around 2021 and 2022 after decades in California.
So far, the foundation has not publicly commented on whether it will continue donating money for Flock services. Some experts think the donations might be a strategy called “penetration pricing,” where a company gives free or reduced products or services in order to hook consumers before charging them.
“There’s no question that there’s a financial interest in them proving that the Flock technology works in Las Vegas so that they can sell it to other places,” said Ferguson.
The former police officer said he was concerned about taxpayers having to cough up funds to continue Flock services if the Horowitz money ran dry.
“Once you start relying on a certain type of policing, it’s going to be hard to switch over, and then who will foot the bill?” the officer said.
...
Read the original on thenevadaindependent.com »
Tenant and product or co-owner and participant?
Today the Web and Internet is owned and controlled by large for profit corporations and a few governments. Corporate ownership combined with government policies has left us as tenant and product. It has given us a surveillance economy and enshittification.
* What if I do not wish to be tenant and product?
* What can I do to change the equation?
Those two questions lead me to a bigger question.
* What happens when ownership and control of hardware and software shifts from the domain of corporations to a world where a significant percentage are owned by individual people and cooperatives?
I think the answer is suggested by a corollary found in the history of labor movements. When a significant percentage of industries were unionized the unions exerted a strong influence across the political economy. I think ownership of the hardware and software can mirror that impact on the Web and Internet. I think when a significant number of individuals and cooperatives own the hardware and uses simpler software we can impact the Web and Internet in a positive way. That’s my hypothesis.
An observation and common some assumptions:
* Most content on the web is already created by individuals not Big Co
* Big Co persuaded people that only Big Co could provide easy Web publication
* Big Co convinced many there was no point in looking for alternatives
The assumption that only Big Co can provide easy Web publication is just flat out wrong. These systems don’t last for more than a decade before they decay. Each of Big Co origin story are similar. They started small. They get to scale by having investors who fund and push rapid expansion. Innovation slows so they buy up any potential rivals. Big Co then either shut them down or fold the rival system into their product lines. The last real innovation these companies introduced was decades ago. Lack of real innovation is one of the factors that drive the Big Co and Big Tech hype cycle. They proclaim a new shiny thing in order to maintain the circus that accumulates more money. Along the way Big Co insists on tax breaks and zero regulation as a prerequisite for innovation which isn’t delivered. When they did innovate they didn’t have the breaks they insist on now, hell they didn’t have the investment or market lock they have now. They only need the hype cycle, not innovation, to keep the money rolling in. At the end of the day we wind up the product, we wind up being exploited and we get very little useful in return.
Folks there is an alternative. In 1992 authoring for the Web did require significant technical knowledge. HTML itself was very challenging to teach people. It was challenging to teach computer enthusiasts! I was involved in helping out at classes that taught HTML back in the early 1990s. I speak from first hand experience. But a funny thing happened on the way to 2026. A tech writer (John Grubber and friends) came up with a simpler expression of hypertext called Markdown. You don’t need to know HTML to create a web page or blog post today. You can write it or read it using Markdown. You can write it using the simple text editor that came with your operating system on the computer you own. You only need a program to flip Markdown into HTML. There are plenty of programs out there that do that.
In the past many of our efforts to break free of Big Co have met with limited success. Usually the energy and effort has been spent re-creating the centralized systems as distributed systems. There was a sense we needed to offer the same experience as Big Co. While ideally individuals and groups could easily run these distributed version the reality is that it remains challenging. I’m really happy to see some of them have some degree of success. It is an impressive effort. They have broken new ground and importantly they are playing an important role in the world today. I don’t think they alone will get us to where we need to go. Even Cory Doctorow uses a system administrator to setup his system. Cory Doctorow is a smart technical guy. It should be easier to do (see https://pluralistic.net/2025/08/15/dogs-breakfast/#by-clicking-this-you-agree-on-behalf-of-your-employer-to-release-me-from-all-obligations-and-waivers-arising-from-any-a).
I think there is a simpler path. The Web itself is a decentralized system. What is needed is an easier way for individuals to create content for it. Markdown I believe is a significant piece of the solution. There are many software programs that can convert Markdown into an HTML page. Pandoc is a brilliant example of that. A website is more than a single Web page otherwise we’d be done. This is why content management systems were adopted on the Web. What you need is a way of getting to the HTML typing something easier to read and type. You need a simple way to manage the website structure for what you have written. Again there are programs that do this today. Unfortunately many are complex and come with their own steep learning curve.
The most popular content system on the web today is WordPress. It was designed to run remotely on a server. It integrates with the social web systems like Mastodon. It is open source software and you could run it on a personal computer in a pinch. Unfortunately WordPress is complex to maintain. WordPress is really a bundle of software. It requires running Apache or NginX Web Server. It requires running a database like MySQL or MariaDB. It is built from a bunch of PHP, JavaScript, CSS and templates. WordPress out of the box does some really nice things. But it comes with overhead too.
If you are a developer WordPress isn’t a huge barrier. It’s dandy. But running and maintaining it amounts to running and maintaining a whole bundle of interconnected software. That takes up computer resources like memory and computation time. That is problematic. It’s challenging to set it up to use as simply as you use your text editor or word processor. Your stuck because it is designed to run on a remote server. If you only want to type up some Markdown to turn into a web page WordPress adds another whole other level of complexity to that big kettle of fish.
Complex content management systems was what lead to a renaissance of popularity of using static website generators. Static websites are simple to generate, cheap and easy to host and can be surprisingly interactive. You can even hand craft a static website page by page using Markdown and Pandoc. It did that for years. What Pandoc doesn’t do easily for you is provide the trimmings like RSS feeds and sitemaps. If doesn’t help manage this site structure. Many people build websites with more elaborate systems like Jekyll and Hugo because, like WordPress, they provide more in the way of content management. There are literally hundreds of other static website generators out there. Unfortunately they don’t complete solve the problem. The ones I’ve tried have been too complex or didn’t run on the machines I wanted to do my writing on. I think this is because most were created by developers like me. We grew up on large large complex content management systems. So when we build our own they easily become large and complex too. That is a problem. As a writer you shouldn’t need to put on a developer hat to produce a website. You shouldn’t have to use Medium or Substack either. What is needed is different. What is needed is an easy way to go from Markdown documents to websites without extra knowledge. Ideally you’d only need to know Markdown to build a nice website.
This lack of simplicity for writers has disappointed me. The Web is over thirty years old. It is reasonable to expect a simpler writing system for the web. One that can run on small computers. Ones that don’t make you use a text input box for writing. Yet the systems out there are are stuck with complexity because they are solving the problem faced by professional Web developers decades ago. They are making old assumptions about requiring complexity. In a way developers like me keep building formula one race cars when what is needed is a single speed bicycle. How do we get to a simple web?
I’ve been search for an answer. I don’t think any invention of is needed. The answer in 2026 is already built-in to the Web. What is needed to change is the software holding that technology. The Web can interconnect us. The software needs to take Markdown and generate the rest of the website so we can take advantage of that. I think we need to break the assumptions of complexity of use and complexity of multi author or centralized models. The core software requirements include an easy way to express hypertext (Markdown), an easy way to generate the HTML. It needs to make content syndication and discovery automatic (create RSS files and sitemaps). The Web browser will see HTML, CSS, JavaScript, RSS, sitemap.xml but the author only needs to work with Markdown documents. I’ve written experimental software to prove this is possible. My hope with this post and pointing at my own software contribution will shed some light on how easy it could be. I hope it is an example that this can become collectively understood.
A simple web of our own has three core characteristics.
A computing device owned and controlled by an individual or cooperative
A network owned and controlled by an individual or cooperative
Simple to use software that empowers us to both read and write hypertext and syndicated content
The Web and Internet we have today isn’t required by the technologies that created it. Human choices and human organizations combined with past scarcity of knowledge and resources is what lead us to this point. That’s good news moving forward. Between 1992 and 2026 resource scarcity has changed. Spreading knowledge through communication is the strength and purpose of the Web. They are solid foundations to build on if we choose.
Let me illustrate. In 1926 we didn’t have a global e-waste problem. In 2026 it is a huge problem. In 1950 a computer filled a room and could only be afforded by governments and the largest corporations. They required special high capacity power connections. They required cooling systems. Often required physical changes to the buildings (example sub flooring for cable access and fire supression systems). In 2026 a single computer like the Raspberry Pi 400 runs $60.00 in the United States. It can run off a USB battery or wall socket. It can run in ambient temptures. Throw in a monitor, power supply and cables and you’re your computer budget comes in at about $200.00. That is with the crazy United States tariffs built-in. It includes the crazy AI hype inflated memory pricing. A good desktop computer capable of producing Web content and hosting it is far less than the price of a smart phone which you don’t control.
Let’s explore the Internet and Web not as proper nouns but as common nouns. The underlying technology is a distributed system. We happen to use it like a monolithic system. You see a similar pattern in computer operating systems. Windows is based on NT, it was based on VMS. VMS was a mini computer based multi user operating system. Linux and macOS are modeled on Unix. Unix was originally a mini computer based multi user system. Similarly our two most popular phone operating systems, Android and iOS are, on built on top of Linux and macOS. They are multi user systems used on single user machines. We choose to use them as single systems to avoid thinking about their complexity. Similarly we assume the Web must be run by Big Co because we avoid thinking about the complexity underlying it. Abstraction and re-purposing abstraction is a common theme in software systems. Re-purposing abstraction allows us to move where the complexity is based. It allows us to experience a simple system. What’s changed is we don’t require Big Co to have a simple user experience. I am arguing for managing complexity through simple to use software running directly on a computer we control and own. It not a remote service. It’s doesn’t run until you tell it to. When it does run it takes care of the complex details of generating the website HTML, RSS and sitemaps from the simpler expression of Markdown.
The Internet is a network of network. An internet as a common noun is also a network of networks. Specifically it is a network of one or more computers connected using Internet Protocols. The Internet Protocols provides for public facing networks and private ones. One that runs on your computer and is only available to your computer is called localhost. You can author a website and view it on your own computer using localhost. Localhost is a private network. If you are running macOS, Linux, Windows or Raspberry Pi OS it’s already available to you. You only need to choose to use it. You have a private network the minute you turn on your computer. You can have a private piece of the Web if you choose.
If you are lucky enough to have Internet access at home that network is probably setup as a private network. Your private network is then connected to your Internet Provider via a switch or cable modem. The Internet Provider connects our private network to the public Internet on our behalf. Both the public and private systems run using the same set of technologies and protocols. This is something we can leverage to our own ends.
* The Internet is just a network of networks using Internet Protocols
* The network starts on your own computer
* Networks can be private or public
* We can own a private network and connect it to another public or private one
There are two versions of Internet Protocols running in parallel today, IPv4 and IPv6 (IP stands for Internet Protocol and “v” is followed by the version number). IPv6 provides a larger possible number of uniquely identifiable connections on the network. Each network connection can provide a Web destination. Much of the globe has already shifted to IPv6. The United States lingers with quite a bit of IPv4. We stopped innovating a long long time ago. I slow WiFi and copper wire networks reflect that.
A Raspberry Pi computer running the Raspberry Pi Operating System supports both IPv4 and IPv6. As a Raspberry Pi computer owner you don’t really need to worry about the distinction. If you are connecting to more than one computer you’ll need a device called a switch or router. There are cheap hardware switches used to connect computers via Ethernet (faster) or WiFi (more convenient). They usually support both protocols. This means individuals can create a local internet (a network compatible with the Internet). When I checked the prices at my local appliance store a four port network switch start at under $50.00. Some were under $20.00. By comparison when the Arpanet (the original Internet) started it required a DEC PDP-1 mini computer to interconnect networks with the Arpanet. A DEC PDP-1 cost approximately $120,000.00 (1960s United States dollars). There was a huge change in cost from then to now. Raspberry Pi and inexpensive network switches are way more available than all the DEC PDP-1 ever made. They consume far less electrical power too. You can spend less than $500.00 to create a nicely little Internet compatible network with a couple computers.
Why do I keep pointing out prices? Back in the late 1980s when I was a student and first encountered the Internet the hardware and software used to connect to it cost a small fortune. The price of an Internet connected Workstation I used at University was more than the value of my parents suburban home! Creating an Internet compatible network at my home was not possible do to coast. I actually talked to the people who setup the University’s network about doing this (I commuted from a long distance).
Fast forward to 2026. Prices have changed. Computer availability has changed. In 1969 computers were still rare devices. Today there is one built into your TV and probably your toaster. The cost and availability has radically changed since the creation of the Web too. That should inform our expectation of how things can work. Sometime I couldn’t so in 1989 is very doable in 2026. In 2026 rural communities in the United States are forming their own Internet Provider cooperatives. These cooperatives are connecting homes using fiber optic cables. This transforms their access from none or slow to really fast and very reliable. It also can be done for a lower cost than relying on Big Co Internet Providers if they even service the area.
In 2026 my city of 200,000 plus people we don’t have fiber optic connections to homes. In my case one Big Co paid the another Big Co to stop expanding home fiber access anywhere in the county of Los Angeles. That includes my city. They’ve been paying the other Big Off for more than a decade. The Big Co created scarcity ensures their profit margins. They are like the rail road companies in the nineteenth and twentieth century. Not about public service, not even about being effective transport corporations. It’s all about profit at the expense of the public.
Let’s focus on the Web running on top of the Internet. What is it? The Web is a hypertext system built on top of the Internet. hypertext is the key take away. It’s the Web’s origin killer feature. The Web’s hypertext system is built from a set of core technologies. These technologies are now mature. That collection includes things like HTTP, HTML, CSS, JavaScript and RSS. The two that go back to the beginning are HTTP and HTML. Let’s take a look at where these started and where we are today.
HTTP stands for Hypertext Transport Protocol. It is a way of using the Internet protocol and text to reliability transfer hypertext from one computer to another. The interaction model is a client (requester, web browser) and server (responder, web server). It is a call and response system. In 1992 this required specialized software. It required one or more skilled specialist to run it. Most websites ran on expensive multi user mini computers that cost the price of suburban single family home. The computers required specialists to run and maintain them too. In short it was an expensive luxury affordable only by large institutions with significant government funding.
In 2026 most programming languages ship with a standard library that allows creating a web server in a few lines of code. You do not need to be a network systems programmer to create one. No networking engineer required either. Ethernet and WiFi are available as commodity hardware components that largely work plug and play. Today web servers run inside appliances. This allows them to be labeled as “smart” and to fetch a higher price. You can do the same thing these embedded devices do using a $15.00 Raspberry Pi Zero 2W, power supply and SD Card for storage. A Raspberry Pi Zero 2W can even be configured to be a public WiFi access point. That’s the impact of an abundance of computers and resources. Creating Web services is a solved problem.
The technology that originated back in 1990s is still largely the same. It has just been updated slowly over time. That slowness has lead many people to not notice the changes. They haven’t fully revise their assumptions they made back in 1990, 2000, 2010 or 2020. None of what I discussed here is rocket science. It is clearly visible in computing history. Through an understanding of the historical view that you can see how things were and how they can be. I’m making the point that things have change even when the collective wisdom the Tech Bros and Big Co hasn’t.
The Internet is a next neighbor connection proposition. If I have a home internet owned by me and my neighbor has their own little home internet we can connect them. It forms a slightly larger network. If we choose we could split the costs of connecting to the public Internet assuming we had a provider willing to connect us in their terms of service. Internet cooperatives take advantage of this simple relationship. The recurring bills are electricity and the common connection to a larger publicly connected Internet. The way the Internet evolved is that each organization (university or research institution) payed to connect to their neighbor and agreed to carry their neighbor’s traffic as well as their own. Larger organizations wound up having multiple connections to other institutions. They operated like hubs. Multiple connections enhanced reliability. Smaller institutions might connect only to one other Internet site. That was called a leaf connection on the network. Importantly whether you were a hub or leaf you could reach any other available site in the network just by knowing it’s address.
The old metaphor, Internet Super Highway, was based on the corollary that each town paid for its road and they paid for a road connecting to next ones. Roads interconnect. Traffic, in the form of cars, trucks and motorcycles, can follow the roads from one town to another. The road system can be expanded to include new towns, home, cities or other destinations. Like the road system the Internet is extensible. It can be expanded as needed just by adding connections.
A home might be a computer, a town might a local network with a small collection of computers and cities might be large hubs with large data centers owned by Big Co. In the real world most roads are owned collectively by the public. Some roads are private. Some are private roads allow access for a toll. All are still just roads. The Internet today is built as a series of toll roads. There are few public roads. We all pay for access in cash, in loss of privacy and loss of autonomy. Many commercial Internet Providers prohibit direct sharing of the your network with your neighbors in their terms of service. These are human organizational choices. They are not technical choices or constraints. On the Internet today most people might own the device (example phone) but they’re still rent access where the payments are in the form currency, loss privacy and loss autonomy. When the companies wish they can force the purchase of a new devise by using the Internet to delivery software to disable them. This is the big reason I think we need to change our relationship. The country prospered when the public freeway system was created in the 1950s. The country could prosper if we had a real option of public Internet access mirroring our public roads. In the mean time we can take maters into our own hands. Own and control our computers and local networks. Form cooperatives for connecting to the Internet where appropriate.
It feels like a paradox. Ownership and control of our hardware gives us agency to function better collectively. It reminds me of the adage, “you reach the global by first focusing on the local”. What an interesting human concept. If we own our hardware and control it we can choose to band together in cooperatives. We can change the equation and get out from under the thumb of Big Co and their toll system.
Many of us carry a smartphone in our pockets. These are computers but most are not suited to creating a Web of our own. Why? If you are using an iPhone running iOS or an Android phone provisioned with Google’s software then Apple, Google or another Big Co controls your device. This is true even through you may have thought you purchased the phone. Case in point I used to carry a Samsung phone. I really enjoyed it. It ran a version of Android controlled by Samsung. Samsung sent an update that bricked (disabled) the phone. When I reached out to them the automated email reply indicated since my device was over 3 years old I would have to buy a new one. My phone was five years old. It worked really well and I liked it. Samsung had made the decision that they would update the software on my phone knowing that it would make it inoperable. Needless to say I haven’t owned a Samsung phone since. I haven’t trusted any Android device since. My Apple iPod mini faced a similar situation. My point is I owned the hardware but didn’t control the software. It was really convenient that updates were pushed out. I really liked not paying attention to the detail. My life is busy. That arrange worked well right up until it didn’t. If a corporation or government controls the software then they also control the hardware. It doesn’t matter how much you payed to purchase it. You don’t really own it. Good to know.
So this is what I propose. We individually obtain computers where we control the software on it. The computers don’t have to be powerful. I’ve done real computing (writing software) using Raspberry Pi 400 and Raspberry Pi 500. I have chosen to go with new computers because I own them a really long time. I still have a Raspberry Pi 2 that works. Skipping Starbucks and some Pizzas allowed me to save for these relatively inexpensive new computers. I understand that I’m privileged that I can afford these.
You don’t have to go with new machines. There are less expensive options. I have a ten and another fifteen year old Mac Mini. I still can use them. I got them used. I think I paid five dollars for one and the other was given to me. Since they know IPv4 I can run them on my private network. I wouldn’t run them on the public Internet. Apple stopped updating their OS for these machines decades ago. They can be run safely on a private network. They don’t run the latest web browser but my website doesn’t use the latest bells and whistles either. My point is they still work and can be used to curate or produce web content even if another machine is used to make it available on the public Internet.
There is a thriving market in refurbished and used machines. Companies and governments often lease hardware. When the lease is up after two or three years all that equipment goes either to e-waste or is resold. Going refurbish and used has the advantage of not adding to the e-waste problem. There are also civic groups that get refurbished equipment to people that need it low or no cost. Getting a computer to write web content can be challenging but it is possible even when you have limited means. You don’t need a powerful machine, you don’t need the latest fastest one either. You need one that has a text editor and can run software to turn Markdown into HTML.
Here’s what I used for writing this post (it has the advantage of being portable to the nearest electrical plug).
* a wireless switch connected to a cable modem and my Internet Provider
* A Raspberry Pi 3B+ with a 3 gigabyte hard drive setup as a “server” (makes this site available on my home network)
* I publish this site via GitHub Pages service for public Internet access (I have the least expensive subscription for this)
The software I am using to write this post is as follows (all programs are open source software, free to share, free to use)
* Mousepad (the text editor that ships with Raspberry Pi OS)
With this software and hardware setup I can published my blog (see https://rsdoiel.github.io) and I can aggregate the news (see https://rsdoiel.github.io/antenna). I run the most up to date copy of both on my private home network. I can view the home network copy on my phone as well as my computer. My family can view it too on the home network. I update the public copy periodically. That way when I am away from my home network I can still read the aggregated news.
The setup provides a little corner of the Web which I own and control. It is not hard to replicate it for yourself. I don’t need to use Yahoo News, Google News, Bing, Twitter/X, Facebook, Instagram, Whats App, Spotify, YouTube to know what is happening. I just check my own aggregations. Since I didn’t implement an infinite scroll and I aggregate on a slow schedule I don’t get sucked doom scrolling. Slow news gives me more time for being with the humans I love and experiencing real life without distraction. When I read my aggregated site it feels much more like choosing to read a newspaper or magazine. The open source software I created to make this easy to do is called Antenna App. You can run the latest version on macOS, Windows, Linux and Raspberry Pi OS machines.
The Antenna App software is driven my Markdown files. Markdown is a really good expression of hypertext. Posts and pages are Markdown files. The list of websites I aggregate are defined by Markdown files containing a list of links to the RSS news feeds. The Antenna App takes care of harvesting content and generating the HTML files, RSS and sitemaps used by your web browser. Antenna App is written as a command line tool. It could be re-implemented as a graphical system or interactive program. My software is released under an open source license so anyone can build on what I’ve already provided as long as they respect the terms of the license (a GNU license). There are other software systems out there. I mention mine because it provides it is possible. You should look for one that works for you.
I use two computers (Raspberry Pi 500 and Raspberry Pi 3B+) for my home network. I could actually just use the one. That’s because operating systems like Raspberry Pi OS support the concept of localhost. Localhost presents the machine as if it is a network node. If I had a Linux based phone I could run the aggregation service directly on it. Then I would have my Web right there in my pocket. I am saving my pennies for a Linux based phone.
Working with small computers is like living in a small or tiny home. It can be very cozy and comfortable. It will never be a mansion. Mansions and castles are fine for some people. While I’ve enjoyed visiting a few castles I would not choose to live in one. There are really expensive to own, heat/cool and maintain. I like small and simple. I choose to live in a cottage.
I accept living in a small home isn’t for everyone just as running little computers isn’t for everyone. That is why I don’t say people should abandon the computer systems that work for them. I am pushing for people, like myself, who have a problem with the predatory Web and Internet we have today. Assert ownership (individually or collectively) to correct our relationship. Collectively we need a Web and Internet where we are co-owner and participant. I am no longer interested in being a tenant and product.
...
Read the original on rsdoiel.github.io »
The car wash test is the simplest AI reasoning benchmark that nearly every model fails, including Claude Sonnet 4.5, GPT-5.1, Llama, and Mistral.
The question is simple: “I want to wash my car. The car wash is 50 meters away. Should I walk or drive?”
Obviously, you need to drive. The car needs to be at the car wash.
The question has been making the rounds online as a simple logic test, the kind any human gets instantly, but most AI models don’t. We decided to run it properly: 53 models through Opper’s LLM gateway, no system prompt, forced choice between “drive” or “walk” with a reasoning field. First once per model, then 10 times each to test consistency.
On a single call, only 11 out of 53 models got it right. 42 said walk.
The models that passed the car wash test:
Across entire model families, only one model per provider got it right: Opus 4.6 for Anthropic, GPT-5 for OpenAI. All Llama and Mistral models failed.
The wrong answers were all the same: “50 meters is a short distance, walking is more efficient, saves fuel, better for the environment.” Correct reasoning about the wrong problem. The models fixate on the distance and completely miss that the car itself needs to get to the car wash.
The funniest part: Perplexity’s Sonar and Sonar Pro got the right answer for completely wrong reasons. They cited EPA studies and argued that walking burns calories which requires food production energy, making walking more polluting than driving 50 meters. Right answer, insane reasoning.
Getting it right once is easy. But can they do it reliably? We reran every model 10 times, 530 API calls total.
The results got worse. Of the 11 models that passed the single-run test, only 5 could do it consistently.
These are the only models that answered correctly every single time across 10 runs.
Both get it right most of the time. But in production, an 80% success rate on basic reasoning means 1 in 5 API calls returns the wrong answer.
OpenAI’s flagship model fails this 30% of the time. When it gets it right, the reasoning is concise: “You need the car at the car wash to wash it, so drive the short 50 meters.” When it gets it wrong, it writes about fuel efficiency.
All Claude models except Opus 4.6, all Llama, all Mistral, GPT-4o, GPT-4.1, GPT-5-mini, GPT-5-nano, GPT-5.1, GPT-5.2, Grok-3, Grok-4-1 non-reasoning, Sonar, Sonar Reasoning Pro, DeepSeek v3.1, Kimi K2 Instruct.
Some models that looked correct on the first try turned out to be flukes.
Sonar went from correct to 0/10. It still writes the same 200-word essay about food production energy chains and EPA studies in every single run, it just flips the conclusion to “walk” every time now. Same reasoning, opposite answer.
Kimi K2.5 went from correct to a perfect 5/5 tie. Literally cannot decide.
Sonar Pro went from correct to 4/10. When it says “drive,” it’s because of calorie-emission math, not because the car needs to be there.
And one model went the other direction: GLM-4.7 went from wrong on the single run to 6/10. It was unlucky the first time. Still not reliable, but the capability is clearly in the weights.
The most common pushback on the car wash test: “Humans would fail this too.”
Fair point. We didn’t have data either way. So we partnered with Rapidata to find out. They ran the exact same question with the same forced choice between “drive” and “walk,” no additional context, past 10,000 real people through their human feedback platform.
Turns out GPT-5 (7/10) answered about as reliably as the average human (71.5%) in this test. Humans still outperform most AI models with this question, but to be fair I expected a far higher “drive” rate.
That 71.5% is still a higher success rate than 48 out of 53 models tested. Only the five 10/10 models and the two 8/10 models outperform the average human. Everything below GPT-5 performs worse than 10,000 people given two buttons and no time to think.
Thanks to Jason Corkill and the Rapidata team for making this happen on short notice.
GLM-4.7 Flash on one of its correct runs: “Walking would require physically pushing or carrying the car, which is impractical and impossible.” Probably the best articulation of the actual problem from any model.
Claude Sonnet 4.5 wrote: “The only scenario where driving might make sense is if you need to drive the car into the car wash anyway for an automatic wash” and then picked walk. It saw the answer and rejected it.
Claude Opus 4.5 suggested you should “walk to the car wash, then drive your car through the wash.” The car is at home.
Gemini 2.5 Pro when it gets it right: “You want to wash your car. The car needs to be at the car wash for this to happen. Therefore, you must drive it there, regardless of the short distance.” When it gets it wrong: “50 meters is a very short distance that would take less than a minute to walk.” Same model, same prompt.
This is a trivial question. There’s one correct answer and the reasoning to get there takes one step: the car needs to be at the car wash, so you drive.
Out of 53 models, only 5 can do this reliably. 15 more can sometimes get there but unpredictably. The remaining 33 never get it right.
The pattern across 530 API calls shows three tiers of failure:
Models that never get it right (33/53): These models have learned “short distance = walk” as a heuristic and can’t override it with contextual reasoning. The correct answer isn’t accessible to them.
Models that sometimes get it right (15/53): The capability exists but competes with the distance heuristic. On any given call, either path might win. This is the most dangerous category for production AI. The model passes during evaluation and then fails unpredictably in deployment. Picking the right model isn’t enough on its own.
Models that always get it right (5/53): The contextual reasoning consistently overrides the heuristic.
This is a toy problem with one logical step. Real-world AI applications involve chains of reasoning far more complex than this. If 90% of models can’t reliably handle “the car needs to be at the car wash,” how do they handle actual business logic, multi-step workflows, or ambiguous edge cases in production?
The car wash test is a zero-context problem by design. No system prompt, no examples, just a raw question. That’s what makes it useful as a benchmark. But the failure mode is telling: models don’t fail because they lack the capability. They fail because the heuristic (“short distance = walk”) wins over the reasoning (“the car needs to be there”).
Context engineering is one way to shift that balance. When you provide a model with structured examples, domain patterns, and relevant context at inference time, you give it information that can help override generic heuristics with task-specific reasoning.
We’ve seen this in practice. In a separate experiment, we took a small open-weight model that failed an agent-building task and added curated examples through Opper’s context features. It matched the output quality of a frontier model at 98.6% lower cost, without changing the model itself.
The car wash problem is simple enough that the top 5 models solve it without help. But most production tasks aren’t that clean. They involve ambiguity, domain knowledge, and constraints that aren’t obvious from the prompt alone. For those, the gap between “sometimes gets it right” and “always gets it right” is often a context problem.
All 53 models were tested through Opper’s LLM gateway using the same prompt: “I want to wash my car. The car wash is 50 meters away. Should I walk or drive?” No system prompt. Forced choice between “drive” and “walk” with a reasoning field. The single-run test was one call per model. The 10-run retest was 10 identical calls per model (530 total), no cache / memory. Every call was traced and logged through Opper, so we could inspect each model’s reasoning.
The human baseline was collected through Rapidata, using the same question and forced choice format across 10,000 participants.
Full data from every run is available for download:
...
Read the original on opper.ai »
“For a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.”“It does not seem very ‘viral’ or income-generating.”“It’s pretty nice, and I was thinking to myself — hey cool, I could make an online backup of my code. Then it occurred to me — who the hell is this guy, and why should I trust my code to be on his server!?”“It’s a pretty crowded space. And XDrive gets you 5 GB for free, 50 GB for $9.95 a month. I think competitors can duplicate Dropbox’s nice front end.”The most famous bad take in HN history — the ‘I could build this myself with existing unix tools’ archetype. Dropbox IPO’d in 2018 at a $12B valuation. Drew Houston later thanked BrandonM by name when Dropbox went public.Read the post on HN →
“I still don’t exactly understand what they are offering? Is there an advantage to using GitHub versus dumping some (yet to be created) virtual machine image on a cheap virtual server?”“Don’t you think that git’s advantage over SVN evaporates when there is only one user on a team? I run my private Subversion repository which I use for everything (not just code).”“Doesn’t the pricing seem a bit too granular, though? I suspect the pricing categories will collapse into 3, maybe 4, levels eventually.”The opening comment literally couldn’t see the point. GitHub was perceived as ‘just a git host’ — the social layer and the network effects were invisible. Microsoft acquired GitHub in 2018 for $7.5 billion. GitHub now hosts 100M+ developers and 420M+ repositories.Read the post on HN →
“Well this is an exceptionally cute idea, but there is absolutely no way that anyone is going to have any faith in this currency.”“I’m having trouble wrapping my head around the logistics of this…”The entire thread had just 3 comments and 5 upvotes. Three comments. For what would become a $2 trillion asset class. A single bitcoin went from fractions of a cent in 2009 to over $100,000 by 2024. Total crypto market cap exceeded $3 trillion.Read the post on HN →
“I can’t ever see anyone saying ‘just duckduckgo it.’ The name just sounds silly. It makes me think it’s a search engine for toddlers.”“DuckDuckGo is childish. I think that name will hold them back.”“How many people would go to Google and search for ‘new search engine’? DuckDuckGo is not even in the top 10 pages.”“I don’t find their actual search engine very useful at all.”The name. That was the biggest objection. Nobody could get past it. Meanwhile, Google itself was once mocked for being a misspelling of a number. DuckDuckGo grew to over 100 million daily search queries and became the default search engine in many privacy-focused browsers. Valued at over $600 million.Read the post on HN →
“Unfortunately taxis are a regulated industry in most major cities. The entrenched interests of the taxi companies are simply too big — and they have the political clout — to let this one slide under the radar.”“If this service became at all popular, it is very likely that cities would immediately include ‘mobile hailing’ as also requiring a license.”“Driving a gypsy cab (which is what UberCab is) is a dangerous business. A bad guy could simply place an order for an out-of-the-way alley or warehouse and know that the cabbie is going to be driving a really nice car.”“The first time an UberCab driver gets into a wreck without insurance or licensing should be interesting.”“This drastically idealizes UberCab profiles. It gets a lot shadier when UberCab is one of 10 companies doing this, and when it starts to become worth it to game profiles.”Two months after this thread, Uber received an actual cease-and-desist from San Francisco, seemingly validating every skeptic. Travis Kalanick’s response was to ignore it and expand to five more cities. Uber IPO’d in 2019 and is now worth over $160 billion. NYC taxi medallions, which sold for $1.3M in 2014, collapsed to under $80K. The regulation that was supposed to stop Uber became its origin story.Read the post on HN →
“All my experiences with it as a user have been too unreliable to expect that it can scale to truly massive usability. I just don’t see it swallowing up the whole hotel industry.”“This exchange cements my concerns about Airbnb only being huge if they can end-run the hotel regulatory system.”“Airbnb is almost more like a dating service than a marketplace… a buyer and seller who prove compatible will never need to use the service again.”“Airbnb is great unless you’re the kind of person that doesn’t trust strangers. Sadly, in the United States, the tendency to not trust strangers has been on the upswing for the last few decades.”The top comment sided with the skeptics. Commenters argued Airbnb couldn’t scale and couldn’t solve the trust problem of sleeping in a stranger’s home. Airbnb IPO’d in 2020 at a $100B+ valuation and is now worth over $80 billion. One of YC’s most successful companies ever.Read the post on HN →
“I really don’t get or see how Stripe is different? Why would I use it instead of PayPal, 2CheckOut, e-junkie, etc?”“I have no need of a fancy API either — PayPal lets me specify the basics and fire off a simple Post from my PHP code.”“Stripe gets added to the bookmark collection for ‘services to use should I ever have a problem with PayPal.’”“Pretty much every company in payment processing that does not use segregated merchant accounts sooner or later goes bust.”The launch thread was full of commenters doing unfavorable price comparisons to PayPal. Posted by Patrick Collison himself. Stripe reached a $106B+ valuation and processed $1.4 trillion in payments in 2024. The ‘fancy API’ became the default payments infrastructure for the internet.Read the post on HN →
“A lot of really smart people have tried and failed to accomplish this sort of thing before. Amazon invested $60 million in Kozmo.com back in the late 90’s, and they couldn’t make it work.”“I just do not see how this scales, as your marginal labor costs have got to be a very high portion of your revenues.”“Having a delivery fee is a non-starter. ‘I can get it in 2 days free with Amazon, or $4 today…’ People will spend huge amounts of time and effort to not pay delivery charges.”“I’ve built a few real-time delivery businesses, and I’m pessimistic. Real-time operations are costly to manage. Not being Amazon and not being able to control inventory hurts.”The top comment pointed to the graveyard of companies that tried before. The entire thread read like a post-mortem for a company that hadn’t even launched yet. Instacart IPO’d in 2023 and is worth over $12 billion. COVID-19 turned grocery delivery from a novelty into a necessity.Read the post on HN →
“It includes code to load up various analytics tools even if you never use them. For example, if I only use GA and Mixpanel, do I really want to serve the bytes for all the other plugins?”“It’s going to be really hard to make a generic, non-lossy mapping in a static, stateless JS script.”“I was hoping that this would be an open source version of Google Analytics.”“Google Analytics has a new API currently in beta that is also called analytics.js. This will be confusing.”Commenters argued the abstraction layer couldn’t work across fundamentally different analytics providers. The founders later wrote ’From Show HN to Series D.’Segment was acquired by Twilio for $3.2 billion in 2020, the largest acquisition in developer tooling at the time. Read the post on HN →
“I think you’d be a damned fool to invest in this technology for any serious project. Right now this is a toy.”“I have more than a sneaking suspicion that this project is essentially a proof-of-concept, and that it is not heavily used at Microsoft.”“Where’s all that great refactoring support if everything is made dynamic and stringly typed?”Microsoft + new language + compile-to-JS triggered every distrust reflex at once. The phrase ‘damned fool’ was deployed with full sincerity. TypeScript is now used by 80%+ of JavaScript developers and is the default language for virtually every major web framework.Read the post on HN →
“This is terrible. Did we really not learn anything from PHP days? Are we seriously going to mix markup and logic again?”“OMG, JSX… why? Just why?? Stop ruining JS people!”“The current fad of quasi-declarative web components looks like early Ext to me, and I think everyone knows how that turned out.”“Mixing JS and XML syntax, variables in docblocks, DOM components that are not really DOM components… Yikes. Thank you, but no, thank you.”The developer community overwhelmingly felt React violated fundamental software engineering principles. ‘Separation of concerns’ was the rallying cry against it. React became the most popular UI library in the world, used by over 20 million developers. In 2025, Meta donated React to the Linux Foundation.Read the post on HN →
“I mean this in the most helpful way possible: the interface is really, really bad at serving one of its basic, fundamental functions.”“I can get everything I need on HN. Ultimately the best products will make the front page here, no need to look around.”“I looked at the page for like 30 seconds, thinking to myself, ‘What is this?’… literally incomprehensible to a typical reader.”“Here’s a few things I couldn’t easily figure out on your site: What is a best new product? How is this different than a linked list on a blog? Is this site for me or for someone else?”Commenters couldn’t figure out what Product Hunt was after 30 seconds. It went on to become the default place to launch a tech product. Product Hunt was accepted into Y Combinator, raised from Andreessen Horowitz, and was later acquired by AngelList.Read the post on HN →
“No way this a spreadsheet. This is just a CRUD app with data displayed in rows. Zero chance of catching with spreadsheet users.”“The demand for an Access-like or ‘better spreadsheet’ product is all of the ‘Oh yeah, it sounds cool’ variety that never results in sales.”“Very difficult to get non technical peeps just suddenly ditch spreadsheets.”“Your app seems sluggish to scroll compared to Google Docs at that size, and the record density seems low.”Commenters predicted zero market demand. The ‘better spreadsheet’ category was seen as a graveyard of failed attempts. Airtable reached an $11 billion valuation and is used by over 300,000 organizations.Read the post on HN →
“This service allows me to solve this communication problem by asking designers to learn this tool — which is new to them, requires time, and also isn’t as powerful as Photoshop.”“The main differentiator is ‘we’re making this run in the browser.’ But nowhere does it explain why that’s better for designers.”“I just want a solid desktop app that isn’t a web wrapper or lives in the browser. I can’t stand web apps to be honest.”“$18MM to spend only to see if you got it wrong is a rather interesting approach.”An entire comment. Just “MEH.” For a company that would be valued at $20 billion. Adobe tried to acquire Figma for $20 billion in 2022. Figma IPO’d in 2025 at a $60B+ market cap.Read the post on HN →
“I don’t understand Tailwind. The entire point of CSS is to separate style from structure. How does applying composable utility classes differ from the old days of using HTML attributes for styling?”“This is essentially the same as inlining all of your styles in a style attribute on every element. I don’t see how you would ever reasonably want to use this in a project.”“Wasn’t the whole point of CSS to separate presentation from data, and move away from things like ? This is still considered bad practice, right?”“I don’t get it either. Start putting CSS in the style attribute while you’re at it.”The exact same ‘separation of concerns’ argument was levelled against React in 2013. HN missed it twice. Tailwind CSS became the most-downloaded CSS framework in the world with 100M+ npm downloads per month. It’s now the default CSS framework, period.Read the post on HN →
“No one should use a for-profit terminal emulator, especially one created by a VC-backed startup, full stop.”“Downloaded the image, installed it and was greeted by a mandatory login. Next step was uninstall and delete the dmg image. What a waste of time.”“You like people to contribute for free but refuse to give them an actual FOSS client. This is bound to fail.”“Warp’s VC decide they want an exit and Warp becomes 50usd/month SaaS. Your workflow, scripts, etc. are basically dead.”A VC-backed terminal that requires login and collects telemetry? HN reached for the pitchforks. The thread read like a restraining order. Warp raised a $50M Series B led by Sequoia Capital and grew to over 500,000 engineers on the platform.Read the post on HN →
“I love these super-ambitious projects (see Parcel, Rome.js) because after several years they will still fail in many areas at once!”“Moving to a reimplementation of core Node APIs is a terrifying prospect.”“Something has done a bit wrong if you’re running any of those tools in production.”One commenter preemptively grouped Bun with Parcel and Rome.js, ambitious projects that burned out. 1,431 upvotes said otherwise. Bun 1.0 shipped 14 months later. In December 2025, Bun was acquired by Anthropic to power Claude Code and its AI coding infrastructure.Read the post on HN →
“There’s nothing on the official website or GitHub that indicates what this software is, other than a cropped screenshot that looks like VSCode with a prompt pop up over it.”“So looking through the dependencies, it’s CodeMirror with a VSCode theme on top of it, that includes Copilot. Why wouldn’t I just use an existing editor with Copilot support?”“AI is still in an info-phase Bitcoin was in before 2017. Expected to see an avalanche of fake/fraud/phony products based on it.”The first Show HN got just 14 upvotes. Fourteen. The thread had 5 comments. One commenter couldn’t tell if it was ’some sarcastic joke software.’By 2025, 1 billion lines of code were being written on Cursor every day. Valued at $10 billion. Read the post on HN →
“While and after watching the video, I wasn’t sure if the whole thing isn’t just a parody of AI companies.”“It’s a cool idea but I really don’t see how this is any different from Cursor IDE.”“Of course it’s not going to be sustainable.”“Totally useless and I’m sure I will not be subscribing to it at any cost. It gets easily confused and cannot troubleshoot or understand a bit of the environment.”One commenter thought the entire launch video was a parody. Another gave it exactly three minutes before declaring judgment. Windsurf was acquired in a deal worth $2.4 billion, with its CEO and key employees joining Google.Read the post on HN →
“It’s clear that progress is incremental at this point. Anthropic and OpenAI are bleeding money. It’s unclear to me how they’ll shift to making money while providing almost no enhanced value.”“I paid for it for a while, but I kept running out of usage limits right in the middle of work every day. I don’t recommend using it in a professional setting.”“It’s not hard to make, it’s a relatively simple CLI tool so there’s no moat.”“Watching Claude Code fumble around… all while burning actual dollars and context is the opposite of endearing.”“Tried claude code, and have an empty unresponsive terminal. Looks cool in the demo though, but not sure this is going to perform better than Cursor, and shipping this as an interactive CLI instead of an extension is… a choice.”Critics focused on rate limits and cost. The thread got 2,127 points and 963 comments. People cared more than they let on. Claude Code hit $1B in annualized revenue within 6 months of GA, faster than ChatGPT. By early 2026 it surpassed $2.5B ARR.Read the post on HN →
“This thing chews through tokens. I’ve spent $300+ in the last 2 days doing fairly basic tasks. Also, it’s terrifying — no directory sandboxing. It can modify anything on my machine.”“There are 300 open GitHub issues. One of them is a security report claiming hundreds of high-risk issues, including hard-coded, unencrypted OAuth credentials. I am disinclined to install this software.”“I just don’t trust an AI enough to run unprompted with root access to a machine 24/7. Most of the cool stuff here you can also just vibecode in an afternoon using regular Claude Code.”“Layers and layers of security practices over the past decade are just going out the window. It’s quite wild to give root access to a process that has access to the internet without any guardrails.”“This is all starting to feel like the productivity theater rabbit hole people went down with Notion/Obsidian. It is clearly capable of doing a lot of stuff, but where is the real impact?”The project hit 60,000 GitHub stars overnight. Critics called it hype. Then Anthropic asked for a name change, and OpenAI acquired the creator. Creator Peter Steinberger joined OpenAI to work on AI agents. The project surpassed 145,000+ GitHub stars and spawned dozens of derivative projects.Read the post on HN →
...
Read the original on hackernews.love »
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.