10 interesting stories served every morning and every evening.
uv installs packages faster than pip by an order of magnitude. The usual explanation is “it’s written in Rust.” That’s true, but it doesn’t explain much. Plenty of tools are written in Rust without being notably fast. The interesting question is what design decisions made the difference.
Charlie Marsh’s Jane Street talk and a Xebia engineering deep-dive cover the technical details well. The interesting parts are the design decisions: standards that enable fast paths, things uv drops that pip supports, and optimizations that don’t require Rust at all.
pip’s slowness isn’t a failure of implementation. For years, Python packaging required executing code to find out what a package needed.
The problem was setup.py. You couldn’t know a package’s dependencies without running its setup script. But you couldn’t run its setup script without installing its build dependencies. PEP 518 in 2016 called this out explicitly: “You can’t execute a setup.py file without knowing its dependencies, but currently there is no standard way to know what those dependencies are in an automated fashion without executing the setup.py file.”
This chicken-and-egg problem forced pip to download packages, execute untrusted code, fail, install missing build tools, and try again. Every install was potentially a cascade of subprocess spawns and arbitrary code execution. Installing a source distribution was essentially curl | bash with extra steps.
The fix came in stages:
* PEP 518 (2016) created pyproject.toml, giving packages a place to declare build dependencies without code execution. The TOML format was borrowed from Rust’s Cargo, which makes a Rust tool returning to fix Python packaging feel less like coincidence.
* PEP 517 (2017) separated build frontends from backends, so pip didn’t need to understand setuptools internals.
* PEP 621 (2020) standardized the [project] table, so dependencies could be read by parsing TOML rather than running Python.
* PEP 658 (2022) put package metadata directly in the Simple Repository API, so resolvers could fetch dependency information without downloading wheels at all.
PEP 658 went live on PyPI in May 2023. uv launched in February 2024. uv could be fast because the ecosystem finally had the infrastructure to support it. A tool like uv couldn’t have shipped in 2020. The standards weren’t there yet.
Other ecosystems figured this out earlier. Cargo has had static metadata from the start. npm’s package.json is declarative. Python’s packaging standards finally bring it to parity.
Speed comes from elimination. Every code path you don’t have is a code path you don’t wait for.
uv’s compatibility documentation is a list of things it doesn’t do:
No .egg support. Eggs were the pre-wheel binary format. pip still handles them; uv doesn’t even try. The format has been obsolete for over a decade.
No pip.conf. uv ignores pip’s configuration files entirely. No parsing, no environment variable lookups, no inheritance from system-wide and per-user locations.
No bytecode compilation by default. pip compiles .py files to .pyc during installation. uv skips this step, shaving time off every install. You can opt in if you want it.
Virtual environments required. pip lets you install into system Python by default. uv inverts this, refusing to touch system Python without explicit flags. This removes a whole category of permission checks and safety code.
Stricter spec enforcement. pip accepts malformed packages that technically violate packaging specs. uv rejects them. Less tolerance means less fallback logic.
Ignoring requires-python upper bounds. When a package says it requires python, uv ignores the upper bound and only checks the lower. This reduces resolver backtracking dramatically since upper bounds are almost always wrong. Packages declare python because they haven’t tested on Python 4, not because they’ll actually break. The constraint is defensive, not predictive.
First-index wins by default. When multiple package indexes are configured, pip checks all of them. uv picks from the first index that has the package, stopping there. This prevents dependency confusion attacks and avoids extra network requests.
Each of these is a code path pip has to execute and uv doesn’t.
Some of uv’s speed comes from Rust. But not as much as you’d think. Several key optimizations could be implemented in pip today:
HTTP range requests for metadata. Wheel files are zip archives, and zip archives put their file listing at the end. uv tries PEP 658 metadata first, falls back to HTTP range requests for the zip central directory, then full wheel download, then building from source. Each step is slower and riskier. The design makes the fast path cover 99% of cases. None of this requires Rust.
Parallel downloads. pip downloads packages one at a time. uv downloads many at once. Any language can do this.
Global cache with hardlinks. pip copies packages into each virtual environment. uv keeps one copy globally and uses hardlinks (or copy-on-write on filesystems that support it). Installing the same package into ten venvs takes the same disk space as one. Any language with filesystem access can do this.
Python-free resolution. pip needs Python running to do anything, and invokes build backends as subprocesses to get metadata from legacy packages. uv parses TOML and wheel metadata natively, only spawning Python when it hits a setup.py-only package that has no other option.
PubGrub resolver. uv uses the PubGrub algorithm, originally from Dart’s pub package manager. Both pip and PubGrub use backtracking, but PubGrub applies conflict-driven clause learning from SAT solvers: when it hits a dead end, it analyzes why and skips similar dead ends later. This makes it faster on complex dependency graphs and better at explaining failures. pip could adopt PubGrub without rewriting in Rust.
Zero-copy deserialization. uv uses rkyv to deserialize cached data without copying it. The data format is the in-memory format. Libraries like FlatBuffers achieve this in other languages, but rkyv integrates tightly with Rust’s type system.
Thread-level parallelism. Python’s GIL forces parallel work into separate processes, with IPC overhead and data copying. Rust can parallelize across threads natively, sharing memory without serialization boundaries. This matters most for resolution, where the solver explores many version combinations.
No interpreter startup. Every time pip spawns a subprocess, it pays Python’s startup cost. uv is a single static binary with no runtime to initialize.
Compact version representation. uv packs versions into u64 integers where possible, making comparison and hashing fast. Over 90% of versions fit in one u64. This is micro-optimization that compounds across millions of comparisons.
These are real advantages. But they’re smaller than the architectural wins from dropping legacy support and exploiting modern standards.
uv is fast because of what it doesn’t do, not because of what language it’s written in. The standards work of PEP 518, 517, 621, and 658 made fast package management possible. Dropping eggs, pip.conf, and permissive parsing made it achievable. Rust makes it a bit faster still.
pip could implement parallel downloads, global caching, and metadata-only resolution tomorrow. It doesn’t, largely because backwards compatibility with fifteen years of edge cases takes precedence. But it means pip will always be slower than a tool that starts fresh with modern assumptions.
Other package managers could learn from this: static metadata, no code execution to discover dependencies, and the ability to resolve everything upfront before downloading. Cargo and npm have operated this way for years. If your ecosystem requires running arbitrary code to find out what a package needs, you’ve already lost.
...
Read the original on nesbitt.io »
When something is running on a system—whether it is a process, a service, or something bound to a port—there is always a cause. That cause is often indirect, non-obvious, or spread across multiple layers such as supervisors, containers, services, or shells.
Existing tools (ps, top, lsof, ss, systemctl, docker ps) expose state and metadata. They show what is running, but leave the user to infer why by manually correlating outputs across tools.
It explains where a running thing came from, how it was started, and what chain of systems is responsible for it existing right now, in a single, human-readable output.
* Explain why a process exists, not just that it exists
Ports, services, containers, and commands all eventually map to PIDs. Once a PID is identified, witr builds a causal chain explaining why that PID exists.
How did it start?
What is keeping it running?
What context does it belong to?
witr node
witr nginx
A single positional argument (without flags) is treated as a process or service name. If multiple matches are found, witr will prompt for disambiguation by PID.
witr –pid 14233
witr –port 5000
What the user asked about.
A causal ancestry chain showing how the process came to exist. This is the core value of witr.
The primary system responsible for starting or supervising the process (best effort).
Only one primary source is selected.
* Restarted multiple times (warning only if above threshold)
* Process has been running for over 90 days
A single positional argument (without flags) is treated as a process or service name.
witr node
witr –port 5000 –short
witr –pid 1482060 –tree
witr node
witr nginx
The easiest way to install witr is via the install script.
curl -fsSL https://raw.githubusercontent.com/pranshuparmar/witr/main/install.sh | bash
curl -fsSL https://raw.githubusercontent.com/pranshuparmar/witr/main/install.sh -o install.sh
cat install.sh
chmod +x install.sh
./install.sh
You may be prompted for your password to write to system directories.
If you prefer manual installation, follow these simple steps for your architecture:
# Download the binary
curl -fsSL https://github.com/pranshuparmar/witr/releases/latest/download/witr-linux-amd64 -o witr-linux-amd64
# Verify checksum (Optional, should print OK)
curl -fsSL https://github.com/pranshuparmar/witr/releases/latest/download/SHA256SUMS -o SHA256SUMS
grep witr-linux-amd64 SHA256SUMS | sha256sum -c -
# Rename and install
mv witr-linux-amd64 witr && chmod +x witr
sudo mv witr /usr/local/bin/witr
# Install the man page (Optional)
sudo curl -fsSL https://github.com/pranshuparmar/witr/releases/latest/download/witr.1 -o /usr/local/share/man/man1/witr.1
sudo mandb >/dev/null 2>&1 || true
# Download the binary
curl -fsSL https://github.com/pranshuparmar/witr/releases/latest/download/witr-linux-arm64 -o witr-linux-arm64
# Verify checksum (Optional, should print OK)
curl -fsSL https://github.com/pranshuparmar/witr/releases/latest/download/SHA256SUMS -o SHA256SUMS
grep witr-linux-arm64 SHA256SUMS | sha256sum -c -
# Rename and install
mv witr-linux-arm64 witr && chmod +x witr
sudo mv witr /usr/local/bin/witr
# Install the man page (Optional)
sudo curl -fsSL https://github.com/pranshuparmar/witr/releases/latest/download/witr.1 -o /usr/local/share/man/man1/witr.1
sudo mandb >/dev/null 2>&1 || true
* Download only the binary for your architecture and the SHA256SUMS file.
* Verify the checksum for your binary only (prints OK if valid).
* Rename to witr, make it executable, and move to your PATH.
witr –version
man witr
sudo rm -f /usr/local/bin/witr
sudo rm -f /usr/local/share/man/man1/witr.1
If you use Nix, you can build witr from source and run without installation:
nix run github:pranshuparmar/witr — –port 5000
witr inspects /proc and may require elevated permissions to explain certain processes.
If you are not seeing the expected information (e.g., missing process ancestry, user, working directory or environment details), try running witr with sudo for elevated permissions:
sudo witr [your arguments]
* A user can answer “why is this running?” within seconds
This project was developed with assistance from AI/LLMs (including GitHub Copilot, ChatGPT, and related tools), supervised by a human who occasionally knew what he was doing.
...
Read the original on github.com »
Great things and people that I discovered, learned, read, met, etc. in 2025. No particular ordering is implied. Not everything is new.
also: see the lists from 2024,
2023,
2022,
2021,
2020,
2019,
2018,
2017,
2016,
2015,
2014,
2013,
2012,
2011
and 2010
While I’ve posted a few technical post on my personal blog, I’ve taken a lot of time to guest-post on the Wormwoodania
blog about weird, macabre, and sardonic fiction and other related, non-technical topics. I hope to continue this trend into next year. Also, my most assiduous readers will have noticed that I’ve written more about games. I’ve decided to keep those posts on this blog since my intent for the site has always been about systems and systems-thinking and games are a great way to study and model systems.
* Mouse, a Language
for Microcomputers by Peter Grogono - Mouse is
basically an esolang with barely any abstraction facilities, but the
book was well-written and the language compelling enough to explore
further.
* Notes on
Distance Dialing (pdf) by AT&T - Described the
telephone systems of the USA and Canada in the mid-1950s. The reading is
a dry as it gets, but it was a fascinating dive into a vastly complex
system solving extremely hard problems. This is a must-read for folks
interested in systems-thinking. That said, I am actively looking for
recommendations for books about the process of designing and building
the unbelievably complex telephony system over the rudiments of the
earlier systems. Recommendations welcomed!
The vast majority of my reading this year was fiction, and I discovered some real gems.
* The Eye of
Osiris by R. Austin Freeman - This is the first book
that I’ve read from Freeman and I suspect that I will read many more in
the future. The story follows the disappearance of John Bellingham,
Egyptologist and the subsequent investigation. As the investigation
stalls, the eminent Dr. Thorndyke digs into the case. The story sets up
the mystery nicely and indeed provides enough information to the reader
to infer how the disappearance occurred and who or what facilitated it.
The book is one of the best whodunits that I’ve ever read.
* The Mystery
of Edwin Drood by Charles Dickens - His final work
remains unfinished as he passed away before he could complete it.
Further complicating the meta-story is that he also didn’t outline the
ending nor even put to paper the “villain” of the story. The
meta-mystery of the ending has motivated a mountain of speculation
around the ending including dozens of continuations of the story from
other authors, all deriving their pet endings from textual hints,
accounts from Dickens’ friends, illustration notes, and even in some
cases seances supposedly accompanied by the spirit of Dickens himself.
What was written by Dickens is spectacular and a compelling mystery and
although it would be great to know the resolution, in some ways the
“Droodiana” that has cropped up over the past 150+ years is reason
enough for it to remain a mystery. The whole lore around Edwin Drood is
a worthwhile hobby in itself and well-worth exploring. The Chiltern
Library edition of the book contains the story and a good bit of the
lore around the writing and the meta-works available at the time of its
publication.
* The
Shadow People by Margaret St. Clair - Sadly out of
print and difficult to find, but I’ve had it on my shelves for decades
and finally got around to reading it. The book came onto my radar in the
1980s when I learned about it in the appendix-n of the
1st edition Advanced D&D Dungeon Masters Guide. I enjoyed many of
the books at the time and have slowly swung around to re-reading them
over the past few years. Sadly, most on the list do not stand the test
of time for me, but St. Clair’s mixture of 60s counter-cultural leanings
in a fantasy/sf world still works. The cultural touch-points in the book
feel quite dated, but despite the occasional awkwardness, the story is
unique even today.
* Lolly
Willowes by Sylvia Townsend Warner - The book started
as a passable novel of manners focused on a turn of the century British
middle-class family. The titular character was mostly background
decoration for the first third of the novel and AFAIR was talked about
only in the third-person. It’s only when she made the choice to move out
on her own to the country in her middle age does she gain a
central role in the narrative and her inner thoughts revealed. This is
where things really pick up because I was shocked to learn that this
unassuming woman’s inner thoughts had a delicious darkness to them. I
don’t want to give away too much, but I’ll just say that you will not
expect how the story ends.
* Patience
by Daniel Clowes - A profound graphic novel using time-travel to
explore the idea of enduring love with a story that proceed through
time, following Jack as he tries to alter the past and save the woman he
loves. This well-known science fiction motif is elevated by Clowes’
signature psychological complexity.
* Narcissus
and Goldmund by Herman Hesse - I’ve read most of the
books by Hermann Hesse but this one escaped my attention until this
year. The story follows the parallel lives of a monk Narcissus and his
passionate friend Goldmund as they respectively search for meaning in
life through spiritual means and through pleasures of the
flesh.
* We
Who Are About To… by Joanna Russ - A small group of
astronauts crash land on a hostile alien world and quickly realize that
...
Read the original on blog.fogus.me »
This software project accompanies the research paper: Sharp Monocular View Synthesis in Less Than a Second
by Lars Mescheder, Wei Dong, Shiwei Li, Xuyang Bai, Marcel Santos, Peiyun Hu, Bruno Lecouat, Mingmin Zhen, Amaël Delaunoy, Tian Fang, Yanghai Tsin, Stephan Richter and Vladlen Koltun.
We present SHARP, an approach to photorealistic view synthesis from a single image. Given a single photograph, SHARP regresses the parameters of a 3D Gaussian representation of the depicted scene. This is done in less than a second on a standard GPU via a single feedforward pass through a neural network. The 3D Gaussian representation produced by SHARP can then be rendered in real time, yielding high-resolution photorealistic images for nearby views. The representation is metric, with absolute scale, supporting metric camera movements. Experimental results demonstrate that SHARP delivers robust zero-shot generalization across datasets. It sets a new state of the art on multiple datasets, reducing LPIPS by 25–34% and DISTS by 21–43% versus the best prior model, while lowering the synthesis time by three orders of magnitude.
We recommend to first create a python environment:
Afterwards, you can install the project using
The model checkpoint will be downloaded automatically on first run and cached locally at ~/.cache/torch/hub/checkpoints/.
Alternatively, you can download the model directly:
To use a manually downloaded checkpoint, specify it with the -c flag:
The results will be 3D gaussian splats (3DGS) in the output folder. The 3DGS .ply files are compatible to various public 3DGS renderers. We follow the OpenCV coordinate convention (x right, y down, z forward). The 3DGS scene center is roughly at (0, 0, +z). When dealing with 3rdparty renderers, please scale and rotate to re-center the scene accordingly.
Additionally you can render videos with a camera trajectory. While the gaussians prediction works for all CPU, CUDA, and MPS, rendering videos via the –render option currently requires a CUDA GPU. The gsplat renderer takes a while to initialize at the first launch.
Please refer to the paper for both quantitative and qualitative evaluations. Additionally, please check out this qualitative examples page containing several video comparisons against related work.
If you find our work useful, please cite the following paper:
@inproceedings{Sharp2025:arxiv,
title = {Sharp Monocular View Synthesis in Less Than a Second},
author = {Lars Mescheder and Wei Dong and Shiwei Li and Xuyang Bai and Marcel Santos and Peiyun Hu and Bruno Lecouat and Mingmin Zhen and Ama"{e}l Delaunoy and Tian Fang and Yanghai Tsin and Stephan R. Richter and Vladlen Koltun},
journal = {arXiv preprint arXiv:2512.10685},
year = {2025},
url = {https://arxiv.org/abs/2512.10685},
Our codebase is built using multiple opensource contributions, please see ACKNOWLEDGEMENTS for more details.
Please check out the repository LICENSE before using the provided code and
LICENSE_MODEL for the released models.
...
Read the original on github.com »
Try out the initial release of the QNX Developer Desktop — a self-hosted development environment for QNX. No more cross-compilation!
Try out the initial release of the QNX Developer Desktop — a self-hosted development environment for QNX. No more cross-compilation!
The team and I are beyond excited to share what we’ve been cooking up over the last little while: a full desktop environment running on QNX 8.0, with support for self-hosted compilation! This environment both makes it easier for newly-minted QNX developers to get started with building for QNX, but it also vastly simplifies the process of porting Linux applications and libraries to QNX 8.0.
This self-hosted target environment is pre-loaded with many of the ports you’ll find on the QNX Open-source Dashboard. (The portal currently includes over 1,400 ports across various targets, QNX versions, and architectures, of which more than 600 are unique ports!)
In this initial release, you can grab a copy of the QEMU image and give it a try for yourself. There’s still so much more to add, but it’s in a great place today for this first release. The team is really passionate about this one, and we’re eagerly looking forward to your feedback!
For the initial release of Desktop, we tried to cover all the basics: windowing, terminal, IDEs, browser, file management, and samples. To that end, here’s what makes up the QNX Developer Desktop:
* The tools you need to compile and/or run your code (clang, gcc, clang++, Python, make, cmake, git, etc)
* A web browser (can you join the QNX Discord from the QNX Desktop? 🏅👀)
* Ports of popular IDEs/editors, like Geany, Emacs, Neovim, and vim
* Preloaded samples, like Hello World in C, C++, and Python, and GTK demos OpenGL ES demos
* … and of course, a terminal.
This environment runs as a virtual machine, using QEMU on Ubuntu. To try the image, you’ll need:
With a free QNX license, you can find this release in QNX Software Center. On the Available tab of the Manage Installation pane, search for “quick start” and install the “QNX SDP 8.0 Quick Start Target Image for QEMU”.
You’ll find the image in your QNX installation directory, usually ~/qnx800/images by default. Follow the README.md file in the qemu directory to extract & combine the multiple QNX packages downloaded under the hood.
Next, follow the PDF instructions found in the new ./qemu_qsti/docs/ directory to install the required dependencies and boot up.
This is just the very first release! Over the next few months and beyond, we’ll drop more updates of Desktop. You can look forward to:
* QEMU images for Windows & macOS, and native images for x86
* Features to help use this self-hosted environment in CI jobs
* … and more! Have suggestions? Let us know.
Lastly, if you want some help with your QNX journey, you can find the QNX team and community:
...
Read the original on devblog.qnx.com »
In 2024, EFF wrote our initial blog about what could go wrong when police let AI write police reports. Since then, the technology has proliferated at a disturbing rate. Why? The most popular generative AI tool for writing police reports is Axon’s Draft One, and Axon also happens to be the largest provider of body-worn cameras to police departments in the United States. As we’ve written, companies are increasingly bundling their products to make it easier for police to buy more technology than they may need or that the public feels comfortable with.
We have good news and bad news.
Here’s the bad news: AI written police reports are still unproven, untransparent, and downright irresponsible–especially when the criminal justice system, informed by police reports, is deciding people’s freedom. The King County prosecuting attorney’s office in Washington state barred police from using AI to write police reports. As their memo read, “We do not fear advances in technology — but we do have legitimate concerns about some of the products on the market now… AI continues to develop and we are hopeful that we will reach a point in the near future where these reports can be relied on. For now, our office has made the decision not to accept any police narratives that were produced with the assistance of AI.”
In July of this year, EFF published a two-part report on how Axon designed Draft One to defy transparency. Police upload their body-worn camera’s audio into the system, the system generates a report that the officer is expected to edit, and then the officer exports the report. But when they do that, Draft One erases the initial draft, and with it any evidence of what portions of the report were written by AI and what portions were written by an officer. That means that if an officer is caught lying on the stand — as shown by a contradiction between their courtroom testimony and their earlier police report — they could point to the contradictory parts of their report and say, “the AI wrote that.” Draft One is designed to make it hard to disprove that.
In this video of a roundtable discussion about Draft One, Axon’s senior principal product manager for generative AI is asked (at the 49:47 mark) whether or not it’s possible to see after-the-fact which parts of the report were suggested by the AI and which were edited by the officer. His response (bold and definition of RMS added):
“So we don’t store the original draft and that’s by design and that’s really because the last thing we want to do is create more disclosure headaches for our customers and our attorney’s offices—so basically the officer generates that draft, they make their edits, if they submit it into our Axon records system then that’s the only place we store it, if they copy and paste it into their third-party RMS [records management system] system as soon as they’re done with that and close their browser tab, it’s gone. It’s actually never stored in the cloud at all so you don’t have to worry about extra copies floating around.”
All of this obfuscation also makes it incredibly hard for people outside police departments to figure out if their city’s officers are using AI to write reports–and even harder to use public records requests to audit just those reports. That’s why this year EFF also put out a comprehensive guide to help the public make their records requests as tailored as possible to learn about AI-generated reports.
Ok, now here’s the good news: People who believe AI-written police reports are irresponsible and potentially harmful to the public are fighting back.
This year, two states have passed bills that are an important first step in reigning in AI police reports. Utah’s SB 180 mandates that police reports created in whole or in part by generative AI have a disclaimer that the report contains content generated by AI. It also requires officers to certify that they checked the report for accuracy. California’s SB 524 went even further. It requires police to disclose, on the report, if it was used to fully or in part author a police report. Further, it bans vendors from selling or sharing the information a police agency provided to the AI. The bill also requires departments to retain the first draft of the report so that judges, defense attorneys, or auditors could readily see which portions of the final report were written by the officer and which portions were written by the computer.
In the coming year, anticipate many more states joining California and Utah in regulating, or perhaps even banning, police from using AI to write their reports.
This article is part of our Year in Review series. Read other articles about the fight for digital rights in 2025.
...
Read the original on www.eff.org »
T-Ruby is an open source project. Your contribution makes a difference.
It’s still experimental. The core compiler works, but there’s much to improve.
Feedback and suggestions are always welcome!
...
Read the original on type-ruby.github.io »
If you’ve heard “Grok” thrown around lately, you’re probably thinking of Elon’s chatbot from xAI. We’re not talking about that one. That model isn’t particularly good, but its whole value prop is being politically incorrect so you can get it to say edgy things.
The company Nvidia bought is Groq (with a Q). Totally different beast.
If you’ve used any high quality LLM, you’ve noticed it takes a while to generate a response. Especially for something rapid fire like a conversation, you want high quality AND speed. But speed is often what gets sacrificed. There’s always that “thinking… gathering my notes… taking some time to form the best response” delay.
Groq specialized in hardware and software that makes this way faster. They created a new type of chip called an LPU (Language Processing Unit). It’s based on an ASIC, an application specific integrated circuit. If that’s confusing, don’t worry about it. It’s just a processor that does a specific type of task really well.
So imagine you’re talking to Gemini and it takes a couple seconds to respond. Now imagine it responded instantly, like 10 or 100 times faster. That’s the problem Groq was solving.
To go one level deeper on LPUs versus GPUs (the processors most LLMs run on, typically Nvidia cards): those GPU calculations have to access a lot of things in memory. Nvidia’s chips depend on HBM, high bandwidth memory. But LPUs use something called SRAM that’s much faster to reference.
Think about it like this. Your wife has a grocery list for you. You go to the store but forget the list. Every time you’re in an aisle, you have to call her on the phone: “Hey, do I need anything from the bread aisle?” Get the bread. Put the phone down. Go to the next aisle. “Hey, do I need anything from canned goods?” And so on through produce, meat, pick up the beer, check out, get home. Very inefficient.
Groq’s approach is like you just took the list to the store. Get to a new aisle, look at the list. Next aisle, look at the list. Much faster than a phone call.
That’s the key difference. Nvidia GPUs are phoning out every time they hit a new aisle. Groq’s LPUs mean the shopper has the list in their pocket.
Groq’s main offering is GroqCloud. An engineer like me isn’t going to go out and buy an LPU (I don’t even know if they’re commercially available). What I’m going to do is, if I need lightning fast response in an application I’m building, I’ll use GroqCloud. That inference happens at an extremely fast rate, running on LPUs in a data center somewhere.
Their value prop is: fast, cheap, low energy.
Where they’ve been falling short is they mostly use open source models. Llama, Mistral, GPT-OSS (OpenAI’s open source offering). These are decent models, but nowhere near the quality of something like Anthropic’s Opus 4.5 or Gemini 3 Pro.
Groq positioned themselves for hardcore use cases where milliseconds matter. One of their big industry fits is real time data analysis for Formula 1. When your driver’s out there on the track, you don’t have time to send a query to Gemini and wait 20 to 30 seconds to figure out if you should pit this guy for new tires this lap. You want something like Groq which does pretty good analysis, really really really fast.
This is insider baseball you’re going to miss from mainstream news headlines. The media is paying attention to Grok (Elon’s chatbot), not Groq (the chip company). This isn’t a conspiracy, it’s just that the only people aware of Groq are software developers and tech nuts like me.
This is a canary in the coal mine for worse things to come in 2026.
About a year ago, Groq announced a $1.5 billion infrastructure investment deal with Saudi Arabia. They also secured a $750 million Series D funding round. These are crazy multiples even for a software company that’s somewhat leveraged in hardware. Bubble level projections.
To visualize $1.5 billion: if you cashed that check out in $100 bills and stacked them one on top of another, it would reach a five story building. For ordinary plebeians like us, at the average US salary of around $75K, you’d need to work 20,000 years to earn that.
At that time, the company was valued at $2 billion. Hello bubble.
Then in maybe one of the best rug pulls of all time, in July they quietly changed their revenue projections to $500 million. A 75% cut in four months. I’ve never seen anything like that since the 2008 financial crisis.
This was a company valued at $2 billion, enough that the government of Saudi Arabia was investing at that valuation. Then they took a 75% haircut four months later without anything major happening.
If it can happen to Groq, who else can it happen to? Nvidia? Intel?
The rumors started flying on Christmas Eve. Confirmed the 26th: Nvidia will be buying Groq, their key product line, and key personnel including the CEO, for $20 billion.
Let’s walk through that again:
* December: Nvidia buys them for $20 billion after they took a 75% haircut
What is going on? This is a bubble.
The only explanation is this is a fear purchase. Groq was promising faster, cheaper, more efficient, less electricity use for their chips. But they couldn’t scale fast enough to compete with the vendor lock in and buddy system Nvidia has going.
Nvidia’s buying them with their insanely inflated war chest. They don’t want a chunk taken out of their market share. They can’t afford to take that chance. So it’s like they’re just saying: “Shut up, take the $20 billion, walk away from this project.”
This is a sign that in order to succeed, Nvidia needs a monopoly on the market. Otherwise they would not pay ten times the company’s valuation that was then decreased by 75%. This is a desperate move to consolidate the market into “you have to go with us.”
Saudi Arabia didn’t keep that $1.5 billion sitting around. They redirected it to Nvidia and AMD instead. Nvidia still gets paid ofc.
Smaller competitors like Cerebras and Inflection, doing things in Groq’s space or exploring different architectures for AI inference, are canceling IPOs, dropping like flies, seeking emergency funding. The chatter I’m hearing from VCs and friends in that world? Ain’t nobody buying it right now.
Google made their own chip. Microsoft and Amazon are racing to make competition chips that run on less electricity, are more efficient, faster. But no matter what anybody does, the market is consolidating around the Nvidia monopoly for AI hardware. Engineers like me and architects at large enterprises are trying to escape this. Once they consolidate enough of the market, they can set their price for chips and usage. If you don’t own them, you go through Oracle or some cloud computing service, and they can charge whatever they want because there will be no competitors. Even a competitor having a rough time but getting some traction? They just buy them out for $20 billion because with this monopoly going, that’s pocket change.
The AI infrastructure boom ran on one large errant assumption: that power is cheap and plentiful. That assumption was very, very wrong.
We’ve seen parts of the power grid fail, go unmaintained. There’s a great article on Palladium called “The Competency Crisis” that explains some of what’s going on in the US right now. Electricity is now the bottleneck. It’s the constraint. Too expensive or you can’t get enough of it. Who’s paying these costs? The tech companies aren’t. Trump is meeting with Jensen Huang, with Sam Altman. He hasn’t been over my place lately. He hasn’t invited you to the White House to talk about how you can’t afford groceries and eggs cost three times what they did a few years ago.
You and I are going to pay to subsidize the electricity constraints.
US data centers are using about 4% of all electricity right now. If growth continues at the same rate, in ten years they’ll use 9%. Almost one tenth of all electricity generated in the US.
I’m not much of an environmentalist. But even someone like me, pretty jaded to some amount of waste because of industrialization, something like this actually makes my stomach turn. In places with lots of data centers, AI heavy regions, they’re experiencing about a 250% increase in energy costs compared to just five years ago. That’s huge. Even compared to grocery costs, which are out of control. At least with food you have alternatives. Red meat too expensive? Buy chicken. Chicken too expensive? Buy eggs. But electricity? You have to run electricity. It’s a public utility. You can’t just “use a lot less” when your bill goes up 2.5x.
Let me walk you through what happens. Some business development guy from Oracle wants to build a new data center in Rural America. A year before it’s even built, they meet with city officials, maybe the governor. They grease the right people. Get legislation passed with the utility companies that says they’ll pay a preferential rate for electricity because they’re going to use a shit ton.
They’re Oracle or Nvidia, they’re good for it. They pay five years upfront. Then the utility decides what to do with the rest of the electricity. The grid is strained. They want everyone else to use less. They can’t just tell them to use less, so they keep raising the price until naturally you just can’t afford to use more.
You turn the lights off. Turn the TV off. Get those LED bulbs that disrupt your sleep. Do everything to keep electricity costs down. But you and I are left holding the bag. The data center folks aren’t paying for that, they paid upfront at a preferential rate.
Senate Democrats are allegedly investigating this. I’ll believe it when I see it. These tech companies have their hooks so far into influential politicians. It’s a vote grab. I’d be happy to be wrong about that, but I know I’m not going to be.
I talked about this in my last piece on the AI bubble. This computer communism that’s going on, pricing you out of personal computing, keeping it all in the family of these tech companies. But with the Groq deal confirmed, it’s gone one step further. Nvidia is not just selling chips anymore. They are lending money to customers who buy the chips. They are artificially inflating demand.
It’s like if I run a small grocery store and I need to show good signal to investors that I’m bringing cash in the door. So I go downtown during farmer’s market and give everybody a $20 voucher to use at my store. I take a wholesale hit when they come in and buy stuff. But what I can show is: “Hey, people are coming in and spending money. Look at the revenue. Give me more venture capital money.”
This can’t work infinitely, for the same reason any perpetual motion machine can’t work.
Back in September, Nvidia announced a $100 billion investment in OpenAI. Coming through in $10 billion tranches over time. And it’s not equity, it’s lease agreements for Nvidia’s chips. Literally paying them to pay themselves. There’s probably some tax shenanigans going on there since they’re typically offering $xxx in leases of their chips to the company they’re “lending” to. Assuming they do this in part because the depreciation on the physical asset (the chips) can be written off on their taxes somehow. They’re essentially incentivizing another person to use them with money. Even OpenAI’s CFO has admitted, quote: “Most of the money will go back to Nvidia.”
They’re playing both sides. In the OpenAI case, they’re financing software that uses their chips. But they’re also getting their hooks into data centers.
CoreWeave: they have something like 7% share in the company. Worth $3 billion. That’s funded by GPU revenue from CoreWeave.
Lambda: another data center operator. That’s a $1.3 billion spending deal. Renting Nvidia’s own chips back from them.
They’ve pledged to invest £2 billion across British startups, which of course are going to go back to Nvidia chips one way or another.
In 2024, they invested about $1 billion across startups and saw a $24 billion return in chip spend. They 24x’d their billion dollar investment in one year. Nvidia has more power than the Fed right now. More power than the president over the economy. They have their hand on the knob of the economy. They can choose how fast or slow they want it to go. If the Nvidia cash machine stops printing, if they stop funding startups, data centers, hardware companies, software companies, that whole part of the economy slows way down and maybe crashes if investors get spooked.
I’m waiting for somebody to blow the whistle on this. I’m not a finance guy, so it’s strange I’m even talking about it. But their entire success story for the next couple years hinges on their $100 billion investment in OpenAI, which they’re expecting to bring back about $300 billion in chip purchases.
It’s vendor financing. It’s sweetening the pot on your own deals. I cannot believe more people are not talking about this.
OpenAI, the leader of this space, the company whose CEO Sam Altman is invited to the White House numerous times, probably has a direct line to Trump, a lot of the economy hinges on this guy’s strategy, opinions, and public statements.
And he runs a company that is not profitable. Actually insane if you think about it. All he’s done with that company, from an economics point of view, is rack up debt. Spent more than he’s earned.
By that metric, I’m richer than Sam Altman. Not in net worth. But if I consider myself a business and the fact that I bring in any salary at all, even a dollar a year, that would make me earn more than Sam Altman, who has only lost money. In the next few years, they’re expected to burn something like $75 billion a year. Just set that money on fire. I have a credit card with no preset spending limit, but I assume if I run up a $75 billion charge it’s going to get denied. By their projections, they think they’ll become profitable around 2029/2030, and they need $200 billion in annual revenue to offset their debts and losses.
To visualize $200 billion: if you cashed that out in $100 bills and stacked them, it would reach halfway to the International Space Station. That’s how much they’d have to make every single year to just be profitable. Not be a massively successful company. Just to not spend more than they earn.
* 2028 (projected): Will lose $74 billion that year. Just lose it.
They’d need to make $200 billion in a year to offset that. Are they including interest? I have no idea how these things work, but in simple terms: they’re spending a lot more money than they make.
Groq was kind of a one off because Nvidia panic bought the competition. But they also need to figure out how to get prices down or they can’t keep this money machine moving.
Groq is probably one of the last companies that caught a good lifeboat off a sinking ship.
What we’re going to see this year: it’ll start small, but get major. A company first, then multiple larger ones, that had a 2025 valuation, will go to raise. They’ll do a 409A evaluation. A bunch of smart analysts will say what it’s worth to investors. And you’re going to see the valuation drop. They won’t be able to raise money.
Then the shit is really going to start to hit the fan.
The dominoes will start falling. That’s probably what kicks off the actual pop, and it’s imminent. Any day now.
Part of the big gamble they’ve sold investors is: we’ve got to replace you. The worker. That’s why we need to spend so much money, go into debt. It’ll all be worth it because then we won’t have to pay people to do these jobs anymore.
We’re going to continue to see massive labor displacement. To a degree this is a shell game, an investor illusion. What these larger enterprise companies are hoping to do: cut a lot of folks, have big layoffs, say it’s because AI is replacing the jobs.
In some cases they’re right. Data entry, for example. I don’t mean to be mean if you do data entry for a living, but there are models very good at that now. You need someone to manage the work, spot check it. But it’s kind of a job AI can do.
Like how grocery stores have automatic checkout lines now with one person monitoring six or eight of them. So some of that’s real. A lot of it isn’t though.
They cut a bunch of American workers under the guise that AI’s replacing workers. A lot of these megacorp execs are actually convinced of it. Americans are expensive, especially tech workers.
Then they see: damn, maybe we could have cut some, but not as many. We got greedy. Now our services are failing in production. AWS in Northern Virginia is flaky, going down again. That just happened, by the way, direct result of these layoffs.
So instead they think, “We’ll look globally! Spin up a campus in Hyderabad!” Pay them way less. The cost of living is less there, they expect less. Bring them over on H-1B when needed. I’ve written about the H-1B program. This is nothing against the individuals on that program. I’ve worked with very talented H-1Bs, and some very inferior ones, just like American citizens.
But the corporate sleight of hand is something like this, we can get those H-1B visas, and they’re not going to ask for pesky stuff like Sunday off for church. We can put them on call 24/7 and they can’t say no because if they do, we kick them back to their country. Same thing that’s happened with migrant labor in farming over the past century. Even if it doesn’t look like it on the surface, corporations know that they can pay H-1B employees less than American citizens. If a US citizen and H-1B recipient both make $120k but the H-1B works double the hours because they have no room to push back and are under threat of being sent home, they are making 50% less than the American per-hour.
Salesforce: My favorite one. 4,000 customer support roles cut. And the CEO is gloating about it in interviews. So great that he can replace workers!
Now Salesforce is admitting that they fucked up. They cut too many people.
I’ve been on the other side of this when I was an engineering director at a Fortune 500 company. They were neurotic about tracking AI use. Spending an exorbitant amount of money on shitty AI tools. Like tools from a year ago. GPT-4o in the GPT-5 era, in the Anthropic dominance era. More or less useless.
Not only would they monitor all usage across employees, specifically who’s using it how much, they could see every single message being sent to the AI. So theoretically you could check somebody’s queries and do a performance evaluation based on where you perceive them to be.
They’re using yesterday’s tools because of regulation and compliance, blowing an absurd amount of money. The CEO just sees a lot going out the door: “I thought these were supposed to save money, what’s going on here?” So his lieutenants have to get a grip on it, monitor everything. Even at Amazon this is being included in performance reviews, how much they’re using AI.
That’s why if you’re a software developer wondering why your boss is on you about using AI tools. They’re probably getting pressure from their bosses. I want my engineers to be as productive as possible. I think AI is probably part of that tool belt for everyone at this point. But is tracking it really the best way? I’m going to gauge performance on metrics, how you interacted with the team, what you shipped. It puts the cart before the horse to say you’re paid by how much you use these AI tools. If I’m a developer, I’m just spinning up a script to send lorem ipsum text 24 hours a day, get maximum ratings because I used GPT-3.5 the most.
MIT did a study in summer 2025. They’re saying 95% of companies report zero measurable ROI on AI products.
Actually not that crazy if you consider that a lot of them did layoffs and subbed in AI. That’s just going according to plan in my estimation.
They estimate about $30 to $40 billion has been spent on enterprise AI. That’s the money your JPMorgan Chase is spending for their engineers to use Claude Code or whatever dated tool they have access to.
The market heard that signal. When that study came out:
That last one isn’t encouraging for an AI bubble pop because it indicates this is a big part of the economy.
January through February: Maybe down valuations, just a very flat market without much growth.
Q1 to Q2: We’ll start to see a couple businesses, or maybe one major one at first like a domino starting to fall, not able to raise capital at their 2025 valuation. We’ll see valuations go down. VCs will be like: “Not touching it, not giving them more money, cutting our losses.”
Then timing a little more indeterminate but these things will happen quickly in succession:
* Debt refinancing pressure builds up in the system
* Nvidia revises its revenue guidance to something at least vaguely linked to reality
And that’s when the big reckoning begins.
I want to be clear. AI is not going anywhere. It’s going to continue being a mainstream part of the world. I like a lot of these tools, I think they’re very helpful.
But the talking heads have really promised us the world. It’s clear the technology cannot deliver above and beyond what it’s doing now.
We’ve seen progress of these models slow at an exponential rate of slowing over the past few releases. Each release is better, but the gap between the current release and the last release is much smaller than it used to be.
Because of that, a lot of these AI companies are going to survive, but their valuations are going to get giga slashed.
Valuations of $500 billion for OpenAI, $42 billion for Anthropic are unsustainable. We’re going to actually see them become unsustainable in 2026 as they’re eventually cut. Smaller companies will face those slashes first. But it’s coming for the major AI labs as well.
This is good news, honestly. This AI hype has really turned tech into something different these days, different than it used to be. While I don’t feel AI is going anywhere, I do feel we’ll get a little more back to normal once this bubble pops and resolves.
What do you think? Drop a comment below. Subscribe if you found this interesting.
...
Read the original on blog.drjoshcsimmons.com »
Last month I came across onemillionscreenshots.com and was pleasantly surprised at how well it worked as a tool for discovery. We all know the adage about judging book covers, but here, …it just kinda works. Skip over the sites with bright flashy colors begging for attention and instead, seek out the negative space in between.
The one nitpick I have though is in how they sourced the websites. They used the most popular websites from Common Crawl which is fine, but not really what I’m interested in…
There are of course exceptions, but the relationship between popularity and quality is loose. McDonald’s isn’t popular because it serves the best cheeseburger, it’s popular because it serves a cheeseburger that meets the minimum level of satisfaction for the maximum number of people. It’s a profit maximizing local minima on the cheeseburger landscape.
This isn’t limited to just food either, the NYT Best Sellers list, Spotify Top 50, and Amazon review volume are other good examples. For me, what’s “popular” has become a filter for what to avoid. Lucky for us though, there’s a corner of the internet where substance still outweighs click-through rates. A place that’s largely immune to the corrosive influence of monetization. It’s called the small web and it’s a beautiful place.
The timing of this couldn’t have been better. I’m currently working on a couple of tools specifically focused on small web discovery/recommendation and happen to already have most of the data required to pull this off. I just needed to take some screenshots, sooo… you’re welcome armchairhacker!
Because I plan on discussing how I gathered the domains in the near future, I’ll skip it for now (it’s pretty interesting). Suffice it to say though, once the domains are available, capturing the screenshots is trivial. And once those are ready, we have a fairly well worn path to follow:
I find the last two steps particularly repetitive so I decided to combine them this time via self-organizing maps (SOMs). I tried using SOMs a few years ago to help solve a TSP problem (well, actually the exact opposite…) but ended up going in a different direction. Anyway, despite their trivial implementation they can be extremely useful. A bare bones SOM clocks in at about 10 lines with torch.
At their core, most SOMs have two elements: a monotonically decreasing learning rate and a neighborhood function with an influence (radius) that is also monotonically decreasing. During training, each step consists of the following:
There are numerous modifications that can be made, but that’s basically it! If I’ve piqued your interest, I highly recommend the book Self-Organizing Maps by Teuvo Kohonen, it’s a fairly quick read and covers the core aspects of SOMs.
With dimensionality reduction and assignment resolved, we just need the visual embeddings now. I started with the brand new DinoV3 model, but was left rather disappointed. The progression of Meta’s self-supervised vision transformers has been truly incredible, but the latent space captures waaay more information than what I actually need. I just want to encode the high level aesthetic details of webpage screenshots. Because of this, I fell back on an old friend: the triplet loss on top of a small encoder. The resulting output dimension of 64 afforded ample room for describing the visual range while maintaining a considerably smaller footprint.
This got me 90% of the way there, but it was still lacking the visual layout I had envisioned. I wanted a stronger correlation with color at the expense of visual similarity. To achieve this, I had to manually enforce this bias by training two SOMs in parallel. One SOM operated on the encoder output (visual), the second SOM on the color distribution and were linked using the following:
When the quantization error is low, the BMU pulling force is dominated by the visual similarity. As quantization error increases, the pulling force due to visual similarity wanes and is slowly overpowered by the pulling force from the color distribution. In essence, the color distribution controls the macro placement while the visual similarity controls the micro placement. The only controllable hyperparameter with this approach is selecting a threshold for where the crossover point occurs.
I didn’t spend much time trying to find the optimal point, it’s currently peak fall and well, I’d much rather be outside. A quick look at the overall quantization error (below left) and the U-matrix (below right) was sufficient.
There’s still a lot of cruft that slipped in (substack, medium.com, linkedin, etc…) but overall, I’d say it’s not too bad for a first pass. In the time since generating this initial map I’ve already crawled an additional ~250k new domains so I suppose this means I’ll be doing an update. What I do know for certain though is that self-organizing maps have earned a coveted spot in my heart for things that are simple to the point of being elegant and yet, deceptively powerful (the others of course being panel methods, LBM, Metropolis-Hastings, and the bicycle).
...
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.