10 interesting stories served every morning and every evening.
For eight years, I’ve wanted a high-quality set of devtools for working with SQLite. Given how important SQLite is to the industry1, I’ve long been puzzled that no one has invested in building a really good developer experience for it.
A couple of weeks ago, after ~250 hours of effort over three months3 on evenings, weekends, and vacation days, I finally
released syntaqlite
(GitHub), fulfilling this long-held wish. And I believe the main reason this happened was because of AI coding agents.
Of course, there’s no shortage of posts claiming that AI one-shot their project or pushing back and declaring that AI is all slop. I’m going to take a very different approach and, instead, systematically break down my experience building syntaqlite with AI, both where it helped and where it was detrimental.
I’ll do this while contextualizing the project and my background so you can independently assess how generalizable this experience was. And whenever I make a claim, I’ll try to back it up with evidence from my project journal, coding transcripts, or commit history5.
In my work on Perfetto, I maintain a SQLite-based language for querying performance traces called
PerfettoSQL. It’s basically the same as SQLite but with a few extensions to make the trace querying experience better. There are ~100K lines of PerfettoSQL internally in Google and it’s used by a wide range of teams.
Having a language which gets traction means your users also start expecting things like formatters, linters, and editor extensions. I’d hoped that we could adapt some SQLite tools from open source but the more I looked into it, the more disappointed I was. What I found either wasn’t reliable enough, fast enough6, or flexible enough to adapt to PerfettoSQL. There was clearly an opportunity to build something from scratch, but it was never the “most important thing we could work on”. We’ve been reluctantly making do with the tools out there but always wishing for better.
On the other hand, there was the option to do something in my spare time. I had built lots of open source projects in my teens7 but this had faded away during university when I felt that I just didn’t have the motivation anymore. Being a maintainer is much more than just “throwing the code out there” and seeing what happens. It’s triaging bugs, investigating crashes, writing documentation, building a community, and, most importantly, having a direction for the project.
But the itch of open source (specifically freedom to work on what I wanted while helping others) had never gone away. The SQLite devtools project was eternally in my mind as “something I’d like to work on”. But there was another reason why I kept putting it off: it sits at the intersection of being both hard and
tedious.
If I was going to invest my personal time working on this project, I didn’t want to build something that only helped Perfetto: I wanted to make it work for any
SQLite user out there8. And this means parsing SQL exactly
like SQLite.
The heart of any language-oriented devtool is the parser. This is responsible for turning the source code into a “parse tree” which acts as the central data structure anything else is built on top of. If your parser isn’t accurate, then your formatters and linters will inevitably inherit those inaccuracies; many of the tools I found suffered from having parsers which approximated the SQLite language rather than representing it precisely.
Unfortunately, unlike many other languages, SQLite has no formal specification describing how it should be parsed. It doesn’t expose a stable API for its parser either. In fact, quite uniquely, in its implementation it doesn’t even build a parse tree at all9! The only reasonable approach left in my opinion is to carefully extract the relevant parts of SQLite’s source code and adapt it to build the parser I wanted10.
This means getting into the weeds of SQLite source code, a fiendishly difficult codebase to understand. The whole project is written in C in an
incredibly dense style; I’ve spent days just understanding the virtual table
API11 and
implementation. Trying to grasp the full parser stack was daunting.
There’s also the fact that there are >400 rules in SQLite which capture the full surface area of its language. I’d have to specify in each of these “grammar rules” how that part of the syntax maps to the matching node in the parse tree. It’s extremely repetitive work; each rule is similar to all the ones around it but also, by definition, different.
And it’s not just the rules but also coming up with and writing tests to make sure it’s correct, debugging if something is wrong, triaging and fixing the inevitable bugs people filed when I got something wrong…
For years, this was where the idea died. Too hard for a side project12, too tedious to sustain motivation, too risky to invest months into something that might not work.
I’ve been using coding agents since early 2025 (Aider, Roo Code, then Claude Code since July) and they’d definitely been useful but never something I felt I could trust a serious project to. But towards the end of 2025, the models seemed to make a significant step forward in quality13. At the same time, I kept hitting problems in Perfetto which would have been trivially solved by having a reliable parser. Each workaround left the same thought in the back of my mind: maybe it’s finally time to build it for real.
I got some space to think and reflect over Christmas and decided to really stress test the most maximalist version of AI: could I vibe-code the whole thing using just Claude Code on the Max plan (£200/month)?
Through most of January, I iterated, acting as semi-technical manager and delegating almost all the design and all the implementation to Claude. Functionally, I ended up in a reasonable place: a parser in C extracted from SQLite sources using a bunch of Python scripts, a formatter built on top, support for both the SQLite language and the PerfettoSQL extensions, all exposed in a web playground.
But when I reviewed the codebase in detail in late January, the downside was obvious: the codebase was complete spaghetti14. I didn’t understand large parts of the Python source extraction pipeline, functions were scattered in random files without a clear shape, and a few files had grown to several thousand lines. It was extremely fragile; it solved the immediate problem but it was never going to cope with my larger vision, never mind integrating it into the Perfetto tools. The saving grace was that it had proved the approach was viable and generated more than 500 tests, many of which I felt I could reuse.
I decided to throw away everything and start from scratch while also switching most of the codebase to Rust15. I could see that C was going to make it difficult to build the higher level components like the validator and the language server implementation. And as a bonus, it would also let me use the same language for both the extraction and runtime instead of splitting it across C and Python.
More importantly, I completely changed my role in the project. I took ownership of all decisions16 and used it more as “autocomplete on steroids” inside a much tighter process: opinionated design upfront, reviewing every change thoroughly, fixing problems eagerly as I spotted them, and investing in scaffolding (like linting, validation, and non-trivial testing17) to check AI output automatically.
The core features came together through February and the final stretch (upstream test validation, editor extensions, packaging, docs) led to a 0.1 launch in mid-March.
But in my opinion, this timeline is the least interesting part of this story. What I really want to talk about is what wouldn’t have happened without AI and also the toll it took on me as I used it.
I’ve written in the past
about how one of my biggest weaknesses as a software engineer is my tendency to procrastinate when facing a big new project. Though I didn’t realize it at the time, it could not have applied more perfectly to building syntaqlite.
AI basically let me put aside all my doubts on technical calls, my uncertainty of building the right thing and my reluctance to get started by giving me very concrete problems to work on. Instead of “I need to understand how SQLite’s parsing works”, it was “I need to get AI to suggest an approach for me so I can tear it up and build something better”18. I work so much better with concrete prototypes to play with and code to look at than endlessly thinking about designs in my head, and AI lets me get to that point at a pace I could not have dreamed about before. Once I took the first step, every step after that was so much easier.
AI turned out to be better than me at the act of writing code itself, assuming that code is obvious. If I can break a problem down to “write a function with this behaviour and parameters” or “write a class matching this interface,” AI will build it faster than I would and, crucially, in a style that might well be more intuitive to a future reader. It documents things I’d skip, lays out code consistently with the rest of the project, and sticks to what you might call the “standard dialect” of whatever language you’re working in19.
That standardness is a double-edged sword. For the vast majority of code in any project, standard is exactly what you want: predictable, readable, unsurprising. But every project has pieces that are its edge, the parts where the value comes from doing something non-obvious. For syntaqlite, that was the extraction pipeline and the parser architecture. AI’s instinct to normalize was actively harmful there, and those were the parts I had to design in depth and often resorted to just writing myself.
But here’s the flip side: the same speed that makes AI great at obvious code also makes it great at refactoring. If you’re using AI to generate code at industrial scale, you have to refactor constantly and continuously20. If you don’t, things immediately get out of hand. This was the central lesson of the vibe-coding month: I didn’t refactor enough, the codebase became something I couldn’t reason about, and I had to throw it all away. In the rewrite, refactoring became the core of my workflow. After every large batch of generated code, I’d step back and ask “is this ugly?” Sometimes AI could clean it up. Other times there was a large-scale abstraction that AI couldn’t see but I could; I’d give it the direction and let it execute21. If you have taste, the cost of a wrong approach drops dramatically because you can restructure quickly22.
Of all the ways I used AI, research had by far the highest ratio of value delivered to time spent.
I’ve worked with interpreters and parsers before but I had never heard of Wadler-Lindig pretty printing23. When I needed to build the formatter, AI gave me a concrete and actionable lesson from a point of view I could understand and pointed me to the papers to learn more. I could have found this myself eventually, but AI compressed what might have been a day or two of reading into a focused conversation where I could ask “but why does this work?” until I actually got it.
This extended to entire domains I’d never worked in. I have deep C++ and Android performance expertise but had barely touched Rust tooling or editor extension APIs. With AI, it wasn’t a problem: the fundamentals are the same, the terminology is similar, and AI bridges the gap24. The VS Code extension would have taken me a day or two of learning the API before I could even start. With AI, I had a working extension within an hour.
It was also invaluable for reacquainting myself with parts of the project I hadn’t looked at for a few days25. I could control how deep to go: “tell me about this component” for a surface-level refresher, “give me a detailed linear walkthrough” for a deeper dive, “audit unsafe usages in this repo” to go hunting for problems. When you’re context switching a lot, you lose context fast. AI let me reacquire it on demand.
Beyond making the project exist at all, AI is also the reason it shipped as complete as it did. Every open source project has a long tail of features that are important but not critical: the things you know theoretically how to do but keep deprioritizing because the core work is more pressing. For syntaqlite, that list was long: editor extensions, Python bindings, a WASM playground, a docs site, packaging for multiple ecosystems26. AI made these cheap enough that skipping them felt like the wrong trade-off.
It also freed up mental energy for UX27. Instead of spending all my time on implementation, I could think about what a user’s first experience should feel like: what error messages would actually help them fix their SQL, how the formatter output should look by default, whether the CLI flags were intuitive. These are the things that separate a tool people try once from one they keep using, and AI gave me the headroom to care about them. Without AI, I would have built something much smaller, probably no editor extensions or docs site. AI didn’t just make the same project faster. It changed what the project was.
There’s an uncomfortable parallel between using AI coding tools and playing slot machines28. You send a prompt, wait, and either get something great or something useless. I found myself up late at night wanting to do “just one more prompt,” constantly trying AI just to see what would happen even when I knew it probably wouldn’t work. The sunk cost fallacy kicked in too: I’d keep at it even in tasks it was clearly ill-suited for, telling myself “maybe if I phrase it differently this time.”
The tiredness feedback loop made it worse29. When I had energy, I could write precise, well-scoped prompts and be genuinely productive. But when I was tired, my prompts became vague, the output got worse, and I’d try again, getting more tired in the process. In these cases, AI was probably slower than just implementing something myself, but it was too hard to break out of the loop30.
Several times during the project, I lost my mental model of the codebase31. Not the overall architecture or how things fitted together. But the day-to-day details of what lived where, which functions called which, the small decisions that accumulate into a working system. When that happened, surprising issues would appear and I’d find myself at a total loss to understand what was going wrong. I hated that feeling.
The deeper problem was that losing touch created a communication breakdown32. When you don’t have the mental thread of what’s going on, it becomes impossible to communicate meaningfully with the agent. Every exchange gets longer and more verbose. Instead of “change FooClass to do X,” you end up saying “change the thing which does Bar to do X”. Then the agent has to figure out what Bar is, how that maps to FooClass, and sometimes it gets it wrong33. It’s exactly the same complaint engineers have always had about managers who don’t understand the code asking for fanciful or impossible things. Except now you’ve become that manager.
The fix was deliberate: I made it a habit to read through the code immediately after it was implemented and actively engage to see “how would I have done this differently?”.
Of course, in some sense all of the above is also true of code I wrote a few months ago (hence the
sentiment that AI code is legacy code), but AI makes the drift happen faster because you’re not building the same muscle memory that comes from originally typing it out.
There were some other problems I only discovered incrementally over the three months.
I found that AI made me procrastinate on key design decisions34. Because refactoring was cheap, I could always say “I’ll deal with this later.” And because AI could refactor at the same industrial scale it generated code, the cost of deferring felt low. But it wasn’t: deferring decisions corroded my ability to think clearly because the codebase stayed confusing in the meantime. The vibe-coding month was the most extreme version of this. Yes, I understood the problem, but if I had been more disciplined about making hard design calls earlier, I could have converged on the right architecture much faster.
Tests created a similar false comfort35. Having 500+ tests felt reassuring, and AI made it easy to generate more. But neither humans nor AI are creative enough to foresee every edge case you’ll hit in the future; there are several times in the vibe-coding phase where I’d come up with a test case and realise the design of some component was completely wrong and needed to be totally reworked. This was a significant contributor to my lack of trust and the decision to scrap everything and start from scratch.
Basically, I learned that the “normal rules” of software still apply in the AI age: if you don’t have a fundamental foundation (clear architecture, well-defined boundaries) you’ll be left eternally chasing bugs as they appear.
Something I kept coming back to was how little AI understood about the passage of time36. It sees a codebase in a certain state but doesn’t feel time the way humans do. I can tell you what it feels like to use an API, how it evolved over months or years, why certain decisions were made and later reversed.
The natural problem from this lack of understanding is that you either make the same mistakes you made in the past and have to relearn the lessons or you fall into new traps which were successfully avoided the first time, slowing you down in the long run. In my opinion, this is a similar problem to why losing a high-quality senior engineer hurts a team so much: they carry history and context that doesn’t exist anywhere else and act as a guide for others around them.
In theory, you can try to preserve this context by keeping specs and docs up to date. But there’s a reason we didn’t do this before AI: capturing implicit design decisions exhaustively is incredibly expensive and time-consuming to write down. AI can help draft these docs, but because there’s no way to automatically verify that it accurately captured what matters, a human still has to manually audit the result. And that’s still time-consuming.
There’s also the context pollution problem. You never know when a design note about API A will echo in API B. Consistency is a huge part of what makes codebases work, and for that you don’t just need context about what you’re working on right now but also about other things which were designed in a similar way. Deciding what’s relevant requires exactly the kind of judgement that institutional knowledge provides in the first place.
Reflecting on the above, the pattern of when AI helped and when it hurt was fairly consistent.
When I was working on something I already understood deeply, AI was excellent. I could review its output instantly, catch mistakes before they landed and move at a pace I’d never have managed alone. The parser rule generation is the clearest example37: I knew exactly what each rule should produce, so I could review AI’s output within a minute or two and iterate fast.
When I was working on something I could describe but didn’t yet know, AI was good but required more care. Learning Wadler-Lindig for the formatter was like this: I could articulate what I wanted, evaluate whether the output was heading in the right direction, and learn from what AI explained. But I had to stay engaged and couldn’t just accept what it gave me.
When I was working on something where I didn’t even know what I wanted, AI was somewhere between unhelpful and harmful. The architecture of the project was the clearest case: I spent weeks in the early days following AI down dead ends, exploring designs that felt productive in the moment but collapsed under scrutiny. In hindsight, I have to wonder if it would have been faster just thinking it through without AI in the loop at all.
But expertise alone isn’t enough. Even when I understood a problem deeply, AI still struggled if the task had no objectively checkable answer38. Implementation has a right answer, at least at a local level: the code compiles, the tests pass, the output matches what you asked for. Design doesn’t. We’re still arguing about OOP decades after it first took off.
Concretely, I found that designing the public API of syntaqlite was where this hit home the hardest. I spent several days in early March doing nothing but API refactoring, manually fixing things any experienced engineer would have instinctively avoided but AI made a total mess of. There’s no test or objective metric for “is this API pleasant to use” and “will this API help users solve the problems they have” and that’s exactly why the coding agents did so badly
at it.
This takes me back to the days I was obsessed with physics and, specifically, relativity. The laws of physics look simple and Newtonian in any small local area, but zoom out and spacetime curves in ways you can’t predict from the local picture alone. Code is the same: at the level of a function or a class, there’s usually a clear right answer, and AI is excellent there. But architecture is what happens when all those local pieces interact, and you can’t get good global behaviour by stitching together locally correct components.
Knowing where you are on these axes at any given moment is, I think, the core skill of working with AI effectively.
Eight years is a long time to carry a project in your head. Seeing these SQLite tools actually exist and function after only three months of work is a massive win, and I’m fully aware they wouldn’t be here without AI.
But the process wasn’t the clean, linear success story people usually post. I lost an entire month to vibe-coding. I fell into the trap of managing a codebase I didn’t actually understand, and I paid for that with a total rewrite.
The takeaway for me is simple: AI is an incredible force multiplier for implementation, but it’s a dangerous substitute for design. It’s brilliant at giving you the right answer to a specific technical question, but it has no sense of history, taste, or how a human will actually feel using your API. If you rely on it for the “soul” of your software, you’ll just end up hitting a wall faster than you ever have before.
What I’d like to see more of from others is exactly what I’ve tried to do here: honest, detailed accounts of building real software with these tools; not weekend toys or one-off scripts but the kind of software that has to survive contact with users, bug reports, and your own changing mind.
...
Read the original on lalitm.com »
AI Edge Gallery is the premier destination for running the world’s most powerful open-source Large Language Models (LLMs) on your mobile device. Experience high-performance Generative AI directly on your hardware—fully offline, private, and lightning-fast.
Now Featuring: Gemma 4
This update brings official support for the newly released Gemma 4 family. As the centerpiece of this release, Gemma 4 allows you to test the cutting edge of on-device AI. Experience advanced reasoning, logic, and creative capabilities without ever sending your data to a server.
Core Features
- Agent Skills: Transform your LLM from a conversationalist into a proactive assistant. Use the Agent Skills tile to augment model capabilities with tools like Wikipedia for fact-grounding, interactive maps, and rich visual summary cards. You can even load modular skills from a URL or browse community contributions on GitHub Discussions.
- AI Chat with Thinking Mode: Engage in fluid, multi-turn conversations and toggle the new Thinking Mode to peek “under the hood.” This feature allows you to see the model’s step-by-step reasoning process, which is perfect for understanding complex problem-solving. Note: Thinking Mode currently works with supported models, starting with the Gemma 4 family.
- Ask Image: Use multimodal power to identify objects, solve visual puzzles, or get detailed descriptions using your device’s camera or photo gallery.
- Audio Scribe: Transcribe and translate voice recordings into text in real-time using high-efficiency on-device language models.
- Prompt Lab: A dedicated workspace to test different prompts and single-turn use cases with granular control over model parameters like temperature and top-k.
- Mobile Actions: Unlock offline device controls and automated tasks powered entirely by a finetune of FuntionGemma 270m.
- Tiny Garden: A fun, experimental mini-game that uses natural language to plant and harvest a virtual garden using a finetune of FunctionGemma 270m.
- Model Management & Benchmark: Gallery is a flexible sandbox for a wide variety of open-source models. Easily download models from the list or load your own custom models. Manage your model library effortlessly and run benchmark tests to understand exactly how each model performs on your specific hardware.
- 100% On-Device Privacy: All model inferences happen directly on your device hardware. No internet is required, ensuring total privacy for your prompts, images, and sensitive data.
Built for the Community
AI Edge Gallery is an open-source project designed for the developer community and AI enthusiasts alike. Explore our example features, contribute your own skills, and help shape the future of the on-device agent ecosystem.
Check out the source code on GitHub:
https://github.com/google-ai-edge/gallery
Note: This app is in active development. Performance is dependent on your device’s hardware (CPU/GPU). For support or feedback, contact us at google-ai-edge-gallery-android-feedback@google.com.
...
Read the original on apps.apple.com »
This project exists to show that training your own language model is not magic.
No PhD required. No massive GPU cluster. One Colab notebook, 5 minutes, and you have a working LLM that you built from scratch — data generation, tokenizer, model architecture, training loop, and inference. If you can run a notebook, you can train a language model.
It won’t produce a billion-parameter model that writes essays. But it will show you exactly how every piece works — from raw text to trained weights to generated output — so the big models stop feeling like black boxes.
GuppyLM is a tiny language model that pretends to be a fish named Guppy. It speaks in short, lowercase sentences about water, food, light, and tank life. It doesn’t understand human abstractions like money, phones, or politics — and it’s not trying to.
It’s trained from scratch on 60K synthetic conversations across 60 topics, runs on a single GPU in ~5 minutes, and produces a model small enough to run in a browser.
Vanilla transformer. No GQA, no RoPE, no SwiGLU, no early exit. As simple as it gets.
* Experiences the world through water, temperature, light, vibrations, and food
* Is friendly, curious, and a little dumb
60 topics: greetings, feelings, temperature, food, light, water, tank, noise, night, loneliness, bubbles, glass, reflection, breathing, swimming, colors, taste, plants, filter, algae, snails, scared, excited, bored, curious, happy, tired, outside, cats, rain, seasons, music, visitors, children, meaning of life, time, memory, dreams, size, future, past, name, weather, sleep, friends, jokes, fear, love, age, intelligence, health, singing, TV, and more.
Downloads the pre-trained model from HuggingFace and lets you chat. Just run all cells.
pip install torch tokenizers
python -m guppylm chat
from datasets import load_dataset
ds = load_dataset(“arman-bd/guppylm-60k-generic”)
print(ds[“train”][0])
# {‘input’: ‘hi guppy’, ‘output’: ‘hello. the water is nice today.’, ‘category’: ‘greeting’}
Why no system prompt? Every training sample had the same one. A 9M model can’t conditionally follow instructions — the personality is baked into the weights. Removing it saves ~60 tokens per inference.
Why single-turn only? Multi-turn degraded at turn 3-4 due to the 128-token context window. A fish that forgets is on-brand, but garbled output isn’t. Single-turn is reliable.
Why vanilla transformer? GQA, SwiGLU, RoPE, and early exit add complexity that doesn’t help at 9M params. Standard attention + ReLU FFN + LayerNorm produces the same quality with simpler code.
Why synthetic data? A fish character with consistent personality needs consistent training data. Template composition with randomized components (30 tank objects, 17 food types, 25 activities) generates ~16K unique outputs from ~60 templates.
...
Read the original on github.com »
What Can Be Done
You may have heard about 25 Gbit symmetrical internet in Switzerland. This is often cited as the fastest dedicated (non-shared) residential connection in the world. However, did you ever wonder why Switzerland has such fast internet at a reasonable price while the United States and other countries like Switzerland’s neighbor Germany are falling behind?
What is the fundamental difference between the countries that leads to such a stark difference in internet speeds and prices?
Free markets, regulation, technology, or all three?
Let’s take a closer look at the situation in Switzerland, Germany, and the United States.
This article is written by me and spell checked with AI. Many of the images are generated by AI and are mostly to break up the wall of text.
This Article is also available as a video (My first):
As mentioned, in Switzerland, you can get 25 Gigabit per second fiber internet to your home, symmetric and dedicated. If you don’t need such extreme speed, you can get 1 or 10 Gigabit from multiple competing providers for very little money. All over a connection that isn’t shared with your neighbors. In fact, someone could offer 100 Gigabit or more today; there is nothing preventing this other than the cost of endpoint equipment.
In the United States, if you’re lucky enough to have fiber, you might get 1 Gigabit. But often it’s shared with your neighbors. And you usually have exactly one choice of provider. Maybe two, if you count the cable company that offers slower speeds for the same price.
In Germany, you are in a somewhat similar situation to the United States. Fiber service is limited to one provider and is often shared with your neighbors.
The United States prides itself on free markets. On competition. On letting businesses fight it out. A deregulated market with no brakes.
Germany, on the other hand, is famous for over-regulation, making it difficult for businesses to operate, yet it is in a similar situation to the United States.
Switzerland has a highly regulated telecom sector with strong oversight and government-backed infrastructure projects, but regulations in Switzerland differ from those in Germany.
So why is the country that worships free markets producing stagnation, monopolies, and inferior internet, while the country with heavy regulation is producing hyper-competition, world-leading speeds, and consumer choice?
And at the same time, the country with the most regulation is suffering the same problems as the country with the least.
The answer reveals a fundamental truth about capitalism and regulation that most people get wrong.
To understand the failure, you have to understand what economists call a “natural monopoly.”
A natural monopoly is an industry where the cost of building the infrastructure is so high, and the cost of serving an additional customer is so low, that competition actually destroys value.
Think about water pipes. It would be insane to have three different water companies each digging up your street to lay their own pipes. You’d have three times the construction, three times the disruption, three times the cost. And at the end of it, you’d still only use one of them.
The rational solution is to build the infrastructure once, as a shared, neutral asset, and let different companies compete to provide the service over that infrastructure.
That’s how water works. That’s how electricity works in most places. And in Switzerland, that’s how fiber optic internet works.
But in the United States and Germany, they did the opposite.
In Germany, the “free market” approach meant letting any company dig up the street to lay their own fiber. The result is called “overbuild.” Multiple networks running in parallel trenches, often just meters apart.
Billions of euros spent on redundant concrete and asphalt. Money that could have been spent on faster equipment, lower prices, or connecting rural areas, instead wasted on digging the same hole twice, literally.
But isn’t Germany heavily regulated? Yes, but the regulations focus heavily on infrastructure competition rather than duct sharing enforcement.
Germany champions infrastructure competition, meaning it prefers multiple companies laying their own cables rather than sharing a single network. At the same time, the regulatory system wastes enormous amounts of time on waiting for digging permits and on courtroom battles just to obtain basic information about existing ducts.
Germany also has a large incumbent, Deutsche Telekom, which uses existing regulations to its competitive advantage against smaller ISPs. While Germany does have laws requiring Deutsche Telekom to share its ducts with competitors, in practice smaller ISPs face unreasonable hurdles such as high fees, procedural delays, and legal double burdens that undermine effective access.
Sharing ducts is not as bad as digging two trenches but it is still a waste of resources.
The United States took a different path, but the result is equally bad. Instead of overbuild, they got territorial monopolies, in some places paid for by the federal government.
In most American cities, you don’t have a choice of fiber providers. You have whatever incumbent happens to serve your neighborhood. Comcast has one area. Spectrum has another. AT&T has a third.
This is marketed as competition. But it’s not. It’s a cartel. Each company gets its own protected territory, and consumers get no choice. If you don’t like your provider, your only alternative is often DSL from the 1990s or a cellular hotspot.
This is what happens when you let natural monopolies operate without oversight. They don’t compete on price or quality. They extract rent.
And because these networks are built on the cheap using P2MP, or shared architecture, your “gigabit” connection is shared with your entire neighborhood. At 8 PM, when everyone streams Netflix, that gigabit becomes 200 megabits. Or 100. Or less.
The provider still charges you for “gigabit.” They just don’t tell you that you’re sharing it with 31 other households.
And it gets worse. In the United States, even if a competitor wanted to challenge the incumbent, they often can’t. Because the Point of Presence, the central hub where all the fiber lines from homes converge, is private. It belongs to Comcast or AT&T. Your fiber terminates in their building. A competitor can’t just install equipment there. They would have to build their own network from scratch, digging up the same streets, to reach you.
Now look at Switzerland. Here, the physical infrastructure, the fiber in the ground, is treated as a neutral, shared asset. It’s built once, often by a public or semi-public entity.
Every home gets a dedicated 4-strand fiber line. Point-to-Point. Not shared. Not split 32 ways.
That dedicated fiber terminates in a neutral, open hub. And any internet service provider can connect to that hub.
Init7, Swisscom, Salt, or a tiny local ISP, they all have equal access to the physical line that goes into your home.
This means you, the consumer, have genuine choice. When you sign up with a provider, you simply give them your OTO (Optical Termination Outlet) number, the unique identifier printed on the fiber optic plate in your home. It tells the provider exactly which fiber connection is yours. That’s it. No technician needs to visit. No one needs to dig up your street. You just call, give them the number, and within days (not always the case…), your new service is active.
And because your home has four separate fiber strands, you’re not locked into a single provider. You can have Init7 on one strand, Swisscom on another, and a local utility on a third. You can switch providers with a phone call. You can try a new provider without canceling your old one first. The competition happens on price, speed, and customer service but not on who happens to own the cable in front of your house.
In Switzerland, you can get 25 Gigabit per second fiber to your home. Today. Symmetric. Dedicated. Not shared with your neighbors.
In Switzerland, you have a choice of a dozen or more providers in most cities. Prices are competitive. Customer service matters because you can leave at any time.
In the United States, the majority of households have only one choice for high-speed internet. Speeds are lower. Prices are higher. And the technology is often a decade behind.
The “free market” promised innovation. It delivered rent-seeking. The incumbents have no incentive to upgrade because you have nowhere else to go.
American broadband prices have risen faster than inflation for decades. Speeds have increased only when a competitor, usually a municipal utility, forces the incumbent to respond.
Without competition, there is no innovation. There is only profit extraction.
But here’s the crucial part. Switzerland didn’t arrive at this model by accident. It didn’t happen because telecom companies were feeling generous. It happened because regulators forced it to happen.
Back in 2008, when the industry sat down at the Round Table organized by the Federal Communications Commission, it was Swisscom, the incumbent itself, that pushed for the four-fiber Point-to-Point model. The company argued that a single fiber would create a monopoly and that regulation would be necessary.
So the standard was set. Four fibers per home. Point-to-Point. Open access for competitors on Layer 1 - the physical fiber itself.
Then, in 2020, Swisscom changed course. The company announced a new network expansion strategy, this time using P2MP, the shared model with splitters. On paper, they argued it was cheaper and faster to deploy.
But the effect was clear. Under the P2MP design, competitors would no longer have direct access to the physical fiber. Instead of plugging into their own dedicated fiber strand, they would have to rent access from Swisscom at a higher network layer - effectively becoming resellers of Swisscom’s infrastructure. The open, competitive matrix that had been carefully built over years would disappear.
The small ISP Init7 filed a complaint with Switzerland’s competition authority, COMCO, which later opened an investigation. In December 2020, they issued a precautionary measure: Swisscom could not continue its P2MP rollout unless it guaranteed the same Layer 1 access that the original standard provided.
Swisscom fought this all the way to the Federal Court. They lost. In 2021, the Federal Administrative Court confirmed COMCO’s measures, stating that Swisscom had failed to demonstrate “sufficient technological or economic grounds” to deviate from the established fiber standard. In April 2024, COMCO finalized its ruling, fining Swisscom 18 million francs for violating antitrust law.
Swisscom is 51% owned by the Swiss Confederation. So, in simple terms, 51% state-owned and 49% privately/institutionally owned. Whether this makes the fine “symbolic” is a matter of opinion.
The result? Swisscom was forced to return to the four-fiber, Point-to-Point architecture it had originally championed. Competitors retained their direct, physical access to the fiber network. The walled garden was prevented.
Whether intended or not, the effect of Swisscom’s P2MP shift was clear: competitors would have been locked out of the physical infrastructure.
Swisscom is a bit of a walking contradiction. Being majority state-owned, it’s supposed to be a public service. But it’s also a private company, and maximizing profit benefits the state coffers. But that is something for another blog post.
This is the paradox that confuses so many people.
The American and German approach of letting incumbents build monopolies, allowing wasteful overbuild, and refusing to regulate natural monopolies is often called a ‘free market.’
But it’s not free. And it’s not a market.
True capitalism requires competition. But infrastructure is a natural monopoly. If you treat it like a regular consumer product, you don’t get competition. You get waste, or you get a monopoly.
The Swiss model understands this. They built the infrastructure once, as a shared, neutral asset, and then let the market compete on the services that run over it.
That’s not anti-capitalist. It’s actually better capitalism. It directs competition to where it adds value, not to where it destroys it.
The free market doesn’t mean letting powerful incumbents do whatever they want. It means creating the conditions where genuine competition can thrive.
What Can Be Done
So what can other countries learn from Switzerland? Here are the key policy changes that would help:
Mandate open access to physical infrastructure - require incumbents to share fiber ducts and dark fiber with competitors at cost-based prices. This is not “socialism” - it is how electricity and water work. Enforce Point-to-Point architecture - require that every home gets dedicated fiber strands, not shared splitters. This ensures competitors can access the physical layer, not just resell bandwidth.Create a neutral fiber standard - establish national standards that require multi-fiber deployment to every home, as Switzerland did in 2008.Empower competition authorities - give regulators like COMCO real teeth to enforce these rules. Fines must be large enough to matter.Support municipal fiber - allow cities and towns to build their own fiber networks when incumbents fail to serve residents adequately.
If you care about faster internet and lower prices, push your representatives to support these policies. The technology exists. The money exists. What is missing is the political will to demand real competition.
...
Read the original on sschueller.github.io »
A few years ago I was in a meeting with developers and someone asked a simple question: “What’s the right framework for a new Windows desktop app?”
Dead silence. One person suggested WPF. Another said WinUI 3. A third asked if they should just use Electron. The meeting went sideways and we never did answer the question.
That silence is the story. And the story goes back thirty-plus years.
When a platform can’t answer “how should I build a UI?” in under ten seconds, it has failed its developers. Full stop.
In 1988, Charles Petzold published Programming Windows. 852 pages. Win16 API in C. And for all its bulk, it represented something remarkable: a single, coherent, authoritative answer to how you write a Windows application. In the business, we call that a ‘strategy’.
Win32 that followed was bigger but still coherent. Message loops. Window procedures. GDI. The mental model was a bit whacky, but it was one mental model. Petzold explained it. It was the F=MA of Windows. Simple. Powerful. You learned it. You used it. You were successful.
Clarity is your friend! One OS, one API, one language, one book. There was no committee debating managed-code alternatives. There was just Win32 and Petzold, and it worked. This was Physics not Chemistry (this works but only for this slice of the period table. And only under these pressures. And only within this temperature. And only if the Moon is in the 7th house of Jupiter).
What happened next is a masterclass in how a company with brilliant people and enormous resources can produce a thirty-year boof-a-rama by optimizing for the wrong things. AKA Brillant people doing stupid things.
Win32 had real limitations, so Microsoft did what Microsoft does: it shipped something new for the developer conference. Several somethings.
MFC (1992) wrapped Win32 in C++. If Win32 was inelegant, MFC was Win32 wearing a tuxedo made of other tuxedos. Then came OLE. COM. ActiveX. None of these were really GUI frameworks — they were component architectures — but they infected every corner of Windows development and introduced a level of cognitive complexity that makes Kierkegaard read like Hemingway.
I sat through a conference session in the late nineties trying to understand the difference between an OLE document, a COM object, and an ActiveX control. I looked at the presenter like they had a rat’s tail hanging out of his mouth for the entire hour.
Microsoft wasn’t selling a coherent story. It was selling technology primitives and telling developers to figure out the story themselves. That’s the Conference Keynote Cluster***k — Microsoft optimized for an executive impressing people with their keynote and not the success of the users or developers.
At PDC 2003, Microsoft unveiled Longhorn — genuinely one of the most compelling technical visions the company had ever put in front of developers. Three pillars: WinFS (a relational file system), Indigo (unified communications), and Avalon — later WPF — a GPU-accelerated, vector-based UI subsystem driven by a declarative XML language called XAML. Developers saw the Avalon demos and went nuts. It was the right vision.
It was also, in the words of Jim Allchin’s internal memo from January 2004, “a pig.”
By August 2004, Microsoft announced a complete development reset. Scrapped. Start over from the Server 2003 codebase. And after the reset, leadership issued a quiet directive: no f***ing managed code in Windows. All new code in C++. WPF would ship alongside Vista, but the shell itself would not use it.
The Windows team’s bitterness toward .NET never healed. From their perspective, gambling on a new managed-code framework had produced the most embarrassing failure in the company’s history. That bitterness created a thirteen-year institutional civil war between the Windows team and the .NET team that would ultimately orphan WPF, kill Silverlight, doom UWP, and give us the GUI ecosystem boof-a-rama we have today.
WPF shipped in late 2006. It was remarkable — XAML, hardware-accelerated rendering, real data binding. If Microsoft had made it the definitive answer and invested relentlessly, the story might have ended differently. Instead, in 2007, they launched Silverlight: a stripped-down browser plugin to compete with Flash, cross-platform, elegant, and the foundation for Windows Phone. Around 2010 it looked like the rich client future.
Then at MIX 2010, a Microsoft executive said in a Q&A that Silverlight was not a cross-platform strategy — it was about Windows Phone. HTML5 was now policy. The Silverlight team was not told this was coming. Developers who had bet their LOB applications on Silverlight found out from a conference Q&A.
Silverlight wasn’t killed by technical failure. The technology was fine. It was killed by a business strategy decision, and developers were the last to know.
Remember that pattern. We’ll see it again.
Apple had sold 200 million iPhones. The iPad was eating into PC sales. Microsoft’s answer was Windows 8 and Metro — a touch-first runtime called WinRT that was deliberately not built on .NET. Remember the Windows team’s bitterness? Here it manifests. WinRT was a native C++ runtime. Clean break from WPF, WinForms, and a decade of developer investment in .NET.
There were actually two stories being told simultaneously inside Microsoft. The Windows team was building WinRT. The .NET team was still evangelizing WPF. Different buildings, different VPs, different road maps.
What developers heard at //Build 2012: the future is WinRT, and also HTML+JS is first-class, and also .NET still works, and also C++ is back, and also you should write Metro apps, and also your WPF code still runs fine. That is not a strategy. That is a Hunger Games stage where six teams are fighting for your attention.
Enterprise developers took one look at UWP’s sandboxing, its Store deployment requirement, and its missing Win32 APIs, and walked away. The framework designed to win them into the modern era had been optimized for a tablet app store that never materialized.
Windows 10 brought Universal Windows Platform — write once, run on PC, phone, Xbox, HoloLens. Compelling on paper. The problem: Windows Phone was dying, and Microsoft’s own flagship apps — Office, Visual Studio, the shell itself — weren’t using UWP. The message was clear even if no one said it out loud.
When UWP stalled, the official answer became it depends. Use UWP for new apps, keep WPF for existing ones, add modern APIs via XAML Islands, wait for WinUI 3, but also WinUI 2 exists for UWP specifically, and Project Reunion will fix everything, except we’re renaming it Windows App SDK and it still doesn’t fully replace UWP and…
Project Reunion / WinUI 3 represents genuine progress. But ask yourself why the problem existed at all. UWP’s controls were tied to the OS because the Windows team owned them. The .NET team didn’t. The developer tools team didn’t. Project Reunion was an organizational workaround dressed up as a technical solution.
One developer’s summary, written in 2024: “I’ve been following Microsoft’s constant changes: UAP, UWP, C++/CX replaced by C++/WinRT without tool support, XAML Islands, XAML Direct, Project Reunion, the restart of WinAppSDK, the chaotic switch between WinUI 2.0 and 3.0…” Fourteen years. Fourteen pivots. That person deserves a medal and an apology, in that order.
Here is every GUI technology actually shipping on Windows today:
* Win32 (1985) — Still here. Still used. Petzold’s book still applies.
* MFC (1992) — C++ wrapper on Win32. Maintenance mode. Lives in enterprise and CAD.
* WinForms (2002) — .NET wrapper on Win32. “Available but discouraged.” Still fastest for data-entry forms.
* Electron — Chromium + Node.js. VS Code, Slack, Discord. The most widely deployed desktop GUI technology on Windows right now — and Microsoft had nothing to do with it.
* Avalonia — Open source WPF spiritual successor. Used by JetBrains, GitHub, Unity — developers who stopped waiting for Microsoft.
* Uno Platform — WinUI APIs on every platform. More committed to WinUI than Microsoft is.
* Delphi / RAD Studio — Still alive. Still fast. Still in vertical market software.
* Java Swing / JavaFX — Yes, still in production. The enterprise never forgets.
Seventeen approaches. Five programming languages. Three rendering philosophies. That is not a platform. I might not have a dictionary definition for the term boof-a-rama but I know one when I see it.
Every failed GUI initiative traces back to one of three causes: internal team politics (Windows vs. .NET), a developer conference announcement driving a premature platform bet (Metro, UWP), or a business strategy pivot that orphaned developers without warning (Silverlight). None of these are technical failures. The technology was often genuinely good — WPF was good, Silverlight was good, XAML is good. The organizational failure was the product.
You either have a Plausible Theory of Success that covers the full lifecycle — adoption, investment, maintenance, and migration — or you have a developer conference keynote.
One is a strategy. The other is a thirty-year boof-a-rama.
Charles Petzold wrote six editions of Programming Windows trying to keep up with each new thing Microsoft announced. He stopped after the sixth, which covered WinRT for Windows 8. That was 2012.
...
Read the original on www.jsnover.com »
The crew for Nasa’s Artemis II mission have described seeing the far side of the Moon for the first time.
Nasa astronauts Reid Wiseman, Victor Glover, and Christina Koch, and Canadian Space Agency astronaut Jeremy Hansen have entered the third day of their mission on the Orion spacecraft that will carry them around the far side of the Moon and back to Earth.
“Something about you senses that is not the Moon that I’m used to seeing,” Koch said.
The crew shared a photo they took of the Orientale basin of the Moon, which Nasa said marked “the first time the entire basin has been seen with human eyes”.
As of 23:00 BST on Saturday, Nasa’s online dashboard showed the Artemis II spacecraft was more than 180,000 miles (289,681km) from Earth.
...
Read the original on www.bbc.com »
LÖVE is an awesome framework you can use to make 2D games in Lua. It’s free, open-source, and works on Windows, macOS, Linux, Android, and iOS.
We use our wiki for documentation. If you need further help, feel free to ask on our forums, our Discord server, or our subreddit.
We use the ‘main’ branch for development of the next major release, and therefore it should not be considered stable.
There are also branches for currently released major versions, which may have fixes and changes meant for upcoming patch releases within that major version.
We tag all our releases (since we started using mercurial and git), and have binary downloads available for them.
Experimental changes are sometimes developed in a separate love-experiments repository.
Files for releases are in the releases section on GitHub. The site has links to files and additional platform content for the latest release.
There are also unstable/nightly builds:
* Builds for some platforms are automatically created after each commit and are available through GitHub’s CI interfaces.
* For ubuntu linux they are in ppa:bartbes/love-unstable
* For arch linux there’s love-git in the AUR.
The test suite in testing/ covers all the LÖVE APIs, and tests them the same way developers use them. You can view current test coverage from any action.
You can run the suite locally like you would run a normal LÖVE project, e.g.:
love testing
See the readme in the testing folder for more info.
The best places to contribute are through the issue tracker and the official Discord server or IRC channel.
For code contributions, pull requests and patches are welcome. Be sure to read the source code style guide. Changes and new features typically get discussed in the issue tracker or on Discord or the forums before a pull request is made.
Follow the instructions at the megasource repository page.
Because in-tree builds are not allowed, the Makefiles needs to be generated in a separate build directory. In this example, folder named build is used:
Download or clone this repository and copy, move, or symlink the macOS/Frameworks subfolder into love’s platform/xcode/macosx folder and the shared subfolder into love’s platform/xcode folder.
Then use the Xcode project found at platform/xcode/love.xcodeproj to build the love-macosx target.
Download the love-apple-dependencies zip file corresponding to the LÖVE version being used from the Releases page, unzip it, and place the iOS/libraries subfolder into love’s platform/xcode/ios folder and the shared subfolder into love’s platform/xcode folder.
Or, download or clone this repository and copy, move, or symlink the iOS/libraries subfolder into love’s platform/xcode/ios folder and the shared subfolder into love’s platform/xcode folder.
Then use the Xcode project found at platform/xcode/love.xcodeproj to build the love-ios target.
See readme-iOS.rtf for more information.
...
Read the original on github.com »
Cloud AI APIs are great until they are not. Rate limits, usage costs, privacy concerns, and network latency all add up. For quick tasks like code review, drafting, or testing prompts, a local model that runs entirely on your hardware has real advantages: zero API costs, no data leaving your machine, and consistent availability.
Google’s Gemma 4 is interesting for local use because of its mixture-of-experts architecture. The 26B parameter model only activates 4B parameters per forward pass, which means it runs well on hardware that could never handle a dense 26B model. On my 14” MacBook Pro M4 Pro with 48 GB of unified memory, it fits comfortably and generates at 51 tokens per second. Though there’s significant slowdowns when used within Claude Code from my experience.
Google released Gemma 4 as a family of four models, not just one. The lineup spans a wide range of hardware targets:
The “E” models (E2B, E4B) use Per-Layer Embeddings to optimize for on-device deployment and are the only variants that support audio input (speech recognition and translation). The 31B dense model is the most capable, scoring 85.2% on MMLU Pro and 89.2% on AIME 2026.
Why I picked the 26B-A4B. The mixture-of-experts architecture is the key. It has 128 experts plus 1 shared expert, but only activates 8 experts (3.8B parameters) per token. A common rule of thumb estimates MoE dense - equivalent quality as roughly sqrt(total x active parameters), which puts this model around 10B effective. In practice, it delivers inference cost comparable to a 4B dense model with quality that punches well above that weight class. On benchmarks, it scores 82.6% on MMLU Pro and 88.3% on AIME 2026, close to the dense 31B (85.2% and 89.2%) while running dramatically faster.
The chart below tells the story. It plots Elo score against total model size on a log scale for recent open-weight models with thinking enabled. The blue-highlighted region in the upper left is where you want to be: high performance, small footprint.
Gemma 4 26B-A4B (Elo ~1441) sits firmly in that zone, punching well above its 25.2B parameter weight. The 31B dense variant scores slightly higher (~1451) but is still remarkably compact. For context, models like Qwen 3.5 397B-A17B (~1450 Elo) and GLM-5 (~1457 Elo) need 100-600B total parameters to reach similar scores. Kimi-K2.5 (~1457 Elo) requires over 1,000B. The 26B-A4B achieves competitive Elo with a fraction of the parameters, which translates directly into lower memory requirements and faster local inference.
This is what makes MoE models transformative for local use. You do not need a cluster or a high-end GPU rig to run a model that competes with 400B+ parameter behemoths. A laptop with 48 GB of unified memory is enough.
For local inference on a 48 GB Mac, this is the sweet spot. The dense 31B would consume more memory and generate tokens slower because every parameter participates in every forward pass. The E4B is lighter but noticeably less capable. The 26B-A4B gives you 256K max context, vision support (useful for analyzing screenshots and diagrams), native function/tool calling, and reasoning with configurable thinking modes, all at 51 tokens/second on my hardware.
LM Studio has been a popular desktop app for running local models for a while. Version 0.4.0 changed the architecture fundamentally by introducing llmster, the core inference engine extracted from the desktop app and packaged as a standalone server.
The practical result: you can now run LM Studio entirely from the command line using the lms CLI. No GUI required. This makes it usable on headless servers, in CI/CD pipelines, SSH sessions, or just for developers who prefer staying in the terminal.
* llmster daemon: a background service that manages model loading and inference without the desktop app
* Parallel request processing: continuous batching instead of sequential queuing, so multiple requests to the same model run concurrently
* Stateful REST API: a new /v1/chat endpoint that maintains conversation history across requests
# Linux/Mac
curl -fsSL https://lmstudio.ai/install.sh | bash
# Windows
irm https://lmstudio.ai/install.ps1 | iexlms daemon uplms runtime update llama.cpp
lms runtime update mlxlms get google/gemma-4-26b-a4b
The CLI shows you the variant it will download (Q4_K_M quantization by default, 17.99 GB) and asks for confirmation:
↓ To download: model google/gemma-4-26b-a4b - 64.75 KB
└─ ↓ To download: Gemma 4 26B A4B Instruct Q4_K_M [GGUF] - 17.99 GB
About to download 17.99 GB.
? Start download?
❯ Yes
No
Change variant selection
If you already have the model, the CLI tells you and shows the load command:
✔ Start download? yes
Model already downloaded. To use, run: lms load google/gemma-4-26b-a4blms lsYou have 10 models, taking up 118.17 GB of disk space.
LLM PARAMS ARCH SIZE DEVICE
gemma-3-270m-it-mlx 270m gemma3_text 497.80 MB Local
google/gemma-4-26b-a4b (1 variant) 26B-A4B gemma4 17.99 GB Local
gpt-oss-20b-mlx 20B gpt_oss 22.26 GB Local
llama-3.2-1b-instruct 1B Llama 712.58 MB Local
nvidia/nemotron-3-nano (1 variant) 30B nemotron_h 17.79 GB Local
openai/gpt-oss-20b (1 variant) 20B gpt-oss 12.11 GB Local
qwen/qwen3.5-35b-a3b (1 variant) 35B-A3B qwen35moe 22.07 GB Local
qwen2.5-0.5b-instruct-mlx 0.5B Qwen2 293.99 MB Local
zai-org/glm-4.7-flash (1 variant) 30B glm4_moe_lite 24.36 GB Local
EMBEDDING PARAMS ARCH SIZE DEVICE
text-embedding-nomic-embed-text-v1.5 Nomic BERT 84.11 MB Local
Worth noting: several of these models use mixture-of-experts architectures (Gemma 4, Qwen 3.5, GLM 4.7 Flash). MoE models punch above their weight for local inference because only a fraction of parameters activate per token.
Start a chat session with stats enabled to see performance numbers:
lms chat google/gemma-4-26b-a4b –stats ╭─────────────────────────────────────────────────╮
│ 👾 lms chat │
│ Type exit or Ctrl+C to quit │
│ Chatting with google/gemma-4-26b-a4b │
│ Try one of the following commands: │
│ /model - Load a model (type /model to see list) │
│ /download - Download a model │
│ /clear - Clear the chat history │
│ /help - Show help information │
With –stats, you get prediction metrics after each response:
Prediction Stats:
Stop Reason: eosFound
Tokens/Second: 51.35
Time to First Token: 1.551s
Prompt Tokens: 39
Predicted Tokens: 176
Total Tokens: 215
51 tokens/second on a 14” MacBook Pro M4 Pro (48 GB) with a 26B model is solid. Time to first token at 1.5 seconds is responsive enough for interactive use.
See what is currently loaded:
lms psIDENTIFIER MODEL STATUS SIZE CONTEXT PARALLEL DEVICE TTL
google/gemma-4-26b-a4b google/gemma-4-26b-a4b IDLE 17.99 GB 48000 2 Local 60m / 1h
The model occupies 17.99 GB in memory with a 48K context window and supports 2 parallel requests. The TTL (time-to-live) auto-unloads the model after 1 hour of idle time, freeing memory without manual intervention.
lms ps –json | jq
* “vision”: true and “trainedForToolUse”: true - Gemma 4 supports both image input and tool calling
* “maxContextLength”: 262144 - the model supports up to 256K context, though the default load is 48K
Before loading a model, you can estimate memory requirements at different context lengths using –estimate-only. I wrote a small script to test across the full range:
The base model takes about 17.6 GiB regardless of context. Each doubling of context length adds roughly 3-4 GiB. At the default 48K context, you need about 21 GiB. On my 48 GB MacBook Pro, I can push to the full 256K context at 37.48 GiB and still have about 10 GB free for the OS and other apps. A 36 GB Mac could comfortably run 200K context with headroom.
lms load google/gemma-4-26b-a4b –estimate-only –context-length 48000
Model: google/gemma-4-26b-a4b
Context Length: 48,000
Estimated GPU Memory: 21.05 GiB
Estimated Total Memory: 21.05 GiB
Estimate: This model may be loaded based on your resource guardrails settings.
This is useful for capacity planning. If you want to run Gemma 4 alongside other applications, check the estimate at your target context length first.
Here is the full script I used to generate the table above. You can swap in any model name and context length list to profile a different model:
#!/usr/bin/env bash
model=“google/gemma-4-26b-a4b”
contexts=(4096 8000 16000 24000 32000 48000 64000 96000 128000 200000 256000)
table_contexts=()
table_gpu=()
table_total=()
for ctx in “${contexts[@]}”; do
output=“$(lms load “$model” –estimate-only –context-length “$ctx” 2>&1)”
parsed_context=“$(printf ‘%s\n’ “$output” | awk -F’: ′ ‘/^Context Length:/ {print $2; exit}’)”
parsed_gpu=“$(printf ‘%s\n’ “$output” | awk -F’: +′ ‘/^Estimated GPU Memory:/ {print $2; exit}’)”
parsed_total=“$(printf ‘%s\n’ “$output” | awk -F’: +′ ‘/^Estimated Total Memory:/ {print $2; exit}’)”
table_contexts+=(“${parsed_context:-$ctx}“)
table_gpu+=(“${parsed_gpu:-N/A}“)
table_total+=(“${parsed_total:-N/A}“)
done
printf ‘| Model | Context Length | GPU Memory | Total Memory |\n’
printf ‘|–-|–-:|–-:|–-:|\n’
for i in “${!table_contexts[@]}”; do
printf ‘| %s | %s | %s | %s |\n’ \
“$model” “${table_contexts[$i]}” “${table_gpu[$i]}” “${table_total[$i]}”
done
...
Read the original on ai.georgeliu.com »
Try adjusting your search terms or filters
...
Read the original on playlists.at »
Peter Thiel controls both Palantir (surveillance analytics) and Persona’s primary investor (Founders Fund) while Persona’s leaked source code exposes 269 verification checks and government reporting modules for FinCEN/FINTRAC
SDK decompilation reveals a hardcoded AES key (rotated in v1.15.3 after disclosure), zero certificate pinning, silent carrier phone verification via Vonage, and 7 simultaneous analytics services
Paravision powers Persona’s facial recognition but is not listed on the subprocessors page. Ranked #1 at DHS Biometric Rally
LinkedIn runs 4 identity verification vendors (AU10TIX since 2021, Persona, CLEAR, DigiLocker) and a parallel Chinese surveillance stack with Sesame Credit social scoring, ShanYan carrier auth, and government device identifiers in the same APK
197 subdomains enumerated on withpersona.com exposing 40+ clients (Roblox, Robinhood, Kaggle, Playboy, Italian government). 65 staging subdomains expose internal ML services, TigerGraph, and the AAMVA gateway
Roblox embeds the full Persona SDK in a children’s game with NFC passport reading and a flow users cannot exit
Legislative mandates across 25+ US states, Brazil, and the UK create the mandatory market. Meta spent $26.3M lobbying to push this legislation
No ads, no corporate funding, no grants. Infrastructure, security, and research tools run entirely on donations.
Fund this project
Independent decompilation of the Persona Wallet APK v1.14.0 (SDK v2.32.3, built March 11, 2026) and analysis of the web inquiry bundle from cdn.withpersona.com (inquiry-main.js, 1.8MB) reveals the full scope of Persona’s surveillance capabilities. The APK was obtained from APKPure and decompiled with jadx 1.5.5. The Roblox APK v2.714.1091 was decompiled separately to confirm the SDK integration. All findings are from publicly available APKs and client-side JavaScript served to every user. New
Every copy of the Persona SDK contains a hardcoded AES-256-GCM encryption key in TrackingEventUtilsKt.java line 22:
All telemetry events are “encrypted” with this key before transmission to POST https://tg.withpersona.com/t. Since the key is embedded in every publicly downloadable APK, anyone can decrypt the payloads. The encryption pipeline serializes events to JSON, wraps them as {“events”: , encrypts with AES-256-GCM using a 12-byte random IV, then Base64-encodes the ciphertext and sends it as {“e”: ”. This is obfuscation, not security. A standalone Python decryptor was built and verified in round-trip testing.
The SDK does not implement certificate pinning. The OkHttpClient builder creates a standard client without certificatePinner(). Combined with the hardcoded AES key, a standard MITM proxy with a user-installed CA certificate can capture and decrypt all Persona telemetry from any app that embeds the SDK.
What to look forThe okhttpClient() method builds an OkHttpClient without calling certificatePinner(). The builder chain is: addNetworkInterceptor → readTimeout → writeTimeout → connectTimeout → addInterceptor (loop) → build(). No pinning step exists.
jadx -d out persona-wallet.apk && grep -rn “certificatePinner\|CertificatePinner” out/ - returns zero results
public final OkHttpClient okhttpClient(…) {
OkHttpClient. Builder builderAddNetworkInterceptor =
new OkHttpClient.Builder().addNetworkInterceptor(new Interceptor() {
public final Response intercept(Interceptor.Chain chain) {
// … adds Persona-Version, Persona-Device-*, VTDGJLGG headers …
return chain.proceed(builderHeader3.build());
TimeUnit timeUnit = TimeUnit.MINUTES;
OkHttpClient.Builder builderConnectTimeout = builderAddNetworkInterceptor
.readTimeout(1L, timeUnit)
.writeTimeout(1L, timeUnit)
.connectTimeout(1L, timeUnit);
Iterator it = set.iterator();
while (it.hasNext()) {
builderConnectTimeout.addInterceptor((Interceptor) it.next());
return builderConnectTimeout.build();
// No certificatePinner() call anywhere in this chain.
A user going through Persona verification is tracked by seven analytics services at the same time:
The Android SDK’s Sentry configuration sets io.sentry.traces.sample-rate = 1 (100% of sessions), io.sentry.traces.user-interaction.enable = true, and io.sentry.attach-view-hierarchy = true. Every user session is fully traced with interaction recording and UI hierarchy capture.
io.sentry.dsn = https://ad039950…@o175220.ingest.us.sentry.io/4506939573993472
io.sentry.traces.sample-rate = 1 // 100% of sessions traced
io.sentry.traces.user-interaction.enable = true // taps, clicks, gestures
io.sentry.attach-view-hierarchy = true // full UI tree on errors
ActivityLifecycleIntegration - tracks activity start/stop/resume/pause
FragmentLifecycleIntegration - tracks fragment lifecycle
UserInteractionIntegration - captures user taps and gestures
NetworkBreadcrumbsIntegration - logs network connectivity changes
ViewHierarchyEventProcessor - captures view hierarchy snapshots
io.sentry.android.replay.* - session replay (record and replay user sessions)
jadx -d out persona-wallet.apk && find out/ -path “*mx_com.mixpanel*” -name “*.java” | wc -l
jadx -d out persona-wallet.apk && find out/ -name “firebase-*.properties”
Key classes found:
mx_com.mixpanel.android.mpmetrics.MixpanelAPI
mx_com.mixpanel.android.mpmetrics.AnalyticsMessages
mx_com.mixpanel.android.mpmetrics.MixpanelActivityLifecycleCallbacks
mx_com.mixpanel.android.mpmetrics.PersistentIdentity
mx_com.mixpanel.android.mpmetrics.SystemInformation
20+ files total. mx_com prefix = repackaged/embedded build.
SystemInformation collects:
- Device model, OS version, screen dimensions
- Mobile carrier name
- Bluetooth status
- NFC status
firebase-annotations.properties
firebase-components.properties
firebase-encoders-json.properties
com.google.android.datatransport.cct.internal.ClientInfo
Firebase component registrars loaded at startup
The SDK can silently verify a user’s phone number through the mobile carrier with zero user interaction. The PhoneNumberSna component uses Vonage (Ericsson subsidiary) to make an HTTP request over the device’s cellular connection. The carrier intercepts the request and verifies that the SIM card’s phone number matches the number encoded in the URL. The request follows up to 10 redirects through carrier authentication infrastructure. Results are auto-submitted to Persona’s backend with countdown=0, meaning the verification happens instantly. The user never sees, touches, or approves the carrier-level verification.
From the decompiled source: the PhoneNumberSnaComponent implements AutoSubmitableComponent. When verification completes, UiState.Displaying.AutoSubmit fires with countdown=0 and the form is submitted immediately.
public VonageSnaClient(Context context, SilentNetworkAuthConfig config, Moshi moshi) {
this.config = config;
this.moshi = moshi;
VGCellularRequestClient.Companion.initializeSdk(context);
public SnaClient.Response performSilentNetworkAuth() throws Exception {
if (!SnaUtils.INSTANCE.isValidSnaCheckUrl(this.config.getCheckUrl())) {
return new SnaClient.Response.Error(…);
JSONObject result = VGCellularRequestClient.Companion.getInstance()
.startCellularGetRequest(
new VGCellularRequestParameters(
this.config.getCheckUrl(), // URL from Persona server
MapsKt.emptyMap(), // headers
MapsKt.emptyMap(), // query params
this.config.getMaxRedirects() // follows up to N carrier redirects
), false);
// Carrier validates SIM -> returns success/failure JSON
During selfie capture, the SDK streams live video to webrtc-consumer.withpersona.com via WebRTC TURN connections. The server receives continuous video, not just captured frames. An optional recordAudio flag enables audio recording during the stream. The SDK’s VideoCaptureMethod priority prefers Stream (WebRTC) first, falling back to Upload only if streaming is unavailable.
On-device face detection uses Google ML Kit to extract: bounding box, Euler angles (pitch, rotation, tilt), smile probability, left/right eye open probability, 10 face landmarks, and 15 face contour types. A 3x3 brightness grid divides the face region into 9 zones and computes per-zone luminance to detect flat paper or screen reflections. All anti-spoofing analysis is delegated to the server. The terms anti_spoof, active_liveness, passive_liveness, and depth do not appear in the SDK codebase.
The SDK implements ICAO 9303 BAC/PACE protocol for reading e-passport chips. Data groups read from the chip:
Signed hashes of all data groups for tamper detection
The data groups are server-configurable via PassportNfcReaderConfig.enabledDataGroups. The server can request additional data groups at any time. Terminal Authentication (EAC-TA) is not implemented, meaning the passport chip cannot verify whether Persona is an authorized reader. Chip Authentication is supported but the absence of Terminal Authentication means the chip has no way to refuse data extraction.
The web inquiry bundle contains the complete Persona platform entity registry, expanding well beyond the February 2026 leak. 15 verification types were not present in the leaked source or the Android SDK:
Persona now has integration with 15 digital identity systems across 14 countries, including Worldcoin’s iris biometric World ID. The platform also supports Aadhaar (India, 1.4 billion enrolled), Serpro (Brazil, 220 million), PhilSys (Philippines, 92 million), Singpass (Singapore), and ECBSV (US SSA Social Security number verification).
The web bundle reveals 47 distinct report types. The following third-party data integrations were not previously documented:
Persona connects to Chainalysis and TRM Labs for crypto wallet screening, Equifax for credit bureau data, FINRA and the SEC for securities enforcement, Sentilink for synthetic identity detection, MX for financial account aggregation, and Kyckr for global company registries. The platform processes far more than identity verification.
The SDK supports 26 distinct government ID types, including country-specific documents: Japan MyNumber Card (myn), Singapore/Malaysia NRIC (nric), Philippines OFWID (ofw), Philippines UMID (umid), Philippines NBI Clearance (nbi), and India PAN Card (pan). US and Canadian driver’s licenses are parsed via an AAMVA PDF417 barcode reader that extracts 13 fields: first name, middle name, last name, sex, street address, city, state, postal code, ID number, issue date, expiration date, date of birth, and country.
Decompilation of the Roblox APK v2.714.1091 confirms the full Persona SDK is embedded under the internal module name com.roblox.universalapp.facialageestimation. The integration bridge at com.roblox.client.personasdk passes verification results through a JNI (Java Native Interface) bridge directly into Roblox’s C++ game engine. The DisablePauseSessionInPersonaFlow feature flag prevents users from pausing or backgrounding the app during verification, ensuring the continuous WebRTC video stream is not interrupted.
Roblox’s manifest includes permissions for camera, audio recording, NFC, biometric authentication, fingerprint, Bluetooth, and contacts (READ_CONTACTS, unusual for a game). The full 21-endpoint API surface and all surveillance capabilities documented above are available inside Roblox, an app where 67% of users are under 16.
The Persona client is a stateless renderer. The server dictates every step, every component, and every transition via 38 UI component types and 12 inquiry states. There is no hardcoded step sequence in the SDK. The server can dynamically insert, reorder, or skip any step for any user based on risk scoring, template configuration, or real-time decisions. This means the verification pipeline experienced by one user can be completely different from another, and users have no way to predict or audit what data collection steps will be triggered.
Every API request is signed with four obfuscated headers: NHMJLNRS (HMAC-SHA256 hash of JWT sub, timestamp, and request body), STPBWSBB (Unix timestamp), DNLGNZLZ (secondary hash incorporating header values), and TLJLGGDG (list of signed header names). A fifth header, VTDGJLGG, reports whether a debugger is attached to the process via Debug.isDebuggerConnected(). This flag is included in the signed header set, making it tamper-evident. Google Play Integrity verification runs with up to 5 retries and a 1-second delay between attempts.
The following keys are embedded in publicly served client-side code. All were confirmed active through testing:
Sentry organization ID is o175220. All three Sentry DSNs map to the same organization. The web inquiry bundle also contains Datadog RUM application ID and client token, and an Osano CMP (consent management) account identifier.
The web verification flow loads faceapi.js (1.3MB), which includes an age_gender_model for client-side age and gender estimation. This ML model runs in the browser before any data is sent to Persona’s servers. The document OCR module (microblink.js, 91KB) performs MRZ reading directly in the browser.
DNS TXT record analysis of withpersona.com reveals domain verification tokens from companies confirming active integration relationships: New
Meta spent $26.3M lobbying for age verification laws via the Digital Childhood Alliance while simultaneously maintaining a domain verification integration with Persona, the company that would profit from those laws. Meta is also a founding platform partner of the OpenAge Initiative (Dec 2025), where Persona is an identity verification member (Jan 2026).
Certificate transparency confirms openai-watchlistdb.withpersona.com has been operational since November 16, 2023. Testing environment at openai-watchlistdb-testing.withpersona.com resolves to the same GCP Kansas City IP (34.49.93.177). Twenty-two certificates total, renewed every two months, still active as of March 18, 2026. Twenty-seven months of continuous operation before public awareness via the February 2026 leak. OpenAI has never publicly disclosed this watchlist database. Winbuzzer
...
Read the original on tboteproject.com »
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.