10 interesting stories served every morning and every evening.
So there I was, pedaling my bicycle as fast as I could down a long, straight stretch of road, feeling great. I’d just discovered the pleasures of riding a road bike, and I loved every minute that I could get away. Always a data geek, I tracked my mileage, average speed, heart rate, etc. It was a beautiful Indian summer Sunday afternoon in September. I was in my late 30s, still a baby. Out of nowhere, my chain came off right in the middle of the sprint I was timing. In true masculine fashion, I threw a fit, cursing and hitting the brakes as hard as I could. At this point, I found out that experienced riders don’t do that because I flew right over the handlebars, landing on the pavement amid speeding cars. I momentarily lost consciousness, and when I regained my senses, I knew I’d screwed up badly. The pain in my shoulder was nauseating. I couldn’t move my arm, and I had to just roll off the road onto the shoulder. I just lay there, hurting, unable to think clearly. Within seconds, it seemed, a man materialized beside me.
He was exceptionally calm. He didn’t ask me if I was OK, since I clearly wasn’t. It was obvious that he knew what he was doing. He made certain I could breathe, paused long enough to dial 911, and then started pulling stuff out of a medical bag (WTF?) to clean the extensive road rash I had. In a minute, he asked for my home phone number so he could call my wife to let her know I was going to be riding in an ambulance to the hospital. He told her he was an emergency room doctor who just happened to be right behind me when I crashed. He explained that he would stay with me until the medics arrived and that he would call ahead to make sure one of the doctors on duty would “take good care of me.”
When he hung up, he asked me if I’d heard the conversation. I told him that I had and that I couldn’t believe how lucky I was under the circumstances. He agreed. To keep my mind off the pain, he just kept chatting, telling me that because I was arriving by ambulance, I’d be treated immediately. He told me that I’d be getting the “good drugs” to take care of the pain. That sounded awesome.
I don’t remember telling him goodbye. I certainly didn’t ask him his name or find out anything about him. He briefed the EMTs when they arrived and stood there until the ambulance doors closed. The ER was indeed ready for me when the ambulance got there. They treated me like a VIP. I got some Dilaudid for the pain, and it was indeed the good stuff. They covered the road rash with Tegaderm and took x-rays, which revealed that I’d torn my collarbone away from my shoulder blade. That was going to require a couple of surgeries and lots of physical therapy. I had a concussion and was glad that I had a helmet on.
All of this happened almost 25 years ago. I’ve had plenty of other bike wrecks, but that remains the worst one. My daughter is a nurse, and she’s like a magnet for car crashes, having stopped multiple times to render aid. She doesn’t do it with a smile on her face, though; emergency medicine isn’t her gig, and if anyone asks her if she’s a doctor, her stock answer is “I’m a YMCA member.”
The guy who helped me that day was an absolute angel. I have no idea what I would have done without him. I didn’t even have a cell phone at the time. But he was there at a time when I couldn’t have needed him any more badly. He helped me and then got in his car and completed his trip. I think of that day often, especially when the American medical system makes me mad, which happens regularly these days.
I’ve enjoyed the kindness of a lot of strangers over the years, particularly during the long hike my wife and I did for our honeymoon (2,186 miles) when we hitchhiked to a town in NJ in the rain and got a ride from the first car to pass. Another time, in Connecticut, a man gave us a $100 bill and told us to have a nice dinner at the restaurant atop Mt. Greylock, the highest mountain in Massachusetts. In Virginia, a moth flew into my wife’s ear, and I mean all the way into her ear until it was bumping into her eardrum. We hiked several miles to the road and weren’t there for a minute before a man stopped and took us to urgent care, 30 miles away.
When you get down in the dumps, I hope you have some memories like that to look back on, to restore your faith in humanity. There are a lot of really good people in the world.
Enjoyed it? Please upvote 👇
...
Read the original on louplummer.lol »
In a large-scale analysis of 20 popular VPNs, IPinfo found that 17 of those VPNs exit traffic from different countries than they claim. Some claim 100+ countries, but many of them point to the same handful of physical data centers in the US or Europe. That means the majority of VPN providers we analyzed don’t route your traffic via the countries they claim to, and they claim many more countries than they actually support. Analyzing over 150,000 exit IPs across 137 possible exit countries, and comparing what providers claim to what IPinfo measures, shows that:17 in 20 providers had traffic exiting in a different country.38 countries were “virtual-only” in our dataset (claimed by at least one provider, but never observed as the actual traffic exit country for any provider we tested).We were only able to verify all provider announced locations for 3 providers out of the 20.Across ~150,000 VPN exit IPs tested, ProbeNet, our internet measurement platform, detected roughly 8,000 cases where widely-used IP datasets placed the server in the wrong country — sometimes thousands of kilometers off.This report walks through what we saw across VPN and IP data providers, provides a closer look at two particularly interesting countries, explores why measurement-based IP data matters if you care where your traffic really goes, and shares how we ran the investigation.Which VPNs Matched Reality (And Which Didn’t)Here is the overlap between the number of listed countries each VPN provider claims to offer versus the countries with real VPN traffic that we measured — lower percentages indicate providers whose claimed lists best match our data:
It’s important to note that we used the most commonly and widely supported technologies in this research, to make comparison between providers as fair as possible while giving us significant data to analyze, so this will not be the full coverage for each provider. These are some of the most visible names in the market. They also tend to have very long country lists on their websites. Notably, three well-known providers had zero mismatches across all the countries we tested: Mullvad, IVPN, and Windscribe.Country mismatches doesn’t automatically mean some providers offer “bad VPNs,” but it does mean that if you’re choosing a VPN because it claims “100+ countries,” you should know that a significant share of those flags may be labels, or virtual locations.What “Virtual Locations” Really MeanWhen a VPN lets you connect to, for example, “Bahamas” or “Somalia,” that doesn’t always mean traffic routes through there. In many cases, it’s somewhere entirely different, like Miami or London, but presented as if traffic is in the country you picked.This setup is known as a virtual location:The IP registry data also says “Country X” — because the provider self-declared it that way.But the network measurements (latency and routing) show the traffic actually exits in “Country Y” — often thousands of kilometers away.The problem? Without active network measurement, most IP datasets will rely on what the IP’s owner told the internet registry or published in WHOIS/geofeeds: a self-reported country tag. If that record is wrong or outdated, the mistake spreads everywhere. That’s where IPinfo’s ProbeNet comes in: by running live RTT tests from 1,200+ points of presence worldwide, we anchor each IP to its real-world location, not just its declared one.Across the dataset, we found 97 countries where at least one VPN brand only ever appeared as virtual or unmeasurable in our data. In other words, for a noticeable slice of the world map, some “locations” in VPNs never show up as true exits in our measurements. We also found 38 countries where every mention behaved this way: at least one VPN claimed them, but none ever produced a stable, measurable exit in that country in our sample.You can think of these 38 as the “unmeasurable” countries in this study — places that exist in server lists, config files, and IP geofeeds, but never once appeared as the actual exit country in our measurements. They’re not randomly scattered — they cluster in specific parts of the map. By region, that includes:This doesn’t prove there is zero VPN infrastructure in those countries globally. It does show that, across the providers and locations we measured, the dominant pattern is to serve those locations from elsewhere. Here are three of the most interesting examples of how this looks at the IP level.Case Studies: Two Countries That Only Exist on the MapTo make this concrete, let’s look at three countries where every provider in our dataset turned out to be virtual: Bahamas, and Somalia.Bahamas: All-Inclusive, Hosted in the USIn our measurements, five providers offered locations labeled as “Bahamas”: NordVPN, ExpressVPN, Private Internet Access, FastVPN, and IPVanish.For all of them, measured traffic was in the United States, usually with sub-millisecond RTT to US probes.
Somalia: Mogadishu, via France and the UKSomalia appears in our sample for only two providers: NordVPN and ProtonVPN. Both label Mogadishu explicitly in their naming, but these RTTs are exactly what you’d expect for traffic in Western Europe, and completely inconsistent with traffic in East Africa. Both providers go out of their way in the labels (e.g. “SO, Mogadishu”), but the actual traffic is in Nice and London, not Somalia.
When Legacy IP Providers Agree With the Wrong VPN LocationsSo far, we’ve talked about VPN claims versus our measurements. But other IP data providers don’t run active RTT tests. They rely on self-declared IP data sources, and often assume that if an IP is tagged as “Country X,” it must actually be there. In these cases, the IP legacy datasets typically “follow” the VPN provider’s story: if the VPN markets the endpoint as Country X, the legacy IP dataset also places it in Country X.To quantify that, we looked at 736 VPN exits where ProbeNet’s measured country disagreed with one or more widely used legacy IP datasets.We then compared the country IPinfo’s ProbeNet measured (backed by RTT and routing) with the country reported by these other IP datasets and computed the distance between them. The gaps are large:How Far Off Were the Other IP Datasets?
The median error between ProbeNet and the legacy datasets was roughly 3,100 km. On the ProbeNet side, we have strong latency evidence that our measured country is the right one:The median minimum RTT to a probe in the measured country was 0.27 ms. About 90% of these locations had a sub-millisecond RTT from at least one probe.That’s what you expect when traffic is genuinely in that country, not thousands of kilometers away.An IP Example You Can Test YourselfThis behavior is much more tangible if you can see it on a single IP. Here’s one VPN exit IP where ProbeNet places the server in the United Kingdom, backed by sub-millisecond RTT from local probes, while other widely used legacy IP datasets place the same IP in Mauritius, 9,691 kilometers away.If you want to check this yourself, you can plug it into a public measurement tool like https://ping.sx/ and run pings or traceroutes from different regions. Tools like this one provide a clear visual for where latency is lowest.ProbeNet uses the same basic idea, but at a different scale: we maintain a network of 1,200+ points of presence (PoPs) around the world, so we can usually get even closer to the real physical location than public tools with smaller networks.If you’d like to play with more real IPs (not necessarily VPNs) where ProbeNet and IPinfo get the country right and other datasets don’t, you can find a fuller set of examples on our IP geolocation accuracy page.Why This Happens and How It Impacts TrustIt’s worth separating technical reasons from trust issues. There are technical reasons to use virtual or hubbed infrastructure:Risk & regulation. Hosting in certain countries can expose both the provider and users to local surveillance or seizure.Infrastructure quality. Some regions simply don’t have the same density of reliable data centers or high-capacity internet links, so running servers there is harder and riskier.Performance & cost. Serving “Bahamas” from Miami or “Cambodia” from Singapore can be cheaper, faster, and easier to maintain.From this perspective, a virtual location can be a reasonable compromise: you get a regional IP and content unblocking without the downsides of hosting in a fragile environment.Where It Becomes a Trust ProblemLack of disclosure. Marking something clearly as “Virtual Bahamas (US-based)” is transparent. Listing “Bahamas” alongside “Germany” without any hint that one is virtual and the other is physical blurs the line between marketing and reality.Scale of the mismatch. It’s one thing to have a few virtual locations in hard-to-host places. It’s another when dozens of countries exist only as labels across your entire footprint, or when more than half of your tested locations are actually somewhere else.Downstream reliance. Journalists, activists, and NGOs may pick locations based on safety assumptions. Fraud systems, compliance workflows, and geo-restricted services may treat “Somalia” vs “France” as a meaningful difference. If both the VPN UI and the IP data say “Somalia” while the traffic is physically in France, everyone is making decisions on a false premise.That last point leads directly into the IP data problem that we are focused on solving.So How Much Should You Trust Your VPN?If you’re a VPN user, here are some practical takeaways from this work:Treat “100+ countries” as a marketing number, not a guarantee. In our sample, 97 countries existed only as claims, not reality, across 17 providers.Check how your provider talks about locations. Do they clearly label “virtual” servers? Document where they’re actually hosted? Or do they quietly mix virtual and physical locations in one long list?If you rely on IP data professionally, ask where it comes from. A static “99.x% accurate worldwide” claim doesn’t tell you how an IP data provider handles fast-moving, high-stakes environments like VPN infrastructure.Ultimately, this isn’t an argument against VPNs, or even against virtual locations. It’s an argument for honesty and evidence. If a VPN provider wants you to trust that map of flags, they should be willing, and able, to show that it matches the real network underneath.Most legacy IP data providers rely on regional internet registry (RIR) allocation data and heuristics around routing and address blocks. These providers will often accept self-declared data like customer feedback, corrections, and geofeeds, without a clear way to verify them. Proprietary ProbeNet with 1,200+ points of presence
We maintain an internet measurement platform of PoPs in locations around the world.Active measurements
For each visible IP on the internet, including both IPv4 and IPv6 addresses, we measure RTT from multiple probes.Evidence-based geolocation
We combine these measurements with IPinfo’s other signals to assign a country (and more granular location) that’s grounded in how the internet actually behaves.This measurement-first approach is unique in the IP data space. Once we realized how much inaccuracy came from self-declared data, we started investing heavily in research and building ProbeNet to use active measurements at scale. Our goal is to make IP data as evidence-based as possible, verifying with observation on how the internet actually behaves.Our Methodology for This ReportWe approached this VPN investigation the way a skeptical but well-equipped user would: start from the VPNs’ own claims, then test them.For each of the 20 VPN providers, we pulled together three kinds of data:Marketing promises: The “servers in X countries” claims and country lists from their websites. When a country was clearly listed there, we treated it as the locations they actively promote. Configurations and locations lists: Configurations from different protocols like OpenVPN or WireGuard were collected along with location information available on provider command-line tools, mobile applications, or APIs.Unique provider–location entries: We ended up with over 6,000,000 data points and a list of provider + location combinations we could actually try to connect to with multiple IPs each.Step 2: Observing Where the Traffic Really GoesNext, we used IPinfo infrastructure and ProbeNet to dial into those locations and watch what actually happens:We connected to each VPN “location” and captured the exit IP addresses.For each exit IP address, we used IPinfo + ProbeNet’s active measurements to determine a measured country, plus:The round-trip time (RTT) from that probe (often under 1 ms), which is a strong hint about physical proximityNow we had two views for each location:Expected/Claimed country: What the VPN claims in its UI/configs/websiteMeasured country: Where IPinfo + ProbeNet actually see the exit IPFor each location where a country was clearly specified, we asked a very simple question: Does the expected country match the measured country?If yes, we counted it as a match. If not, it became a mismatch: a location where the app says one country, but the traffic exits somewhere else.We deliberately used a very narrow definition of “mismatch.” For a location to be counted, two things had to be true: the provider had to clearly claim a specific country (on their website, in their app, or in configs), and we had direct active measurements from ProbeNet for the exit IPs behind that location.We ignored any locations where the marketing was ambiguous, where we hadn’t measured the exit directly, or where we only had weaker hints like hostname strings, registry data, or third-party IP databases. Those signals can be useful and true, but we wanted our numbers to be as hard-to-argue-with as possible.The result is that the mismatch rates we show here are conservative. With a looser methodology that also leaned on those additional hints, the numbers would almost certainly be higher, not lower.
...
Read the original on ipinfo.io »
I’ve started using the term HTML tools to refer to HTML applications that I’ve been building which combine HTML, JavaScript, and CSS in a single file and use them to provide useful functionality. I have built over 150 of these in the past two years, almost all of them written by LLMs. This article presents a collection of useful patterns I’ve discovered along the way.
First, some examples to show the kind of thing I’m talking about:
pypi-changelog lets you generate (and copy to clipboard) diffs between different PyPI package releases.
bluesky-thread provides a nested view of a discussion thread on Bluesky.
These are some of my recent favorites. I have dozens more like this that I use on a regular basis.
You can explore my collection on tools.simonwillison.net—the by month view is useful for browsing the entire collection.
If you want to see the code and prompts, almost all of the examples in this post include a link in their footer to “view source” on GitHub. The GitHub commits usually contain either the prompt itself or a link to the transcript used to create the tool.
These are the characteristics I have found to be most productive in building tools of this nature:
A single file: inline JavaScript and CSS in a single HTML file means the least hassle in hosting or distributing them, and crucially means you can copy and paste them out of an LLM response.
Avoid React, or anything with a build step. The problem with React is that JSX requires a build step, which makes everything massively less convenient. I prompt “no react” and skip that whole rabbit hole entirely.
Load dependencies from a CDN. The fewer dependencies the better, but if there’s a well known library that helps solve a problem I’m happy to load it from CDNjs or jsdelivr or similar.
Keep them small. A few hundred lines means the maintainability of the code doesn’t matter too much: any good LLM can read them and understand what they’re doing, and rewriting them from scratch with help from an LLM takes just a few minutes.
The end result is a few hundred lines of code that can be cleanly copied and pasted into a GitHub repository.
The easiest way to build one of these tools is to start in ChatGPT or Claude or Gemini. All three have features where they can write a simple HTML+JavaScript application and show it to you directly.
Claude calls this “Artifacts”, ChatGPT and Gemini both call it “Canvas”. Claude has the feature enabled by default, ChatGPT and Gemini may require you to toggle it on in their “tools” menus.
Try this prompt in Gemini or ChatGPT:
Build a canvas that lets me paste in JSON and converts it to YAML. No React.
Or this prompt in Claude:
Build an artifact that lets me paste in JSON and converts it to YAML. No React.
I always add “No React” to these prompts, because otherwise they tend to build with React, resulting in a file that is harder to copy and paste out of the LLM and use elsewhere. I find that attempts which use React take longer to display (since they need to run a build step) and are more likely to contain crashing bugs for some reason, especially in ChatGPT.
All three tools have “share” links that provide a URL to the finished application. Examples:
Coding agents such as Claude Code and Codex CLI have the advantage that they can test the code themselves while they work on it using tools like Playwright. I often upgrade to one of those when I’m working on something more complicated, like my Bluesky thread viewer tool shown above.
I also frequently use asynchronous coding agents like Claude Code for web to make changes to existing tools. I shared a video about that in Building a tool to copy-paste share terminal sessions using Claude Code for web.
Claude Code for web and Codex Cloud run directly against my simonw/tools repo, which means they can publish or upgrade tools via Pull Requests (here are dozens of examples) without me needing to copy and paste anything myself.
Any time I use an additional JavaScript library as part of my tool I like to load it from a CDN.
The three major LLM platforms support specific CDNs as part of their Artifacts or Canvas features, so often if you tell them “Use PDF.js” or similar they’ll be able to compose a URL to a CDN that’s on their allow-list.
Sometimes you’ll need to go and look up the URL on cdnjs or jsDelivr and paste it into the chat.
CDNs like these have been around for long enough that I’ve grown to trust them, especially for URLs that include the package version.
The alternative to CDNs is to use npm and have a build step for your projects. I find this reduces my productivity at hacking on individual tools and makes it harder to self-host them.
I don’t like leaving my HTML tools hosted by the LLM platforms themselves for a couple of reasons. First, LLM platforms tend to run the tools inside a tight sandbox with a lot of restrictions. They’re often unable to load data or images from external URLs, and sometimes even features like linking out to other sites are disabled.
The end-user experience often isn’t great either. They show warning messages to new users, often take additional time to load and delight in showing promotions for the platform that was used to create the tool.
They’re also not as reliable as other forms of static hosting. If ChatGPT or Claude are having an outage I’d like to still be able to access the tools I’ve created in the past.
Being able to easily self-host is the main reason I like insisting on “no React” and using CDNs for dependencies—the absence of a build step makes hosting tools elsewhere a simple case of copying and pasting them out to some other provider.
My preferred provider here is GitHub Pages because I can paste a block of HTML into a file on github.com and have it hosted on a permanent URL a few seconds later. Most of my tools end up in my simonw/tools repository which is configured to serve static files at tools.simonwillison.net.
One of the most useful input/output mechanisms for HTML tools comes in the form of copy and paste.
I frequently build tools that accept pasted content, transform it in some way and let the user copy it back to their clipboard to paste somewhere else.
Copy and paste on mobile phones is fiddly, so I frequently include “Copy to clipboard” buttons that populate the clipboard with a single touch.
Most operating system clipboards can carry multiple formats of the same copied data. That’s why you can paste content from a word processor in a way that preserves formatting, but if you paste the same thing into a text editor you’ll get the content with formatting stripped.
These rich copy operations are available in JavaScript paste events as well, which opens up all sorts of opportunities for HTML tools.
hacker-news-thread-export lets you paste in a URL to a Hacker News thread and gives you a copyable condensed version of the entire thread, suitable for pasting into an LLM to get a useful summary.
paste-rich-text lets you copy from a page and paste to get the HTML—particularly useful on mobile where view-source isn’t available.
alt-text-extractor lets you paste in images and then copy out their alt text.
The key to building interesting HTML tools is understanding what’s possible. Building custom debugging tools is a great way to explore these options.
clipboard-viewer is one of my most useful. You can paste anything into it (text, rich text, images, files) and it will loop through and show you every type of paste data that’s available on the clipboard.
This was key to building many of my other tools, because it showed me the invisible data that I could use to bootstrap other interesting pieces of functionality.
keyboard-debug shows the keys (and KeyCode values) currently being held down.
cors-fetch reveals if a URL can be accessed via CORS.
HTML tools may not have access to server-side databases for storage but it turns out you can store a lot of state directly in the URL.
I like this for tools I may want to bookmark or share with other people.
icon-editor is a custom 24x24 icon editor I built to help hack on icons for the GitHub Universe badge. It persists your in-progress icon design in the URL so you can easily bookmark and share it.
The localStorage browser API lets HTML tools store data persistently on the user’s device, without exposing that data to the server.
I use this for larger pieces of state that don’t fit comfortably in a URL, or for secrets like API keys which I really don’t want anywhere near my server —even static hosts might have server logs that are outside of my influence.
word-counter is a simple tool I built to help me write to specific word counts, for things like conference abstract submissions. It uses localStorage to save as you type, so your work isn’t lost if you accidentally close the tab.
render-markdown uses the same trick—I sometimes use this one to craft blog posts and I don’t want to lose them.
haiku is one of a number of LLM demos I’ve built that request an API key from the user (via the prompt() function) and then store that in localStorage. This one uses Claude Haiku to write haikus about what it can see through the user’s webcam.
CORS stands for Cross-origin resource sharing. It’s a relatively low-level detail which controls if JavaScript running on one site is able to fetch data from APIs hosted on other domains.
APIs that provide open CORS headers are a goldmine for HTML tools. It’s worth building a collection of these over time.
Here are some I like:
* iNaturalist for fetching sightings of animals, including URLs to photos
* GitHub because anything in a public repository in GitHub has a CORS-enabled anonymous API for fetching that content from the raw.githubusercontent.com domain, which is behind a caching CDN so you don’t need to worry too much about rate limits or feel guilty about adding load to their infrastructure.
* Bluesky for all sorts of operations
* Mastodon has generous CORS policies too, as used by applications like phanpy.social
GitHub Gists are a personal favorite here, because they let you build apps that can persist state to a permanent Gist through making a cross-origin API call.
species-observation-map uses iNaturalist to show a map of recent sightings of a particular species.
zip-wheel-explorer fetches a .whl file for a Python package from PyPI, unzips it (in browser memory) and lets you navigate the files.
github-issue-to-markdown fetches issue details and comments from the GitHub API (including expanding any permanent code links) and turns them into copyable Markdown.
terminal-to-html can optionally save the user’s converted terminal session to a Gist.
bluesky-quote-finder displays quotes of a specified Bluesky post, which can then be sorted by likes or by time.
All three of OpenAI, Anthropic and Gemini offer JSON APIs that can be accessed via CORS directly from HTML tools.
Unfortunately you still need an API key, and if you bake that key into your visible HTML anyone can steal it and use to rack up charges on your account.
I use the localStorage secrets pattern to store API keys for these services. This sucks from a user experience perspective—telling users to go and create an API key and paste it into a tool is a lot of friction—but it does work.
haiku uses the Claude API to write a haiku about an image from the user’s webcam.
gemini-bbox demonstrates Gemini 2.5’s ability to return complex shaped image masks for objects in images, see Image segmentation using Gemini 2.5.
You don’t need to upload a file to a server in order to make use of the element. JavaScript can access the content of that file directly, which opens up a wealth of opportunities for useful functionality.
ocr is the first tool I built for my collection, described in Running OCR against PDFs and images directly in your browser. It uses PDF.js and Tesseract.js to allow users to open a PDF in their browser which it then converts to an image-per-page and runs through OCR.
social-media-cropper lets you open (or paste in) an existing image and then crop it to common dimensions needed for different social media platforms—2:1 for Twitter and LinkedIn, 1.4:1 for Substack etc.
ffmpeg-crop lets you open and preview a video file in your browser, drag a crop box within it and then copy out the ffmpeg command needed to produce a cropped copy on your own machine.
An HTML tool can generate a file for download without needing help from a server.
The JavaScript library ecosystem has a huge range of packages for generating files in all kinds of useful formats.
Pyodide is a distribution of Python that’s compiled to WebAssembly and designed to run directly in browsers. It’s an engineering marvel and one of the most underrated corners of the Python world.
It also cleanly loads from a CDN, which means there’s no reason not to use it in HTML tools!
Even better, the Pyodide project includes micropip—a mechanism that can load extra pure-Python packages from PyPI via CORS.
pyodide-bar-chart demonstrates running Pyodide, Pandas and matplotlib to render a bar chart directly in the browser.
numpy-pyodide-lab is an experimental interactive tutorial for Numpy.
apsw-query demonstrates the APSW SQLite library running in a browser, using it to show EXPLAIN QUERY plans for SQLite queries.
Pyodide is possible thanks to WebAssembly. WebAssembly means that a vast collection of software originally written in other languages can now be loaded in HTML tools as well.
Squoosh.app was the first example I saw that convinced me of the power of this pattern—it makes several best-in-class image compression libraries available directly in the browser.
I’ve used WebAssembly for a few of my own tools:
The biggest advantage of having a single public collection of 100+ tools is that it’s easy for my LLM assistants to recombine them in interesting ways.
Sometimes I’ll copy and paste a previous tool into the context, but when I’m working with a coding agent I can reference them by name—or tell the agent to search for relevant examples before it starts work.
The source code of any working tool doubles as clear documentation of how something can be done, including patterns for using editing libraries. An LLM with one or two existing tools in their context is much more likely to produce working code.
And then, after it had found and read the source code for zip-wheel-explorer:
Build a new tool pypi-changelog.html which uses the PyPI API to get the wheel URLs of all available versions of a package, then it displays them in a list where each pair has a “Show changes” clickable in between them - clicking on that fetches the full contents of the wheels and displays a nicely rendered diff representing the difference between the two, as close to a standard diff format as you can get with JS libraries from CDNs, and when that is displayed there is a “Copy” button which copies that diff to the clipboard
See Running OCR against PDFs and images directly in your browser for another detailed example of remixing tools to create something new.
I like keeping (and publishing) records of everything I do with LLMs, to help me grow my skills at using them over time.
For HTML tools I built by chatting with an LLM platform directly I use the “share” feature for those platforms.
For Claude Code or Codex CLI or other coding agents I copy and paste the full transcript from the terminal into my terminal-to-html tool and share that using a Gist.
In either case I include links to those transcripts in the commit message when I save the finished tool to my repository. You can see those in my tools.simonwillison.net colophon.
I’ve had so much fun exploring the capabilities of LLMs in this way over the past year and a half, and building tools in this way has been invaluable in helping me understand both the potential for building tools with HTML and the capabilities of the LLMs that I’m building them with.
If you’re interested in starting your own collection I highly recommend it! All you need to get started is a free GitHub repository with GitHub Pages enabled (Settings -> Pages -> Source -> Deploy from a branch -> main) and you can start copying in .html pages generated in whatever manner you like.
...
Read the original on simonwillison.net »
I do Advent of Code every year.
For the last seven years, including this one, I have managed to get all the stars. I do not say that to brag. I say it because it explains why I keep coming back.
It is one of the few tech traditions I never get bored of, even after doing it for a long time. I like the time pressure. I like the community vibe. I like that every December I can pick one language and go all in.
Advent of Code is usually 25 days. This year Eric decided to do 12 days instead.
So instead of 50 parts, it was 24.
That sounds like a relaxed year. It was not, but not in a bad way.
The easier days were harder than the easy days in past years, but they were also really engaging and fun to work through. The hard days were hard, especially the last three, but they were still the good kind of hard. They were problems I actually wanted to wrestle with.
It also changes the pacing in a funny way. In a normal year, by day 10 you have a pretty comfy toolbox. This year it felt like the puzzles were already demanding that toolbox while I was still building it.
That turned out to be a perfect setup for learning a new language.
Gleam is easy to like quickly.
The syntax is clean. The compiler is helpful, and the error messages are super duper good. Rust good.
Most importantly, the language strongly nudges you into a style that fits Advent of Code really well. Parse some text. Transform it a few times. Fold. Repeat.
One thing I did not expect was how good the editor experience would be. The LSP worked much better than I expected. It basically worked perfectly the whole time. I used the Gleam extension for IntelliJ and it was great.
I also just like FP.
FP is not always easier, but it is often easier. When it clicks, you stop writing instructions and you start describing the solution.
The first thing I fell in love with was echo.
It is basically a print statement that does not make you earn it. You can echo any value. You do not have to format anything. You do not have to build a string. You can just drop it into a pipeline and keep going.
This is the kind of thing I mean:
You can quickly inspect values at multiple points without breaking the flow.
I did miss string interpolation, especially early on. echo made up for a lot of that.
It mostly hit when I needed to generate text, not when I needed to inspect values. The day where I generated an LP file for glpsol is the best example. It is not hard code, but it is a lot of string building. Without interpolation it turns into a bit of a mess of <>s.
This is a small excerpt from my LP generator:
It works. It is just the kind of code where you really feel missing interpolation.
Grids are where you normally either crash into out of bounds bugs, or you litter your code with bounds checks you do not care about.
In my day 4 solution I used a dict as a grid. The key ergonomic part is that dict.get gives you an option-like result, which makes neighbour checking safe by default.
This is the neighbour function from my solution:
That last line is the whole point.
No bounds checks. No sentinel values. Out of bounds just disappears.
I expected to write parsers and helpers, and I did. What I did not expect was how often Gleam already had the exact list function I needed.
I read the input, chunked it into rows, transposed it, and suddenly the rest of the puzzle became obvious.
In a lot of languages you end up writing your own transpose yet again. In Gleam it is already there.
Another example is list.combination_pairs.
In day 8 I needed all pairs of 3D points. In an imperative language you would probably write nested loops and then question your off by one logic.
In Gleam it is a one liner:
Sometimes FP is not about being clever. It is about having the right function name.
If I had to pick one feature that made me want to keep writing Gleam after AoC, it is fold_until.
Early exit without hacks is fantastic in puzzles.
In day 8 part 2 I kept merging sets until the first set in the list contained all boxes. When that happens, I stop.
The core shape looks like this:
It is small, explicit, and it reads like intent.
I also used fold_until in day 10 part 1 to find the smallest combination size that works.
Even though I enjoyed Gleam a lot, I did hit a few recurring friction points.
None of these are deal breakers. They are just the kind of things you notice when you do 24 parts in a row.
This one surprised me on day 1.
For AoC you read a file every day. In this repo I used simplifile everywhere because you need something. It is fine, I just did not expect basic file IO to be outside the standard library.
Day 2 part 2 pushed me into regex and I had to add gleam_regexp.
This is the style I used, building a regex from a substring:
Again, totally fine. It just surprised me.
You can do [first, ..rest] and you can do [first, second].
But you cannot do [first, ..middle, last].
It is not the end of the world, but it would have made some parsing cleaner.
In Gleam a lot of comparisons are not booleans. You get an order value.
This is great for sorting. It is also very explicit. It can be a bit verbose when you just want an check.
In day 5 I ended up writing patterns like this:
I used bigi a few times this year.
On the Erlang VM, integers are arbitrary precision, so you usually do not care about overflow. That is one of the nicest things about the BEAM.
If you want your Gleam code to also target JavaScript, you do care. JavaScript has limits, and suddenly using bigi becomes necessary for some puzzles.
I wish that was just part of Int, with a single consistent story across targets.
Day 10 part 1 was my favorite part of the whole event.
The moment I saw the toggling behavior, it clicked as XOR. Represent the lights as a number. Represent each button as a bitmask. Find the smallest combination of bitmasks that XOR to the target.
This is the fold from my solution:
It felt clean, it felt fast, and it felt like the representation did most of the work.
I knew brute force was out. It was clearly a system of linear equations.
In previous years I would reach for Z3, but there are no Z3 bindings for Gleam. I tried to stay in Gleam, and I ended up generating an LP file and shelling out to glpsol using shellout.
It worked, and honestly the LP format is beautiful.
Here is the call:
It is a hack, but it is a pragmatic hack, and that is also part of Advent of Code.
Day 11 part 2 is where I was happy I was writing Gleam.
The important detail was that the memo key is not just the node. It is the node plus your state.
In my case the key was:
Once I got the memo threading right, it ran instantly.
The last day was the only puzzle I did not fully enjoy.
Not because it was bad. It just felt like it relied on assumptions about the input, and I am one of those people that does not love doing that.
I overthought it for a bit, then I learned it was more of a troll problem. The “do the areas of the pieces, when fully interlocked, fit on the board” heuristic was enough.
In my solution it is literally this:
Sometimes you build a beautiful mental model and then the right answer is a single inequality.
I am very happy I picked Gleam this year.
It has sharp edges, mostly around where the standard library draws the line and a few language constraints that show up in puzzle code. But it also has real strengths.
Pipelines feel good. Options and Results make unsafe problems feel safe. The list toolbox is better than I expected. fold_until is incredible. Once you stop trying to write loops and you let it be functional, the solutions start to feel clearer.
I cannot wait to try Gleam in a real project. I have been thinking about using it to write a webserver, and I am genuinely excited to give it a go.
And of course, I cannot wait for next year’s Advent of Code.
If you want to look at the source for all 12 days, it is here:
...
Read the original on blog.tymscar.com »
Given that there would only be one service, it made sense to move all the destination code into one repo, which meant merging all the different dependencies and tests into a single repo. We knew this was going to be messy.
For each of the 120 unique dependencies, we committed to having one version for all our destinations. As we moved destinations over, we’d check the dependencies it was using and update them to the latest versions. We fixed anything in the destinations that broke with the newer versions.
With this transition, we no longer needed to keep track of the differences between dependency versions. All our destinations were using the same version, which significantly reduced the complexity across the codebase. Maintaining destinations now became less time consuming and less risky.
We also wanted a test suite that allowed us to quickly and easily run all our destination tests. Running all the tests was one of the main blockers when making updates to the shared libraries we discussed earlier.
Fortunately, the destination tests all had a similar structure. They had basic unit tests to verify our custom transform logic was correct and would execute HTTP requests to the partner’s endpoint to verify that events showed up in the destination as expected.
Recall that the original motivation for separating each destination codebase into its own repo was to isolate test failures. However, it turned out this was a false advantage. Tests that made HTTP requests were still failing with some frequency. With destinations separated into their own repos, there was little motivation to clean up failing tests. This poor hygiene led to a constant source of frustrating technical debt. Often a small change that should have only taken an hour or two would end up requiring a couple of days to a week to complete.
The outbound HTTP requests to destination endpoints during the test run was the primary cause of failing tests. Unrelated issues like expired credentials shouldn’t fail tests. We also knew from experience that some destination endpoints were much slower than others. Some destinations took up to 5 minutes to run their tests. With over 140 destinations, our test suite could take up to an hour to run.
To solve for both of these, we created Traffic Recorder. Traffic Recorder is built on top of yakbak, and is responsible for recording and saving destinations’ test traffic. Whenever a test runs for the first time, any requests and their corresponding responses are recorded to a file. On subsequent test runs, the request and response in the file is played back instead requesting the destination’s endpoint. These files are checked into the repo so that the tests are consistent across every change. Now that the test suite is no longer dependent on these HTTP requests over the internet, our tests became significantly more resilient, a must-have for the migration to a single repo.
It took milliseconds to complete running the tests for all 140+ of our destinations after we integrated Traffic Recorder. In the past, just one destination could have taken a couple of minutes to complete. It felt like magic.
Once the code for all destinations lived in a single repo, they could be merged into a single service. With every destination living in one service, our developer productivity substantially improved. We no longer had to deploy 140+ services for a change to one of the shared libraries. One engineer can deploy the service in a matter of minutes.
The proof was in the improved velocity. When our microservice architecture was still in place, we made 32 improvements to our shared libraries. One year later, we’ve made 46 improvements.
The change also benefited our operational story. With every destination living in one service, we had a good mix of CPU and memory-intense destinations, which made scaling the service to meet demand significantly easier. The large worker pool can absorb spikes in load, so we no longer get paged for destinations that process small amounts of load.
Moving from our microservice architecture to a monolith overall was huge improvement, however, there are trade-offs:
Fault isolation is difficult. With everything running in a monolith, if a bug is introduced in one destination that causes the service to crash, the service will crash for all destinations. We have comprehensive automated testing in place, but tests can only get you so far. We are currently working on a much more robust way to prevent one destination from taking down the entire service while still keeping all the destinations in a monolith.
In-memory caching is less effective. Previously, with one service per destination, our low traffic destinations only had a handful of processes, which meant their in-memory caches of control plane data would stay hot. Now that cache is spread thinly across 3000+ processes so it’s much less likely to be hit. We could use something like Redis to solve for this, but then that’s another point of scaling for which we’d have to account. In the end, we accepted this loss of efficiency given the substantial operational benefits.
Updating the version of a dependency may break multiple destinations. While moving everything to one repo solved the previous dependency mess we were in, it means that if we want to use the newest version of a library, we’ll potentially have to update other destinations to work with the newer version. In our opinion though, the simplicity of this approach is worth the trade-off. And with our comprehensive automated test suite, we can quickly see what breaks with a newer dependency version.
Our initial microservice architecture worked for a time, solving the immediate performance issues in our pipeline by isolating the destinations from each other. However, we weren’t set up to scale. We lacked the proper tooling for testing and deploying the microservices when bulk updates were needed. As a result, our developer productivity quickly declined.
Moving to a monolith allowed us to rid our pipeline of operational issues while significantly increasing developer productivity. We didn’t make this transition lightly though and knew there were things we had to consider if it was going to work.
We needed a rock solid testing suite to put everything into one repo. Without this, we would have been in the same situation as when we originally decided to break them apart. Constant failing tests hurt our productivity in the past, and we didn’t want that happening again. We accepted the trade-offs inherent in a monolithic architecture and made sure we had a good story around each. We had to be comfortable with some of the sacrifices that came with this change.
When deciding between microservices or a monolith, there are different factors to consider with each. In some parts of our infrastructure, microservices work well but our server-side destinations were a perfect example of how this popular trend can actually hurt productivity and performance. It turns out, the solution for us was a monolith.
The transition to a monolith was made possible by Stephen Mathieson, Rick Branson, Achille Roussel, Tom Holmes, and many more.
Special thanks to Rick Branson for helping review and edit this post at every stage.
...
Read the original on www.twilio.com »
Memory safety and sandboxing are two different things. It’s reasonable to think of them as orthogonal: you could have memory safety but not be sandboxed, or you could be sandboxed but not memory safe.
* Example of memory safe but not sandboxed: a pure Java program that opens files on the filesystem for reading and writing and accepts filenames from the user. The OS will allow this program to overwrite any file that the user has access to. This program can be quite dangerous even if it is memory safe. Worse, imagine that the program didn’t have any code to open files for reading and writing, but also had no sandbox to prevent those syscalls from working. If there was a bug in the memory safety enforcement of this program (say, because of a bug in the Java implementation), then an attacker could cause this program to overwrite any file if they succeeded at achieving code execution via weird state.
* Example of sandboxed but not memory safe: a program written in assembly that starts by requesting that the OS revoke all of its capabilities beyond just pure compute. If the program did want to open a file or write to it, then the kernel will kill the process, based on the earlier request to have this capability revoked. This program could have lots of memory safety bugs (because it’s written in assembly), but even if it did, then the attacker cannot make this program overwrite any file unless they find some way to bypass the sandbox.
In practice, sandboxes have holes by design. A typical sandbox allows the program to send and receive messages to broker processes that have higher privileges. So, an attacker may first use a memory safety bug to make the sandboxed process send malicious messages, and then use those malicious messages to break into the brokers.
The best kind of defense is to have both a sandbox and memory safety. This document describes how to combine sandboxing and Fil-C’s memory safety by explaining what it takes to port OpenSSH’s seccomp-based Linux sandbox code to Fil-C.
Fil-C is a memory safe implementation of C and C++ and this site has a lot of documentation about it. Unlike most memory safe languages, Fil-C enforces safety down to where your code meets Linux syscalls and the Fil-C runtime is robust enough that it’s possible to use it in low-level system components like init and udevd. Lots of programs work in Fil-C, including OpenSSH, which makes use of seccomp-BPF sandboxing.
This document focuses on how OpenSSH uses seccomp and other technologies on Linux to build a sandbox around its unprivileged sshd-session process. Let’s review what tools Linux gives us that OpenSSH uses:
* chroot to restrict the process’s view of the filesystem.
* Running the process with the sshd user and group, and giving that user/group no privileges.
* setrlimit to prevent opening files, starting processes, or writing to files.
* seccomp-BPF syscall filter to reduce the attack surface by allowlisting only the set of syscalls that are legitimate for the unprivileged process. Syscalls not in the allowlist will crash the process with SIGSYS.
The Chromium developers and the Mozilla developers both have excellent notes about how to do sandboxing on Linux using seccomp. Seccomp-BPF is a well-documented kernel feature that can be used as part of a larger sandboxing story.
Fil-C makes it easy to use chroot and different users and groups. The syscalls that are used for that part of the sandbox are trivially allowed by Fil-C and no special care is required to use them.
Both setrlimit and seccomp-BPF require special care because the Fil-C runtime starts threads, allocates memory, and performs synchronization. This document describes what you need to know to make effective use of those sandboxing technologies in Fil-C. First, I describe how to build a sandbox that prevents thread creation without breaking Fil-C’s use of threads. Then, I describe what tweaks I had to make to OpenSSH’s seccomp filter. Finally, I describe how the Fil-C runtime implements the syscalls used to install seccomp filters.
The Fil-C runtime uses multiple background threads for garbage collection and has the ability to automatically shut those threads down when they are not in use. If the program wakes up and starts allocating memory again, then those threads are automatically restarted.
Starting threads violates the “no new processes” rule that OpenSSH’s setrlimit sandbox tries to achieve (since threads are just lightweight processes on Linux). It also relies on syscalls like clone3 that are not part of OpenSSH’s seccomp filter allowlist.
It would be a regression to the sandbox to allow process creation just because the Fil-C runtime relies on it. Instead, I added a new API to :
void zlock_runtime_threads(void);
This forces the runtime to immediately create whatever threads it needs, and to disable shutting them down on demand. Then, I added a call to zlock_runtime_threads() in OpenSSH’s ssh_sandbox_child function before either the setrlimit or seccomp-BPF sandbox calls happen.
Because the use of zlock_runtime_threads() prevents subsequent thread creation from happening, most of the OpenSSH sandbox just works. I did not have to change how OpenSSH uses setrlimit. I did change the following about the seccomp filter:
* Failure results in SECCOMP_RET_KILL_PROCESS rather than SECCOMP_RET_KILL. This ensures that Fil-C’s background threads are also killed if a sandbox violation occurs.
* MAP_NORESERVE is added to the mmap allowlist, since the Fil-C allocator uses it. This is not a meaningful regression to the filter, since MAP_NORESERVE is not a meaningful capability for an attacker to have.
* sched_yield is allowed. This is not a dangerous syscall (it’s semantically a no-op). The Fil-C runtime uses it as part of its lock implementation.
Nothing else had to change, since the filter already allowed all of the futex syscalls that Fil-C uses for synchronization.
The OpenSSH seccomp filter is installed using two prctl calls. First, we PR_SET_NO_NEW_PRIVS:
if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) == -1) {
debug(“%s: prctl(PR_SET_NO_NEW_PRIVS): %s”,
__func__, strerror(errno));
nnp_failed = 1;
This prevents additional privileges from being acquired via execve. It’s required that unprivileged processes that install seccomp filters first set the no_new_privs bit.
if (prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &preauth_program) == -1)
debug(“%s: prctl(PR_SET_SECCOMP): %s”,
__func__, strerror(errno));
else if (nnp_failed)
fatal(“%s: SECCOMP_MODE_FILTER activated but ”
“PR_SET_NO_NEW_PRIVS failed”, __func__);
This installs the seccomp filter in preauth_program. Note that this will fail in the kernel if the no_new_privs bit is not set, so the fact that OpenSSH reports a fatal error if the filter is installed without no_new_privs is just healthy paranoia on the part of the OpenSSH authors.
The trouble with both syscalls is that they affect the calling thread, not all threads in the process. Without special care, Fil-C runtime’s background threads would not have the no_new_privs bit set and would not have the filter installed. This would mean that if an attacker busted through Fil-C’s memory safety protections (in the unlikely event that they found a bug in Fil-C itself!), then they could use those other threads to execute syscalls that bypass the filter!
To prevent even this unlikely escape, the Fil-C runtime’s wrapper for prctl implements PR_SET_NO_NEW_PRIVS and PR_SET_SECCOMP by handshaking all runtime threads using this internal API:
/* Calls the callback from every runtime thread. */
PAS_API void filc_runtime_threads_handshake(void (*callback)(void* arg), void* arg);
The callback performs the requested prctl from each runtime thread. This ensures that the no_new_privs bit and the filter are installed on all threads in the Fil-C process.
Additionally, because of ambiguity about what to do if the process has multiple user threads, these two prctl commands will trigger a Fil-C safety error if the program has multiple user threads.
The best kind of protection if you’re serious about security is to combine memory safety with sandboxing. This document shows how to achieve this using Fil-C and the sandbox technologies available on Linux, all without regressing the level of protection that those sandboxes enforce or the memory safety guarantees of Fil-C.
...
Read the original on fil-c.org »
Loved reading through GReg TeChnoLogY Anthony Bourdain’s Lost Li.st’s and seeing the list of lost Anthony Bourdain li.st’s made me think on whether at least some of them we can recover.
Having worked in security and crawling space for majority of my career—I don’t have the access nor permission to use the proprietary storages—I thought we might be able to find something from publicly available crawl archives.
All of the code and examples lead to the source git repository. This article has also been discussed on hackernews. Also, a week before I published this, mirandom had the same idea as me and published their findings—go check them out.
If Internet Archive had the partial list that Greg published, what about the Common Crawl? Reading through their documentation, it seems straightforward enough to get prefix index for Tony’s lists and grep for any sub-paths.
Putting something up with help of Claude to prove my theory, we have commoncrawl_search.py that makes a single index request to a specific dataset and if any hits discovered, retrieve them from the public s3 bucket—since they are small straight-up HTML documents, seems even more feasible than I had initially thought.
Simply have a python version around 3.14.2 and install the dependencies from requirements.txt. Run the below and we are in business. Now, below, you’ll find the command I ran and then some manual archeological effort to prettify the findings.
Images have been lost. Other avenues had struck no luck. I’ll try again later.
Any and all emphasis, missing punctuation, cool grammar is all by Anthony Bourdain. The only modifications I have made is to the layout, to represent li.st as closely as possible with no changes to the content.
If you see these blocks, that’s me commenting if pictures have been lost.
From Greg’s page, let’s go and try each entry one by one, I’ll put the table of what I wasn’t able to find in Common Crawl, but I would assume exists elsewhere—I’d be happy to take another look. And no, none of this above has been written by AI, only the code since I don’t really care about warcio encoding or writing the same python requests method for the Nth time. Enjoy!
Things I No Longer Have Time or Patience For
Dinners where it takes the waiter longer to describe my food than it takes me to eat it.
I admit it: my life doesn’t suck. Some recent views I’ve enjoyed
Montana at sunset : There’s pheasant cooking behind the camera somewhere. To the best of my recollection some very nice bourbon. And it IS a big sky .
Puerto Rico: Thank you Jose Andres for inviting me to this beautiful beach!
Naxos: drinking ouzo and looking at this. Not a bad day at the office .
Istanbul: raki and grilled lamb and this ..
Borneo: The air is thick with hints of durian, sambal, coconut..
Chicago: up early to go train #Redzovic
If I Were Trapped on a Desert Island With Only Three Tv Series
Edge of Darkness (with Bob Peck and Joe Don Baker )
The Film Nobody Ever Made
Dreamcasting across time with the living and the dead, this untitled, yet to be written masterwork of cinema, shot, no doubt, by Christopher Doyle, lives only in my imagination.
If you bought these vinyls from an emaciated looking dude with an eager, somewhat distracted expression on his face somewhere on upper Broadway sometime in the mid 80’s, that was me . I’d like them back. In a sentimental mood.
material things I feel a strange, possibly unnatural attraction to and will buy (if I can) if I stumble across them in my travels. I am not a paid spokesperson for any of this stuff .
Vintage Persol sunglasses : This is pretty obvious. I wear them a lot. I collect them when I can. Even my production team have taken to wearing them.
19th century trepanning instruments: I don’t know what explains my fascination with these devices, designed to drill drain-sized holes into the skull often for purposes of relieving “pressure” or “bad humours”. But I can’t get enough of them. Tip: don’t get a prolonged headache around me and ask if I have anything for it. I do.
Montagnard bracelets: I only have one of these but the few that find their way onto the market have so much history. Often given to the indigenous mountain people ’s Special Forces advisors during the very early days of America’s involvement in Vietnam .
Jiu Jitsi Gi’s: Yeah. When it comes to high end BJJ wear, I am a total whore. You know those people who collect limited edition Nikes ? I’m like that but with Shoyoroll . In my defense, I don’t keep them in plastic bags in a display case. I wear that shit.
Voiture: You know those old school, silver plated (or solid silver) blimp like carts they roll out into the dining room to carve and serve your roast? No. Probably not. So few places do that anymore. House of Prime Rib does it. Danny Bowein does it at Mission Chinese. I don’t have one of these. And I likely never will. But I can dream.
Kramer knives: I don’t own one. I can’t afford one . And I’d likely have to wait for years even if I could afford one. There’s a long waiting list for these individually hand crafted beauties. But I want one. Badly. http://www.kramerknives.com/gallery/
R. CRUMB : All of it. The collected works. These Taschen volumes to start. I wanted to draw brilliant, beautiful, filthy comix like Crumb until I was 13 or 14 and it became clear that I just didn’t have that kind of talent. As a responsible father of an 8 year old girl, I just can’t have this stuff in the house. Too dark, hateful, twisted. Sigh…
THE MAGNIFICENT AMBERSONS : THE UNCUT, ORIGINAL ORSON WELLES VERSION: It doesn’t exist. Which is why I want it. The Holy Grail for film nerds, Welles’ follow up to CITIZEN KANE shoulda, coulda been an even greater masterpiece . But the studio butchered it and re-shot a bullshit ending. I want the original. I also want a magical pony.
Four Spy Novels by Real Spies and One Not by a Spy
I like good spy novels. I prefer them to be realistic . I prefer them to be written by real spies. If the main character carries a gun, I’m already losing interest. Spy novels should be about betrayal.
Ashenden–Somerset Maugham
Somerset wrote this bleak, darkly funny, deeply cynical novel in the early part of the 20th century. It was apparently close enough to the reality of his espionage career that MI6 insisted on major excisions. Remarkably ahead of its time in its atmosphere of futility and betrayal.
The Man Who Lost the War–WT Tyler
WT Tyler is a pseudonym for a former “foreign service” officer who could really really write. This one takes place in post-war Berlin and elsewhere and was, in my opinion, wildly under appreciated. See also his Ants of God.
The Human Factor–Graham Greene
Was Greene thinking of his old colleague Kim Philby when he wrote this? Maybe. Probably. See also Our Man In Havana.
The Tears of Autumn -Charles McCarry
A clever take on the JFK assassination with a Vietnamese angle. See also The Miernik Dossier and The Last Supper
Agents of Innocence–David Ignatius
Ignatius is a journalist not a spook, but this one, set in Beirut, hewed all too closely to still not officially acknowledged events. Great stuff.
I wake up in a lot of hotels, so I am fiercely loyal to the ones I love. A hotel where I know immediately wher I am when I open my eyes in the morning is a rare joy. Here are some of my favorites
CHATEAU MARMONT ( LA) : if I have to die in a hotel room, let it be here. I will work in LA just to stay at the Chateau.
CHILTERN FIREHOUSE (London): Same owner as the Chateau. An amazing Victorian firehouse turned hotel. Pretty much perfection
EDGEWATER INN (Seattle): kind of a lumber theme going on…ships slide right by your window. And the Led Zep “Mudshark incident”.
THE METROPOLE (Hanoi): there’s a theme developing: if Graham Greene stayed at a hotel, chances are I will too.
THE MURRAY (Livingston,Montana): You want the Peckinpah suite
Pictures in each have not been recovered.
5 Photos on My Phone, Chosen at Random
Shame, indeed, no pictures, there was one for each.
People I’d Like to Be for a Day
I’m Hungry and Would Be Very Happy to Eat Any of This Right Now
Spaghetti a la bottarga . I would really, really like some of this. Al dente, lots of chili flakes
A street fair sausage and pepper hero would be nice. Though shitting like a mink is an inevitable and near immediate outcome
Some uni. Fuck it. I’ll smear it on an English muffin at this point.
I wonder if that cheese is still good?
In which my Greek idyll is Suddenly invaded by professional nudists
T-shirt and no pants. Leading one to the obvious question : why bother?
The cheesy crust on the side of the bowl of Onion Soup Gratinee
Before he died, Warren Zevon dropped this wisdom bomb: “Enjoy every sandwich”. These are a few locals I’ve particularly enjoyed:
PASTRAMI QUEEN: (1125 Lexington Ave. ) Pastrami Sandwich. Also the turkey with Russian dressing is not bad. Also the brisket.
EISENBERG’S SANDWICH SHOP: ( 174 5th Ave.) Tuna salad on white with lettuce. I’d suggest drinking a lime Rickey or an Arnold Palmer with that.
THE JOHN DORY OYSTER BAR: (1196 Broadway) the Carta di Musica with Bottarga and Chili is amazing. Is it a sandwich? Yes. Yes it is.
RANDOM STREET FAIRS: (Anywhere tube socks and stale spices are sold. ) New York street fairs suck. The same dreary vendors, same bad food. But those nasty sausage and pepper hero sandwiches are a siren song, luring me, always towards the rocks. Shitting like a mink almost immediately after is guaranteed but who cares?
BARNEY GREENGRASS : ( 541 Amsterdam Ave.) Chopped Liver on rye. The best chopped liver in NYC.
SIBERIA in any of its iterations. The one on the subway being the best
LADY ANNES FULL MOON SALOON a bar so nasty I’d bring out of town visitors there just to scare them
KELLY’S on 43rd and Lex. Notable for 25 cent drafts and regularly and reliably serving me when I was 15
BILLY’S TOPLESS (later, Billy’s Stopless) an atmospheric, working class place, perfect for late afternoon drinking where nobody hustled you for money and everybody knew everybody. Great all-hair metal jukebox . Naked breasts were not really the point.
THE BAR AT HAWAII KAI. tucked away in a giant tiki themed nightclub in Times Square with a midget doorman and a floor show. Best place to drop acid EVER.
THE NURSERY after hours bar decorated like a pediatrician’s office. Only the nursery rhyme characters were punk rockers of the day.
It was surprising to see that only one page was not recoverable from the common crawl.
I’ve enjoyed this little project tremendously—a little archeology project. Can we declare victory for at least this endeavor? Hopefully, we would be able to find images, but that’s a little tougher, since that era’s cloudfront is fully gone.
What else can we work on restoring and setting up some sort of a public archive to store them? I made this a git repository for the sole purpose so that anyone interested can contribute their interest and passion for these kinds of projects.
Thank you and until next time! ◼︎
...
Read the original on sandyuraz.com »
YouTube’s CEO Neal Mohan is the latest in a line of tech bosses who have admitted to limiting their children’s social media use, as the harms of being online for young people have become more evident.
Mohan, who took the helm of YouTube’s leadership in 2023, was just named Time’s 2025 CEO of the Year. He said in an interview with the magazine that his children’s use of media platforms is controlled and restricted.
“We do limit their time on YouTube and other platforms and other forms of media. On weekdays we tend to be more strict, on weekends we tend to be less so. We’re not perfect by any stretch,” Mohan said in one TikTok video posted by Time Magazine on Thursday.
He stressed “everything in moderation” is what works best for him and his wife, and that extends to other online services and platforms. Mohan has three children: two sons and one daughter.
Experts have continued to sound the alarm on how excessive smartphones and social media use has harmed children and teenagers. Jonathan Haidt, NYU professor and author of “The Anxious Generation,” has advocated for children to not have smartphones before the age of 14 and no access to social media before the age of 16.
“Let them have a flip phone, but remember, a smartphone isn’t really a phone. They could make phone calls on it, but it’s a multi-purpose device by which the world can get to your children,” Haidt said in an interview with CNBC’s Tania Bryer earlier this year.
This week, Australia became the first country to formally bar users under the age of 16 from accessing major social media platforms. Ahead of the legislation’s passage last year, a YouGov survey found that 77% of Australians backed the under-16 social media ban. Still, the rollout has faced some resistance since becoming law.
Mohan said in a more extensive interview with Time on Wednesday that he feels a “paramount responsibility” to young people and giving parents greater control over how their kids use the platform. YouTube Kids was launched in 2015 as a child-friendly version of the Google-owned platform.
He said his goal is “to make it easy for all parents” to manage their children’s YouTube use “in a way that is suitable to their household,” especially as every parent has a different approach.
...
Read the original on www.cnbc.com »
Part of the Accepted! series, explaining the upcoming Go changes in simple terms. The new runtime/secret package lets you run a function in secret mode. After the function finishes, it immediately erases (zeroes out) the registers and stack it used. Heap allocations made by the function are erased as soon as the garbage collector decides they are no longer reachable.secret.Do(func() {
// Generate a session key and
// use it to encrypt the data.
This helps make sure sensitive information doesn’t stay in memory longer than needed, lowering the risk of attackers getting to it.The package is experimental and is mainly for developers of cryptographic libraries, not for application developers.Cryptographic protocols like WireGuard or TLS have a property called “forward secrecy”. This means that even if an attacker gains access to long-term secrets (like a private key in TLS), they shouldn’t be able to decrypt past communication sessions. To make this work, session keys (used to encrypt and decrypt data during a specific communication session) need to be erased from memory after they’re used. If there’s no reliable way to clear this memory, the keys could stay there indefinitely, which would break forward secrecy.In Go, the runtime manages memory, and it doesn’t guarantee when or how memory is cleared. Sensitive data might remain in heap allocations or stack frames, potentially exposed in core dumps or through memory attacks. Developers often have to use unreliable “hacks” with reflection to try to zero out internal buffers in cryptographic libraries. Even so, some data might still stay in memory where the developer can’t reach or control it.The solution is to provide a runtime mechanism that automatically erases all temporary storage used during sensitive operations. This will make it easier for library developers to write secure code without using workarounds.Add the runtime/secret package with Do and Enabled functions:// Do invokes f.
// Do ensures that any temporary storage used by f is erased in a
// timely manner. (In this context, “f” is shorthand for the
// entire call tree initiated by f.)
// - Any registers used by f are erased before Do returns.
// - Any stack used by f is erased before Do returns.
// - Any heap allocation done by f is erased as soon as the garbage
// collector realizes that it is no longer reachable.
// - Do works even if f panics or calls runtime.Goexit. As part of
// that, any panic raised by f will appear as if it originates from
// Do itself.
func Do(f func())
// Enabled reports whether Do appears anywhere on the call stack.
func Enabled() bool
The current implementation has several limitations:Only supported on linux/amd64 and linux/arm64. On unsupported platforms, Do invokes f directly.Protection does not cover any global variables that f writes to.Trying to start a goroutine within f causes a panic.If f calls runtime.Goexit, erasure is delayed until all deferred functions are executed.Heap allocations are only erased if ➊ the program drops all references to them, and ➋ then the garbage collector notices that those references are gone. The program controls the first part, but the second part depends on when the runtime decides to act.If f panics, the panicked value might reference memory allocated inside f. That memory won’t be erased until (at least) the panicked value is no longer reachable.Pointer addresses might leak into data buffers that the runtime uses for garbage collection. Do not put confidential information into pointers.The last point might not be immediately obvious, so here’s an example. If an offset in an array is itself secret (you have a data array and the secret key always starts at data[100]), don’t create a pointer to that location (don’t create a pointer p to &data[100]). Otherwise, the garbage collector might store this pointer, since it needs to know about all active pointers to do its job. If someone launches an attack to access the GC’s memory, your secret offset could be exposed.The package is mainly for developers who work on cryptographic libraries. Most apps should use higher-level libraries that use secret.Do behind the scenes.As of Go 1.26, the runtime/secret package is experimental and can be enabled by setting GOEXPERIMENT=runtimesecret at build time.Use secret.Do to generate a session key and encrypt a message using AES-GCM:// Encrypt generates an ephemeral key and encrypts the message.
// It wraps the entire sensitive operation in secret.Do to ensure
// the key and internal AES state are erased from memory.
func Encrypt(message []byte) ([]byte, error) {
var ciphertext []byte
var encErr error
secret.Do(func() {
// 1. Generate an ephemeral 32-byte key.
// This allocation is protected by secret.Do.
key := make([]byte, 32)
if _, err := io.ReadFull(rand.Reader, key); err != nil {
encErr = err
return
// 2. Create the cipher (expands key into round keys).
// This structure is also protected.
block, err := aes.NewCipher(key)
if err != nil {
encErr = err
return
gcm, err := cipher.NewGCM(block)
if err != nil {
encErr = err
return
nonce := make([]byte, gcm.NonceSize())
if _, err := io.ReadFull(rand.Reader, nonce); err != nil {
encErr = err
return
// 3. Seal the data.
// Only the ciphertext leaves this closure.
ciphertext = gcm.Seal(nonce, nonce, message, nil)
return ciphertext, encErr
Note that secret.Do protects not just the raw key, but also the cipher.Block structure (which contains the expanded key schedule) created inside the function.This is a simplified example, of course — it only shows how memory erasure works, not a full cryptographic exchange. In real situations, the key needs to be shared securely with the receiver (for example, through key exchange) so decryption can work.★ Subscribe to keep up with new posts.Gist of Go: Concurrency is out! →
...
Read the original on antonz.org »
Yesterday I shared a little program called Mark V. Shaney Junior at github.com/susam/mvs. It is a minimal implementation of a Markov text generator inspired by the legendary Mark V. Shaney program from the 1980s. If you don’t know about Mark V. Shaney, read more about it on the Wikipedia article Mark
V. Shaney.
It is a very small program with only about 30 lines of
Python that favours simplicity over efficiency. As a hobby, I often engage in exploratory programming where I write computer programs not to solve a specific problem but simply to explore a particular idea or topic for the sole purpose of recreation. I must have written small programs to explore Markov chains for various kinds of state spaces over a dozen times by now. Every time, I just pick my last experimental code and edit it to encode the new state space I am exploring. That’s usually my general approach to such one-off programs. I have hundreds of tiny little experimental programs lying on my disk at any given time.
Once in a while, I get the itch to take one of those exploratory programs, give it some finishing touches, wrap it up in a nice Git repo along with a README.md, CHANGES.md
and the whole shebang and share it on github.com/susam and codeberg.org/susam. The Mark V. Shaney Junior program that I shared yesterday happened to be one such exercise.
If you scroll down the README of this project, you’ll find some nice examples of the gibberish produced by this program. The first few examples there are the result of training the model on A Christmas Carol by Charles Dickens, one of my favourite authors. It is often said that Dickens never used fewer words when more would suffice. So I thought there couldn’t be a better piece of text when it comes to testing out my tiny Markov model. I’ll not reproduce the generated text examples here for the sake of brevity. If you are interested to take a look, just head over to the Gibberish
section of the README.
Soon after sharing the project, I wondered what kind of gibberish it would produce if I fed all 24 years of my blog posts and pages into the program. Well, here’s one of the results:
$ ./mvs < susam.txt
while a query replace operation is approved by the user. The above
variable defines the build job. It can be incredibly useful while
working on assembly language and machine code. In fact, all internal
resources like the result to refine the search prompt changes from
bck-i-search: to fwd-i-search:. Now type C-SPC (i.e. ctrl+space) to
set a mark causes Emacs to use 32-bit registers like EBP, ESP,
etc. Thus the behaviour is undefined. Such code may behave differently
when compiled with the readily available GNU tools like the shape
of 8. Flipping “P” horizontally makes it a proper quine: cat $0
This is the text that comes out after the program consumes over 200 posts consisting of about 200,000 words. My blog also has a comments section with over 500 comments consisting of about 40,000 words. All comments were excluded while training the model. Here is another output example:
$ ./mvs < susam.txt
enjoy asking “what happens if” and then type M-x zap-up-to-char RET
b. The buffer for this specific video, the actual fare for 8.3 km and
11 are all written from scratch. No prior knowledge is expected to
slow down in future. For now, I will add a statement like x =
0.999… , the inner corner square as discussed in the code segment
into the REPL window. Unlike Slimv, Vlime can work with and debug
executable files, it can be more convenient. M-x: Execute Extended
Command The key sequence M-q invokes the command cat and type TAB to
indent the
Here is a particularly incoherent but amusing one:
$ ./mvs < susam.txt
Then open a new Lisp source file and the exact answer could harm
students’ self-esteem. Scientists have arbitrarily assumed that an
integral domain. However, the string and comment text. To demonstrate
how a build job can trigger itself, pass input to standard output or
standard error), Eshell automatically runs the following command in
Vim and Emacs will copy the message length limit of 512 characters,
etc. For example, while learning to play the game between normal mode
to move the point is on an old dictionary lying around our house and
that is moving to the small and supportive community
No, I have never said anywhere that opening a Lisp source file could harm anyone’s self-esteem. The text generator has picked up the ‘Lisp source file’ phrase from my Lisp in Vim post and the ‘self-esteem’ bit from the From Perl
to Pi post.
By default, this program looks at trigrams (all sequences of three adjacent words) and creates a map where the first two words of the trigram are inserted as the key and the third word is appended to its list value. This map is the model. In this way, the model captures each pair of adjacent words along with the words that immediately follow each pair. The text generator then chooses a key (a pair of words) at random and looks for a word which follows. If there are multiple followers, it picks one at random. That is pretty much the whole algorithm. There isn’t much more to it. It is as simple as it gets. For that reason, I often describe a simple Markov model like this as the ‘hello, world’ for language models.
Of course, in 2025, given the overwhelming popularity of large language models (LLMs), Markov models like this look unimpressive. Unlike LLMs, a simple Markov model cannot capture global structure or long-range dependencies within the text. It relies entirely on local word transition statistics. Also, these days, one hardly needs a Markov model to generate gibberish; social media provides an ample supply. Nevertheless, I think the simplicity of its design and implementation serves as a good entry point into language models.
In my implementation, the number of words in the key of the map can be set via command line arguments. By default, it is 2 as described above. This value is also known as the order of the model. So by default the order is 2. If we increase it to, say, 3 or 4, the generated text becomes a little more coherent. Here is one such example:
$ ./mvs 4 < susam.txt
It is also possible to search for channels by channel names. For
example, on Libera Chat, to search for all channels with ‘python’ in
its name, enter the IRC command: /msg alis list python. Although I
have used Libera Chat in the examples above, there are plenty of
infinite fields, so they must all be integral domains too. Consider
the field of rational numbers Q. Another quick way to arrive at this
fact is to observe that when one knight is placed on a type D square,
only two positions for the second knight such that the two knights
attack
Except for a couple of abrupt and meaningless transitions, the text is mostly coherent. We need to be careful about not increasing the order too much. In fact, if we increase the order of the model to 5, the generated text becomes very dry and factual because it begins to quote large portions of the blog posts verbatim. Not much fun can be had with that.
Before I end this post, let me present one final example where I ask it to generate text from an initial prompt:
$ ./mvs 2 100 ‘Finally we’ < susam.txt
Finally we divide this number by a feed aggregrator for Emacs-related
blogs. The following complete key sequences describe the effects of
previous evaluations shall have taken a simple and small to contain
bad content. This provides an interactive byte-compiled Lisp function
in MATLAB and GNU bash 5.1.4 on Debian is easily reproducible in
Windows XP. Older versions might be able to run that server for me it
played a significant burden on me as soon as possible. C-u F: Visit
the marked files or directories in the sense that it was already
initiated and we were to complete the proof.
Apparently, this is how I would sound if I ever took up speaking gibberish!
...
Read the original on susam.net »
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.