10 interesting stories served every morning and every evening.
(Photography)
Here’s a photo of a Christmas tree, as my camera’s sensor sees it:
This is becuase while the camera’s analog-to-digital converter (ADC) output can theoretically output values from 0 to 16382, the data doesn’t cover that whole range:
The real range of ADC values is ~2110 to ~136000. Let’s set those values as the white and black in the image:
Much better, but it’s still more monochromatic then I remember the tree being. Camera sensors aren’t actually able to see color: They only measure how much light hit each pixel.
In a color camera, the sensor is covered by a grid of alternating color filters:
Let’s color each pixel the same as the filter it’s looking through:
This version is more colorful, but each pixel only has one third of its RGB color.
To fix this, I just averaged the values each pixel with its neighbors:
Applying this process to the whole photo gives the lights some color:
However, the image is still very dark. This is because monitors don’t have as much dynamic range as the human eye, or a camera sensor: Even if you are using an OLED, the screen still has some ambient light reflecting off of it and limiting how black it can get.
There’s also another, sneakier factor causing this:
Our perception of brightness is non-linear.
If brightness values are quantized, most of the ADC bins will be wasted on nearly identical shades of white while every other tone is crammed into the bottom. Because this is an inefficient use of memory, most color spaces assign extra bins to darker colors:
As a result of this, if the linear data is displayed directly, it will appear much darker then it should be.
Both problems can be solved by applying a non-linear curve to each color channel to brighten up the dark areas… but this doesn’t quite work out:
Some of this green cast is caused by the camera sensor being intrinsically more sensitive to green light, but some of it is my fault: There are twice as many green pixels in the filter matrix. When combined with my rather naive demosaicing, this resulted in the green channel being boosted even higher.
In either case, it can fixed with proper white-balance: Equalize the channels by multipling each one with a constant.
However, because the image is now non-linear, I have to go back a step to do this. Here’s the dark image from before with all the values temporarily scaled up so I can see the problem:
… here’s that image with the green taken down to match the other channels:
… and after re-applying the curve:
This is really just the bare minimum: I haven’t done any color calibration, the white balance isn’t perfect, the black points are too high, there’s lots of noise that needs to be cleaned up…
Additionally, applying the curve to each color channel accidentally desaturated the highlights. This effect looks rather good — and is what we’ve come to expect from film — but it has de-yellowed the star. It’s possible to separate the luminance and curve it while preserving color. On its own, this would make the LED Christmas lights into an overstaturated mess, but combining both methods can produce nice results.
For comparison, here’s the image my camera produced from the same data:
Far from being an “unedited” photo: there’s a huge amount of math that’s gone into making an image that nicely represents what the subject looks like in person.
There’s nothing that happens when you adjust the contrast or white balance in editing software that the camera hasn’t done under the hood. The edited image isn’t “faker” then the original: they are different renditions of the same data.
In the end, replicating human perception is hard, and it’s made harder when constrained to the limitations of display technology or printed images. There’s nothing wrong with tweaking the image when the automated algorithms make the wrong call.
...
Read the original on maurycyz.com »
I’m all-in, baby. I’m committed. If upgrading any distinct component of my PC didn’t require me taking out a loan right now, I’d be seriously considering switching my GPU over to some kind of AMD thing just to make my life slightly, slightly easier.
I’ve had it with Windows and ascended to the sunlit uplands of Linux, where the trees heave with open-source fruits and men with large beards grep things with their minds.
I’m not alone. In last month’s Steam hardware survey, the number of Linux users hit a new all-time high for the second month running, reaching the heady summit of a whopping, ah, 3.2% of overall Steam users. Hey, we’re beating Mac players.
I think that number will only grow as the new year goes by. More and more of us are getting sick of Windows, sure—the AI guff, the constant upselling on Office subs, the middle taskbar*—but also, all my experience goofing about with Linux this year has dispelled a lot of the, frankly, erroneous ideas I had about it. It’s really not hard! Really! I know Linux guys have been saying this for three decades, but it’s true now!
As I’ve already written about, the bulk of my Linux-futzing time this year has been spent in Bazzite, a distro tailor-made for gaming and also tailor-made to stop idiots (me) from doing something likely to detonate their boot drive.
I grew up thinking of Linux as ‘the command-line OS that lets you delete your bootloader’ and, well, I suppose that’s not untrue, but I’ve been consistently impressed at how simple Bazzite has been to run on my PC, even with my persnickety Nvidia GPU.
Everything I’ve played this year has been as easy—if not easier—to run on a free OS put together by a gaggle of passionate nerds as it is on Windows, the OS made by one of the most valuable corporations on planet Earth. I’ve never had to dip into the command line (which is, to be frank, a shame, as the command line is objectively cool).
But to be honest, it’s not as if the Bazzite team has miraculously made Linux pleasant to use after decades of it seeming difficult and esoteric to normie computer users. I think mainstream Linux distros are just, well, sort of good now. Apart from my gaming PC, I also have an old laptop converted into a media server that lives underneath my television. It runs Debian 13 (which I updated to from Debian 12 earlier in the year) and requires essentially zero input from me at all.
What’s more, the only software I have on there is software I actually want on there. Oh for a version of Windows that let me do something as zany as, I don’t know, uninstall Edge.
That’s the true nub of it, I think. The stats can say what they like (and they do! We’ve all heard tales of Windows games actually running better on Linux via Valve’s Proton compatibility layer), but the heart of my fatigue with Windows is that, for every new worthless AI gadget Microsoft crams into it and for every time the OS inexplicably boots to a white screen and implores me to “finish setting up” my PC with an Office 365 subscription, the real problem is a feeling that my computer isn’t mine, that I am somehow renting this thing I put together with my own two hands from an AI corporation in Redmond.
That’s fine for consoles. Indeed, part of the whole pitch of an Xbox or PlayStation is the notion that you are handing off a lot of responsibility for your device to Sony and Microsoft’s teams of techs, but my PC? That I built? Get your grubby mitts off it.
Are there issues? Sure. HDR’s still a crapshoot (plus ça change) and, as you’ve no doubt heard, a lot of live-service games have anticheat software that won’t play with Linux. But I think both of these issues are gradually ticking toward their solutions, particularly with Valve making its own push into the living room.
So I say make 2026 the year you give Linux a try, if you haven’t already. At the very least, you can stick it on a separate boot drive and have a noodle about with it. I suspect you’ll find the open (source) water is a lot more hospitable than you might think.
*I’m actually fine with the middle taskbar. I’m sorry.
...
Read the original on www.pcgamer.com »
If you live in Germany, you have been treated like livestock by Deutsche Bahn (DB). Almost all of my friends have a story: they traveled with DB, got thrown out in the middle of the night in some cow village, and had to wait hours for the next train.
I have something better. I was kidnapped.
I am taking the RE5 (ID 28521) to my grandmother’s house in Meckenheim. Scheduled departure: 15:32. Scheduled arrival in Bonn: 15:54. From there, the S23 to Meckenheim. A journey of 35 kilometers, or, in DB units, somewhere between forty-five minutes and the heat death of the universe.
I wanted to arrive early to spend more time with her. My father, who lives near Troisdorf, was supposed to join us later.
I board the train. It is twenty minutes late. I consider this early. At least the train showed up. In DB’s official statistics, a train counts as “on time” if it’s less than six minutes late. Cancelled trains are not counted at all. If a train doesn’t exist, it cannot be late.
The train starts moving. The driver announces there are “issues around Bonn.” He does not specify what kind. No one asks. We have learned not to ask. He suggests we exit at Cologne South and take the subway, or continue to Troisdorf and catch a bus from there.
I decide to continue to Troisdorf. My father can just pick me up there and we drive together. The plan adapts.
The driver announces the full detour: from Cologne South to Troisdorf to Neuwied to Koblenz. The entire left bank of the Rhine is unavailable. Only then I notice: the driver has been speaking German only. If you were a tourist who got on in Cologne to visit Brühl, thirteen minutes away, you were about to have a very confusing Christmas in Troisdorf.
A woman near me is holding chocolates and flowers. She is on the phone with her mother. “Sorry Mama, I’ll be late.” Pause. “Deutsche Bahn.” Pause. Her mother understood.
Twenty minutes later. We are approaching Troisdorf. I stand up. I gather my things. My father texts me: he is at the station, waiting.
The driver comes back on: “Hello everyone. Apparently we were not registered at Troisdorf station, so we are on the wrong tracks. We cannot stop.”
He says this the way someone might say “the coffee machine is broken.”
I watch Troisdorf slide past the window. Somewhere in the parking lot outside the station, my father is sitting in his car, watching his son pass by as livestock.
I was trying to travel 35 kilometers. I was now 63 kilometers from my grandmother’s house. Further away than when I started.
There are fifteen stations between Troisdorf and Neuwied. We pass all of them [^6].
At some point you stop being a passenger and start being cargo. A cow transporter. Mooohhhhh. A cow transporter going to a cow village. (Germany has a word for this: Kuhdorf. The cows are metaphorical. Usually.) I reached this point around Oberkassel.
DB once operated a bus to Llucalcari, a Mallorcan village of seventeen people. I wanted to take it home.
An English speaker near the doors is getting agitated. “What is happening? Why didn’t we stop?”
“We are not registered for this track.”
“But where will we stop?”
“Fifty-five minutes.” He said it again, quieter. “I am being kidnapped.”
My seatmate, who had not looked up from his book in forty minutes, turned a page. “Deutsche Bahn.”
I had been kidnapped at a loss.
...
Read the original on www.theocharis.dev »
It’s anecdotal, I know, but my main entertainment business revenue is down 50% over the past 3 months. Our main paid source of leads was Google Ads, which have served us well over the past 10 years or so — I think I know what I am doing in adwords by now.
Once per month I check the analytics, updating keywords and tweaking ad campaigns. Over the past year we increased our budget, and then I started looking at it once per week, running simultaneous campaigns with different settings, just trying to get SOMETHING.
Last month Google gave us a bonus — free money! This was 5x our monthly ad spend, to spend just when we needed it most — over the December holidays. I added another new campaign, updated the budgets for the existing ones. Still no change. The last week there was money to burn, left over from unused ad spend. I increased our budget to 10x. ZERO RETURN.
The money ran out. I am not putting more in. Where do we go from here?
Research shows that many young people are getting their information from short video platforms like TikTok and Instagram. We are trying ads on there.
Our customer base is comprised of 50% returning customers (I am proud of that statistic!). We have an email newsletter, we started sending them regularly over the past 2 months. Remember us?
We also plan to do some actual physical advertising — I am going to a market next weekend, doing a free show or two, handing out cards.
Also, we are branching out — I have some projects I want to make, related to the Magic Poi project, and hopefully sell. We ordered supplies last week.
Right now, though — I’m broke. Anyone need a website or IOT project built? I am AI assisted, very fast!
...
Read the original on www.circusscientist.com »
This is the third in my annual series reviewing everything that happened in the LLM space over the past 12 months. For previous years see Stuff we figured out about AI in 2023 and Things we learned about LLMs in 2024.
It’s been a year filled with a lot of different trends.
OpenAI kicked off the “reasoning” aka inference-scaling aka Reinforcement Learning from Verifiable Rewards (RLVR) revolution in September 2024 with o1 and o1-mini. They doubled down on that with o3, o3-mini and o4-mini in the opening months of 2025 and reasoning has since become a signature feature of models from nearly every other major AI lab.
My favourite explanation of the significance of this trick comes from Andrej Karpathy:
By training LLMs against automatically verifiable rewards across a number of environments (e.g. think math/code puzzles), the LLMs spontaneously develop strategies that look like “reasoning” to humans—they learn to break down problem solving into intermediate calculations and they learn a number of problem solving strategies for going back and forth to figure things out (see DeepSeek R1 paper for examples). […]
Running RLVR turned out to offer high capability/$, which gobbled up the compute that was originally intended for pretraining. Therefore, most of the capability progress of 2025 was defined by the LLM labs chewing through the overhang of this new stage and overall we saw ~similar sized LLMs but a lot longer RL runs.
Every notable AI lab released at least one reasoning model in 2025. Some labs released hybrids that could be run in reasoning or non-reasoning modes. Many API models now include dials for increasing or decreasing the amount of reasoning applied to a given prompt.
It took me a while to understand what reasoning was useful for. Initial demos showed it solving mathematical logic puzzles and counting the Rs in strawberry—two things I didn’t find myself needing in my day-to-day model usage.
It turned out that the real unlock of reasoning was in driving tools. Reasoning models with access to tools can plan out multi-step tasks, execute on them and continue to reason about the results such that they can update their plans to better achieve the desired goal.
A notable result is that AI assisted search actually works now. Hooking up search engines to LLMs had questionable results before, but now I find even my more complex research questions can often be answered by GPT-5 Thinking in ChatGPT.
Reasoning models are also exceptional at producing and debugging code. The reasoning trick means they can start with an error and step through many different layers of the codebase to find the root cause. I’ve found even the gnarliest of bugs can be diagnosed by a good reasoner with the ability to read and execute code against even large and complex codebases.
Combine reasoning with tool-use and you get…
I started the year making a prediction that agents were not going to happen. Throughout 2024 everyone was talking about agents but there were few to no examples of them working, further confused by the fact that everyone using the term “agent” appeared to be working from a slightly different definition from everyone else.
By September I’d got fed up of avoiding the term myself due to the lack of a clear definition and decided to treat them as an LLM that runs tools in a loop to achieve a goal. This unblocked me for having productive conversations about them, always my goal for any piece of terminology like that.
I didn’t think agents would happen because I didn’t think the gullibility problem could be solved, and I thought the idea of replacing human staff members with LLMs was still laughable science fiction.
I was half right in my prediction: the science fiction version of a magic computer assistant that does anything you ask of (Her) didn’t materialize…
But if you define agents as LLM systems that can perform useful work via tool calls over multiple steps then agents are here and they are proving to be extraordinarily useful.
The two breakout categories for agents have been for coding and for search.
The Deep Research pattern—where you challenge an LLM to gather information and it churns away for 15+ minutes building you a detailed report—was popular in the first half of the year but has fallen out of fashion now that GPT-5 Thinking (and Google’s “AI mode”, a significantly better product than their terrible “AI overviews”) can produce comparable results in a fraction of the time. I consider this to be an agent pattern, and one that works really well.
The “coding agents” pattern is a much bigger deal.
The most impactful event of 2025 happened in February, with the quiet release of Claude Code.
I say quiet because it didn’t even get its own blog post! Anthropic bundled the Claude Code release in as the second item in their post announcing Claude 3.7 Sonnet.
Claude Code is the most prominent example of what I call coding agents—LLM systems that can write code, execute that code, inspect the results and then iterate further.
The major labs all put out their own CLI coding agents in 2025
Vendor-independent options include GitHub Copilot CLI, Amp, OpenCode, OpenHands CLI, and Pi. IDEs such as Zed, VS Code and Cursor invested a lot of effort in coding agent integration as well.
My first exposure to the coding agent pattern was OpenAI’s ChatGPT Code Interpreter in early 2023—a system baked into ChatGPT that allowed it to run Python code in a Kubernetes sandbox.
I was delighted this year when Anthropic finally released their equivalent in September, albeit under the baffling initial name of “Create and edit files with Claude”.
In October they repurposed that container sandbox infrastructure to launch Claude Code for web, which I’ve been using on an almost daily basis ever since.
Claude Code for web is what I call an asynchronous coding agent—a system you can prompt and forget, and it will work away on the problem and file a Pull Request once it’s done. OpenAI “Codex cloud” (renamed to “Codex web” in the last week) launched earlier in May 2025. Gemini’s entry in this category is called Jules, also launched in May.
I love the asynchronous coding agent category. They’re a great answer to the security challenges of running arbitrary code execution on a personal laptop and it’s really fun being able to fire off multiple tasks at once—often from my phone—and get decent results a few minutes later.
I wrote more about how I’m using these in Code research projects with async coding agents like Claude Code and Codex and Embracing the parallel coding agent lifestyle.
In 2024 I spent a lot of time hacking on my LLM command-line tool for accessing LLMs from the terminal, all the time thinking that it was weird that so few people were taking CLI access to models seriously—they felt like such a natural fit for Unix mechanisms like pipes.
Maybe the terminal was just too weird and niche to ever become a mainstream tool for accessing LLMs?
Claude Code and friends have conclusively demonstrated that developers will embrace LLMs on the command line, given powerful enough models and the right harness.
It helps that terminal commands with obscure syntax like sed and ffmpeg and bash itself are no longer a barrier to entry when an LLM can spit out the right command for you.
As-of December 2nd Anthropic credit Claude Code with $1bn in run-rate revenue! I did not expect a CLI tool to reach anything close to those numbers.
With hindsight, maybe I should have promoted LLM from a side-project to a key focus!
The default setting for most coding agents is to ask the user for confirmation for almost every action they take. In a world where an agent mistake could wipe your home folder or a malicious prompt injection attack could steal your credentials this default makes total sense.
Anyone who’s tried running their agent with automatic confirmation (aka YOLO mode—Codex CLI even aliases –dangerously-bypass-approvals-and-sandbox to –yolo) has experienced the trade-off: using an agent without the safety wheels feels like a completely different product.
A big benefit of asynchronous coding agents like Claude Code for web and Codex Cloud is that they can run in YOLO mode by default, since there’s no personal computer to damage.
I run in YOLO mode all the time, despite being deeply aware of the risks involved. It hasn’t burned me yet…
One of my favourite pieces on LLM security this year is The Normalization of Deviance in AI by security researcher Johann Rehberger.
Johann describes the “Normalization of Deviance” phenomenon, where repeated exposure to risky behaviour without negative consequences leads people and organizations to accept that risky behaviour as normal.
This was originally described by sociologist Diane Vaughan as part of her work to understand the 1986 Space Shuttle Challenger disaster, caused by a faulty O-ring that engineers had known about for years. Plenty of successful launches led NASA culture to stop taking that risk seriously.
Johann argues that the longer we get away with running these systems in fundamentally insecure ways, the closer we are getting to a Challenger disaster of our own.
ChatGPT Plus’s original $20/month price turned out to be a snap decision by Nick Turley based on a Google Form poll on Discord. That price point has stuck firmly ever since.
This year a new pricing precedent has emerged: the Claude Pro Max 20x plan, at $200/month.
OpenAI have a similar $200 plan called ChatGPT Pro. Gemini have Google AI Ultra at $249/month with a $124.99/month 3-month starting discount.
These plans appear to be driving some serious revenue, though none of the labs have shared figures that break down their subscribers by tier.
I’ve personally paid $100/month for Claude in the past and will upgrade to the $200/month plan once my current batch of free allowance (from previewing one of their models—thanks, Anthropic) runs out. I’ve heard from plenty of other people who are happy to pay these prices too.
You have to use models a lot in order to spend $200 of API credits, so you would think it would make economic sense for most people to pay by the token instead. It turns out tools like Claude Code and Codex CLI can burn through enormous amounts of tokens once you start setting them more challenging tasks, to the point that $200/month offers a substantial discount.
2024 saw some early signs of life from the Chinese AI labs mainly in the form of Qwen 2.5 and early DeepSeek. They were neat models but didn’t feel world-beating.
This changed dramatically in 2025. My ai-in-china tag has 67 posts from 2025 alone, and I missed a bunch of key releases towards the end of the year (GLM-4.7 and MiniMax-M2.1 in particular.)
GLM-4.7, Kimi K2 Thinking, MiMo-V2-Flash, DeepSeek V3.2, MiniMax-M2.1 are all Chinese open weight models. The highest non-Chinese model in that chart is OpenAI’s gpt-oss-120B (high), which comes in sixth place.
The Chinese model revolution really kicked off on Christmas day 2024 with the release of DeepSeek 3, supposedly trained for around $5.5m. DeepSeek followed that on 20th January with DeepSeek R1 which promptly triggered a major AI/semiconductor selloff: NVIDIA lost ~$593bn in market cap as investors panicked that AI maybe wasn’t an American monopoly after all.
The panic didn’t last—NVIDIA quickly recovered and today are up significantly from their pre-DeepSeek R1 levels. It was still a remarkable moment. Who knew an open weight model release could have that kind of impact?
DeepSeek were quickly joined by an impressive roster of Chinese AI labs. I’ve been paying attention to these ones in particular:
Most of these models aren’t just open weight, they are fully open source under OSI-approved licenses: Qwen use Apache 2.0 for most of their models, DeepSeek and Z.ai use MIT.
Some of them are competitive with Claude 4 Sonnet and GPT-5!
Sadly none of the Chinese labs have released their full training data or the code they used to train their models, but they have been putting out detailed research papers that have helped push forward the state of the art, especially when it comes to efficient training and inference.
One of the most interesting recent charts about LLMs is Time-horizon of software engineering tasks different LLMscan complete 50% of the time from METR:
The chart shows tasks that take humans up to 5 hours, and plots the evolution of models that can achieve the same goals working independently. As you can see, 2025 saw some enormous leaps forward here with GPT-5, GPT-5.1 Codex Max and Claude Opus 4.5 able to perform tasks that take humans multiple hours—2024’s best models tapped out at under 30 minutes.
METR conclude that “the length of tasks AI can do is doubling every 7 months”. I’m not convinced that pattern will continue to hold, but it’s an eye-catching way of illustrating current trends in agent capabilities.
The most successful consumer product launch of all time happened in March, and the product didn’t even have a name.
One of the signature features of GPT-4o in May 2024 was meant to be its multimodal output—the “o” stood for “omni” and OpenAI’s launch announcement included numerous “coming soon” features where the model output images in addition to text.
Then… nothing. The image output feature failed to materialize.
In March we finally got to see what this could do—albeit in a shape that felt more like the existing DALL-E. OpenAI made this new image generation available in ChatGPT with the key feature that you could upload your own images and use prompts to tell it how to modify them.
This new feature was responsible for 100 million ChatGPT signups in a week. At peak they saw 1 million account creations in a single hour!
Tricks like “ghiblification”—modifying a photo to look like a frame from a Studio Ghibli movie—went viral time and time again.
OpenAI released an API version of the model called “gpt-image-1”, later joined by a cheaper gpt-image-1-mini in October and a much improved gpt-image-1.5 on December 16th.
The most notable open weight competitor to this came from Qwen with their Qwen-Image generation model on August 4th followed by Qwen-Image-Edit on August 19th. This one can run on (well equipped) consumer hardware! They followed with Qwen-Image-Edit-2511 in November and Qwen-Image-2512 on 30th December, neither of which I’ve tried yet.
The even bigger news in image generation came from Google with their Nano Banana models, available via Gemini.
Google previewed an early version of this in March under the name “Gemini 2.0 Flash native image generation”. The really good one landed on August 26th, where they started cautiously embracing the codename “Nano Banana” in public (the API model was called “Gemini 2.5 Flash Image”).
Nano Banana caught people’s attention because it could generate useful text! It was also clearly the best model at following image editing instructions.
In November Google fully embraced the “Nano Banana” name with the release of Nano Banana Pro. This one doesn’t just generate text, it can output genuinely useful detailed infographics and other text and information-heavy images. It’s now a professional-grade tool.
Max Woolf published the most comprehensive guide to Nano Banana prompting, and followed that up with an essential guide to Nano Banana Pro in December.
I’ve mainly been using it to add kākāpō parrots to my photos.
Given how incredibly popular these image tools are it’s a little surprising that Anthropic haven’t released or integrated anything similar into Claude. I see this as further evidence that they’re focused on AI tools for professional work, but Nano Banana Pro is rapidly proving itself to be of value to anyone who’s work involves creating presentations or other visual materials.
In July reasoning models from both OpenAI and Google Gemini achieved gold medal performance in the International Math Olympiad, a prestigious mathematical competition held annually (bar 1980) since 1959.
This was notable because the IMO poses challenges that are designed specifically for that competition. There’s no chance any of these were already in the training data!
It’s also notable because neither of the models had access to tools—their solutions were generated purely from their internal knowledge and token-based reasoning capabilities.
Turns out sufficiently advanced LLMs can do math after all!
In September OpenAI and Gemini pulled off a similar feat for the International Collegiate Programming Contest (ICPC)—again notable for having novel, previously unpublished problems. This time the models had access to a code execution environment but otherwise no internet access.
I don’t believe the exact models used for these competitions have been released publicly, but Gemini’s Deep Think and OpenAI’s GPT-5 Pro should provide close approximations.
With hindsight, 2024 was the year of Llama. Meta’s Llama models were by far the most popular open weight models—the original Llama kicked off the open weight revolution back in 2023 and the Llama 3 series, in particular the 3.1 and 3.2 dot-releases, were huge leaps forward in open weight capability.
Llama 4 had high expectations, and when it landed in April it was… kind of disappointing.
There was a minor scandal where the model tested on LMArena turned out not to be the model that was released, but my main complaint was that the models were too big. The neatest thing about previous Llama releases was that they often included sizes you could run on a laptop. The Llama 4 Scout and Maverick models were 109B and 400B, so big that even quantization wouldn’t get them running on my 64GB Mac.
They were trained using the 2T Llama 4 Behemoth which seems to have been forgotten now—it certainly wasn’t released.
It says a lot that none of the most popular models listed by LM Studio are from Meta, and the most popular on Ollama is still Llama 3.1, which is low on the charts there too.
Meta’s AI news this year mainly involved internal politics and vast amounts of money spent hiring talent for their new Superintelligence Labs. It’s not clear if there are any future Llama releases in the pipeline or if they’ve moved away from open weight model releases to focus on other things.
Last year OpenAI remained the undisputed leader in LLMs, especially given o1 and the preview of their o3 reasoning models.
This year the rest of the industry caught up.
OpenAI still have top tier models, but they’re being challenged across the board.
In image models they’re still being beaten by Nano Banana Pro. For code a lot of developers rate Opus 4.5 very slightly ahead of GPT-5.2 Codex. In open weight models their gpt-oss models, while great, are falling behind the Chinese AI labs. Their lead in audio is under threat from the Gemini Live API.
Where OpenAI are winning is in consumer mindshare. Nobody knows what an “LLM” is but almost everyone has heard of ChatGPT. Their consumer apps still dwarf Gemini and Claude in terms of user numbers.
Their biggest risk here is Gemini. In December OpenAI declared a Code Red in response to Gemini 3, delaying work on new initiatives to focus on the competition with their key products.
They posted their own victorious 2025 recap here. 2025 saw Gemini 2.0, Gemini 2.5 and then Gemini 3.0—each model family supporting audio/video/image/text input of 1,000,000+ tokens, priced competitively and proving more capable than the last.
They also shipped Gemini CLI (their open source command-line coding agent, since forked by Qwen for Qwen Code), Jules (their asynchronous coding agent), constant improvements to AI Studio, the Nano Banana image models, Veo 3 for video generation, the promising Gemma 3 family of open weight models and a stream of smaller features.
Google’s biggest advantage lies under the hood. Almost every other AI lab trains with NVIDIA GPUs, which are sold at a margin that props up NVIDIA’s multi-trillion dollar valuation.
Google use their own in-house hardware, TPUs, which they’ve demonstrated this year work exceptionally well for both training and inference of their models.
...
Read the original on simonwillison.net »
Today, Michał Kiciński, one of the co-founders of CD PROJEKT, and the co-founder of GOG, has acquired GOG from CD PROJEKT.
We believe the games that shaped us deserve to stay alive: easy to find, buy, download, and play forever. But time is annoyingly good at erasing them. Rights get tangled, compatibility breaks, builds disappear, and a nostalgic evening often turns into a troubleshooting session. That’s the difference between “I’m playing today” (the game lives on) and “I’ll play someday” (the game dies).
As Michał put it: “GOG stands for freedom, independence, and genuine control.”
GOG has always been built on strong values and clear principles. When Marcin Iwiński and Michał Kiciński first came up with the idea for GOG in 2007, the vision was simple: bring classic games back to players, and make sure that once you buy a game, it truly belongs to you, forever. In a market increasingly defined by mandatory clients and closed ecosystems, that philosophy feels more relevant than ever.
This new chapter is about doubling down on that vision. We want to do more to preserve the classics of the past, celebrate standout games of today, and help shape the classics of tomorrow, including new games with real retro spirit.
First of all, DRM-free is more central to GOG than ever. Your library stays yours to enjoy: same access, same offline installers, same sense of ownership. Your data stays with GOG, and GOG GALAXY remains optional.
We’ll keep our relationship with CD PROJEKT. CD PROJEKT RED games will continue to be available on GOG, and upcoming titles from the studio will also be released on the platform.
If you’re a GOG Patron, or you donate to support the Preservation Program, those funds stay within GOG. Your support has been huge this year, and we think that with your help, we can undertake even more ambitious rescue missions in 2026 and 2027. We’ll have more to say about that sometime in 2026.
GOG will remain independent in its operations. We will continue building a platform that’s ethical, non-predatory, and made to last, while helping indie developers reach the world. We’re also committed to giving the community a stronger voice, with new initiatives planned for 2026.
Thanks for being the reason this all matters.
A lot of companies sell games. Fewer do the unglamorous work of making sure the games that shaped people’s lives don’t quietly rot into incompatibility.
Thanks for caring about this mission with us. We’ll keep you posted as we ship, and in the meantime, you can dig into the full FAQ for the detailed answers.
...
Read the original on www.gog.com »
TL;DR: 2026 is going to be The Year of The Linux Desktop for me. I haven’t booted into Windows in over 3 months on my tower and I’m starting to realize that it’s not worth wasting the space for. I plan to unify my three SSDs and turn them all into btrfs drives on Fedora.
I’ve been merely tolerating Windows 11 for a while but recently it’s gotten to the point where it’s just absolutely intolerable. Somehow Linux on the desktop has gotten so much better by not even doing anything differently. Microsoft has managed to actively sabotage the desktop experience through years of active disregard and spite against their users. They’ve managed to take some of their most revolutionary technological innovations (the NT kernel’s hybrid design allowing it to restart drivers, NTFS, ReFS, WSL, Hyper-V, etc.) then just shat all over them with start menus made with React Native, control-alt-delete menus that are actually just webviews, and forcing Copilot down everyone’s throats to the point that I’ve accidentally gotten stuck in Copilot in a handheld gaming PC and had to hard reboot the device to get out of it. It’s as if the internal teams at Microsoft have had decades of lead time in shooting each other in the head with predictable results.
To be honest, I’ve had enough. I’m going to go with Fedora on my tower and Bazzite (or SteamOS) on my handhelds.
I think that Linux on the desktop is ready for the masses now, not because it’s advanced in a huge leap/bound. It’s ready for the masses to use because Windows has gotten so much actively worse that continuing to use it is an active detriment to user experience and stability. Not to mention with the price of ram lately, you need every gigabyte you can get and desktop Linux lets you waste less of it on superfluous bullshit that very few people actually want.
At the very least, when something goes wrong on Linux you have log messages that can let you know what went wrong so you can search for it.
Facts and circumstances may have changed since publication. Please contact me before jumping to conclusions if something seems wrong or unclear.
...
Read the original on xeiaso.net »
This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts
...
Read the original on substack.com »
Well, the Internet mostly feels bad these days. We were given this vast, holy realm of self-discovery and joy and philosophy and community; a thousand thousand acres of digital landscape, on which to grow our forests and grasslands of imagination, plant our gardens of learning, explore the caves of our making. We were given the chance to know anything about anything, to be our own Prometheus, to make wishes and to grant them.But that’s not what we use the Internet for anymore. These days, instead of using it to make ourselves, most of us are using it to waste ourselves: we’re doom-scrolling brain-rot on the attention-farm, we’re getting slop from the feed.Instead of turning freely in the HTTP meadows we grow for each other, we go to work: we break our backs at the foundry of algorithmic content as this earnest, naïve, human endeavoring to connect our lives with others is corrupted. Our powerful drive to learn about ourselves, each other, and our world, is broken into scant remnants — hollow, clutching phantasms of Content Creation, speed-cut vertical video, listicle thought-leadership, ragebait and the thread emoji.
It used to feel way better to Go Online, and some of us will remember.We used to be able to learn about our hobbies and interests from hundreds of experts on a wealth of websites whose only shared motivation was their passion. Some of those venerable old educational blogs, forums, and wikis still stand, though most have been bulldozed.Now, Learning On The Internet often means fighting ads and endless assaults on one’s attention — it means watching part-1-part-2-part-3 short-form video clips, taped together by action movie psychology hacks, narrated gracelessly by TTS AI voices. We’re down from a thousand and one websites to three, and each of those remaining monolith websites is just a soullessly-regurgitated, compression-down-scaled, AI-up-scaled version of the next.We used to make lasting friendships with folks all over the world on shared interest and good humor.But now those social networks, once hand-built and hand-tended, vibrant and organic, are unceremoniously swallowed by social media networks, pens built for trapping us and our little piggy attentions, turning us all into clout-chasers & content-creators, and removing us from what meaningful intimacy & community felt like.Even coding for the web used to be different: One could Learn To Code™ to express oneself creatively, imbue one’s online presence with passion and meaning, and for some of us, build a real career.These days, however, we write increasing amounts of complicated, unsecure code to express less and less meaning, in order to infinitely generate shareholder value. We don’t think about the art of our craft and the discipline of its application, we think about throughput and scale.
you are not immune to nostalgia.
To be very clear: I’m not trying to Good Old Days the internet. None of this is meant to make you feel nostalgic — the Internet used to be slow and less populated and less diverse, and its access was limited to those of a certain class. The Web For All is a marked improvement, widespread global internet access is a marked improvement, and what I’m asking you to consider is what it used to feel like to use these tools, and what we’ve lost in the Big Tech, Web 2.0 and web3 devouring of the ’Net.
The onset of the automobile was a revelation for access and personal liberty. With the advent of cars, members of society could travel farther, get more done in their day, and bend their limited time more to their creative will!But as time wore on and the industrialization & proliferation of the automobile progressed, its marginal utility diminished — the industry started to offer society fewer & fewer benefits, and take more & more in exchange. In American cities, for example: though at first the automobile enabled humans to travel further distances, it now demanded that humans travel those distances, and demanded infrastructure be created & maintained to enable it. Many now must use an automobile to get everything done in their town in a day, and must pay & take time for that automobile’s fueling & maintenance.Further than that, the automobile asks all of us to chip in tax revenue to protect its infrastructure, but only certain classes can afford an automobile with which to use that infrastructure, and those classes who can’t afford to do so are relegated to underfunded public transit systems.No longer a tool to serve our societies, our societies now serve the automobile.
no longer a tool to serve our societies,
our societies now serve the automobile.
the markers of a decaying ’Net I mentioned before, with convivial tooling in mind:
Monolithic platforms like YouTube, TikTok, Medium, and Substack draw a ton of creators and educators because of the promise of monetization and large audiences, but they’ve shown time and time again how the lack of ownership creates a problem. When those platforms fail, when they change their rules, when they demand creators move or create a particular way to maintain their access to those audiences, they pit creators or their audiences against the loss of the other. Without adhering to the algorithm’s requirements, writers may not write an impactful document, and without bypassing a paywall, readers can’t read it.
When those promises of exorbitant wealth and a life of decadence through per-click monetization ultimately dry up (or come with a steep moral or creative cost), creators and learners must look for new solutions for how educational content is shared on the Internet. The most self-evident, convivial answer is an old one: blogs. HTML is free to access by default, RSS has worked for about 130 years[citation needed], and combined with webmentions, it’s never been easier to read new ideas, experiment with ideas, and build upon & grow those ideas with other strong thinkers on the web, owning that content all along.
Connecting with friends on the WebSocial media apps have imprisoned us all in this weird content prison — in order to connect with friends we’re sort of forced to create or be vanished by capricious black box algorithms, and all that we do create is, as we’ve already alluded to, subsequently owned by whatever platform we’ve created it on. If Instagram goes away overnight, or decides to pivot catastrophically, your stories and your network of friends goes with it.
The advent and development of tools & methodologies like POSSE (Publish On your Own Site, Syndicate Elsewhere), ActivityPub, microformats, and ATProto, it’s becoming quite achievable to generate your own social network, interoperable with other networks like Bluesky or Mastodon. That network, designed for ownership and decentralization, is durable, designed around storytelling instead of engagement, and free of the whims of weird tech billionaires.
With some basic HTML knowledge and getting-stuff-online knowledge, a handful of scrappy protocols, and a free afternoon or two, one can build their own home to post bangers for the tight homies, make friends, and snipe those new friends with those hits of dopamine they so fiendishly rely on.
Lastly, consider the discipline of web engineering:
We have been asked to build the same B2B SaaS website with the same featureset n^∞ times, and our answers for the optimal way to do that are increasingly limited. We’ve penned all of our markup into JavaScript templates just in case a product manager needs the wrapper component to post JSON somewhere down the line, and we’ve whittled away at style code until it’s just a mechanism for deploying one of two border-radius-drop-shadow combos to divs. It’s an industrial, production-minded way of approaching a discipline that has all the hallmarks of being a great craft, and that’s understandably uninspiring to many of us.
Yet our young React shepherds have no need to fear: there are countless more colors than blurple out there, and countless more fonts than Inter. HTML and CSS are better and more generative technologies than they’ve ever been: Thanks to the tireless work of the CSS working groups and browser implementers, etc, there is an unbelievable amount of creative expression possible with basic web tools in a text editor. Even JavaScript is more progressively-enhanceable than ever, and enables interfacing with a rapidly-growing number of exciting browser APIs (still fuck Brendan Eich though). ${new Date.getCurrentYear()} is a veritable renaissance of web code, and it asks of authors only curiosity and a drive to experiment.
Sunrise on the Matterhorn, (after 1875)
You’re not crazy. The internet does feel genuinely so awful right now, and for about a thousand and one reasons. But the path back to feeling like you have some control is to un-spin yourself from the Five Apps of the Apocalypse and reclaim the Internet as a set of tools you use to build something you can own & be proud of — or in most of our cases, be deeply ashamed of. Godspeed and good luck. That’s all for me. If you find any issues with this post, please reach out to me by email. Thanks eternally for your time and patience, and thanks for reading. Find me here online at one of my personal websites like henry.codes or strange.website or stillness.digital or strangersbyspring.com, or sometimes on Bluesky and Mastodon.As ever, unionize, free Palestine, trans rights are human rights, fix your heart or die.
...
Read the original on henry.codes »
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.