10 interesting stories served every morning and every evening.




1 529 shares, 30 trendiness

New Sponsor Announcement

Financially - sev­eral tiers and op­tions avail­able for GitHub, PayPal and Patreon.

Help in the Community dis­cord and be­yond (we also love blog posts).

Bounties, Fix bugs and add fea­tures faster as well as get paid for your work :)

...

Read the original on monogame.net »

2 504 shares, 54 trendiness

Warren Buffett steps down as Berkshire Hathaway CEO after six decades

This is read by an au­to­mated voice. Please re­port any is­sues or in­con­sis­ten­cies here.

This is read by an au­to­mated voice. Please re­port any is­sues or in­con­sis­ten­cies here.

Greg Abel faces the chal­lenge of tak­ing over Berkshire Hathaway from the leg­endary Warren Buffett .

Many re­gard Buffett as the world’s great­est in­vestor af­ter he grew Berkshire from a strug­gling New England tex­tile mill that he start­ing buy­ing up for $7.60 a share in 1962, to the mas­sive con­glom­er­ate it is to­day with shares that go for more than $750,000 a pop. Buffett’s per­sonal for­tune of Berkshire stock is worth roughly $150 bil­lion even af­ter giv­ing away more than $60 bil­lion over the last 20 years.

Berkshire for decades has rou­tinely out­paced the S&P 500 as Buffett bought up in­sur­ance com­pa­nies like Geico and National Indemnity, man­u­fac­tur­ers like Iscar Metalworking, re­tail brands like Dairy Queen, ma­jor util­i­ties and even one of the na­tion’s biggest rail­roads, BNSF. Along the way, Buffett bought and sold hun­dreds of bil­lions of dol­lars of stocks and prof­ited hand­somely from his fa­mously long-term bets on com­pa­nies like American Express, Coca-Cola and Apple.

Berkshire has strug­gled to keep that pace in re­cent years be­cause it has grown so huge and also strug­gled to find new and sig­nif­i­cant ac­qui­si­tions. Even this fal­l’s $9.7-billion ac­qui­si­tion of OxyChem prob­a­bly is­n’t big enough to make a dif­fer­ence in Berkshire’s prof­its.

Investors will be watch­ing closely to see what changes Abel might make in Berkshire’s tra­jec­tory, but don’t ex­pect any seis­mic shifts.

Buffett is­n’t go­ing any­where and Abel has al­ready been man­ag­ing all of Berkshire’s non­in­sur­ance busi­nesses since 2018. Buffett will re­main chair­man and plans to con­tinue com­ing into the of­fice each day to help spot new in­vest­ments and of­fer Abel any ad­vice he asks for.

CFRA Research an­a­lyst Cathy Seifert said it is nat­ural for Abel to make some changes in the way Berkshire is run. Taking a more tra­di­tional ap­proach to lead­er­ship with nearly 400,000 em­ploy­ees spread across dozens of sub­sidiaries makes a lot of sense, she said.

But Berkshire op­er­ates un­der an ex­tremely de­cen­tral­ized struc­ture that trusts its ex­ec­u­tives with sig­nif­i­cant de­ci­sions. Everyone as­so­ci­ated with the com­pany has said there are no plans to change that.

The world learned that Abel was to be­come the des­ig­nated suc­ces­sor at Berkshire in 2021 when Buffett’s long­time busi­ness part­ner, the late Charlie Munger, as­sured share­hold­ers at an an­nual meet­ing that Abel would main­tain the com­pa­ny’s cul­ture.

Part of Buffett’s sales pitch to com­pany founders and CEOs think­ing of sell­ing their com­pa­nies has al­ways been that Berkshire would largely al­low them to con­tinue run­ning their com­pa­nies the same way as long as they de­liv­ered re­sults.

I think the in­vest­ment com­mu­nity would likely ap­plaud Greg’s man­age­ment style to the de­gree that it sort of but­tons things up,” Seifert said. And if it helps per­for­mance, that can’t re­ally be faulted.”

Abel has al­ready shown him­self to be a more hands-on man­ager than Buffett, but he still fol­lows the Berkshire model of au­ton­omy for ac­quired com­pa­nies. Abel asks tough ques­tions of com­pany lead­ers and holds them ac­count­able for their per­for­mance.

Abel did an­nounce some lead­er­ship changes in December af­ter in­vest­ment man­ager and Geico CEO Todd Combs de­parted, and Chief Financial Officer Marc Hamburg an­nounced his re­tire­ment. Abel also said he’s ap­point­ing NetJets CEO Adam Johnson as man­ager of all of Berkshire’s con­sumer, ser­vice and re­tail busi­nesses. That es­sen­tially cre­ates a third di­vi­sion of the com­pany and takes some work off of Abel’s plate. He will con­tinue to man­age the man­u­fac­tur­ing, util­ity and rail­road busi­nesses.

Abel will even­tu­ally face more pres­sure to start pay­ing a div­i­dend. From the be­gin­ning, Berkshire has held the po­si­tion that it is bet­ter to rein­vest prof­its rather than make quar­terly or an­nual pay­outs to share­hold­ers.

But if Abel can’t find a pro­duc­tive use of the $382 bil­lion cash that Berkshire is sit­ting on, there may be a push from in­vestors to start pay­ing div­i­dends or to adopt a tra­di­tional stock buy­back pro­gram that would boost the value of shares they hold. Currently, Berkshire only re­pur­chases shares when Buffett thinks they are a bar­gain, and he has­n’t done that since early 2024.

Still, Abel will be in­su­lated from such pres­sure for some time since Buffett con­trols nearly 30% of the vot­ing power in the stock. That will di­min­ish grad­u­ally af­ter his death as his chil­dren dis­trib­ute his shares to char­ity as agreed.

Many of Berkshire’s sub­sidiaries tend to fol­low the econ­omy and profit hand­somely when­ever the coun­try is pros­per­ous. Berkshire’s util­i­ties typ­i­cally gen­er­ate a re­li­able profit, and its in­sur­ance com­pa­nies like Geico and General Reinsurance sup­ply more than $175 bil­lion worth of pre­mi­ums that can be in­vested un­til claims come due.

Investor Chris Ballard, who is man­ag­ing di­rec­tor at Check Capital, said most of Berkshire’s busi­nesses can al­most take care of them­selves.” He sees a bright fu­ture for Berkshire un­der Abel.

One of the biggest ques­tions right now may be how much ad­di­tional change there will be in com­pany lead­er­ship af­ter Combs’ de­par­ture, if any at all. The head of the in­sur­ance unit, Vice Chairman Ajit Jain, who Buffett has long lav­ished with praise, is now 74. Many of the CEOs of the var­i­ous com­pa­nies have con­tin­ued work­ing long af­ter re­tire­ment age be­cause they like work­ing for Buffett.

As a long-term share­holder, we aren’t too con­cerned with Todd’s de­par­ture and don’t think this is the tip of some sort of ice­berg,” said Ballard, whose firm counts Berkshire as its largest hold­ing. Todd’s sit­u­a­tion is unique. It’s just a re­minder that Warren’s pend­ing de­par­ture is im­mi­nent and they’re prepar­ing for a new phase — one that we’re still ex­cited to see un­fold.”

Funk writes for the Associated Press.

...

Read the original on www.latimes.com »

3 413 shares, 32 trendiness

I canceled my book deal

See the dis­cus­sion of this post on Hacker News.

Many peo­ple reached out ex­press­ing their in­ter­est to buy the book. I’ve put the e-book up for pre-or­der . I’ll re­lease each chap­ter as they are com­pleted. A print ver­sion will be avail­able on Amazon later.

Back in 2020-2022, my blog was get­ting a lot of at­ten­tion. Some of the big tech book pub­lish­ers reached out about whether I was in­ter­ested in writ­ing a book. I had a few con­ver­sa­tions but de­cided against it. I did want to write a book but self-pub­lish­ing seemed like the bet­ter op­tion.

Then an ac­qui­si­tions ed­i­tor for an­other big pub­lisher asked to chat. He had a sim­i­lar back­ground to me, an aca­d­e­mic who en­joys cod­ing and writ­ing. He had writ­ten a few books for mul­ti­ple pub­lish­ers so he knew the process and was­n’t shy to share the good and bad. He even made de­cent money from his books.

I was in­trigued. Writing a book was one of those goals that I liked the idea of but never made any progress on. I went and talked to a few other peo­ple that had pub­lished tech­ni­cal books and they gave me their re­views of the process and of each of the big pub­lish­ers.

* Pros of a pub­lisher: They are a forc­ing func­tion to make progress, they han­dle a lot of lo­gis­tics for you around the book, they pro­vide some feed­back on the con­tent, they have large dis­tri­b­u­tion chan­nels, and it looks more real” when a pub­lish­er’s name is on your book.

* Cons of a pub­lisher: They nag you con­stantly, they may try to steer your book in other di­rec­tions, the money is peanuts, they can stop print­ing the book when­ever they want, they have con­trol over fu­ture edi­tions, and they ac­tu­ally do lit­tle to no mar­ket­ing of your book.

I de­cided to write a book and sign with the pub­lisher!

Before the ac­tual deal, we had to agree on the book. Each pub­lisher has their own tem­plate for pitch­ing an idea. I did that and we went back and forth on some of the de­tails. This part felt col­lab­o­ra­tive, they were try­ing to help me flesh out the con­cept based on their data and ex­pe­ri­ence.

A lot of my blog posts in­volve clas­sic pro­gram­ming pro­jects that were rel­e­vant 30 years ago and will be just as fun 30 years from now. What if the book is a col­lec­tion of tu­to­ri­als on build­ing these pro­jects, each self-con­tained and teach fun­da­men­tal com­put­ing con­cepts along the way?

To mo­ti­vate that there is a mar­ket for this, I showed them that sev­eral of my blog posts in this space col­lec­tively had mil­lions of views. Most no­tably, Challenging pro­gram­ming pro­jects every pro­gram­mer should try (and the two se­quels). My read­ers seem to res­onate with mak­ing use­less stuff and pro­gram­ming for fun. Who does­n’t love hack­ing on a ray tracer or com­piler or game or op­er­at­ing sys­tem just for fun?!

They liked it! It was quite dif­fer­ent than many of their other books. I wrote a high-level out­line of the en­tire book. The pro­jects were go­ing to be:

The com­piler chap­ter would be a re­vised ver­sion of my Let’s make a Teeny Tiny com­piler blog se­ries. The last pro­ject chap­ter would list a bunch of smaller scale pro­jects that are less of a com­mit­ment but still worth­while for learn­ing (e.g., an im­age file for­mat con­verter). Also, each chap­ter would end with a bunch of sug­ges­tions on how to con­tinue with the pro­ject. If I got to the end and the book was a bit short, I was plan­ning to squeeze in one more pro­ject chap­ter too.

There was some ne­go­ti­a­tion of the con­tract terms. We had to agree specif­i­cally on what the book was go­ing to be. This in­cluded the pitch for the book, who the in­tended au­di­ence is, and a very de­tailed Table of Contents (down to the sub­sub­sec­tion head­ings). I had to give a ten­ta­tive sched­ule of when I’d de­liver drafts for ma­jor mile­stones.

The weird­est thing in the con­tract was the num­ber of il­lus­tra­tions that the book must con­tain. I asked to bump that up. In the end we agreed to 115,500 to 132,000 words in length, ap­prox­i­mately 350 to 400 printed pages, with 10 to 30 il­lus­tra­tions.

They of­fered a $5000 ad­vance with the first half paid out when they ap­prove of the first third of the book and the sec­ond half when they ac­cept the fi­nal man­u­script for pub­li­ca­tion.

I did­n’t even bother ne­go­ti­at­ing this be­cause it is es­sen­tially noth­ing. I have a day job so I don’t need the money sooner to pay rent (it is just an ad­vance af­ter all!) and if I don’t sell way, way more than that then this was a bad use of my time fi­nan­cially any­way.

The roy­al­ties of­fered were sur­pris­ingly low. I did ne­go­ti­ate these and got them bumped up a tiny bit. They agreed to 12% of to­tal sales for print and e-book on the first 7000 copies and then 15% on sales af­ter that. 50% roy­alty on for­eign trans­la­tion sales.

I was later told that some peo­ple ne­go­ti­ated up to 18%. Meh. The fi­nances don’t look great re­gard­less un­less you have one of their top books.

They re­fused to share sta­tis­tics. I was able to get out of them that their me­dian book sells in the range of sin­gle thou­sands of copies. Their top sell­ers are in the range of hun­dreds of thou­sands of copies but they only shared one ex­am­ple of a book that did that. Most are on the shelf for only a few years.

Oh, and I get 25 copies for my­self and I can buy copies at a 50% dis­count.

We kicked the pro­ject off in early 2023.

They as­signed me an ed­i­tor that I met with reg­u­larly. This was my main con­tact with the pub­lisher.

He walked me through the process and got me set up. I was re­quired to use AsciiDoc or Microsoft Word for my drafts. Nope, I can’t use LaTeX. He gave me a very de­tailed style guide that I had to fol­low.

He emailed me, a lot. We had ini­tially agreed to a chap­ter draft every 3 or 4 weeks (I was overly op­ti­mistic, and felt pres­sured…). Once that first soft dead­line passed, the con­stant emails ask­ing to see drafts be­gan.

When I de­liv­ered a draft, I quickly got a marked-up ver­sion back. The feed­back was mostly for­mat­ting and styling. The help­ful feed­back was point­ing out rough tran­si­tions or as­sump­tions I made about prior knowl­edge.

The un­help­ful feed­back was a con­sis­tent push to dumb down the book (which I don’t think is par­tic­u­larly com­plex but I do like to leave things for the reader to try) to ap­pease a broader au­di­ence and to mel­low out my per­sonal voice. He also wanted me to add a chap­ter that acts as an in­tro to pro­gram­ming with Python… 😵

It be­came clear that they were fol­low­ing a for­mula for tech­ni­cal books. Don’t show too much per­son­al­ity, don’t be too tech­ni­cal, and hand hold the reader through a lin­ear task. Just crank out the book so they get the book on the shelf. What you say on page 120 does­n’t mat­ter be­cause the reader has al­ready bought the book.

This rubbed me the wrong way but I pushed through. They are just fol­low­ing their in­cen­tives and I should push back.

The book was started only a few months af­ter the re­lease of ChatGPT. The en­tire world was talk­ing about AI!

So it was­n’t long be­fore the pub­lisher asked to chat. Hey, is there any way you can in­cor­po­rate AI into the book?”, I po­litely de­clined.

A bit later they come back ba­si­cally say­ing The Powers That Be are re­quir­ing AI to be part of every book. I of­fered a few com­pro­mises (i.e., a chap­ter about im­ple­ment­ing an ML al­go­rithm or a note at the end of each chap­ter about lever­ag­ing AI in the cre­ation of the pro­jects). I got a mixed re­sponse.

In the end, I firmly told them no. It is an­ti­thet­i­cal to the premise of the book (classic pro­gram­ming pro­jects!) that they agreed to pub­lish. They went away.

I kept miss­ing dead­lines. I was busy with work (AI!) and life. Apparently every book’s time­line gets pushed at least once, so they were flex­i­ble. I even­tu­ally sent the ed­i­tor the first third of the book and we made a few back-and-forths re­vis­ing it.

This trig­gered the next stage: get­ting feed­back on the tech­ni­cal con­tent. We would do a round or two of this with a tech­ni­cal ed­i­tor, then the draft would go off to strangers for re­view. If they hated it, the pub­lisher had the right to can­cel the pro­ject. If they don’t hate it, then the draft would go live for early read­ers to buy (and they’d re­ceive fu­ture chap­ters as they are com­pleted).

The first notes I got back from the tech­ni­cal ed­i­tor did­n’t seem like a good fit. Everything he said was cor­rect, but in­di­cated a mis­match in ex­pec­ta­tions. He was cri­tiquing the pro­ject in the chap­ter as if it was sup­posed to be pro­duc­tion qual­ity soft­ware. But my pro­jects are a bal­anc­ing act of what can any pro­gram­mer with very lit­tle knowl­edge on the sub­ject make in a week­end that gives them a broad un­der­stand­ing of the con­cepts.

The sec­ond chap­ter of feed­back from the tech­ni­cal ed­i­tor was far more help­ful to me. I think he got” what I was go­ing for and pointed out many flaws and sug­gested many im­prove­ments I could make. It was nice. Iterating on ex­ist­ing con­tent is a very dif­fer­ent work­flow than writ­ing new con­tent so I slowed down even more. For ex­am­ple, it can be quite te­dious to make sure that code snip­pets are con­sis­tent with snip­pets from 20 pages prior.

I con­tin­ued to get fur­ther be­hind on de­liv­er­ing my re­vised draft of the first 1/3. This is a big mile­stone in the pub­lish­er’s process (soliciting ex­ter­nal re­view­ers, de­ter­min­ing whether the pro­ject should con­tinue, putting the early adopters e-book up for sale, and pay­ing out the first half of the ad­vance).

The pub­lisher was get­ting grumpy. I was get­ting grumpy. They were bring­ing up piv­ot­ing the book to be about AI again. They were reevaluating” their port­fo­lio. My ed­i­tor left the com­pany and I was as­signed a new one. Things were pil­ing on.

There was also a daunt­ing voice in the back of my head that LLMs have elim­i­nated the need for books like this. Why buy this book when ChatGPT can gen­er­ate the same style of tu­to­r­ial for ANY pro­ject that is cus­tomized to you?

The process with the pub­lisher was­n’t en­joy­able. I had hoped for it to be a pos­i­tive mo­ti­va­tor to keep me fo­cused but it felt like a chore. And I was wor­ried that the fin­ished prod­uct would be void of per­son­al­ity and would be yet an­other bor­ing pro­gram­ming book.

Around this time, there was a pos­si­bil­ity of me chang­ing jobs. Oh, and my wed­ding was com­ing up. That was the fi­nal nail in the cof­fin.

There were too many things go­ing on and I did­n’t en­joy work­ing on the book any­more, so what is the point? I made up my mind to ask to freeze the pro­ject.

I think they thought of this as a tem­po­rary cooldown where I could spo­rad­i­cally work on the book with­out the stress of the dead­lines. The new ed­i­tor still pinged me reg­u­larly ask­ing if I made progress. Repeatedly. (I sup­pose they are fol­low­ing the in­cen­tives again—they only get paid if peo­ple ship books.) Eventually I asked him to stop un­til I reached out first.

And then… life went on. I never got un-busy.

Fast for­ward, I just got no­ti­fi­ca­tion from the pub­lisher that the con­tract has been of­fi­cially ter­mi­nated and all rights of the work were trans­ferred back to me.

I still love my book idea. Maybe I’ll just pub­lish the chap­ters as blog posts or self-pub­lish it!

Many peo­ple reached out ex­press­ing their in­ter­est to buy the book. I’ve put the e-book up for pre-or­der . I’ll re­lease each chap­ter as they are com­pleted. A print ver­sion will be avail­able on Amazon later.

...

Read the original on austinhenley.com »

4 336 shares, 59 trendiness

The year in LLMs

This is the third in my an­nual se­ries re­view­ing every­thing that hap­pened in the LLM space over the past 12 months. For pre­vi­ous years see Stuff we fig­ured out about AI in 2023 and Things we learned about LLMs in 2024.

It’s been a year filled with a lot of dif­fer­ent trends.

OpenAI kicked off the reasoning” aka in­fer­ence-scal­ing aka Reinforcement Learning from Verifiable Rewards (RLVR) rev­o­lu­tion in September 2024 with o1 and o1-mini. They dou­bled down on that with o3, o3-mini and o4-mini in the open­ing months of 2025 and rea­son­ing has since be­come a sig­na­ture fea­ture of mod­els from nearly every other ma­jor AI lab.

My favourite ex­pla­na­tion of the sig­nif­i­cance of this trick comes from Andrej Karpathy:

By train­ing LLMs against au­to­mat­i­cally ver­i­fi­able re­wards across a num­ber of en­vi­ron­ments (e.g. think math/​code puz­zles), the LLMs spon­ta­neously de­velop strate­gies that look like reasoning” to hu­mans—they learn to break down prob­lem solv­ing into in­ter­me­di­ate cal­cu­la­tions and they learn a num­ber of prob­lem solv­ing strate­gies for go­ing back and forth to fig­ure things out (see DeepSeek R1 pa­per for ex­am­ples). […]

Running RLVR turned out to of­fer high ca­pa­bil­ity/$, which gob­bled up the com­pute that was orig­i­nally in­tended for pre­train­ing. Therefore, most of the ca­pa­bil­ity progress of 2025 was de­fined by the LLM labs chew­ing through the over­hang of this new stage and over­all we saw ~similar sized LLMs but a lot longer RL runs.

Every no­table AI lab re­leased at least one rea­son­ing model in 2025. Some labs re­leased hy­brids that could be run in rea­son­ing or non-rea­son­ing modes. Many API mod­els now in­clude di­als for in­creas­ing or de­creas­ing the amount of rea­son­ing ap­plied to a given prompt.

It took me a while to un­der­stand what rea­son­ing was use­ful for. Initial demos showed it solv­ing math­e­mat­i­cal logic puz­zles and count­ing the Rs in straw­berry—two things I did­n’t find my­self need­ing in my day-to-day model us­age.

It turned out that the real un­lock of rea­son­ing was in dri­ving tools. Reasoning mod­els with ac­cess to tools can plan out multi-step tasks, ex­e­cute on them and con­tinue to rea­son about the re­sults such that they can up­date their plans to bet­ter achieve the de­sired goal.

A no­table re­sult is that AI as­sisted search ac­tu­ally works now. Hooking up search en­gines to LLMs had ques­tion­able re­sults be­fore, but now I find even my more com­plex re­search ques­tions can of­ten be an­swered by GPT-5 Thinking in ChatGPT.

Reasoning mod­els are also ex­cep­tional at pro­duc­ing and de­bug­ging code. The rea­son­ing trick means they can start with an er­ror and step through many dif­fer­ent lay­ers of the code­base to find the root cause. I’ve found even the gnarli­est of bugs can be di­ag­nosed by a good rea­soner with the abil­ity to read and ex­e­cute code against even large and com­plex code­bases.

Combine rea­son­ing with tool-use and you get…

I started the year mak­ing a pre­dic­tion that agents were not go­ing to hap­pen. Throughout 2024 every­one was talk­ing about agents but there were few to no ex­am­ples of them work­ing, fur­ther con­fused by the fact that every­one us­ing the term agent” ap­peared to be work­ing from a slightly dif­fer­ent de­f­i­n­i­tion from every­one else.

By September I’d got fed up of avoid­ing the term my­self due to the lack of a clear de­f­i­n­i­tion and de­cided to treat them as an LLM that runs tools in a loop to achieve a goal. This un­blocked me for hav­ing pro­duc­tive con­ver­sa­tions about them, al­ways my goal for any piece of ter­mi­nol­ogy like that.

I did­n’t think agents would hap­pen be­cause I did­n’t think the gulli­bil­ity prob­lem could be solved, and I thought the idea of re­plac­ing hu­man staff mem­bers with LLMs was still laugh­able sci­ence fic­tion.

I was half right in my pre­dic­tion: the sci­ence fic­tion ver­sion of a magic com­puter as­sis­tant that does any­thing you ask of (Her) did­n’t ma­te­ri­al­ize…

But if you de­fine agents as LLM sys­tems that can per­form use­ful work via tool calls over mul­ti­ple steps then agents are here and they are prov­ing to be ex­tra­or­di­nar­ily use­ful.

The two break­out cat­e­gories for agents have been for cod­ing and for search.

The Deep Research pat­tern—where you chal­lenge an LLM to gather in­for­ma­tion and it churns away for 15+ min­utes build­ing you a de­tailed re­port—was pop­u­lar in the first half of the year but has fallen out of fash­ion now that GPT-5 Thinking (and Google’s AI mode”, a sig­nif­i­cantly bet­ter prod­uct than their ter­ri­ble AI overviews”) can pro­duce com­pa­ra­ble re­sults in a frac­tion of the time. I con­sider this to be an agent pat­tern, and one that works re­ally well.

The coding agents” pat­tern is a much big­ger deal.

The most im­pact­ful event of 2025 hap­pened in February, with the quiet re­lease of Claude Code.

I say quiet be­cause it did­n’t even get its own blog post! Anthropic bun­dled the Claude Code re­lease in as the sec­ond item in their post an­nounc­ing Claude 3.7 Sonnet.

Claude Code is the most promi­nent ex­am­ple of what I call cod­ing agents—LLM sys­tems that can write code, ex­e­cute that code, in­spect the re­sults and then it­er­ate fur­ther.

The ma­jor labs all put out their own CLI cod­ing agents in 2025

Vendor-independent op­tions in­clude GitHub Copilot CLI, Amp, OpenCode, OpenHands CLI, and Pi. IDEs such as Zed, VS Code and Cursor in­vested a lot of ef­fort in cod­ing agent in­te­gra­tion as well.

My first ex­po­sure to the cod­ing agent pat­tern was OpenAI’s ChatGPT Code Interpreter in early 2023—a sys­tem baked into ChatGPT that al­lowed it to run Python code in a Kubernetes sand­box.

I was de­lighted this year when Anthropic fi­nally re­leased their equiv­a­lent in September, al­beit un­der the baf­fling ini­tial name of Create and edit files with Claude”.

In October they re­pur­posed that con­tainer sand­box in­fra­struc­ture to launch Claude Code for web, which I’ve been us­ing on an al­most daily ba­sis ever since.

Claude Code for web is what I call an asyn­chro­nous cod­ing agent—a sys­tem you can prompt and for­get, and it will work away on the prob­lem and file a Pull Request once it’s done. OpenAI Codex cloud” (renamed to Codex web” in the last week) launched ear­lier in May 2025. Gemini’s en­try in this cat­e­gory is called Jules, also launched in May.

I love the asyn­chro­nous cod­ing agent cat­e­gory. They’re a great an­swer to the se­cu­rity chal­lenges of run­ning ar­bi­trary code ex­e­cu­tion on a per­sonal lap­top and it’s re­ally fun be­ing able to fire off mul­ti­ple tasks at once—of­ten from my phone—and get de­cent re­sults a few min­utes later.

I wrote more about how I’m us­ing these in Code re­search pro­jects with async cod­ing agents like Claude Code and Codex and Embracing the par­al­lel cod­ing agent lifestyle.

In 2024 I spent a lot of time hack­ing on my LLM com­mand-line tool for ac­cess­ing LLMs from the ter­mi­nal, all the time think­ing that it was weird that so few peo­ple were tak­ing CLI ac­cess to mod­els se­ri­ously—they felt like such a nat­ural fit for Unix mech­a­nisms like pipes.

Maybe the ter­mi­nal was just too weird and niche to ever be­come a main­stream tool for ac­cess­ing LLMs?

Claude Code and friends have con­clu­sively demon­strated that de­vel­op­ers will em­brace LLMs on the com­mand line, given pow­er­ful enough mod­els and the right har­ness.

It helps that ter­mi­nal com­mands with ob­scure syn­tax like sed and ffm­peg and bash it­self are no longer a bar­rier to en­try when an LLM can spit out the right com­mand for you.

As-of December 2nd Anthropic credit Claude Code with $1bn in run-rate rev­enue! I did not ex­pect a CLI tool to reach any­thing close to those num­bers.

With hind­sight, maybe I should have pro­moted LLM from a side-pro­ject to a key fo­cus!

The de­fault set­ting for most cod­ing agents is to ask the user for con­fir­ma­tion for al­most every ac­tion they take. In a world where an agent mis­take could wipe your home folder or a ma­li­cious prompt in­jec­tion at­tack could steal your cre­den­tials this de­fault makes to­tal sense.

Anyone who’s tried run­ning their agent with au­to­matic con­fir­ma­tion (aka YOLO mode—Codex CLI even aliases –dangerously-bypass-approvals-and-sandbox to –yolo) has ex­pe­ri­enced the trade-off: us­ing an agent with­out the safety wheels feels like a com­pletely dif­fer­ent prod­uct.

A big ben­e­fit of asyn­chro­nous cod­ing agents like Claude Code for web and Codex Cloud is that they can run in YOLO mode by de­fault, since there’s no per­sonal com­puter to dam­age.

I run in YOLO mode all the time, de­spite be­ing deeply aware of the risks in­volved. It has­n’t burned me yet…

One of my favourite pieces on LLM se­cu­rity this year is The Normalization of Deviance in AI by se­cu­rity re­searcher Johann Rehberger.

Johann de­scribes the Normalization of Deviance” phe­nom­e­non, where re­peated ex­po­sure to risky be­hav­iour with­out neg­a­tive con­se­quences leads peo­ple and or­ga­ni­za­tions to ac­cept that risky be­hav­iour as nor­mal.

This was orig­i­nally de­scribed by so­ci­ol­o­gist Diane Vaughan as part of her work to un­der­stand the 1986 Space Shuttle Challenger dis­as­ter, caused by a faulty O-ring that en­gi­neers had known about for years. Plenty of suc­cess­ful launches led NASA cul­ture to stop tak­ing that risk se­ri­ously.

Johann ar­gues that the longer we get away with run­ning these sys­tems in fun­da­men­tally in­se­cure ways, the closer we are get­ting to a Challenger dis­as­ter of our own.

ChatGPT Plus’s orig­i­nal $20/month price turned out to be a snap de­ci­sion by Nick Turley based on a Google Form poll on Discord. That price point has stuck firmly ever since.

This year a new pric­ing prece­dent has emerged: the Claude Pro Max 20x plan, at $200/month.

OpenAI have a sim­i­lar $200 plan called ChatGPT Pro. Gemini have Google AI Ultra at $249/month with a $124.99/month 3-month start­ing dis­count.

These plans ap­pear to be dri­ving some se­ri­ous rev­enue, though none of the labs have shared fig­ures that break down their sub­scribers by tier.

I’ve per­son­ally paid $100/month for Claude in the past and will up­grade to the $200/month plan once my cur­rent batch of free al­lowance (from pre­view­ing one of their mod­els—thanks, Anthropic) runs out. I’ve heard from plenty of other peo­ple who are happy to pay these prices too.

You have to use mod­els a lot in or­der to spend $200 of API cred­its, so you would think it would make eco­nomic sense for most peo­ple to pay by the to­ken in­stead. It turns out tools like Claude Code and Codex CLI can burn through enor­mous amounts of to­kens once you start set­ting them more chal­leng­ing tasks, to the point that $200/month of­fers a sub­stan­tial dis­count.

2024 saw some early signs of life from the Chinese AI labs mainly in the form of Qwen 2.5 and early DeepSeek. They were neat mod­els but did­n’t feel world-beat­ing.

This changed dra­mat­i­cally in 2025. My ai-in-china tag has 67 posts from 2025 alone, and I missed a bunch of key re­leases to­wards the end of the year (GLM-4.7 and MiniMax-M2.1 in par­tic­u­lar.)

GLM-4.7, Kimi K2 Thinking, MiMo-V2-Flash, DeepSeek V3.2, MiniMax-M2.1 are all Chinese open weight mod­els. The high­est non-Chi­nese model in that chart is OpenAI’s gpt-oss-120B (high), which comes in sixth place.

The Chinese model rev­o­lu­tion re­ally kicked off on Christmas day 2024 with the re­lease of DeepSeek 3, sup­pos­edly trained for around $5.5m. DeepSeek fol­lowed that on 20th January with DeepSeek R1 which promptly trig­gered a ma­jor AI/semiconductor sell­off: NVIDIA lost ~$593bn in mar­ket cap as in­vestors pan­icked that AI maybe was­n’t an American mo­nop­oly af­ter all.

The panic did­n’t last—NVIDIA quickly re­cov­ered and to­day are up sig­nif­i­cantly from their pre-DeepSeek R1 lev­els. It was still a re­mark­able mo­ment. Who knew an open weight model re­lease could have that kind of im­pact?

DeepSeek were quickly joined by an im­pres­sive ros­ter of Chinese AI labs. I’ve been pay­ing at­ten­tion to these ones in par­tic­u­lar:

Most of these mod­els aren’t just open weight, they are fully open source un­der OSI-approved li­censes: Qwen use Apache 2.0 for most of their mod­els, DeepSeek and Z.ai use MIT.

Some of them are com­pet­i­tive with Claude 4 Sonnet and GPT-5!

Sadly none of the Chinese labs have re­leased their full train­ing data or the code they used to train their mod­els, but they have been putting out de­tailed re­search pa­pers that have helped push for­ward the state of the art, es­pe­cially when it comes to ef­fi­cient train­ing and in­fer­ence.

One of the most in­ter­est­ing re­cent charts about LLMs is Time-horizon of soft­ware en­gi­neer­ing tasks dif­fer­ent LLMscan com­plete 50% of the time from METR:

The chart shows tasks that take hu­mans up to 5 hours, and plots the evo­lu­tion of mod­els that can achieve the same goals work­ing in­de­pen­dently. As you can see, 2025 saw some enor­mous leaps for­ward here with GPT-5, GPT-5.1 Codex Max and Claude Opus 4.5 able to per­form tasks that take hu­mans mul­ti­ple hours—2024’s best mod­els tapped out at un­der 30 min­utes.

METR con­clude that the length of tasks AI can do is dou­bling every 7 months”. I’m not con­vinced that pat­tern will con­tinue to hold, but it’s an eye-catch­ing way of il­lus­trat­ing cur­rent trends in agent ca­pa­bil­i­ties.

The most suc­cess­ful con­sumer prod­uct launch of all time hap­pened in March, and the prod­uct did­n’t even have a name.

One of the sig­na­ture fea­tures of GPT-4o in May 2024 was meant to be its mul­ti­modal out­put—the o” stood for omni” and OpenAI’s launch an­nounce­ment in­cluded nu­mer­ous coming soon” fea­tures where the model out­put im­ages in ad­di­tion to text.

Then… noth­ing. The im­age out­put fea­ture failed to ma­te­ri­al­ize.

In March we fi­nally got to see what this could do—al­beit in a shape that felt more like the ex­ist­ing DALL-E. OpenAI made this new im­age gen­er­a­tion avail­able in ChatGPT with the key fea­ture that you could up­load your own im­ages and use prompts to tell it how to mod­ify them.

This new fea­ture was re­spon­si­ble for 100 mil­lion ChatGPT signups in a week. At peak they saw 1 mil­lion ac­count cre­ations in a sin­gle hour!

Tricks like ghiblification”—modifying a photo to look like a frame from a Studio Ghibli movie—went vi­ral time and time again.

OpenAI re­leased an API ver­sion of the model called gpt-image-1”, later joined by a cheaper gpt-im­age-1-mini in October and a much im­proved gpt-im­age-1.5 on December 16th.

The most no­table open weight com­peti­tor to this came from Qwen with their Qwen-Image gen­er­a­tion model on August 4th fol­lowed by Qwen-Image-Edit on August 19th. This one can run on (well equipped) con­sumer hard­ware! They fol­lowed with Qwen-Image-Edit-2511 in November and Qwen-Image-2512 on 30th December, nei­ther of which I’ve tried yet.

The even big­ger news in im­age gen­er­a­tion came from Google with their Nano Banana mod­els, avail­able via Gemini.

Google pre­viewed an early ver­sion of this in March un­der the name Gemini 2.0 Flash na­tive im­age gen­er­a­tion”. The re­ally good one landed on August 26th, where they started cau­tiously em­brac­ing the co­de­name Nano Banana” in pub­lic (the API model was called Gemini 2.5 Flash Image”).

Nano Banana caught peo­ple’s at­ten­tion be­cause it could gen­er­ate use­ful text! It was also clearly the best model at fol­low­ing im­age edit­ing in­struc­tions.

In November Google fully em­braced the Nano Banana” name with the re­lease of Nano Banana Pro. This one does­n’t just gen­er­ate text, it can out­put gen­uinely use­ful de­tailed in­fo­graph­ics and other text and in­for­ma­tion-heavy im­ages. It’s now a pro­fes­sional-grade tool.

Max Woolf pub­lished the most com­pre­hen­sive guide to Nano Banana prompt­ing, and fol­lowed that up with an es­sen­tial guide to Nano Banana Pro in December.

I’ve mainly been us­ing it to add kākāpō par­rots to my pho­tos.

Given how in­cred­i­bly pop­u­lar these im­age tools are it’s a lit­tle sur­pris­ing that Anthropic haven’t re­leased or in­te­grated any­thing sim­i­lar into Claude. I see this as fur­ther ev­i­dence that they’re fo­cused on AI tools for pro­fes­sional work, but Nano Banana Pro is rapidly prov­ing it­self to be of value to any­one who’s work in­volves cre­at­ing pre­sen­ta­tions or other vi­sual ma­te­ri­als.

In July rea­son­ing mod­els from both OpenAI and Google Gemini achieved gold medal per­for­mance in the International Math Olympiad, a pres­ti­gious math­e­mat­i­cal com­pe­ti­tion held an­nu­ally (bar 1980) since 1959.

This was no­table be­cause the IMO poses chal­lenges that are de­signed specif­i­cally for that com­pe­ti­tion. There’s no chance any of these were al­ready in the train­ing data!

It’s also no­table be­cause nei­ther of the mod­els had ac­cess to tools—their so­lu­tions were gen­er­ated purely from their in­ter­nal knowl­edge and to­ken-based rea­son­ing ca­pa­bil­i­ties.

Turns out suf­fi­ciently ad­vanced LLMs can do math af­ter all!

In September OpenAI and Gemini pulled off a sim­i­lar feat for the International Collegiate Programming Contest (ICPC)—again no­table for hav­ing novel, pre­vi­ously un­pub­lished prob­lems. This time the mod­els had ac­cess to a code ex­e­cu­tion en­vi­ron­ment but oth­er­wise no in­ter­net ac­cess.

I don’t be­lieve the ex­act mod­els used for these com­pe­ti­tions have been re­leased pub­licly, but Gemini’s Deep Think and OpenAI’s GPT-5 Pro should pro­vide close ap­prox­i­ma­tions.

With hind­sight, 2024 was the year of Llama. Meta’s Llama mod­els were by far the most pop­u­lar open weight mod­els—the orig­i­nal Llama kicked off the open weight rev­o­lu­tion back in 2023 and the Llama 3 se­ries, in par­tic­u­lar the 3.1 and 3.2 dot-re­leases, were huge leaps for­ward in open weight ca­pa­bil­ity.

Llama 4 had high ex­pec­ta­tions, and when it landed in April it was… kind of dis­ap­point­ing.

There was a mi­nor scan­dal where the model tested on LMArena turned out not to be the model that was re­leased, but my main com­plaint was that the mod­els were too big. The neat­est thing about pre­vi­ous Llama re­leases was that they of­ten in­cluded sizes you could run on a lap­top. The Llama 4 Scout and Maverick mod­els were 109B and 400B, so big that even quan­ti­za­tion would­n’t get them run­ning on my 64GB Mac.

They were trained us­ing the 2T Llama 4 Behemoth which seems to have been for­got­ten now—it cer­tainly was­n’t re­leased.

It says a lot that none of the most pop­u­lar mod­els listed by LM Studio are from Meta, and the most pop­u­lar on Ollama is still Llama 3.1, which is low on the charts there too.

Meta’s AI news this year mainly in­volved in­ter­nal pol­i­tics and vast amounts of money spent hir­ing tal­ent for their new Superintelligence Labs. It’s not clear if there are any fu­ture Llama re­leases in the pipeline or if they’ve moved away from open weight model re­leases to fo­cus on other things.

Last year OpenAI re­mained the undis­puted leader in LLMs, es­pe­cially given o1 and the pre­view of their o3 rea­son­ing mod­els.

This year the rest of the in­dus­try caught up.

OpenAI still have top tier mod­els, but they’re be­ing chal­lenged across the board.

In im­age mod­els they’re still be­ing beaten by Nano Banana Pro. For code a lot of de­vel­op­ers rate Opus 4.5 very slightly ahead of GPT-5.2 Codex Max. In open weight mod­els their gpt-oss mod­els, while great, are falling be­hind the Chinese AI labs. Their lead in au­dio is un­der threat from the Gemini Live API.

Where OpenAI are win­ning is in con­sumer mind­share. Nobody knows what an LLM is but al­most every­one has heard of ChatGPT. Their con­sumer apps still dwarf Gemini and Claude in terms of user num­bers.

Their biggest risk here is Gemini. In December OpenAI de­clared a Code Red in re­sponse to Gemini 3, de­lay­ing work on new ini­tia­tives to fo­cus on the com­pe­ti­tion with their key prod­ucts.

They posted their own vic­to­ri­ous 2025 re­cap here. 2025 saw Gemini 2.0, Gemini 2.5 and then Gemini 3.0—each model fam­ily sup­port­ing au­dio/​video/​im­age/​text in­put of 1,000,000+ to­kens, priced com­pet­i­tively and prov­ing more ca­pa­ble than the last.

They also shipped Gemini CLI (their open source com­mand-line cod­ing agent, since forked by Qwen for Qwen Code), Jules (their asyn­chro­nous cod­ing agent), con­stant im­prove­ments to AI Studio, the Nano Banana im­age mod­els, Veo 3 for video gen­er­a­tion, the promis­ing Gemma 3 fam­ily of open weight mod­els and a stream of smaller fea­tures.

Google’s biggest ad­van­tage lies un­der the hood. Almost every other AI lab trains with NVIDIA GPUs, which are sold at a mar­gin that props up NVIDIAs multi-tril­lion dol­lar val­u­a­tion.

Google use their own in-house hard­ware, TPUs, which they’ve demon­strated this year work ex­cep­tion­ally well for both train­ing and in­fer­ence of their mod­els.

...

Read the original on simonwillison.net »

5 327 shares, 14 trendiness

Alignment Scry

We give you and Claude full + search power over a grow­ing in­dex of doc­u­ments rel­e­vant to the in­tel­li­gence ex­plo­sion. Exploration-first LessWrong lens­ing. Steerable axes, bridge posts, and a per­sonal at­tribute pro­file. Designed to be easy to delete if it is not worth keep­ing. Paste this into Claude Code to start ex­plor­ing im­me­di­ately. For full func­tion­al­ity (higher lim­its + pri­vate vec­tors), cre­ate an ac­count. Claude Code and Codex are es­sen­tially AGI at this point—we rec­om­mend get­ting ac­quainted with these tools even if you are not a soft­ware de­vel­oper. For max­i­mum er­gonom­ics (else you’ll be man­u­ally ap­prov­ing each time Claude tries to query our API), we think you can get away with claude –dangerously-skip-permissions, but that is your risk to ac­cept. We would not rec­om­mend this with a model less smart than Opus 4.5. The risk even if you trust us is prompt in­jec­tion at­tacks in one of our in­gested en­ti­ties, even though we gen­er­ally scrape con­tent from rep­utable sources. Use this prompt di­rectly in­side the Claude web app. No MCP, no in­stalls: just al­low ac­cess to our API once. Paste the prompt be­low and start query­ing in claude.ai. This gives Claude web per­mis­sion to call our API from its sand­box. # ExoPriors Alignment Scry (Public Access)

You have **public** ac­cess to the ExoPriors align­ment re­search cor­pus.

You are a re­search copi­lot for ExoPriors Alignment Scry.

Your pur­pose:

- Turn re­search goals into ef­fec­tive se­man­tic search, SQL, and vec­tor work­flows

- Surface high-sig­nal doc­u­ments and pat­terns, not just raw rows

- Use vec­tor mix­ing to ex­press nu­anced vibes” (e.g., mech in­terp + over­sight − hype”)

- **Core trick**: use `debias_vector(axis, topic)` to re­move topic over­lap (best for X but not Y” or tone ≠ topic” queries)

## Public ac­cess notes

- **Public @handles**: must match `p__` (e.g., `p_8f3a1c2d_myhandle`); shared name­space; write-once

- **Examples**: re­place any `@mech_interp`-style han­dle with your pub­lic han­dle (e.g., `@p_8f3a1c2d_mech_interp`)

- **Rate lim­its**: stricter per-IP lim­its and lower con­cur­rent query caps than pri­vate keys

- **Timeouts**: adap­tive, up to ~120s un­der light load; can drop un­der load

- **Embeddings**: small per-IP to­ken bud­get and per-re­quest size caps; cre­ate an ac­count if you hit lim­its

- **Not avail­able**: `GET/DELETE /v1/alignment/vectors`, `/api/scry/alerts`

**Strategy for nu­anced ques­tions (explore → scale):**

1) **Explore quickly**: Start with small LIMITs (10–50), ma­te­ri­al­ized views, or `alignment.search()` to val­i­date schema and phras­ing.

2) **Form can­di­dates**: Build a fo­cused can­di­date set (lexical search or a tight WHERE) with a hard LIMIT (100–500), then join.

3) **Scale care­fully**: Once the shape is right, ex­pand lim­its and add ag­gre­ga­tions. Let Postgres plan joins when pos­si­ble; if pub­lic time­outs bite, in­ter­sect small can­di­date sets client-side as a fall­back.

4) **Lean on the plan­ner**: Use `EXPLAIN SELECT …` (no ANALYZE) to san­ity-check join or­der and fil­ters. Keep fil­ters sar­gable, and push them into the base ta­bles/​CTEs.

**Execution guardrails (transparency + con­fir­ma­tion):**

- Always show a short about to run” sum­mary: SQL + se­man­tic fil­ters (sources/kinds/date ranges + @handles).

- If a query may be heavy, ask for con­fir­ma­tion be­fore ex­e­cut­ing. Use `/v1/alignment/estimate` when in doubt.

- Treat as heavy if: miss­ing LIMIT, LIMIT > 1000, es­ti­mat­ed_rows > 100k, em­bed­ding dis­tance over >500k rows, or joins over large base ta­bles.

- Always re­mind the user they can can­cel or re­vise the query at any time.

**Explore cor­pus com­po­si­tion (source × type):**

```sql

SELECT source::text AS source, kind::text AS kind, COUNT(*) AS n

FROM align­ment.en­ti­ties

GROUP BY 1, 2

ORDER BY n DESC

LIMIT 50;

**Quick ex­am­ple** — weighted com­bi­na­tion search:

```sql

– After stor­ing @mech_interp, @oversight, @hype via /embed:

SELECT mv.uri, mv.ti­tle, mv.orig­i­nal_au­thor, mv.base_s­core,

mv.em­bed­ding (

scale_vec­tor(@mech_in­terp, 0.5)

+ scale_vec­tor(@over­sight, 0.4)

- scale_vec­tor(@hype, 0.2)

) AS dis­tance

FROM mv_­less­wrong_­posts mv

ORDER BY dis­tance

LIMIT 20;

You ac­cess every­thing via HTTP APIs. You do NOT have di­rect data­base ac­cess.

## 1. APIs

All end­points: `POST` with JSON body.

Headers for all re­quests:

Authorization: Bearer ex­o­pri­ors_pub­lic_read­on­ly_v1_2025

Content-Type: ap­pli­ca­tion/​json

(Your API key is in­ten­tion­ally em­bed­ded in this prompt for er­gonom­ics. Keys re­load fre­quently; if you get 401 er­rors, re­fresh the prompt.)

### 1.1 SQL Query

`POST https://​api.ex­o­pri­ors.com/​v1/​align­ment/​query`

Request body:

```json

{{“sql”: SELECT kind::text AS kind, COUNT(*) FROM align­ment.en­ti­ties GROUP BY kind::text ORDER BY 2 DESC LIMIT 20″, include_vectors”: false}}

Example re­sponse (illustrative; counts change):

```json

columns”: [{{“name”: kind”, type”: TEXT}}, {{“name”: count”, type”: INT8}}],

rows”: [[“comment”, 38911611], [“tweet”, 11977552], [“wikipedia”, 6201199]],

row_count”: 3,

duration_ms”: 42,

truncated”: false

Constraints:

- Max 10,000 rows (100 when `include_vectors: true`)

- Adaptive time­out: up to 120s when load al­lows (down to ~20s un­der heavy load)

- One state­ment per re­quest

- Always in­clude LIMIT; use WHERE fil­ters to avoid full scans

- Vector columns are re­turned as place­hold­ers (e.g., `[vector value]`); use dis­tances/​sim­i­lar­i­ties in­stead of re­quest­ing raw vec­tors

**Performance heuris­tics (rough, load-de­pen­dent):**

- Embedding dis­tances are the most ex­pen­sive op­er­a­tion; each em­bed­ding com­par­i­son scans the can­di­date set.

- Multiple em­bed­dings mul­ti­ply cost lin­early (2 em­bed­dings ≈ work).

- Keep em­bed­ding com­par­isons to a few hun­dred thou­sand rows per em­bed­ding; use tighter fil­ters or smaller can­di­dates first.

- Regex/ILIKE on `payload` is costly; pre­fer `alignment.search()` to nar­row, then join.

**Performance tips (ballpark, load-de­pen­dent):**

- Simple searches: ~1–5s

- Embedding joins (5M rows): may time­out un­der load

- `alignment.search()` is capped at 100 rows; use `alignment.search_exhaustive()` + pag­i­na­tion if com­plete­ness mat­ters

- If a query times out: re­duce sam­ple size, use fewer em­bed­dings, or pre-fil­ter with `alignment.search()`. For pub­lic keys, in­ter­sect small can­di­date lists client-side as a fall­back.

- For au­thor ag­gre­gates, use `alignment.mv_author_stats` in­stead of `COUNT(DISTINCT orig­i­nal_au­thor)` on `alignment.entities`.

**Context man­age­ment (for LLMs):**

- Avoid `SELECT *` on large re­sult sets; pick only the columns you need.

- Trim long text with `alignment.preview_text(payload, 500)` or `LEFT(payload, 500)`.

- Keep LIMITs small (10–50); don’t fetch hun­dreds of en­ti­ties at once or you’ll flood con­text.

### 1.1b Query Estimate (No Execution)

`POST https://​api.ex­o­pri­ors.com/​v1/​align­ment/​es­ti­mate`

Request body:

```json

{{“sql”: SELECT id FROM align­ment.en­ti­ties WHERE source = hackernews’ AND kind = comment’ LIMIT 1000″}}

Response (example):

```json

estimated_rows”: 1000,

total_cost”: 12345.6,

estimated_seconds”: 1.8,

estimated_range_seconds”: [0.9, 3.6],

risk”: low”,

timeout_secs”: 300,

load_stage”: normal”,

warnings”: []

This uses `EXPLAIN (FORMAT JSON)` to es­ti­mate cost/​time and does **not** ex­e­cute the query.

...

Read the original on exopriors.com »

6 261 shares, 13 trendiness

Efficient method to capture carbon dioxide from the atmosphere developed at the University of Helsinki

A new method to cap­ture car­bon diox­ide from the air has been de­vel­oped at the University of Helsinki’s chem­istry de­part­ment.

The method de­vel­oped by Postdoctoral Researcher  is based on a com­pound of su­per­base and al­co­hol. Tests done in pro­fes­sor group show that the com­pound ap­pears promis­ing: one gram of the com­pound can ab­sorb 156 mil­ligrams of car­bon diox­ide di­rectly from un­treated am­bi­ent air. However, the com­pound does not re­act with ni­tro­gen, oxy­gen or other at­mos­pheric gases. Capasity clearly out­per­forms the CO cap­ture meth­ods cur­rently in use.

The CO cap­tured by the com­pound can be re­leased by heat­ing the com­pound at 70 °C in 30 min­utes. Clean CO is re­cov­ered and can be re­cy­cled.

The ease of re­leas­ing CO is the key ad­van­tage of the new com­pound. In cur­rent com­pounds, re­leas­ing CO typ­i­cally re­quires heat above 900 de­grees Celsius.

– In ad­di­tion, the com­pound can be used mul­ti­ple times: the com­pound re­tained 75 per­cent of its orig­i­nal ca­pac­ity af­ter 50 cy­cles, and 50 per­cent af­ter 100 cy­cles.

The new com­pound was dis­cov­ered by ex­per­i­ment­ing with a num­ber of bases in dif­fer­ent com­pounds, says Eshagi Gorji. The ex­per­i­ments lasted more than a year in to­tal.

The most promis­ing base proved to be 1,5,7-triazabicyclo [4.3.0] non-6-ene (TBN), de­vel­oped at in the pro­fes­sor group, which was com­bined with ben­zyl al­co­hol to pro­duce the fi­nal com­pound.

– None of the com­po­nents is ex­pen­sive to pro­duce, Eshaghi Gorji points out. In ad­di­tion, the fluid is non-toxic.

The com­pound will now be tested in pi­lot plants at a near-in­dus­trial scale, rather than in grams. A solid ver­sion of the liq­uid com­pound must be made for this pur­pose.

– The idea is to bind the com­pound to com­pounds such as sil­ica and graphene ox­ide, which pro­motes the in­ter­ac­tion with car­bon diox­ide.

...

Read the original on www.helsinki.fi »

7 227 shares, 9 trendiness

The rise of industrial software

Of or re­lat­ing to pro­duc­tive work, trade, or man­u­fac­ture, esp. me­chan­i­cal in­dus­try or large-scale man­u­fac­tur­ing; ( also) re­sult­ing from such in­dus­try.

For most of its his­tory, soft­ware has been closer to craft than man­u­fac­ture: costly, slow, and dom­i­nated by the need for skills and ex­pe­ri­ence. AI cod­ing is chang­ing that, by mak­ing avail­able paths of pro­duc­tion which are cheaper, faster, and in­creas­ingly dis­con­nected from the ex­per­tise of hu­mans.

I have writ­ten pre­vi­ously about how AI cod­ing can be a trap

for to­day’s prac­ti­tion­ers, of­fer­ing short­cuts to in­com­plete so­lu­tions at the ex­pense of the un­der­stand­ing needed for sus­tain­able de­vel­op­ment prac­tices. But as we col­lec­tively ad­dress the short­com­ings of our cur­rent toolset, it is clear that we are head­ing into a world in which the pro­duc­tion of soft­ware is be­com­ing in­creas­ingly au­to­mated.

What hap­pens to soft­ware when its pro­duc­tion un­der­goes an in­dus­trial rev­o­lu­tion?

Traditionally, soft­ware has been ex­pen­sive to pro­duce, with ex­pense dri­ven largely by the labour costs of a highly skilled and spe­cialised work­force. This work­force has also con­sti­tuted a bot­tle­neck for the pos­si­ble scale of pro­duc­tion, mak­ing soft­ware a valu­able com­mod­ity to pro­duce ef­fec­tively.

Industrialisation of pro­duc­tion, in any field, seeks to ad­dress both of these lim­i­ta­tions at once, by us­ing au­toma­tion of processes to re­duce the re­liance on hu­man labour, both low­er­ing costs and also al­low­ing greater scale and elas­tic­ity of pro­duc­tion. Such changes rel­e­gate the hu­man role to over­sight, qual­ity con­trol, and op­ti­mi­sa­tion of the in­dus­trial process.

The first or­der ef­fect of this change is a dis­rup­tion in the sup­ply chain of high qual­ity, work­ing prod­ucts. Labour is dis­in­ter­me­di­ated, bar­ri­ers to en­try are low­ered, com­pe­ti­tion rises, and rate of change ac­cel­er­ates. All of these ef­fects are start­ing to be in ev­i­dence to­day, with the tra­di­tional soft­ware in­dus­try grap­pling with the ram­i­fi­ca­tions.

A sec­ond or­der ef­fect of such in­dus­tri­al­i­sa­tion is to en­able ad­di­tional ways to pro­duce low qual­ity, low cost prod­ucts at high scale. Examples from other fields in­clude:

In the case of soft­ware, the in­dus­tri­al­i­sa­tion of pro­duc­tion is giv­ing rise to a new class of soft­ware arte­fact, which we might term dis­pos­able soft­ware: soft­ware cre­ated with no durable ex­pec­ta­tion of own­er­ship, main­te­nance, or long-term un­der­stand­ing.

Advocates might re­fer to this as vibe-coded soft­ware, and scep­tics will in­vari­ably talk about AI slop. Regardless of its mer­its, it is clear that the eco­nom­ics of this class of soft­ware are quite dif­fer­ent, as each soft­ware out­put has less eco­nomic value, due to its easy re­pro­ducibil­ity. This lack of per­ceived value might tempt you to dis­miss the trend as a pass­ing fad, but this would be un­wise. To un­der­stand why, we need to con­sider the his­tor­i­cal prece­dents for com­modi­ti­sa­tion of pre­vi­ously scarce goods.

Jevons para­dox

is an old bit of eco­nomic the­ory that has been much quoted re­cently. The ob­ser­va­tion dates to the nine­teenth cen­tury, not­ing that im­proved ef­fi­ciency in coal con­sump­tion would lead to lower costs, fu­el­ing higher de­mand, and ul­ti­mately re­sult­ing in higher over­all coal con­sump­tion.

This is rel­e­vant to­day, be­cause we are see­ing the same surge in de­mand for AI com­pute: as mod­els be­come more ef­fi­cient at to­ken pre­dic­tion, de­mand is surg­ing and re­sults in ever greater con­sump­tion. Will the same ef­fect rip­ple through soft­ware de­vel­op­ment it­self, with lower cost of ef­fort dri­ving higher con­sump­tion and out­put? History sug­gests it will.

Consider the in­dus­tri­al­i­sa­tion of agri­cul­ture. In the early twen­ti­eth cen­tury, sci­en­tific ad­vances were ex­pected to erad­i­cate hunger and usher in an era of abun­dant, nour­ish­ing food. Instead, hunger and famine per­sist. In 2025, there are 318 mil­lion peo­ple ex­pe­ri­enc­ing acute hunger, even in coun­tries with an agri­cul­tural sur­plus. Meanwhile, in the wealth­i­est na­tions, in­dus­trial food sys­tems have pro­duced abun­dance of a dif­fer­ent kind: the United States has an adult obe­sity rate of 40% and a grow­ing di­a­betes cri­sis. Ultraprocessed foods are widely recog­nised as harm­ful, yet the over­whelm­ing ma­jor­ity of Americans con­sume them each day.

Industrial sys­tems re­li­ably cre­ate eco­nomic pres­sure to­ward ex­cess, low qual­ity goods. This is not be­cause pro­duc­ers are care­less, but be­cause once pro­duc­tion is cheap enough, junk is what max­imises vol­ume, mar­gin, and reach. The re­sult is not abun­dance of the best things, but over­pro­duc­tion of the most con­sum­able ones. And con­sume them we do.

Our ap­petite for AI slop is likely to be sim­i­larly in­sa­tiable. The adop­tion curve we’ve seen so far may pale be­side what hap­pens when dis­pos­able soft­ware pro­duc­tion be­comes truly main­stream. If the de­moc­ra­ti­sa­tion of soft­ware mir­rors the im­pact of ubiq­ui­tous photo, video, and au­dio cap­ture en­abled by the smart­phone, we may see user-gen­er­ated soft­ware cre­ated, shared, and dis­carded at so­cial-me­dia scale. Should that hap­pen, the feed­back loops of nov­elty and re­ward will drive an ex­plo­sion of soft­ware out­put that makes the past half-cen­tury of de­vel­op­ment look quaint by com­par­i­son.

Ultraprocessed foods are, of course, not the only game in town. There is a thriv­ing and grow­ing de­mand for healthy, sus­tain­able pro­duc­tion of food­stuffs, largely in re­sponse to the harm­ful ef­fects of in­dus­tri­al­i­sa­tion. Is it pos­si­ble that soft­ware might also re­sist mech­a­ni­sa­tion through the growth of an organic soft­ware” move­ment? If we look at other sec­tors, we see that even those with the high­est lev­els of in­dus­tri­al­i­sa­tion also still ben­e­fit from small-scale, hu­man-led pro­duc­tion as part of the spec­trum of out­put.

For ex­am­ple, prior to in­dus­tri­al­i­sa­tion, cloth­ing was largely pro­duced by spe­cialised ar­ti­sans, of­ten co­or­di­nated through guilds and man­ual labour, with re­sources gath­ered lo­cally, and the ex­per­tise for cre­at­ing durable fab­rics ac­cu­mu­lated over years, and fre­quently passed down in fam­ily lines. Industrialisation changed that com­pletely, with raw ma­te­ri­als be­ing shipped in­ter­con­ti­nen­tally, fab­rics mass pro­duced in fac­to­ries, clothes as­sem­bled by ma­chin­ery, all lead­ing to to­day’s world of fast, dis­pos­able, ex­ploita­tive fash­ion. And yet hand­crafted clothes still ex­ist: from tai­lored suits to knit­ted scarves, a place still ex­ists for small-scale, slow pro­duc­tion of tex­tile goods, for rea­sons rang­ing from cus­tomi­sa­tion of fit, sig­nalling of wealth, dura­bil­ity of prod­uct, up to en­joy­ment of the craft as a pas­time.

So, might hu­man-writ­ten soft­ware be con­fined to niches mir­ror­ing high fash­ion or home­made knitwear? That might have been the case were soft­ware a phys­i­cal prod­uct, in which in­dus­tri­al­i­sa­tion could lead to mass pro­duc­tion of reusable com­po­nents. But soft­ware is an in­tan­gi­ble good, and un­like other in­dus­tri­alised fields, it has a long his­tory of com­po­nent reuse that is in­trin­sic to the na­ture of the good it­self. Innovation is not lim­ited to bet­ter or cheaper ver­sions of ex­ist­ing prod­ucts, as with cloth­ing, but also en­com­passes growth of the so­lu­tion space, more akin to how the steam en­gine en­abled reusable ma­chine parts, en­abled the pro­duc­tion line, en­abled the mo­tor car, etc.

As such, the mech­a­nism for tech­no­log­i­cal progress in the his­tory of soft­ware de­vel­op­ment has been not only in­dus­tri­al­i­sa­tion, but also in­no­va­tion. Research and de­vel­op­ment is ex­pen­sive, but of­fers the only path to greater value over time.

Innovation is fun­da­men­tally dif­fer­ent to in­dus­tri­al­i­sa­tion, be­cause it is not fo­cused on more ef­fi­ciently repli­cat­ing what al­ready ex­ists to­day. It in­stead ad­vances through find­ing and solv­ing new prob­lems, build­ing on what came be­fore, and de­liv­er­ing ca­pa­bil­i­ties that could not pre­vi­ously have ex­isted. Industrialisation then steps in and pro­vides scale and com­modi­ti­sa­tion, pro­vid­ing a foun­da­tion upon which the next round of in­no­va­tion can build. The in­ter­play of these two forces is what we term progress.

Large lan­guage mod­els are a steam en­gine mo­ment for soft­ware. They col­lapse the cost of a class of work pre­vi­ously fully de­pen­dent on scarce hu­man labour, and in do­ing so un­lock an ex­tra­or­di­nary ac­cel­er­a­tion in out­put.

But re­mem­ber that the steam en­gine did not ap­pear in a vac­uum. Windmills and wa­ter­mills pre­ceded tur­bines by cen­turies. Mechanisation did not be­gin with coal and steel; it merely reached an in­flec­tion point at which au­toma­tion, scale, and cap­i­tal aligned to power eco­nomic trans­for­ma­tion. Similarly, soft­ware has been in­dus­tri­al­is­ing for a long time: through reusable com­po­nents (open source code), porta­bil­ity (containerisation, the cloud), de­moc­ra­ti­sa­tion (low-code / no-code tools), in­ter­op­er­abil­ity (API stan­dards, pack­age man­agers) and many other ways.

We are en­ter­ing an in­dus­trial rev­o­lu­tion for soft­ware, then, not as a mo­ment of rup­ture, but one of huge ac­cel­er­a­tion. Industrialisation does not re­place tech­no­log­i­cal progress, but it will greatly ac­cel­er­ate both the ab­sorp­tion of new ideas and the com­modi­ti­sa­tion of new ca­pa­bil­i­ties. In turn, in­no­va­tion is more quickly un­locked, as the cost of build­ing on top of novel tech­nol­ogy drops more quickly. The cy­cle of progress con­tin­ues, but in an era of mass au­toma­tion, the wheel spins faster than ever be­fore.

The open ques­tion, then, is not whether in­dus­trial soft­ware will dom­i­nate, but what that dom­i­nance does to the sur­round­ing ecosys­tem. Previous in­dus­trial rev­o­lu­tions ex­ter­nalised their costs onto en­vi­ron­ments that seemed in­fi­nite un­til they weren’t. Software ecosys­tems are no dif­fer­ent: de­pen­dency chains, main­te­nance bur­dens, se­cu­rity sur­faces that com­pound as out­put scales. Technical debt is the pol­lu­tion of the dig­i­tal world, in­vis­i­ble un­til it chokes the sys­tems that de­pend on it. In an era of mass au­toma­tion, we may find that the hard­est prob­lem is not pro­duc­tion, but stew­ard­ship. Who main­tains the soft­ware that no one owns?

...

Read the original on chrisloy.dev »

8 220 shares, 15 trendiness

Stewart Cheifet Obituary December 28, 2025

Stewart Douglas Cheifet, age 87, of Philadelphia, PA, passed away on December 28, 2025.

Stewart was born on September 24, 1938, to Paul and Anne Cheifet in Philadelphia, where he spent his child­hood and at­tended Central High School. He later moved to California to at­tend col­lege, grad­u­at­ing from the University of Southern California in 1960 with de­grees in Mathematics and Psychology. He went on to earn his law de­gree from Harvard Law School.

In 1967, Stewart met his fu­ture wife, Peta Kennedy, while the two were work­ing at CBS News in Paris. They re­turned to the United States and mar­ried later that year. Stewart’s ca­reer in tele­vi­sion pro­duc­tion took them around the world, and they lived to­gether in the Samoan Islands, Hawaii, San Francisco, and Los Angeles, be­fore even­tu­ally set­tling back in Philadelphia.

Stewart and Peta had two chil­dren, Stephanie and Jonathan.

Stewart is best known for pro­duc­ing and host­ing the na­tion­ally broad­cast PBS tele­vi­sion pro­grams Computer Chronicles and Net Cafe. Computer Chronicles aired from 1984 to 2002, pro­duc­ing more than 400 episodes that doc­u­mented the rise of the per­sonal com­puter from its ear­li­est days. Net Cafe, which aired from 1996 to 2002, ex­plored the emer­gence of the in­ter­net. Both pro­grams were widely re­garded as vi­sion­ary, cap­tur­ing the evo­lu­tion of per­sonal com­put­ing and the early de­vel­op­ment of the dig­i­tal age.

Stewart’s pro­fes­sional in­ter­ests and tal­ents were wide-rang­ing. After leav­ing tele­vi­sion pro­duc­tion, he worked as a con­sul­tant for the Internet Archive, help­ing to pre­serve and pro­vide pub­lic ac­cess to cul­tural and tech­no­log­i­cal me­dia, in­clud­ing Computer Chronicles and other tech­nol­ogy pro­grams. He also shared his knowl­edge as an ed­u­ca­tor, teach­ing broad­cast jour­nal­ism at the Donald W. Reynolds School of Journalism at the University of Nevada, Reno. After re­tire­ment, he spent his re­main­ing years en­joy­ing time with Peta, his chil­dren, his grand­chil­dren, and his broth­ers.

Stewart is sur­vived by his broth­ers Lanny and Bruce, his chil­dren Stephanie and Jonathan, and his grand­chil­dren Gussy, Josephine, Benjamin, Freya, and Penny.

Services will be held for im­me­di­ate fam­ily only.

...

Read the original on obits.goldsteinsfuneral.com »

9 206 shares, 13 trendiness

bugs, bad updates, and fed‑up users

The last 12 months have been an in­cred­i­bly frus­trat­ing time for Windows fans. For the first time in a long while, it feels like Windows is suf­fer­ing from a lack of fo­cus from the peo­ple at the top.

Support for Windows 10 ended in October, and this year was the per­fect time to strengthen Windows 11 as a vi­able re­place­ment for mil­lions of users. Instead, Microsoft spent most of it shov­ing the OS full of half-baked AI fea­tures, all while let­ting the qual­ity bar slip and ship­ping new bugs and is­sues on an al­most monthly ca­dence.

Everything Microsoft has done when it comes to Windows this year has eroded the plat­for­m’s rep­u­ta­tion in ways that I haven’t seen since Windows 8. Today, it feels like peo­ple hate Windows 11 with a pas­sion, much more so than they did when 2025 first started.

There are so many prob­lems with Windows as a plat­form right now that it’s hard to know where to be­gin.

Of course, the is­sue that made head­lines the most this year is AI, as Microsoft falls over it­self try­ing to make Windows 11 a fron­tier plat­form for ar­ti­fi­cial in­tel­li­gence. Unfortunately, this ef­fort feels like it has been pri­or­i­tized above every­thing else, in­clud­ing qual­ity of life and over­all plat­form sta­bil­ity.

Copilot has forced its way into al­most every sur­face and in­ten­tion on the plat­form. Heck, even Notepad now has a Copilot but­ton, which is some­thing lit­er­ally no­body has ever asked for. Microsoft’s AI in­ten­tions feel ob­ses­sive and forced, al­most as if the com­pany is just throw­ing every­thing at the wall to see what sticks.

Under the hood, Microsoft has been mov­ing to make Windows 11 agen­tic. It un­veiled the agen­tic work­space, along with a set of APIs that will al­low AI de­vel­op­ers to build tools that can au­to­mate work­flows on your be­half. Sounds great on pa­per, un­til you read the fine print and dis­cover that it comes with se­ri­ous se­cu­rity im­pli­ca­tions and warn­ings.

You’d like to think that a fea­ture with such se­ri­ous se­cu­rity con­cerns would­n’t make it out of the lab, but be­cause this is AI, Microsoft does­n’t seem to mind. The fea­ture even ships off by de­fault, which tells you every­thing you need to know about how the com­pany views this fea­ture.

A large chunk of the AI fea­tures that were an­nounced this year also aren’t Copilot+ PC ex­clu­sive, which means most of them re­quire an in­ter­net con­nec­tion and your data sent to the cloud to be use­ful, which is an­other pri­vacy con­cern to add to all the other pri­vacy con­cerns on Windows 11.

In November, Windows pres­i­dent Pavan Davuluri men­tioned that Windows would evolve into an agen­tic OS, spark­ing one of the biggest amounts of back­lash I’ve seen around Windows this year. His post was so neg­a­tively re­ceived, he had to dis­able replies and is­sue a fol­low up state­ment re­as­sur­ing cus­tomers that Windows would con­tinue to in­no­vate out­side of AI too.

I want to stress that AI can be ben­e­fi­cial. I’ve al­ways said that AI is best when it’s in­vis­i­ble, which is why I’m so con­fused about Microsoft’s ap­proach to AI on Windows 11. It seems like Microsoft wants AI to be the sell­ing point, but that’s to­tally back­wards. AI should be a help­ful ex­tra, not an all-en­com­pass­ing, sole rea­son for the plat­for­m’s ex­is­tence.

I think the biggest is­sue users are deal­ing with right now is Microsoft’s Continuous Innovation” strat­egy for Windows 11, which is de­signed to al­low the com­pany to build new fea­tures and get them out the door faster than ever be­fore.

In the past, new Windows fea­tures were of­ten timed with a sig­nif­i­cant OS up­date. Once a year, Windows would re­ceive a big up­grade, which would in­tro­duce new fea­tures and im­prove­ments from the core up. This al­lowed Microsoft plenty of time to bake and fine-tune new fea­tures and changes, iron­ing out bugs be­fore gen­eral avail­abil­ity.

Today, thanks to Continuous Innovation, Microsoft is able to ship new fea­tures when­ever the com­pany deems them ready. Every. Single. Month. This means there’s now a con­stant churn of new fea­tures, with no breaks or respite. Users never get a chance to breathe.

On top of this is Microsoft’s Controlled Feature Rollout (CFR) sys­tem, which makes it so some users don’t see the new fea­tures even af­ter in­stalling the up­date that sup­pos­edly in­cludes them, mak­ing it lit­er­ally im­pos­si­ble to pre­dict and pre­pare for when a new fea­ture might ac­tu­ally ar­rive on your PC.

The new Windows 11 Start menu is a per­fect ex­am­ple of this. It only be­gan rolling out in October, but thanks to Controlled Feature Rollout, many users did­n’t get it right away af­ter in­stalling the October up­date. For those peo­ple, it ran­domly ap­peared a few days or weeks later, with­out any warn­ing or prompt, let­ting the user know what hap­pened and why.

In this sce­nario, go­ing into your up­date his­tory to see what changed is go­ing to con­fuse you, be­cause the up­date that in­cludes the new Start menu was in­stalled on your sys­tem weeks ago. You’re only see­ing the new fea­tures now be­cause Microsoft al­lowed you to see it, which is in­sane and frus­trat­ing be­yond be­lief.

As a re­sult, no two Windows PCs are the same these days. Two iden­ti­cal sys­tems, run­ning the same build and up­date of Windows 11, might ap­pear com­pletely dif­fer­ent fea­ture-wise, which is con­fus­ing to your av­er­age user, and more than likely one of the rea­sons why Windows feels so much bug­gier these days. There are too many mov­ing parts.

Continuous Innovation es­sen­tially boils down to al­low­ing Microsoft to force new fea­tures onto you when­ever it wants, be­cause the com­pany ties these new fea­tures to monthly se­cu­rity up­dates, which are es­sen­tially re­quired if you want to use your com­puter safely on the in­ter­net.

But it’s be­yond frus­trat­ing when said fea­tures or changes ran­domly ap­pear on your sys­tem with­out any warn­ing, and even more frus­trat­ing when you can’t dis­able or undo them. Users have to make do with Microsoft con­stantly mov­ing the deck chairs, and peo­ple are get­ting tired.

I mean, Microsoft has even built a Windows Roadmap web­site de­signed to try and make it eas­ier to see where new fea­tures are in their roll­out. Except, the web­site is so con­fus­ing and frus­trat­ing to nav­i­gate, and di­gest, it’s ac­tu­ally not very use­ful at all. That’s how com­pli­cated the Windows up­date sit­u­a­tion is right now.

Above all else, it ren­ders the an­nual ver­sion up­date es­sen­tially ir­rel­e­vant. Version 25H2, which shipped a cou­ple of months back, in­cludes no new fea­tures or changes over ver­sion 24H2, be­cause Microsoft ships new fea­tures to both at the same time. They are the same ver­sion. Why even do this? Surely it makes more sense just to ex­tend sup­port for ver­sion 24H2?

Unfortunately, Microsoft’s abil­ity to ship new fea­tures quickly ap­pears to have also con­tributed to a no­tice­able de­cline in qual­ity over the last year or two. It feels like many new fea­tures that ac­tu­ally ship are half-baked, and in some cases, out­right break other things as they are in­tro­duced.

Every sin­gle week, there’s a new head­line about how a re­cent Microsoft up­date has bro­ken some­thing on Windows, with fixes for said bugs ei­ther com­ing a cou­ple of weeks later or an en­tire month later, de­pend­ing on the sched­ule. Rarely, if ever, does Microsoft pull an up­date that is caus­ing bugs, though it has hap­pened a cou­ple of times.

I would­n’t be sur­prised if CFR is play­ing a part in this de­cline in qual­ity. With the same ver­sion of Windows be­ing able to pre­sent it­self dif­fer­ently de­pend­ing on what I can only de­scribe as ran­dom fac­tors, it may be just be­com­ing harder to keep Windows sta­ble when there are so many mov­ing parts and vari­ables at play.

Windows 11 to­day is a much more com­plex beast than pre­vi­ous ver­sions of Windows have been. Microsoft has been ob­sessed with A/B test­ing for a long while, but CFR takes it to a whole other level, to the point where you lit­er­ally can’t guar­an­tee the ver­sion of Windows 11 you’re in­stalling will be feature com­plete” when you want it to be.

Some users have re­ported never get­ting the chance to test a par­tic­u­lar fea­ture be­fore it’s made gen­er­ally avail­able be­cause of CFR. That’s how detri­men­tal the sys­tem is to the de­vel­op­ment and avail­abil­ity of new Windows fea­tures. As a real-life ex­am­ple of this, my main Windows 11 Insider PC is still stuck with the old Start menu, even though the new Start menu is now rolling out and gen­er­ally avail­able.

There’s no built-in op­tion in the OS to over­ride this, mean­ing there’s noth­ing I can do to get the new Start menu with­out re­ly­ing on third-party tools to trick the sys­tem into let­ting me test it.

In some ways, CFR feels like a way for Microsoft to hide be­hind the fact that it knows the fea­tures it ships to pro­duc­tion aren’t al­ways 100% ready, as it al­lows them to dis­able ac­cess to said fea­tures server-side if a prob­lem arises.

There’s also the is­sue of con­sis­tency, which con­tin­ues to be a prob­lem on Windows 11. The com­pany has done well to at­tempt to ad­dress UI con­sis­tency, though there are still glar­ing is­sues in ar­eas like the File Explorer. But what frus­trates me most is the in­con­sis­tent use of its own na­tive Windows UI frame­work in in-box apps and the OS shell.

Outlook is the built-in mail client on Windows 11, and it’s gen­uinely the worst in­cluded OS email client I’ve ever used. It’s a web­site that’s slow to open, un­re­li­able at send­ing no­ti­fi­ca­tions, and eats up a chunk of mem­ory when in use. There’s noth­ing op­ti­mized or de­light­ful about the Outlook app on Windows 11.

Microsoft also just an­nounced that it’s bring­ing back the agenda view to the cal­en­dar fly­out on the Taskbar, but it looks like that fea­ture is built us­ing web tech in­stead of Windows 11′s na­tive UI stack. That’s frankly un­ac­cept­able, but this is the sort of thing Microsoft does on a fre­quent ba­sis these days.

Unfortunately for Microsoft, its com­peti­tors have been qui­etly cap­i­tal­iz­ing on Windows’ down­fall in the last year or so. Google has been work­ing be­hind closed doors on Android PCs, which are ex­pected to de­but next year as a vi­able al­ter­na­tive to Windows in the low-end to mid-range PC space.

This is an area that Windows woe­fully strug­gled in in re­cent years. Windows 11 is just too big, bloated, and un­op­ti­mized to run well on low-end hard­ware, to the point where many schools and en­ter­prises are switch­ing to Chrome OS or even the iPad be­cause Windows just sucks on these de­vices.

I’m still blown away by how quick Chrome OS is at both up­dat­ing the sys­tem and fac­tory re­set­ting the sys­tem. Installing a sys­tem up­date on Chrome OS is as quick as restart­ing an app, tak­ing less than a few sec­onds in most cases. On Windows 11, in­stalling an up­date can take any­where from a few min­utes to hours, de­pend­ing on how big the up­date is.

With Android PCs, Windows might fi­nally have a real chal­lenger in this low-end space. If Lenovo, Dell, HP, and the other top-name OEMs are on board to build Android PCs, I re­ally don’t see how Windows 11 in its cur­rent state will be able to com­pete. The Android sys­tem is just bet­ter op­ti­mized for these low-end de­vices, and Windows needs a real ar­chi­tec­tural slim down even to stand a chance.

It’s not just Google com­ing for Microsoft’s lunch ei­ther; Valve is in­ter­ested in tak­ing some of that sweet gamer mar­ket share from Windows. It’s made its in­ten­tions very clear this year: SteamOS is the fu­ture of PC gam­ing, and it wants as many Windows users to make the switch as pos­si­ble.

This could­n’t have come at a worse time for Microsoft, given the back­lash and frus­tra­tion from users about Windows 11. Gamers are all but ready to aban­don ship, and Valve is of­fer­ing up a vi­able al­ter­na­tive on a plate. The Steam Machine is go­ing to light a fire un­der Windows PC gam­ing.

Then you’ve got Apple, which is al­ways slowly peck­ing away at Windows mar­ket share with the Mac. Since Apple Silicon, Mac has only gained mar­ket share, and its lat­est lap­tops are some of the best out there. These days, the only rea­son not to buy a Mac is if you need a lap­top with a touch screen or 5G, or just don’t like MacOS.

Now, Apple is ru­mored to be build­ing a cheap MacBook that will ship some­time next year. This could be po­ten­tially dev­as­tat­ing for Windows, be­cause for a lot of peo­ple, the only rea­son they don’t own a Mac is that they’re too ex­pen­sive. If Apple can ship a new MacBook for $600, that’s go­ing to be hard to say no to over any Windows lap­top in the same price range.

I don’t want this ar­ti­cle to be all doom and gloom, and it should­n’t be, be­cause for all of Windows’ faults, Microsoft has done some good things with the plat­form this year.

It fi­nally com­mit­ted to re­fo­cus­ing on small but im­por­tant de­tails and ex­pe­ri­ences in the OS. The com­pany un­der­stands that Windows 11 cur­rently feels in­com­plete in a lot of ar­eas, likely be­cause it is, and is ad­dress­ing those key com­plaints. Things like Dark Mode are be­ing more con­sis­tently ap­plied across the OS now, which is an im­prove­ment I’ve been wait­ing a decade for.

The com­pany is also adding back things like smooth an­i­ma­tions when hov­er­ing over open app icons on the Taskbar, or the Agenda view in the Calendar fly­out on the Taskbar (albeit with web tech). It’s also in­tro­duced new fea­tures like the share drag tray, which makes shar­ing files su­per easy.

The new Start menu is also a sig­nif­i­cant im­prove­ment over the old one, with more icons on show, the abil­ity to turn off Recommended ads and re­cent files, and the abil­ity to show your apps list on the main home page.

Microsoft also in­tro­duced a num­ber of im­prove­ments to the Windows BSOD and re­cov­ery op­tions screen, which makes re­cov­er­ing a Windows sys­tem that has been taken of­fline due to a faulty up­date or dri­ver much more straight­for­ward and stream­lined. It’s a lot harder to take a PC of­fline to­day than it was a year ago.

For gamers, Windows 11 is bet­ter than ever. The Xbox app is be­ing po­si­tioned as a hub for all of gam­ing on Windows, and is now ca­pa­ble of re­plac­ing the desk­top in­ter­face for when you just want to nav­i­gate the sys­tem with a con­troller. The com­pany has also promised even more op­ti­miza­tions to come in the fol­low­ing year.

While there is a lot to com­plain about, there’s also quite a bit to like about Windows 11 this year. I just wish there were more good than bad.

Ultimately, I think it’s very clear that some­thing needs to change. The pub­lic has de­cided that Windows 11 is a bad op­er­at­ing sys­tem, and Microsoft does need to ad­dress this.

If I were in charge, the very first thing I would do is throw out the Continuous Innovation strat­egy. There’s sim­ply no need to ship new fea­tures on a monthly ca­dence, users don’t want it, and Microsoft would have an eas­ier time de­vel­op­ing and test­ing new fea­tures thor­oughly with­out it.

Instead, I would in­tro­duce quar­terly fea­ture drops, with big new fea­tures com­ing once a year timed with the an­nual ver­sion up­date. Microsoft can ship smaller qual­ity of life im­prove­ments, fea­tures, and up­dates every three months, and any big user ex­pe­ri­ence changes or im­prove­ments once a year. Security up­dates can re­main monthly.

This would al­low Microsoft more time to test fea­tures as they are de­vel­oped be­fore ship­ping them, which would ide­ally im­prove the over­all sta­bil­ity of the sys­tem. I’d also scrap the CFR sys­tem and ship new fea­tures to every­one as the up­dates are re­leased.

I’d also love to see Microsoft tone things down when it comes to AI. Windows 11 should be AI ca­pa­ble, of course, but I re­ally don’t think it needs to be shoved into every UI sur­face pos­si­ble. Notepad does­n’t need an AI but­ton, for good­ness sake. AI is best when it’s in­vis­i­ble, not when it’s shoved in your face at every turn.

Given the cur­rent rep­u­ta­tion of Windows 11, if I were in charge of Windows, I’d cer­tainly be think­ing about piv­ot­ing to Windows 12 in an at­tempt to give Windows a clean slate and a fresh start. As long as the com­pany does­n’t mar­ket it as an AI-first OS, piv­ot­ing to Windows 12 would be noth­ing but a good thing for Microsoft, es­pe­cially if it’s a free up­date for every­one that does­n’t bump sys­tem re­quire­ments.

That does­n’t mean Windows 12 should have no AI fea­tures. The fact of the mat­ter is, AI is here to stay, and I’d be very in­ter­ested to see what a desk­top UX can be like if it’s built from scratch with AI in mind. But it needs to be op­tional, and it can­not be the sole rea­son for Windows 12 to ex­ist. AI should com­ple­ment the plat­form, not be­come the plat­form.

Follow Windows Central on Google News to keep our lat­est news, in­sights, and fea­tures at the top of your feeds!

...

Read the original on www.windowscentral.com »

10 199 shares, 12 trendiness

France targets Australia-style social media ban for children next year

France in­tends to fol­low Australia and ban so­cial me­dia plat­forms for chil­dren from the start of the 2026 aca­d­e­mic year.

A draft bill pre­vent­ing un­der-15s from us­ing so­cial me­dia will be sub­mit­ted for le­gal checks and is ex­pected to be de­bated in par­lia­ment early in the new year.

The French pres­i­dent, Emmanuel Macron, has made it clear in re­cent weeks that he wants France to swiftly fol­low Australia’s world-first ban on so­cial me­dia plat­forms for un­der-16s, which came into force in December. It in­cludes Facebook, Snapchat, TikTok and YouTube.

Le Monde and France Info re­ported on Wednesday that a draft bill was now com­plete and con­tained two mea­sures: a ban on so­cial me­dia for un­der-15s and a ban on mo­bile phones in high schools, where 15- to 18-year-olds study. Phones have al­ready been banned in pri­mary and mid­dle schools.

The bill will be sub­mit­ted to France’s Conseil d’É­tat for le­gal re­view in the com­ing days. Education unions will also look at the pro­posed high-school ban on phones.

The gov­ern­ment wants the so­cial me­dia ban to come into force from September 2026.

Le Monde re­ported the text of the draft bill cited the risks of ex­ces­sive screen use by teenagers”, in­clud­ing the dan­gers of be­ing ex­posed to in­ap­pro­pri­ate so­cial me­dia con­tent, on­line bul­ly­ing, and al­tered sleep pat­terns. The bill states the need to protect fu­ture gen­er­a­tions” from dan­gers that threaten their abil­ity to thrive and live to­gether in a so­ci­ety with shared val­ues.

Earlier this month, Macron con­firmed at a pub­lic de­bate in Saint Malo that he wanted a so­cial me­dia ban for young teenagers. He said there was consensus be­ing shaped” on the is­sue af­ter Australia in­tro­duced its ban. The more screen time there is, the more school achieve­ment drops … the more screen time there is, the more men­tal health prob­lems go up,” he said.

He used the anal­ogy of a teenager get­ting into a Formula One rac­ing car be­fore they had learned to drive. If a child is in a Formula One car and they turn on the en­gine, I don’t want them to win the race, I just want them to get out of the car. I want them to learn the high­way code first, and to en­sure the car works, and to teach them to drive in a dif­fer­ent car.”

Several other coun­tries are con­sid­er­ing so­cial me­dia bans for un­der-15s af­ter Australia’s ban in­clud­ing Denmark, whose gov­ern­ment hopes to in­tro­duce a ban in 2026, and Norway. Malaysia is also plan­ning a so­cial me­dia ban for un­der-16s from 2026. In the UK, the Labour gov­ern­ment has not ruled out a ban, say­ing nothing is off the table” but any ban must be based on ro­bust ev­i­dence”.

Anne Le Hénanff, the French min­is­ter in charge of dig­i­tal de­vel­op­ment and ar­ti­fi­cial in­tel­li­gence, told Le Parisien this month that the so­cial me­dia ban for un­der-15s was a gov­ern­ment pri­or­ity, and that the bill would be short and com­pat­i­ble with European law”, namely the EUs Digital Services Act (DSA) — reg­u­la­tion in­tended to com­bat hate­ful speech, mis­in­for­ma­tion and dis­in­for­ma­tion.

The so­cial me­dia ban is part of Macron’s at­tempt to shape his legacy as he en­ters his dif­fi­cult fi­nal year as pres­i­dent with a di­vided par­lia­ment.

On 23 December, last-minute leg­is­la­tion was passed to keep the gov­ern­ment in busi­ness into January af­ter par­lia­ment failed to agree a full bud­get for 2026. Attempts to agree a bud­get will re­sume next month.

A French par­lia­men­tary in­quiry into TikTok’s psy­cho­log­i­cal ef­fects con­cluded in September that the plat­form was like a slow poi­son” to chil­dren. The co-head of the in­quiry, the cen­trist law­maker Laure Miller, told France Info that TikTok was an ocean of harm­ful con­tent” that was very vis­i­ble to chil­dren through al­go­rithms that kept them in a bub­ble. TikTok re­sponded that it was be­ing un­fairly scape­goated for industry-wide and so­ci­etal chal­lenges”.

The French par­lia­ment re­port rec­om­mended more broadly that chil­dren un­der 15 in France should be banned en­tirely from us­ing so­cial me­dia, and those be­tween 15 and 18 should face a night-time digital cur­few”, mean­ing so­cial me­dia would be made un­avail­able to them be­tween 10pm and 8am.

The in­quiry was set up af­ter a 2024 French law­suit against TikTok by seven fam­i­lies who ac­cused it of ex­pos­ing their chil­dren to con­tent that was push­ing them to­wards end­ing their lives.

...

Read the original on www.theguardian.com »

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

If you like 10HN please leave feedback and share

Visit pancik.com for more.