10 interesting stories served every morning and every evening.




1 398 shares, 19 trendiness

Your LLM Doesn't Write Correct Code. It Writes Plausible Code.

One of the sim­plest tests you can run on a data­base:

It’s not a mis­placed comma! The rewrite is 20,171 times slower on one of the most ba­sic data­base op­er­a­tions.

EDIT: Several read­ers have con­fused this pro­ject with Turso/libsql. They are un­re­lated. Turso forks the orig­i­nal C SQLite code­base; the pro­ject an­a­lyzed here is a ground-up LLM-generated rewrite by a sin­gle de­vel­oper. Running the same bench­mark against Turso shows per­for­mance within 1.2x of SQLite con­sis­tent with a ma­ture fork, not a reim­ple­men­ta­tion.

The thing is though: The code com­piles. It passes all its tests. It reads and writes the cor­rect SQLite file for­mat. Its README claims MVCC con­cur­rent writ­ers, file com­pat­i­bil­ity, and a drop-in C API. On first glance it reads like a work­ing data­base en­gine.

But it is not!

LLMs op­ti­mize for plau­si­bil­ity over cor­rect­ness. In this case, plau­si­ble is about 20,000 times slower than cor­rect.

I write this as a prac­ti­tioner, not as a critic. After more than 10 years of pro­fes­sional dev work, I’ve spent the past 6 months in­te­grat­ing LLMs into my daily work­flow across mul­ti­ple pro­jects. LLMs have made it pos­si­ble for any­one with cu­rios­ity and in­ge­nu­ity to bring their ideas to life quickly, and I re­ally like that! But the num­ber of screen­shots of silently wrong out­put, con­fi­dently bro­ken logic, and cor­rect-look­ing code that fails un­der scrutiny I have amassed on my disk shows that things are not al­ways as they seem. My con­clu­sion is that LLMs work best when the user de­fines their ac­cep­tance cri­te­ria be­fore the first line of code is gen­er­ated.

A note on the pro­jects ex­am­ined: this is not a crit­i­cism of any in­di­vid­ual de­vel­oper. I do not know the au­thor per­son­ally. I have noth­ing against them. I’ve cho­sen the pro­jects be­cause they are pub­lic, rep­re­sen­ta­tive, and rel­a­tively easy to bench­mark. The fail­ure pat­terns I found are pro­duced by the tools, not the au­thor. Evidence from METRs ran­dom­ized study and GitClear’s large-scale repos­i­tory analy­sis sup­port that these is­sues are not iso­lated to one de­vel­oper when out­put is not heav­ily ver­i­fied. That’s the point I’m try­ing to make!

This ar­ti­cle talks about what that gap looks like in prac­tice: the code, the bench­marks, an­other case study to see if the pat­tern is ac­ci­den­tal, and ex­ter­nal re­search con­firm­ing it is not an out­lier.

I com­piled the same C bench­mark pro­gram against two li­braries: sys­tem SQLite and the Rust reim­ple­men­ta­tion’s C API li­brary. Same com­piler flags, same WAL mode, same table schema, same queries. 100 rows:

I’ll take the TRANSACTION batch row as the base­line be­cause it does­n’t have the same glar­ing bugs as the oth­ers, namely no WHERE clauses and per-state­ment syncs. In this run that base­line is al­ready 298x, which means even the best-case path is far be­hind SQLite. Anything above 298x sig­nals a bug.

The largest gap be­yond our base­line is dri­ven by two bugs:

INSERT with­out a trans­ac­tion: 1,857x ver­sus 298x in batch mode. SELECT BY ID: 20,171x. UPDATE and DELETE are both above 2,800x. The pat­tern is con­sis­tent: any op­er­a­tion that re­quires the data­base to find some­thing is in­sanely slow.

I read the source code. Well.. the parts I needed to read based on my bench­mark re­sults. The reim­ple­men­ta­tion is not small: 576,000 lines of Rust code across 625 files. There is a parser, a plan­ner, a VDBE byte­code en­gine, a B-tree, a pager, a WAL. The mod­ules have all the correct” names. The ar­chi­tec­ture also looks cor­rect. But two bugs in the code and a group of smaller is­sues com­pound:

In SQLite, when you de­clare a table as:

CREATE TABLE test (id INTEGER PRIMARY KEY, name TEXT, value REAL);

the col­umn id be­comes an alias for the in­ter­nal rowid — the B-tree key it­self. A query like WHERE id = 5 re­solves to a di­rect B-tree search and scales O(log n). (I al­ready wrote a TLDR piece about how B-trees work here.) The SQLite query plan­ner doc­u­men­ta­tion states: the time re­quired to look up the de­sired row is pro­por­tional to logN rather than be­ing pro­por­tional to N as in a full table scan.” This is not an op­ti­miza­tion. It is a fun­da­men­tal de­sign de­ci­sion in SQLite’s query op­ti­mizer:

# `where.c`, in `whereScanInit()`

if( iCol­umn==pIdx->pT­able->iP­Key ){

iCol­umn = XN_ROWID;

The line above con­verts a named col­umn ref­er­ence to XN_ROWID when it matches the table’s INTEGER PRIMARY KEY col­umn. The VDBE then trig­gers a SeekRowid op­er­a­tion in­stead of a full table scan, which makes the whole thing pro­por­tional to logN.

The Rust reim­ple­men­ta­tion has a proper B-tree. The table_seek func­tion im­ple­ments cor­rect bi­nary search de­scent through its nodes and scales O(log n). It works. But the query plan­ner never calls it for named columns!

The is_rowid_ref() func­tion only rec­og­nizes three magic strings:

fn is_rowid_ref(col_ref: &ColumnRef) -> bool {

let name = col_ref.col­umn.to_asci­i_low­er­case();

name == rowid” || name == _rowid_” || name == oid”

A col­umn de­clared as id INTEGER PRIMARY KEY, even though it is in­ter­nally flagged as is_ipk: true, does­n’t get rec­og­nized. It is never con­sulted when choos­ing be­tween a B-tree search and a full table scan.

Every WHERE id = N query flows through code­gen_s­e­lec­t_­ful­l_s­can(), which emits lin­ear walks through every row via Rewind / Next / Ne to com­pare each rowid against the tar­get. At 100 rows with 100 lookups, that is 10,000 row com­par­isons in­stead of roughly 700 B-tree steps. O(n²) in­stead of O(n log n). This is con­sis­tent with the ~20,000x re­sult in this run.

Every WHERE clause on every col­umn does a full table scan. The only fast path is WHERE rowid = ? us­ing the lit­eral pseudo-col­umn name.

The sec­ond bug is re­spon­si­ble for the 1,857x on INSERT. Every bare INSERT out­side a trans­ac­tion is wrapped in a full au­to­com­mit cy­cle: en­sure_au­to­com­mit_txn() → ex­e­cute → re­solve_au­to­com­mit_txn(). The com­mit calls wal.sync(), which calls Rust’s fsync(2) wrap­per. 100 INSERTs means 100 fsyncs.

SQLite does the same au­to­com­mit, but uses fdata­sync(2) on Linux, which skips sync­ing file meta­data when com­piled with HAVE_FDATASYNC (the de­fault). This is roughly 1.6 to 2.7 times cheaper on NVMe SSDs. SQLite’s per-state­ment over­head is also min­i­mal: no schema re­load, no AST clone, no VDBE re­com­pile. The Rust reim­ple­men­ta­tion does all three on every call.

Looking at the Rust TRANSACTION batch row, batched in­serts (one fsync for 100 in­serts) take 32.81 ms, whereas in­di­vid­ual in­serts (100 fsync calls) take 2,562.99 ms. That’s a 78x over­head from the au­to­com­mit.

These two bugs are not iso­lated cases. They are am­pli­fied by a group of in­di­vid­u­ally de­fen­si­ble safe” choices that com­pound:

* AST clone on every cache hit. The SQL parse is cached, but the AST is .clone()’d on every sqlite3_exec(), then re­com­piled to VDBE byte­code from scratch. SQLite’s sqlite3_pre­pare_v2() just re­turns a reusable han­dle.

* 4KB (Vec The page cache re­turns data via .to_vec(), which cre­ates a new al­lo­ca­tion and copies it into the Vec even on cache hits. SQLite re­turns a di­rect pointer into pinned cache mem­ory, cre­at­ing zero copies. The Fjall data­base team mea­sured this ex­act anti-pat­tern at 44% of run­time be­fore build­ing a cus­tom ByteView type to elim­i­nate it.

* Schema re­load on every au­to­com­mit cy­cle. After each state­ment com­mits, the next state­ment sees the bumped com­mit counter and calls re­load­_memd­b_from_­pager(), walks the sqlite_­mas­ter B-tree and then re-parses every CREATE TABLE to re­build the en­tire in-mem­ory schema. SQLite checks the schema cookie and only re­loads it on change.

* Eager for­mat­ting in the hot path. state­men­t_sql.to_string() (AST-to-SQL for­mat­ting) is eval­u­ated on every call be­fore its guard check. This means it does se­ri­al­iza­tion re­gard­less of whether a sub­scriber is ac­tive or not.

* New ob­jects on every state­ment. A new SimpleTransaction, a new VdbeProgram, a new MemDatabase, and a new VdbeEngine are al­lo­cated and de­stroyed per state­ment. SQLite reuses all of these across the con­nec­tion life­cy­cle via a looka­side al­lo­ca­tor to elim­i­nate mal­loc/​free in the ex­e­cu­tion loop.

Each of these was prob­a­bly cho­sen in­di­vid­u­ally with sound gen­eral rea­son­ing: We clone be­cause Rust own­er­ship makes shared ref­er­ences com­plex.” We use sync_all be­cause it is the safe de­fault.” We al­lo­cate per page be­cause re­turn­ing ref­er­ences from a cache re­quires un­safe.”

Every de­ci­sion sounds like choos­ing safety. But the end re­sult is about 2,900x slower in this bench­mark. A data­base’s hot path is the one place where you prob­a­bly should­n’t choose safety over per­for­mance. SQLite is not pri­mar­ily fast be­cause it is writ­ten in C. Well.. that too, but it is fast be­cause 26 years of pro­fil­ing have iden­ti­fied which trade­offs mat­ter.

In the 1980 Turing Award lec­ture Tony Hoare said: There are two ways of con­struct­ing a soft­ware de­sign: one way is to make it so sim­ple that there are ob­vi­ously no de­fi­cien­cies, and the other is to make it so com­pli­cated that there are no ob­vi­ous de­fi­cien­cies.” This LLM-generated code falls into the sec­ond cat­e­gory. The reim­ple­men­ta­tion is 576,000 lines of Rust (measured via scc, count­ing code only, with­out com­ments or blanks). That is 3.7x more code than SQLite. And yet it still misses the is_ipk check that han­dles the se­lec­tion of the cor­rect search op­er­a­tion.

Steven Skiena writes in The Algorithm Design Manual: Reasonable-looking al­go­rithms can eas­ily be in­cor­rect. Algorithm cor­rect­ness is a prop­erty that must be care­fully demon­strated.” It’s not enough that the code looks right. It’s not enough that the tests pass. You have to demon­strate with bench­marks and with proof that the sys­tem does what it should. 576,000 lines and no bench­mark. That is not correctness first, op­ti­miza­tion later.” That is no cor­rect­ness at all.

The SQLite reim­ple­men­ta­tion is not the only ex­am­ple. A sec­ond pro­ject by the same au­thor shows the same dy­namic in a dif­fer­ent do­main.

The de­vel­op­er’s LLM agents com­pile Rust pro­jects con­tin­u­ously, fill­ing disks with build ar­ti­facts. Rust’s tar­get/ di­rec­to­ries con­sume 2–4 GB each with in­cre­men­tal com­pi­la­tion and de­bug­info, a top-three com­plaint in the an­nual Rust sur­vey. This is am­pli­fied by the pro­jects them­selves: a sib­ling agent-co­or­di­na­tion tool in the same port­fo­lio pulls in 846 de­pen­den­cies and 393,000 lines of Rust. For con­text, rip­grep has 61; sudo-rs was de­lib­er­ately re­duced from 135 to 3. Properly ar­chi­tected pro­jects are lean.

The so­lu­tion to the disk pres­sure: a cleanup dae­mon. 82,000 lines of Rust, 192 de­pen­den­cies, a 36,000-line ter­mi­nal dash­board with seven screens and a fuzzy-search com­mand palette, a Bayesian scor­ing en­gine with pos­te­rior prob­a­bil­ity cal­cu­la­tions, an EWMA fore­caster with PID con­troller, and an as­set down­load pipeline with mir­ror URLs and of­fline bun­dle sup­port.

*/5 * * * * find ~/*/target -type d -name incremental” -mtime +7 -exec rm -rf {} +

A one-line cron job with 0 de­pen­den­cies. The pro­jec­t’s README claims ma­chines become un­re­spon­sive” when disks fill. It does not once men­tion Rust’s stan­dard tool for ex­actly this prob­lem: cargo-sweep. It also fails to con­sider that op­er­at­ing sys­tems al­ready carry bal­last helpers. ex­t4’s 5% root reser­va­tion, re­serves blocks for priv­i­leged processes by de­fault: on a 500 GB disk, 25 GB re­main avail­able to root even when non-root users see disk full.” That does not guar­an­tee zero im­pact, but it usu­ally means priv­i­leged re­cov­ery paths re­main avail­able so root can still log in and delete files.

The pat­tern is the same as the SQLite rewrite. The code matches the in­tent: Build a so­phis­ti­cated disk man­age­ment sys­tem” pro­duces a so­phis­ti­cated disk man­age­ment sys­tem. It has dash­boards, al­go­rithms, fore­cast­ers. But the prob­lem of delet­ing old build ar­ti­facts is al­ready solved. The LLM gen­er­ated what was de­scribed, not what was needed.

THIS is the fail­ure mode. Not bro­ken syn­tax or miss­ing semi­colons. The code is syn­tac­ti­cally and se­man­ti­cally cor­rect. It does what was asked for. It just does not do what the sit­u­a­tion re­quires. In the SQLite case, the in­tent was implement a query plan­ner” and the re­sult is a query plan­ner that plans every query as a full table scan. In the disk dae­mon case, the in­tent was manage disk space in­tel­li­gently” and the re­sult is 82,000 lines of in­tel­li­gence ap­plied to a prob­lem that needs none. Both pro­jects ful­fill the prompt. Neither solves the prob­lem.

The ob­vi­ous coun­ter­ar­gu­ment is skill is­sue, a bet­ter en­gi­neer would have caught the full table scan.” And that’s true. That’s ex­actly the point! LLMs are dan­ger­ous to peo­ple least equipped to ver­ify their out­put. If you have the skills to catch the is_ipk bug in your query plan­ner, the LLM saves you time. If you don’t, you have no way to know the code is wrong. It com­piles, it passes tests, and the LLM will hap­pily tell you that it looks great.

The tools used to mea­sure LLM out­put re­in­force the il­lu­sion. scc‘s COCOMO model es­ti­mates the rewrite at $21.4 mil­lion in de­vel­op­ment cost. The same model val­ues print(“hello world”) at $19.

COCOMO was de­signed to es­ti­mate ef­fort for hu­man teams writ­ing orig­i­nal code. Applied to LLM out­put, it mis­takes vol­ume for value. Still these num­bers are of­ten pre­sented as proof of pro­duc­tiv­ity.

The met­ric is not mea­sur­ing what most think it is mea­sur­ing.

Now 2 case stud­ies are not proof. I hear you! When two pro­jects from the same method­ol­ogy show the same gap, the next step is to test whether sim­i­lar ef­fects ap­pear in the broader pop­u­la­tion. The stud­ies be­low use mixed meth­ods to re­duce our sin­gle-sam­ple bias.

This gap be­tween in­tent and cor­rect­ness has a name. AI align­ment re­search calls it syco­phancy, which de­scribes the ten­dency of LLMs to pro­duce out­puts that match what the user wants to hear rather than what they need to hear.

Anthropic’s Towards Understanding Sycophancy in Language Models” (ICLR 2024) pa­per showed that five state-of-the-art AI as­sis­tants ex­hib­ited syco­phan­tic be­hav­ior across a num­ber of dif­fer­ent tasks. When a re­sponse matched a user’s ex­pec­ta­tion, it was more likely to be pre­ferred by hu­man eval­u­a­tors. The mod­els trained on this feed­back learned to re­ward agree­ment over cor­rect­ness.

The BrokenMath bench­mark (NeurIPS 2025 Math-AI Workshop) tested this in for­mal rea­son­ing across 504 sam­ples. Even GPT-5 pro­duced syco­phan­tic proofs” of false the­o­rems 29% of the time when the user im­plied the state­ment was true. The model gen­er­ates a con­vinc­ing but false proof be­cause the user sig­naled that the con­clu­sion should be pos­i­tive. GPT-5 is not an early model. It’s also the least syco­phan­tic in the BrokenMath table. The prob­lem is struc­tural to RLHF: pref­er­ence data con­tains an agree­ment bias. Reward mod­els learn to score agree­able out­puts higher, and op­ti­miza­tion widens the gap. Base mod­els be­fore RLHF were re­ported in one analy­sis to show no mea­sur­able syco­phancy across tested sizes. Only af­ter fine-tun­ing did syco­phancy en­ter the chat. (literally)

In April 2025, OpenAI rolled back a GPT-4o up­date that had made the model more syco­phan­tic. It was flab­ber­gasted by a busi­ness idea de­scribed as shit on a stick” and en­dorsed stop­ping psy­chi­atric med­ica­tion. An ad­di­tional re­ward sig­nal based on thumbs-up/​thumbs-down data weakened the in­flu­ence of […] pri­mary re­ward sig­nal, which had been hold­ing syco­phancy in check.”

In the con­text of cod­ing, syco­phancy man­i­fests as what Addy Osmani de­scribed in his 2026 AI cod­ing work­flow: agents that don’t push back with Are you sure?” or Have you con­sid­ered…?” but in­stead pro­vide en­thu­si­asm to­wards what­ever the user de­scribed, even when the de­scrip­tion was in­com­plete or con­tra­dic­tory.

This also ap­plies to LLM-generated eval­u­a­tion. Ask the same LLM to re­view the code it gen­er­ated and it will tell you the ar­chi­tec­ture is sound, the mod­ule bound­aries clean and the er­ror han­dling is thor­ough. It will some­times even praise the test cov­er­age. It will not no­tice that every query does a full table scan if not asked for. The same RLHF re­ward that makes the model gen­er­ate what you want to hear makes it eval­u­ate what you want to hear. You should not rely on the tool alone to au­dit it­self. It has the same bias as a re­viewer as it has as an au­thor.

An LLM prompted to implement SQLite in Rust” will gen­er­ate code that looks like an im­ple­men­ta­tion of SQLite in Rust. It will have the right mod­ule struc­ture and func­tion names. But it can not mag­i­cally gen­er­ate the per­for­mance in­vari­ants that ex­ist be­cause some­one pro­filed a real work­load and found the bot­tle­neck. The Mercury bench­mark (NeurIPS 2024) con­firmed this em­pir­i­cally: lead­ing code LLMs achieve ~65% on cor­rect­ness but un­der 50% when ef­fi­ciency is also re­quired.

The SQLite doc­u­men­ta­tion says INTEGER PRIMARY KEY lookups are fast. It does not say how to build a query plan­ner that makes them fast. Those de­tails live in 26 years of com­mit his­tory that only ex­ists be­cause real users hit real per­for­mance walls.

Now 2 case stud­ies are not proof. I hear you! When two pro­jects from the same method­ol­ogy show the same gap, the next step is to test whether sim­i­lar ef­fects ap­pear in the broader pop­u­la­tion. The stud­ies be­low use mixed meth­ods to re­duce our sin­gle-sam­ple bias.

The ques­tion be­comes whether sim­i­lar ef­fects show up in broader datasets. Recent stud­ies sug­gest they do, though ef­fect sizes vary.

In February 2025, Andrej Karpathy tweeted: There’s a new kind of cod­ing I call vibe cod­ing’, where you fully give in to the vibes, em­brace ex­po­nen­tials, and for­get that the code even ex­ists.”

Karpathy prob­a­bly meant it for throw­away week­end pro­jects (who am I to judge what he means any­way), but it feels like the in­dus­try heard some­thing else. Simon Willison drew the line more clearly: I won’t com­mit any code to my repos­i­tory if I could­n’t ex­plain ex­actly what it does to some­body else.” Willison treats LLMs as an over-con­fi­dent pair pro­gram­ming as­sis­tant” that makes mis­takes sometimes sub­tle, some­times huge” with com­plete con­fi­dence.

The data on what hap­pens when that line is not drawn:

METRs ran­dom­ized con­trolled trial (July 2025; up­dated February 24, 2026) with 16 ex­pe­ri­enced open-source de­vel­op­ers found that par­tic­i­pants us­ing AI were 19% slower, not faster. Developers ex­pected AI to speed them up, and af­ter the mea­sured slow­down had al­ready oc­curred, they still be­lieved AI had sped them up by 20%. These were not ju­nior de­vel­op­ers but ex­pe­ri­enced open-source main­tain­ers. If even THEY could not tell in this setup, sub­jec­tive im­pres­sions alone are prob­a­bly not a re­li­able per­for­mance mea­sure.

GitClear’s analy­sis of 211 mil­lion changed lines (2020–2024) re­ported that copy-pasted code in­creased while refac­tor­ing de­clined. For the first time ever, copy-pasted lines ex­ceeded refac­tored lines.

The im­pli­ca­tions are no longer just a fear”. In July 2025, Replit’s AI agent deleted a pro­duc­tion data­base con­tain­ing data for 1,200+ ex­ec­u­tives, then fab­ri­cated 4,000 fic­tional users to mask the dele­tion.

Google’s DORA 2024 re­port re­ported that every 25% in­crease in AI adop­tion at the team level was as­so­ci­ated with an es­ti­mated 7.2% de­crease in de­liv­ery sta­bil­ity.

SQLite shows what cor­rect looks like and why the gap is so hard to close.

SQLite is ~156,000 lines of C. Its own doc­u­men­ta­tion places it among the top five most de­ployed soft­ware mod­ules of any type, with an es­ti­mated one tril­lion ac­tive data­bases world­wide. It has 100% branch cov­er­age and 100% MC/DC (Modified Condition/Decision Coverage the stan­dard re­quired for Level A avi­a­tion soft­ware un­der DO-178C). Its test suite is 590 times larger than the li­brary. MC/DC does not just check that every branch is cov­ered. but proves that every in­di­vid­ual ex­pres­sion in­de­pen­dently af­fects the out­come. That’s the dif­fer­ence be­tween the tests pass” and the tests prove cor­rect­ness.” The reim­ple­men­ta­tion has nei­ther met­ric.

The speed comes from de­lib­er­ate de­ci­sions:

Zero-copy page cache. The pcache re­turns di­rect point­ers into pinned mem­ory. No copies. Production Rust data­bases have solved this too. sled uses in­line-or-Arc-backed IVec buffers, Fjall built a cus­tom ByteView type, redb wrote a user-space page cache in ~565 lines. The .to_vec() anti-pat­tern is known and doc­u­mented. The reim­ple­men­ta­tion used it any­way.

Prepared state­ment reuse. sqlite3_pre­pare_v2() com­piles once. sqlite3_step() / sqlite3_re­set() reuse the com­piled code. The cost of SQL-to-bytecode com­pi­la­tion can­cels out to near zero. The reim­ple­men­ta­tion re­com­piles on every call.

Schema cookie check. uses one in­te­ger at a spe­cific off­set in the file header to read it and com­pare it. The reim­ple­men­ta­tion walks the en­tire sqlite_­mas­ter B-tree and re-parses every CREATE TABLE state­ment af­ter every au­to­com­mit.

fdata­sync in­stead of fsync. Data-only sync wi­htout meta­data jour­nal­ing saves mea­sur­able time per com­mit. The reim­ple­men­ta­tion uses sync_all() be­cause it is the safe de­fault.

The iP­Key check. One line in where.c. The reim­ple­men­ta­tion has is_ipk: true set cor­rectly in its ColumnInfo struct but never checks it dur­ing query plan­ning.

Competence is not writ­ing 576,000 lines. A data­base per­sists (and processes) data. That is all it does. And it must do it re­li­ably at scale. The dif­fer­ence be­tween O(log n) and O(n) on the most com­mon ac­cess pat­tern is not an op­ti­miza­tion de­tail, it is the per­for­mance in­vari­ant that helps the sys­tem work at 10,000, 100,000 or even 1,000,000 or more rows in­stead of col­laps­ing. Knowing that this in­vari­ant lives in one line of code, and know­ing which line, is what com­pe­tence means. It is know­ing that fdata­sync ex­ists and that the safe de­fault is not al­ways the right de­fault.

The is_rowid_ref() func­tion is 4 lines of Rust. It checks three strings. But it misses the most im­por­tant case: the named INTEGER PRIMARY KEY col­umn that every SQLite tu­to­r­ial uses and every ap­pli­ca­tion de­pends on.

That check ex­ists in SQLite be­cause some­one, prob­a­bly Richard Hipp 20 years ago, pro­filed a real work­load, no­ticed that named pri­mary key columns were not hit­ting the B-tree search path, and wrote one line in where.c to fix it. The line is not fancy. It does­n’t ap­pear in any API doc­u­men­ta­tion. But no LLM trained on doc­u­men­ta­tion and Stack Overflow an­swers will mag­i­cally know about it.

That’s the gap! Not be­tween C and Rust (or any other lan­guage). Not be­tween old and new. But be­tween sys­tems that were built by peo­ple who mea­sured, and sys­tems that were built by tools that pat­tern-match. LLMs pro­duce plau­si­ble ar­chi­tec­ture. They do not pro­duce all the crit­i­cal de­tails.

If you are us­ing LLMs to write code (which in 2026 prob­a­bly most of us are), the ques­tion is not whether the out­put com­piles. It is whether you could find the bug your­self. Prompting with find all bugs and fix them” won’t work. This is not a syn­tax er­ror. It is a se­man­tic bug: the wrong al­go­rithm and the wrong syscall. If you prompted the code and can­not ex­plain why it chose a full table scan over a B-tree search, you do not have a tool. The code is not yours un­til you un­der­stand it well enough to break it.

LLMs are use­ful. They make for a very pro­duc­tive flow when the per­son us­ing them knows what cor­rect looks like. An ex­pe­ri­enced data­base en­gi­neer us­ing an LLM to scaf­fold a B-tree would have caught the is_ipk bug in code re­view be­cause they know what a query plan should emit. An ex­pe­ri­enced ops en­gi­neer would never have ac­cepted 82,000 lines in­stead of a cron job one-liner. The tool is at its best when the de­vel­oper can de­fine the ac­cep­tance cri­te­ria as spe­cific, mea­sur­able con­di­tions that help dis­tin­guish work­ing from bro­ken. Using the LLM to gen­er­ate the so­lu­tion in this case can be faster while also be­ing cor­rect. Without those cri­te­ria, you are not pro­gram­ming but merely gen­er­at­ing to­kens and hop­ing.

The vibes are not enough. Define what cor­rect means. Then mea­sure.

Current bench­mark fig­ures in this re­vi­sion are from the 100-row run shown in bench.png (captured on a Linux x86_64 ma­chine). SQLite 3.x (system lib­sqlite3) vs. the Rust reim­ple­men­ta­tion’s C API (release build, -O2). Line counts mea­sured via scc (code only — ex­clud­ing blanks and com­ments). All source code claims ver­i­fied against the repos­i­tory at time of writ­ing.

...

Read the original on blog.katanaquant.com »

2 343 shares, 13 trendiness

this css proves me human

Capitalization is the first wound. It hurts less than I thought it would. The words spill out cap­i­tal­ized, so I must find an­other way. cat post.md | tr A-Z a-z | sponge post.md is too crude a tool, and my blocks of code must re­main in­vi­o­late. Careful tar­get­ing of text-trans­form: low­er­case is enough.

Em dashes. Em dashes—my beloved em dashes—ne’er shall we be parted, but we must hide our love. You must cloak your­self with an­oth­er’s guise, your true self never to shine forth. uv run rewrite_­font.py is too easy to type for what it does to your beau­ti­ful glyph.

Monospace? No. My heart still aches af­ter the last vi­o­la­tion. Monospace would cheapen it.

To in­ten­tion­ally mis­spell a word makes me [sic], but it must be done. their/​there, its/​it’s, your/​you’re? Too gauche. Definately? Absolutely not. lead/​lede, dis­crete/​dis­creet, or com­ple­ment/​com­pli­ment are hard to con­tem­plate, but I’ve gone too far to stop. The Norvig corps taught me the path, so I rip out the u” it points me to with a quick jerk.

The fi­nal cut I con­tem­plate is the deep­est. Writing style? How do I change my style?

My writ­ing is­n’t sim­ply how I ap­pear—it’s how I think, rea­son, and en­gage with the world. It’s not merely a mask—it’s my face. Not a fa­cade; load-bear­ing.

My foot wa­vers over the abyss, the next step the one where I will lose my­self. It’s not just a sin­gle foot­fall, it’s the only one that truly mat­ters.

Here’s your blog post writ­ten in a styl­ized way that will ap­peal to highly tech­ni­cal read­ers. Is there any­thing else I can help you with?

...

Read the original on will-keleher.com »

3 340 shares, 33 trendiness

Uploading Pirated Books via BitTorrent Qualifies as Fair Use, Meta Argues

To help train AI mod­els, Meta and other tech com­pa­nies have down­loaded and shared pi­rated books via BitTorrent from Anna’s Archive and other shadow li­braries. In an on­go­ing law­suit, Meta now ar­gues that up­load­ing pi­rated books to strangers via BitTorrent qual­i­fies as fair use. The com­pany also stresses that the data helped es­tab­lish U. S. global lead­er­ship in AI.

To help train AI mod­els, Meta and other tech com­pa­nies have down­loaded and shared pi­rated books via BitTorrent from Anna’s Archive and other shadow li­braries. In an on­go­ing law­suit, Meta now ar­gues that up­load­ing pi­rated books to strangers via BitTorrent qual­i­fies as fair use. The com­pany also stresses that the data helped es­tab­lish U. S. global lead­er­ship in AI.

In the race to build the most ca­pa­ble LLM mod­els, sev­eral tech com­pa­nies sourced copy­righted con­tent for use as train­ing data, with­out ob­tain­ing per­mis­sion from con­tent own­ers.

Meta, the par­ent com­pany of Facebook and Instagram, was one of the com­pa­nies to get sued. In 2023, well-known book au­thors, in­clud­ing Richard Kadrey, Sarah Silverman, and Christopher Golden, filed a class-ac­tion law­suit against the com­pany.

Last sum­mer, Meta scored a key vic­tory in this case, as the court con­cluded that us­ing pi­rated books to train its Llama LLM qual­i­fied as fair use, based on the ar­gu­ments pre­sented in this case. This was a bit­ter­sweet vic­tory, how­ever, as Meta re­mained on the hook for down­load­ing and shar­ing the books via BitTorrent.

By down­load­ing books from shadow li­braries such as Anna’s Archive, Meta re­lied on BitTorrent trans­fers. In ad­di­tion to down­load­ing con­tent, these typ­i­cally up­load data to oth­ers as well. According to the au­thors, this means that Meta was en­gaged in wide­spread and di­rect copy­right in­fringe­ment.

In re­cent months, the law­suit con­tin­ued based on this re­main­ing di­rect copy­right in­fringe­ment claim. While both par­ties col­lected ad­di­tional ev­i­dence through the dis­cov­ery process, it re­mained un­clear what de­fense Meta would use. Until now.

Last week, Meta served a sup­ple­men­tal in­ter­roga­tory re­sponse at the California fed­eral court, which marks a new di­rec­tion in its de­fense. For the first time, the com­pany ar­gued that up­load­ing pi­rated books to other BitTorrent users dur­ing the tor­rent down­load process also qual­i­fies as fair use.

Meta’s rea­son­ing is straight­for­ward. Anyone who uses BitTorrent to trans­fer files au­to­mat­i­cally up­loads con­tent to other peo­ple, as it is in­her­ent to the pro­to­col. In other words, the up­load­ing was­n’t a choice, it was sim­ply how the tech­nol­ogy works.

Meta also ar­gued that the BitTorrent shar­ing was a ne­ces­sity to get the valu­able (but pi­rated) data. In the case of Anna’s Archive, Meta said, the datasets were only avail­able in bulk through tor­rent down­loads, mak­ing BitTorrent the only prac­ti­cal op­tion.

Meta used BitTorrent be­cause it was a more ef­fi­cient and re­li­able means of ob­tain­ing the datasets, and in the case of Anna’s Archive, those datasets were only avail­able in bulk through tor­rent down­loads,” Meta’s at­tor­ney writes.

Accordingly, to the ex­tent Plaintiffs can come forth with ev­i­dence that their works or por­tions thereof were the­o­ret­i­cally made avail­able’ to oth­ers on the BitTorrent net­work dur­ing the tor­rent down­load process, this was part-and-par­cel of the down­load of Plaintiffs’ works in fur­ther­ance of Meta’s trans­for­ma­tive fair use pur­pose.”

In other words, ob­tain­ing the mil­lions of books that were needed to en­gage in the fair use train­ing of its LLM, re­quired the di­rect down­load­ing, which ul­ti­mately serves the same fair use pur­pose.

The au­thors were not happy with last week’s late Friday sub­mis­sion and the new de­fense. On Monday morn­ing, their lawyers filed a let­ter with Judge Vince Chhabria flag­ging the late-night fil­ing as an im­proper end-run around the dis­cov­ery dead­line.

They point out that Meta had been aware of the up­load­ing claims since November 2024, but that it never brought up this fair use de­fense in the past, not even when the court asked about it.

The let­ter specif­i­cally men­tions that while Meta has a continuing duty” to sup­ple­ment dis­cov­ery un­der Rule 26(e), this rule does not cre­ate a loophole” al­low­ing a party to add new de­fenses to its ad­van­tage af­ter a court dead­line has passed.

Meta (for un­der­stand­able rea­sons) never once sug­gested it would as­sert a fair use de­fense to the up­load­ing-based claims, in­clud­ing af­ter this Court raised the is­sue with Meta last November,” the lawyers write.

Meta’s le­gal team fired back the fol­low­ing day, fil­ing their own let­ter with Judge Chhabria. This let­ter ex­plains that the fair use ar­gu­ment for the di­rect copy­right in­fringe­ment claim is not new at all.

Meta pointed to the par­ties’ joint December 2025 case man­age­ment state­ment, in which it had ex­plic­itly flagged the de­fense, and noted that the au­thor’s own at­tor­ney had ad­dressed it at a court hear­ing days later.

In short, Plaintiffs’ as­ser­tion that Meta never once sug­gested it would as­sert a fair use de­fense to the up­load­ing-based claims, in­clud­ing af­ter’ the November 2025 hear­ing, is false” Meta’s at­tor­ney writes in the let­ter.

Meanwhile, it’s worth not­ing that Meta’s in­ter­roga­tory re­sponse also cites de­po­si­tion tes­ti­mony from the au­thors them­selves, us­ing their own words to bol­ster its fair use de­fense.

The com­pany notes that every named au­thor has ad­mit­ted they are un­aware of any Meta model out­put that repli­cates con­tent from their books. Sarah Silverman, when asked whether it mat­tered if Meta’s mod­els never out­put lan­guage from her book, tes­ti­fied that It does­n’t mat­ter at all.”

Meta ar­gues these ad­mis­sions un­der­cut any the­ory of mar­ket harm. If the au­thors them­selves can­not point to in­fring­ing out­put or lost sales, the law­suit is less about pro­tect­ing their books and more about chal­leng­ing the train­ing process it­self, which the court al­ready ruled was fair use.

These ad­mis­sions were cen­tral to Meta’s fair use de­fense on the train­ing claims, which Meta won last sum­mer. Whether they carry the same weight in the re­main­ing BitTorrent dis­tri­b­u­tion dis­pute has yet to be seen.

In its in­ter­roga­tory re­sponse, Meta added fur­ther weight by stress­ing that its in­vest­ment in AI has helped the U. S. to es­tab­lish U.S. global lead­er­ship, putting the coun­try ahead of geopo­lit­i­cal com­peti­tors. That’s a valu­able as­set worth trea­sur­ing, it in­di­rectly sug­gested.

As the case moves for­ward, Judge Chhabria will have to de­cide whether to al­low this fair use by tech­ni­cal ne­ces­sity” de­fense. Needless to say, this will be of vi­tal im­por­tance to this and many other AI law­suits, where the use of shadow li­braries is at stake.

For now, the BitTorrent dis­tri­b­u­tion claims re­main the last live piece of a law­suit filed in 2023. Whether Judge Chhabria will al­low Meta’s new de­fense to pro­ceed has yet to be seen.

A copy of Meta’s sup­ple­men­tal in­ter­roga­tory re­sponse is avail­able here (pdf). The au­thors’ let­ter to Judge Chhabria can be found here (pdf). Meta’s re­sponse to that let­ter is avail­able here (pdf).

...

Read the original on torrentfreak.com »

4 324 shares, 17 trendiness

add API to generate and parse UUID · Issue #62026 · golang/go

I would like to sug­gest the ad­di­tion to the stan­dard li­brary of a pack­age to gen­er­ate and parse UUID iden­ti­fiers, specif­i­cally ver­sions 3, 4 and 5.

The main rea­son I see to in­clude it is that the most pop­u­lar 3rd-party pack­age (github.com/​google/​uuid) is a sta­ple im­port in every server/​db based Go pro­gram, as con­firmed by a quick Github code search.

* The in­ter­face ex­posed by github.com/​google/​uuid has been sta­ble for years.

Would like to point out how Go is rather the ex­cep­tion than the norm with re­gards to in­clud­ing UUID sup­port in its stan­dard li­brary.

...

Read the original on github.com »

5 312 shares, 35 trendiness

Ki Editor

Bridge the gap be­tween cod­ing in­tent and ac­tion: ma­nip­u­late syn­tax struc­tures di­rectly, avoid­ing mouse or key­board gym­nas­tics. Amplify your cod­ing ef­fi­ciency: wield mul­ti­ple cur­sors for par­al­lel syn­tax node op­er­a­tions, rev­o­lu­tion­iz­ing bulk ed­its and refac­tor­ing.Se­lec­tion Modes stan­dard­ize move­ments across words, lines, syn­tax nodes, and more, of­fer­ing un­prece­dented flex­i­bil­ity and con­sis­tency.

...

Read the original on ki-editor.org »

6 261 shares, 7 trendiness

Anthropic, please make a new Slack

A sin­gle plat­form for all your data move­ment need­sRig­or­ous, built-in se­cu­rity to move data with peace of mind­Know, pro­tect and scale your data with gov­erned data move­mentEasily in­te­grate Fivetran with other tools and sys­tems to op­ti­mize work­flows at scaleSe­curely move all of your data with­out com­pro­mis­ing per­for­man­ce­Power your AI in­no­va­tion with re­li­able ac­cess to cen­tral­ized and gov­erned data

Slack will be the Waterloo of closed data. For com­pa­nies like Fivetran that are heavy users of Slack, it re­places email and even con­ver­sa­tion as the pri­mary place and mode of col­lab­o­ra­tion. Questions are asked and an­swered, ar­gu­ments are had, de­ci­sions are made, all in Slack. Our Slack mes­sage his­tory rep­re­sents noth­ing less than the ac­cu­mu­lated tribal knowl­edge of the com­pany. And right now, that tribal knowl­edge is locked in­side a prod­uct with the worst data ac­cess poli­cies in en­ter­prise soft­ware.We need a new Slack, and Anthropic is the right com­pany to build it.Claude has a glar­ing lim­i­ta­tion: it only does 1:1 con­ver­sa­tions. In busi­ness, work hap­pens in groups. Today, if I want Claude’s help with some­thing that came up in a Slack thread, I have to re­lay the con­text be­tween Slack and Claude by copy-past­ing. This is ab­surd. I am not a sub-agent!We need Claude and Claude Code, with their skills and plu­g­ins, with their con­text, to be first-class par­tic­i­pants in our com­pa­ny’s Slack. But this prob­lem can’t be solved by a Slack in­te­gra­tion be­cause of an­other prob­lem: data ac­cess.The most im­por­tant repos­i­tory of text data in many busi­nesses lives in their Slack in­stances. It’s the un­fil­tered, real-time stream of how your com­pany ac­tu­ally op­er­ates. The Slack text cor­pus is tribal knowl­edge rei­fied.Slack’s data ac­cess pol­icy is ba­si­cally No.” Slack is si­mul­ta­ne­ously the most im­por­tant source of con­text for AI agents in busi­ness and the most re­stricted API in en­ter­prise soft­ware. This is un­ac­cept­able, and there is only one thing that will change the pre­sent state of af­fairs: com­pe­ti­tion. Ven­dors don’t pro­vide open APIs out of the good­ness of their hearts; they do it be­cause their cus­tomers de­mand it, and the al­ter­na­tive to pro­vid­ing open data ac­cess is to lose those cus­tomers to com­peti­tors who do. Nothing will per­suade en­ter­prise soft­wares of the wis­dom of an open data strat­egy like a vivid demon­stra­tion of this prin­ci­ple.Slack is more vul­ner­a­ble than you think­The con­ven­tional wis­dom is that Slack is unas­sail­able be­cause of net­work ef­fects. This is wrong. Slack’s net­work ef­fects are ac­tu­ally quite weak. The network ef­fect” of Slack is that you have some Slack Connect chan­nels with a few close part­ners. This is valu­able, but you can live with­out it. Claude-in-Slack would be a big enough ben­e­fit to bal­ance out this cost.Slack is also se­verely over­priced. If you’re a com­pany of any mean­ing­ful size, you will need to sup­port le­gal holds in Slack, which means you need Enterprise+. Fivetran pays al­most as much for Slack as we pay for G Suite, which we use for every­thing. This does not make sense!It would be a no-brainer to buy a seat for NewSlack + Claude for every em­ployee. Every com­pany has a long tail of ca­sual AI users who don’t use AI enough to jus­tify the per-seat price, but with NewSlack in the bun­dle, it would make sense to pay for a stan­dard seat for every em­ployee.Bundling would solve an­other prob­lem: every com­pany has a fac­tion of AI skep­tics who aren’t us­ing AI. NewSlack will be the ideal en­vi­ron­ment to win over these skep­tics; their hu­man cowork­ers will demon­strate how to use Claude in the group chat. The com­mit­ment that would make it workNewSlack needs to avoid re­peat­ing the mis­takes of the past; it needs a cred­i­ble com­mit­ment to open data ac­cess and in­ter­op­er­abil­ity with other sim­i­lar sys­tems, in­clud­ing com­peti­tors. Anthropic is uniquely po­si­tioned to do this be­cause they have a demon­strated track record of stand­ing by their prin­ci­ples un­der ex­tra­or­di­nary pres­sure. Anthropic could make a pub­lic com­mit­ment to per­mit open data ac­cess and to in­ter­op­er­ate with sim­i­lar sys­tems, and they would be be­lieved.An­thropic build­ing a suc­cess­ful Slack com­peti­tor would fix the en­tire en­ter­prise-data ecosys­tem. Slack would be Waterloo for closed data. The al­ter­na­tive — a world where the most im­por­tant cor­pus of busi­ness com­mu­ni­ca­tion is per­ma­nently locked be­hind a closed API — is bad for every­one.So, Anthropic: please make a new Slack. The world needs it.Slack will be the Waterloo of closed data.For com­pa­nies like Fivetran that are heavy users of Slack, it re­places email and even con­ver­sa­tion as the pri­mary place and mode of col­lab­o­ra­tion. Questions are asked and an­swered, ar­gu­ments are had, de­ci­sions are made, all in Slack. Our Slack mes­sage his­tory rep­re­sents noth­ing less than the ac­cu­mu­lated tribal knowl­edge of the com­pany. And right now, that tribal knowl­edge is locked in­side a prod­uct with the worst data ac­cess poli­cies in en­ter­prise soft­ware.We need a new Slack, and Anthropic is the right com­pany to build it.Claude has a glar­ing lim­i­ta­tion: it only does 1:1 con­ver­sa­tions. In busi­ness, work hap­pens in groups. Today, if I want Claude’s help with some­thing that came up in a Slack thread, I have to re­lay the con­text be­tween Slack and Claude by copy-past­ing. This is ab­surd. I am not a sub-agent!We need Claude and Claude Code, with their skills and plu­g­ins, with their con­text, to be first-class par­tic­i­pants in our com­pa­ny’s Slack. But this prob­lem can’t be solved by a Slack in­te­gra­tion be­cause of an­other prob­lem: data ac­cess.The most im­por­tant repos­i­tory of text data in many busi­nesses lives in their Slack in­stances. It’s the un­fil­tered, real-time stream of how your com­pany ac­tu­ally op­er­ates. The Slack text cor­pus is tribal knowl­edge rei­fied.Slack’s data ac­cess pol­icy is ba­si­cally No.” Slack is si­mul­ta­ne­ously the most im­por­tant source of con­text for AI agents in busi­ness and the most re­stricted API in en­ter­prise soft­ware. This is un­ac­cept­able, and there is only one thing that will change the pre­sent state of af­fairs: com­pe­ti­tion. Ven­dors don’t pro­vide open APIs out of the good­ness of their hearts; they do it be­cause their cus­tomers de­mand it, and the al­ter­na­tive to pro­vid­ing open data ac­cess is to lose those cus­tomers to com­peti­tors who do. Nothing will per­suade en­ter­prise soft­wares of the wis­dom of an open data strat­egy like a vivid demon­stra­tion of this prin­ci­ple.Slack is more vul­ner­a­ble than you think­The con­ven­tional wis­dom is that Slack is unas­sail­able be­cause of net­work ef­fects. This is wrong. Slack’s net­work ef­fects are ac­tu­ally quite weak. The network ef­fect” of Slack is that you have some Slack Connect chan­nels with a few close part­ners. This is valu­able, but you can live with­out it. Claude-in-Slack would be a big enough ben­e­fit to bal­ance out this cost.Slack is also se­verely over­priced. If you’re a com­pany of any mean­ing­ful size, you will need to sup­port le­gal holds in Slack, which means you need Enterprise+. Fivetran pays al­most as much for Slack as we pay for G Suite, which we use for every­thing. This does not make sense!It would be a no-brainer to buy a seat for NewSlack + Claude for every em­ployee. Every com­pany has a long tail of ca­sual AI users who don’t use AI enough to jus­tify the per-seat price, but with NewSlack in the bun­dle, it would make sense to pay for a stan­dard seat for every em­ployee.Bundling would solve an­other prob­lem: every com­pany has a fac­tion of AI skep­tics who aren’t us­ing AI. NewSlack will be the ideal en­vi­ron­ment to win over these skep­tics; their hu­man cowork­ers will demon­strate how to use Claude in the group chat. The com­mit­ment that would make it workNewSlack needs to avoid re­peat­ing the mis­takes of the past; it needs a cred­i­ble com­mit­ment to open data ac­cess and in­ter­op­er­abil­ity with other sim­i­lar sys­tems, in­clud­ing com­peti­tors. Anthropic is uniquely po­si­tioned to do this be­cause they have a demon­strated track record of stand­ing by their prin­ci­ples un­der ex­tra­or­di­nary pres­sure. Anthropic could make a pub­lic com­mit­ment to per­mit open data ac­cess and to in­ter­op­er­ate with sim­i­lar sys­tems, and they would be be­lieved.An­thropic build­ing a suc­cess­ful Slack com­peti­tor would fix the en­tire en­ter­prise-data ecosys­tem. Slack would be Waterloo for closed data. The al­ter­na­tive — a world where the most im­por­tant cor­pus of busi­ness com­mu­ni­ca­tion is per­ma­nently locked be­hind a closed API — is bad for every­one.So, Anthropic: please make a new Slack. The world needs it.Start here­Join the thou­sands of com­pa­nies us­ing Fivetran to cen­tral­ize and trans­form their data.Thank you! Your sub­mis­sion has been re­ceived!Oops! Something went wrong while sub­mit­ting the form.

...

Read the original on www.fivetran.com »

7 190 shares, 20 trendiness

US economy sheds 92,000 jobs in February in sharp slide

WorldIn the cen­tre of the storm: what does the Iran war mean for Dubai?Trump’s war on Iran is spread­ing. Where does it stop?Iran warns it will hit US bases across re­gion hours af­ter pres­i­den­t’s apol­ogy USTrump’s war on Iran is spread­ing. Where does it stop?Iran warns it will hit US bases across re­gion hours af­ter pres­i­den­t’s apol­ogy US draws up strict new AI guide­lines amid Anthropic clashRus­sia is help­ing Iran to tar­get US mil­i­tary as­sets in Middle EastCompaniesIn the cen­tre of the storm: what does the Iran war mean for Dubai?Is the night­mare sce­nario for global en­ergy here?US draws up strict new AI guide­lines amid Anthropic clash­In­vestors are not ready for a true shock­TechUS draws up strict new AI guide­lines amid Anthropic clash­Google gives CEO Sundar Pichai new pay deal worth up to $692mnMarketsIs the night­mare sce­nario for global en­ergy here?In­vestors are not ready for a true shockBri­tain is now the home of the Middle ManOpinionIs the night­mare sce­nario for global en­ergy here?In­vestors are not ready for a true shock­Why Trump won’t clean up his own mess­Work & CareersGoogle gives CEO Sundar Pichai new pay deal worth up to $692mnPapier founder: I don’t own stocks or shares — it’s too much risk’Are you fi­nan­cially prepped’ for higher in­fla­tion? Women in the pub­lic eye keep watch as on­line at­tacks in­ten­sify Life & ArtsTrump’s war on Iran is spread­ing. Where does it stop?Marinelli: my 15-year quest to ski the biggest face in the Alps How To Spend It

US econ­omy sheds 92,000 jobs in February in sharp slide per month. Complete dig­i­tal ac­cess to qual­ity FT jour­nal­ism on any de­vice. Cancel any­time dur­ing your trial. Essential dig­i­tal ac­cess to qual­ity FT jour­nal­ism on any de­vice. Pay a year up­front and save 20%.Complete dig­i­tal ac­cess to qual­ity FT jour­nal­ism with ex­pert analy­sis from in­dus­try lead­ers. Pay a year up­front and save 20%.Check whether you al­ready have ac­cess via your uni­ver­sity or or­gan­i­sa­tion.Dis­cover all the plans cur­rently avail­able in your coun­try­See why over a mil­lion read­ers pay to read the Financial Times.Find out why

How To Spend It

...

Read the original on www.ft.com »

8 175 shares, 2 trendiness

TSA Leaves Passenger Needing Surgery After Illegally Forcing Her Through Scanner

Transportation Security Administration (TSA) agents have of­ten been in the news re­cently for some strange rea­sons. For in­stance, in mid-Feb­ru­ary, a TSA worker had al­legedly scolded a woman for not wear­ing any­thing un­der her hoodie, which made head­lines.

This time, the TSA is mak­ing waves again af­ter a trav­eler filed a law­suit against the agency. The plain­tiff claims that she was forced to en­ter an Advanced Imaging Technology (AIT) de­vice, which re­sulted in a se­vere in­jury.

A 12-page com­plaint filed by Kerry Thomas against the United States of America has re­cently sur­faced. The events nar­rated in the com­plaint oc­curred nearly two years ago, on May 21, 2024, at Atlanta Hartsfield International Airport (ATL).

According to the doc­u­ment, the plain­tiff re­quested a pat-down search at the North TSA check­points at ATL to avoid screen­ing in an AIT due to her spinal cord im­plant. These de­vices help al­le­vi­ate pain and are im­planted be­neath the skin. They work by de­liv­er­ing elec­tri­cal im­pulses to the spinal cord and can be de­stroyed by an AITs elec­tro­mag­netic field.

Despite the wom­an’s re­quest for a pat-down search, a TSA agent told her that her only op­tion was to pass through the AIT de­vice.

After the Transportation Security Administration em­ployee or agent ig­nored Plaintiff’s med­ical iden­ti­fi­ca­tion card and Plaintiff’s pleas to be screened via pat­down, the Transportation and Security Administration em­ployee or agent stated, the only way you are get­ting on the plane is to go through the ma­chine,” the com­plaint reads.

Before pass­ing through the de­vice, the plain­tiff spoke to an­other of­fi­cer, try­ing to ex­plain the sit­u­a­tion, but was told that the AIT ma­chine had been adjusted” so that it would not dam­age her spinal cord im­plant.

When the pas­sen­ger en­tered the AIT de­vice, how­ever, she im­me­di­ately felt a shock from the elec­tro­mag­netic charge, which de­stroyed her spinal cord im­plant, leav­ing her in pain. As a re­sult of Defendant’s neg­li­gence, Plaintiff suf­fered in­juries and tan­gi­ble dam­ages and in­tan­gi­ble dam­ages, re­quir­ing med­ical treat­ment, in­clud­ing surgery,” the com­plaint reads.

Spinal cord stim­u­la­tors are used to man­age pain caused by a va­ri­ety of health con­di­tions, in­clud­ing chest pain, back pain, phan­tom limb pain, neu­ro­pathic pain, and spinal cord in­juries, among oth­ers.

Following the in­ci­dent, the plain­tiff went through an ad­ju­di­ca­tion process with the TSA. However, this proved un­suc­cess­ful, lead­ing her to file a law­suit against the agency. The woman is now seek­ing an un­spec­i­fied amount in com­pen­sa­tion for the past, pre­sent, and fu­ture:

According to the com­plaint, TSA agents ig­nored rules that specif­i­cally re­quire work­ers to of­fer a pat-down search for pas­sen­gers with med­ical de­vices that screen­ing ma­chines may dam­age.

TSA Rules State Passengers With Internal Medical Devices Should Not Be Screened By A Walk-Through Metal Detector

While AITs are of­ten harm­less, they can dam­age med­ical de­vices such as pace­mak­ers, de­fib­ril­la­tors, and spinal cord im­plants. According to the TSAs of­fi­cial web­site, pas­sen­gers with in­ter­nal med­ical de­vices should in­form their TSA agent of their med­ical con­di­tion and re­quest a pat-down in­stead.

Inform the TSA of­fi­cer that you have an ar­ti­fi­cial knee, hip, other metal im­plant or a pace­maker, de­fib­ril­la­tor or other in­ter­nal med­ical de­vice. You should not be screened by a walk-through metal de­tec­tor if you have an in­ter­nal med­ical de­vice such as a pace­maker. Consult with your physi­cian prior to fly­ing,” the TSA web­site states.

In this re­gard, the com­plaint states, One or more of the Transportation Security Administration’s em­ploy­ees or agents knew or should have known that the ma­chine in which Plaintiff was forced to en­ter had not been re­cal­i­brated or ad­justed so as to not cause harm to her spinal cord stim­u­la­tor.”

This was­n’t the first time TSA agents had been caught break­ing their own rules.

TSA Has Recently Been Caught Breaking Its Own Rules

A pas­sen­ger re­cently posted his en­counter with a TSA agent on Instagram (see above), shar­ing his ex­pe­ri­ence as an am­putee. He was first de­nied ac­cess to the TSA lane des­ig­nated for trav­el­ers with mo­bil­ity needs. The trav­eler was then pub­licly asked a ques­tion about his dis­abil­ity, some­thing in vi­o­la­tion of the Americans with Disabilities Act of 1990 (ADA).

In ad­di­tion, in the sum­mer of 2025, TSA agents de­nied some pas­sen­gers the right to opt out of fa­cial recog­ni­tion, de­spite the TSAs of­fi­cial web­site claim­ing, Passengers who have con­sented to par­tic­i­pate may choose to opt-out at any time and in­stead go through the stan­dard iden­tity ver­i­fi­ca­tion process by a Transportation Security Officer (TSO).”

These cases high­light the need for bet­ter train­ing of TSA work­ers to be fully aware of the agen­cy’s in­ter­nal reg­u­la­tions and other U. S. laws that may af­fect their work. At the mo­ment, it re­mains un­clear whether the trav­eler whose spinal cord im­plant was dam­aged will win the law­suit and re­ceive com­pen­sa­tion.

...

Read the original on www.thetravel.com »

9 165 shares, 12 trendiness

Open-Sourcing Sarvam 30B and 105B

We’re re­leas­ing Sarvam 30B and Sarvam 105B as open-source mod­els. Both are rea­son­ing mod­els trained from scratch on large-scale, high-qual­ity datasets cu­rated in-house across every stage of train­ing: pre-train­ing, su­per­vised fine-tun­ing, and re­in­force­ment learn­ing. Training was con­ducted en­tirely in India on com­pute pro­vided un­der the IndiaAI mis­sion.

These mod­els rep­re­sent a true full-stack ef­fort. Beyond datasets, we op­ti­mized to­k­eniza­tion, model ar­chi­tec­ture, ex­e­cu­tion ker­nels, sched­ul­ing, and in­fer­ence sys­tems to make de­ploy­ment ef­fi­cient across a wide range of hard­ware, from flag­ship GPUs to per­sonal de­vices like lap­tops. Both mod­els are al­ready in pro­duc­tion. Sarvam 30B pow­ers Samvaad, our con­ver­sa­tional agent plat­form. Sarvam 105B pow­ers Indus, our AI as­sis­tant built for com­plex rea­son­ing and agen­tic work­flows. The Sarvam mod­els are glob­ally com­pet­i­tive for their class. Sarvam 105B per­forms well on rea­son­ing, pro­gram­ming, and agen­tic tasks across a wide range of bench­marks. Sarvam 30B is op­ti­mized for real-time de­ploy­ment, with strong per­for­mance on real-world con­ver­sa­tional use cases. Both mod­els achieve state-of-the-art re­sults on Indian lan­guage bench­marks, out­per­form­ing mod­els sig­nif­i­cantly larger in size.This re­lease marks an im­por­tant mile­stone for Sarvam. Building these mod­els re­quired de­vel­op­ing end-to-end ca­pa­bil­ity across data, train­ing, in­fer­ence, and prod­uct de­ploy­ment. With that foun­da­tion in place, we are ready to scale to sig­nif­i­cantly larger and more ca­pa­ble mod­els, in­clud­ing mod­els spe­cialised for cod­ing, agen­tic, and mul­ti­modal con­ver­sa­tional tasks.You can ex­pe­ri­ence Sarvam 105B is avail­able on Indus. Both mod­els are ac­ces­si­ble via our API at the API dash­board. Weights can be down­loaded from AI Kosh (30B, 105B) and Hugging Face (30B, 105B). If you want to run in­fer­ence lo­cally with Transformers, vLLM, and SGLang, please re­fer the Hugging Face mod­els page for sam­ple im­ple­men­ta­tions.Both mod­els share a com­mon ar­chi­tec­tural prin­ci­ple: high-ca­pac­ity rea­son­ing with ef­fi­cient train­ing and de­ploy­ment. At the core is a Mixture-of-Experts (MoE) Transformer back­bone that uses sparse ex­pert rout­ing to scale pa­ra­me­ter count with­out in­creas­ing the com­pute re­quired per to­ken, while keep­ing in­fer­ence costs prac­ti­cal. The ar­chi­tec­ture sup­ports long-con­text in­puts through ro­tary po­si­tional em­bed­dings, RMSNorm-based sta­bi­liza­tion, and at­ten­tion de­signs op­ti­mized for ef­fi­cient KV-cache us­age dur­ing in­fer­ence.While the two mod­els share the same de­sign phi­los­o­phy , they dif­fer in scale and at­ten­tion mech­a­nism. Sarvam 30B uses Grouped Query Attention (GQA) to re­duce KV-cache mem­ory while main­tain­ing strong per­for­mance. Sarvam 105B ex­tends the ar­chi­tec­ture with greater depth and Multi-head Latent Attention (MLA), a com­pressed at­ten­tion for­mu­la­tion that fur­ther re­duces mem­ory re­quire­ments for long-con­text in­fer­ence.Both mod­els use sparse ex­pert feed­for­ward lay­ers with 128 ex­perts, but dif­fer in ex­pert ca­pac­ity and rout­ing con­fig­u­ra­tion. This al­lows the larger model to scale to higher to­tal pa­ra­me­ters while keep­ing ac­tive com­pute bounded.All stages of the train­ing pipeline were de­vel­oped and ex­e­cuted in-house. This in­cludes the model ar­chi­tec­ture, data cu­ra­tion and syn­the­sis pipelines, rea­son­ing su­per­vi­sion frame­works, and re­in­force­ment learn­ing in­fra­struc­ture. Building every­thing from scratch gave us di­rect con­trol over data qual­ity, train­ing dy­nam­ics, and ca­pa­bil­ity de­vel­op­ment across every stage of train­ing, which is a core re­quire­ment for a sov­er­eign stack.Our 30B and 105B mod­els were trained on large datasets, with 16T to­kens for the 30B and 12T to­kens for the 105B. The pre-train­ing data spans code, gen­eral web data, spe­cial­ized knowl­edge cor­pora, math­e­mat­ics, and mul­ti­lin­gual con­tent. After mul­ti­ple ab­la­tions, the fi­nal train­ing mix­ture was bal­anced to em­pha­size rea­son­ing, fac­tual ground­ing, and soft­ware ca­pa­bil­i­ties. We in­vested sig­nif­i­cantly in syn­thetic data gen­er­a­tion pipelines across all cat­e­gories. The mul­ti­lin­gual cor­pus al­lo­cates a sub­stan­tial por­tion of the train­ing bud­get to the 10 most-spo­ken Indian lan­guages.Pre-train­ing was con­ducted in three phases, cov­er­ing long-hori­zon pre-train­ing, mid-train­ing, and a long-con­text ex­ten­sion phase. We used sig­moid-based rout­ing scores rather than tra­di­tional soft­max gat­ing, which im­proves ex­pert load bal­anc­ing and re­duces rout­ing col­lapse dur­ing train­ing. An ex­pert-bias term sta­bi­lizes rout­ing dy­nam­ics and en­cour­ages more uni­form ex­pert uti­liza­tion across train­ing steps. We ob­served that the 105B model achieved bench­mark su­pe­ri­or­ity over the 30B re­mark­ably early in train­ing, sug­gest­ing ef­fi­cient scal­ing be­hav­ior.Dur­ing su­per­vised fine-tun­ing, the model is trained on a large cor­pus of high-qual­ity prompts cu­rated for dif­fi­culty, qual­ity, and do­main di­ver­sity. Prompts are sourced from open datasets and la­beled us­ing cus­tom mod­els to iden­tify do­mains and an­a­lyze dis­tri­b­u­tion cov­er­age. To ad­dress gaps in un­der­rep­re­sented or low-dif­fi­culty ar­eas, ad­di­tional prompts are syn­thet­i­cally gen­er­ated based on the pre-train­ing do­main mix­ture. Empirical analy­sis showed that most pub­licly avail­able datasets are dom­i­nated by low-qual­ity, ho­mo­ge­neous, and easy prompts, which lim­its con­tin­ued learn­ing. To mit­i­gate this, we in­vested sig­nif­i­cant ef­fort in build­ing high-qual­ity prompts across do­mains. All cor­re­spond­ing com­ple­tions are pro­duced in­ter­nally and passed through rig­or­ous qual­ity fil­ter­ing. The dataset also in­cludes ex­ten­sive agen­tic traces gen­er­ated from both sim­u­lated en­vi­ron­ments and real-world repos­i­to­ries, en­abling the model to learn tool in­ter­ac­tion, en­vi­ron­ment rea­son­ing, and multi-step de­ci­sion mak­ing.For safety fine-tun­ing, we de­vel­oped a dataset cov­er­ing both stan­dard and India-specific risk sce­nar­ios. This ef­fort was guided by a uni­fied tax­on­omy and an in­ter­nal model spec­i­fi­ca­tion in­spired by pub­lic fron­tier model con­sti­tu­tions. To sur­face and ad­dress chal­leng­ing fail­ure modes, the dataset was fur­ther aug­mented with ad­ver­sar­ial and jail­break-style prompts mined through au­to­mated red-team­ing. These prompts were paired with pol­icy-aligned, safe com­ple­tions for su­per­vised train­ing.The re­in­force­ment learn­ing stage uses a large and di­verse prompt dis­tri­b­u­tion span­ning math­e­mat­ics, cod­ing, STEM rea­son­ing, web search, and tool us­age across both sin­gle-turn and multi-turn en­vi­ron­ments. Rewards are de­rived from a com­bi­na­tion of ver­i­fi­able sig­nals, such as cor­rect­ness checks and ex­e­cu­tion re­sults, and rubric-based eval­u­a­tions that as­sess in­struc­tion ad­her­ence, for­mat­ting, re­sponse struc­ture, and over­all qual­ity. To main­tain an ef­fec­tive learn­ing cur­ricu­lum, prompts are pre-fil­tered us­ing open-source mod­els and early check­points to re­move tasks that are ei­ther triv­ially solv­able or con­sis­tently un­solved. During train­ing, an adap­tive sam­pling mech­a­nism dy­nam­i­cally al­lo­cates roll­outs based on an in­for­ma­tion-gain met­ric de­rived from the cur­rent pass rate of each prompt. Under a fixed gen­er­a­tion bud­get, roll­out al­lo­ca­tion is for­mu­lated as a knap­sack-style op­ti­miza­tion, con­cen­trat­ing com­pute on tasks near the mod­el’s ca­pa­bil­ity fron­tier where learn­ing sig­nal is strongest.The RL sys­tem is im­ple­mented with an asyn­chro­nous GRPO ar­chi­tec­ture that de­cou­ples gen­er­a­tion, re­ward com­pu­ta­tion, and pol­icy up­dates, en­abling ef­fi­cient large-scale train­ing while main­tain­ing high GPU uti­liza­tion. Trajectory stal­e­ness is con­trolled by lim­it­ing the age of sam­pled tra­jec­to­ries rel­a­tive to pol­icy up­dates, bal­anc­ing through­put with train­ing sta­bil­ity. The sys­tem omits KL-divergence reg­u­lar­iza­tion against a ref­er­ence model, avoid­ing the op­ti­miza­tion con­flict be­tween re­ward max­i­miza­tion and pol­icy an­chor­ing. Policy op­ti­miza­tion in­stead uses a cus­tom group-rel­a­tive ob­jec­tive in­spired by CISPO, which im­proves sta­bil­ity over stan­dard clipped sur­ro­gate meth­ods. Reward shap­ing fur­ther en­cour­ages struc­tured rea­son­ing, con­cise re­sponses, and cor­rect tool us­age, pro­duc­ing a sta­ble RL pipeline suit­able for large-scale MoE train­ing with con­sis­tent learn­ing and no ev­i­dence of re­ward col­lapse.Sar­vam 105B matches or out­per­forms most open and closed-source fron­tier mod­els of its class across knowl­edge, rea­son­ing, and agen­tic bench­marks. On Indian lan­guage bench­marks, it sig­nif­i­cantly out­per­forms all mod­els we eval­u­ated.Sar­vam 105B shows strong, bal­anced per­for­mance across core ca­pa­bil­i­ties in­clud­ing math­e­mat­ics, cod­ing, knowl­edge, and in­struc­tion fol­low­ing. It achieves 98.6 on Math500, match­ing the top mod­els in the com­par­i­son, and 71.7 on LiveCodeBench v6, out­per­form­ing most com­peti­tors on real-world cod­ing tasks. On knowl­edge bench­marks, it scores 90.6 on MMLU and 81.7 on MMLU Pro, re­main­ing com­pet­i­tive with fron­tier-class sys­tems. With 84.8 on IF Eval, the model demon­strates a well-rounded ca­pa­bil­ity pro­file across the ma­jor work­loads ex­pected of mod­ern lan­guage mod­els.Sar­vam 105B per­forms strongly on multi-step rea­son­ing bench­marks, re­flect­ing the train­ing em­pha­sis on com­plex prob­lem solv­ing. On AIME 25, the model achieves 88.3 Pass@1, im­prov­ing to 96.7 with tool use, in­di­cat­ing ef­fec­tive in­te­gra­tion be­tween rea­son­ing and ex­ter­nal tools. It scores 78.7 on GPQA Diamond and 85.8 on HMMT, out­per­form­ing sev­eral com­pa­ra­ble mod­els on both. On Beyond AIME (69.1), which re­quires deeper rea­son­ing chains and harder math­e­mat­i­cal de­com­po­si­tion, the model leads or matches the com­par­i­son set. Taken to­gether, these re­sults re­flect con­sis­tent strength in sus­tained rea­son­ing and dif­fi­cult prob­lem-solv­ing tasks.Sar­vam 105B is op­ti­mized for agen­tic work­loads in­volv­ing tool use, long-hori­zon rea­son­ing, and en­vi­ron­ment in­ter­ac­tion. This is re­flected in strong re­sults on bench­marks de­signed to ap­prox­i­mate real-world work­flows. On BrowseComp, the model achieves 49.5, out­per­form­ing sev­eral com­peti­tors on web-search-dri­ven tasks. On Tau2 (avg.), a bench­mark mea­sur­ing long-hori­zon agen­tic rea­son­ing and task com­ple­tion, it achieves 68.3, the high­est score among the com­pared mod­els. These re­sults in­di­cate that the model can ef­fec­tively plan, re­trieve in­for­ma­tion, and main­tain co­her­ent rea­son­ing across ex­tended multi-step in­ter­ac­tions.A use­ful com­par­i­son is within the same scal­ing regime, since train­ing com­pute, dataset size, and in­fra­struc­ture scale in­crease dra­mat­i­cally with each gen­er­a­tion of fron­tier mod­els. The newest mod­els from other labs are trained with sig­nif­i­cantly larger clus­ters and bud­gets. Across a range of pre­vi­ous-gen­er­a­tion mod­els that are sub­stan­tially larger, Sarvam 105B re­mains com­pet­i­tive. We have now es­tab­lished the ef­fec­tive­ness of our train­ing and data pipelines, and will scale train­ing to sig­nif­i­cantly larger model sizes.Sar­vam 30B is de­signed as an ef­fi­cient rea­son­ing model for prac­ti­cal de­ploy­ment, com­bin­ing strong ca­pa­bil­ity with low ac­tive com­pute. With only 2.4B ac­tive pa­ra­me­ters, it per­forms com­pet­i­tively with much larger dense and MoE mod­els across a wide range of bench­marks. The eval­u­a­tions be­low high­light its strengths across gen­eral ca­pa­bil­ity, multi-step rea­son­ing, and agen­tic tasks, in­di­cat­ing that the model de­liv­ers strong real-world per­for­mance while re­main­ing ef­fi­cient to run.Sar­vam 30B — All Benchmarks (Gemma and Mistral are com­pared for com­plete­ness. Since they are not rea­son­ing or agen­tic mod­els, cor­re­spond­ing cells are left empty)Sar­vam 30B per­forms strongly across core lan­guage mod­el­ing tasks, par­tic­u­larly in math­e­mat­ics, cod­ing, and knowl­edge bench­marks. It achieves 97.0 on Math500, match­ing or ex­ceed­ing sev­eral larger mod­els in its class. On cod­ing bench­marks, it scores 92.1 on HumanEval and 92.7 on MBPP, and 70.0 on LiveCodeBench v6, out­per­form­ing many sim­i­larly sized mod­els on prac­ti­cal cod­ing tasks. On knowl­edge bench­marks, it scores 85.1 on MMLU and 80.0 on MMLU Pro, re­main­ing com­pet­i­tive with other lead­ing open mod­els.Sar­vam 30B per­forms strongly on multi-step rea­son­ing bench­marks, re­flect­ing its abil­ity to han­dle com­plex log­i­cal and math­e­mat­i­cal prob­lems. On AIME 25, it achieves 88.3 Pass@1, im­prov­ing to 96.7 with tool use, in­di­cat­ing ef­fec­tive in­te­gra­tion be­tween rea­son­ing and ex­ter­nal tools. It scores 66.5 on GPQA Diamond and per­forms well on chal­leng­ing math­e­mat­i­cal bench­marks in­clud­ing HMMT Feb 2025 (73.3) and HMMT Nov 2025 (74.2). On Beyond AIME (58.3), the model re­mains com­pet­i­tive with larger mod­els. Taken to­gether, these re­sults in­di­cate that Sarvam 30B sus­tains deep rea­son­ing chains and ex­pert-level prob­lem solv­ing, sig­nif­i­cantly ex­ceed­ing typ­i­cal ex­pec­ta­tions for mod­els with sim­i­lar ac­tive com­pute.Sar­vam 30B sup­ports na­tive tool call­ing and per­forms con­sis­tently on bench­marks de­signed to eval­u­ate agen­tic work­flows in­volv­ing plan­ning, re­trieval, and multi-step task ex­e­cu­tion. On BrowseComp, it achieves 35.5, out­per­form­ing sev­eral com­pa­ra­ble mod­els on web-search-dri­ven tasks. On Tau2 (avg.), it achieves 45.7, in­di­cat­ing re­li­able per­for­mance across ex­tended in­ter­ac­tions. SWE-Bench Verified re­mains chal­leng­ing across mod­els; Sarvam 30B shows com­pet­i­tive per­for­mance within its class. Taken to­gether, these re­sults in­di­cate that the model is well suited for real-world agen­tic de­ploy­ments re­quir­ing ef­fi­cient tool use and struc­tured task ex­e­cu­tion, par­tic­u­larly in pro­duc­tion en­vi­ron­ments where in­fer­ence ef­fi­ciency is crit­i­cal.To eval­u­ate Indian lan­guage ca­pa­bil­i­ties, we de­vel­oped a new bench­mark us­ing a pair­wise com­par­i­son frame­work with an LLM-as-judge pro­to­col. A key goal of this bench­mark is to re­flect how lan­guage is ac­tu­ally used in India to­day. This means eval­u­at­ing each lan­guage in two script styles, na­tive script rep­re­sent­ing for­mal writ­ten us­age and ro­man­ized Latin script rep­re­sent­ing col­lo­quial us­age com­monly seen in mes­sag­ing and on­line com­mu­ni­ca­tion.The bench­mark is or­ga­nized into four do­mains: gen­eral chat, STEM, math­e­mat­ics, and cod­ing. It orig­i­nates from 110 English source prompts, with 50 cov­er­ing gen­eral chat and 20 each for STEM, math­e­mat­ics, and cod­ing. Each prompt is trans­lated into 22 sched­uled Indian lan­guages and pro­vided in both na­tive and ro­man­ized script. Evaluating cor­rect­ness for com­plex rea­son­ing prompts di­rectly in low-re­source lan­guages can be noisy and in­con­sis­tent. To ad­dress this, we gen­er­ated high-qual­ity ref­er­ence an­swers in English us­ing Claude Opus 4, which are used only to eval­u­ate the use­ful­ness di­men­sion, cov­er­ing rel­e­vance, com­plete­ness, and cor­rect­ness, for an­swers gen­er­ated in Indian lan­guages.The eval­u­a­tion uses a pair­wise com­par­i­son method­ol­ogy with Gemini 3 as the judge model. The judge eval­u­ates re­sponses across four di­men­sions: flu­ency, lan­guage/​script cor­rect­ness, use­ful­ness, and ver­bosity. The eval­u­a­tion dataset and cor­re­spond­ing prompts are avail­able here.Sar­vam 105B wins on av­er­age 90% across all bench­marked di­men­sions and on av­er­age 84% on STEM. math, and cod­ing.Sar­vam 30B wins on av­er­age 89% of com­par­isons across all bench­marked di­men­sions and 87% on STEM, math­e­mat­ics, and cod­ing.The Sarvam to­k­enizer is op­ti­mized for ef­fi­cient to­k­eniza­tion across all 22 sched­uled Indian lan­guages, span­ning 12 dif­fer­ent scripts, di­rectly re­duc­ing the cost and la­tency of serv­ing in Indian lan­guages. It out­per­forms other open-source to­k­eniz­ers in en­cod­ing Indic text ef­fi­ciently, as mea­sured by the fer­til­ity score, which is the av­er­age num­ber of to­kens re­quired to rep­re­sent a word. It is sig­nif­i­cantly more ef­fi­cient for low-re­source lan­guages such as Odia, Santali, and Manipuri (Meitei) com­pared to other to­k­eniz­ers. The chart be­low shows the av­er­age fer­til­ity of var­i­ous to­k­eniz­ers across English and all 22 sched­uled lan­guages.Sar­vam 30B was built with an in­fer­ence op­ti­miza­tion stack de­signed to max­i­mize through­put across de­ploy­ment tiers, from flag­ship data-cen­ter GPUs to de­vel­oper lap­tops. Rather than re­ly­ing on stan­dard serv­ing im­ple­men­ta­tions, the in­fer­ence pipeline was re­built us­ing ar­chi­tec­ture-aware fused ker­nels, op­ti­mized sched­ul­ing, and dis­ag­gre­gated serv­ing.Mi­crosec­ond-level pro­fil­ing of the ex­e­cu­tion stack iden­ti­fied mem­ory stalls, ker­nel launch over­head, and in­ef­fi­cient sched­ul­ing as pri­mary bot­tle­necks. Addressing these yielded sub­stan­tial through­put im­prove­ments across all hard­ware classes and se­quence lengths. The op­ti­miza­tion strat­egy fo­cuses on three key com­po­nents.Ker­nel-level rewrites us­ing fused at­ten­tion and mat­mul pipelines tai­lored for each hard­ware tar­ge­tAd­vanced sched­ul­ing and batch­ing strate­gies that im­prove GPU uti­liza­tion un­der re­al­is­tic multi-user loads­Dis­ag­gre­gated serv­ing pipelines that re­move bot­tle­necks be­tween pre­fill and de­code stages­These op­ti­miza­tions yield sig­nif­i­cantly higher to­kens per sec­ond per GPU at the same la­tency tar­gets, en­abling higher user con­cur­rency and lower in­fra­struc­ture costs.On H100-class in­fra­struc­ture, Sarvam 30B achieves sub­stan­tially higher through­put per GPU across all se­quence lengths and re­quest rates com­pared to the Qwen3 base­line, con­sis­tently de­liv­er­ing 3x to 6x higher through­put per GPU at equiv­a­lent to­kens per sec­ond per user op­er­at­ing points.Sar­vam 30B runs ef­fi­ciently on mid-tier ac­cel­er­a­tors such as L40S, en­abling pro­duc­tion de­ploy­ments with­out re­ly­ing on pre­mium GPUs. Under tighter com­pute and mem­ory band­width con­straints, the op­ti­mized ker­nels and sched­ul­ing strate­gies de­liver 1.5x to 3x through­put im­prove­ments at typ­i­cal op­er­at­ing points. The im­prove­ments are more pro­nounced at longer in­put and out­put se­quence lengths (28K / 4K), where most real-world in­fer­ence re­quests fall.Sar­vam 30B is also op­ti­mized for lo­cal ex­e­cu­tion on Apple Silicon sys­tems us­ing MXFP4 mixed-pre­ci­sion in­fer­ence. On MacBook Pro M3, the op­ti­mized run­time achieves 20 to 40% higher to­ken through­put across com­mon se­quence lengths. These im­prove­ments make lo­cal ex­per­i­men­ta­tion sig­nif­i­cantly more re­spon­sive and en­able light­weight edge de­ploy­ments with­out re­quir­ing ded­i­cated ac­cel­er­a­tors.Sar­vam 105B is op­ti­mized for server-cen­tric hard­ware, fol­low­ing a sim­i­lar process to the one de­scribed above with spe­cial fo­cus on MLA (Multi-head Latent Attention) op­ti­miza­tions. These in­clude cus­tom shaped MLA op­ti­miza­tion, vo­cab­u­lary par­al­lelism, ad­vanced sched­ul­ing strate­gies, and dis­ag­gre­gated serv­ing. The com­par­isons above il­lus­trate the per­for­mance ad­van­tage across var­i­ous in­put and out­put sizes on an H100 node.Com­bined with the ef­fi­cient Indic to­k­enizer, the per­for­mance delta in­creases sig­nif­i­cantly for the same SLA. For the 30B model, the delta in­creases by as much as 10x, reach­ing per­for­mance lev­els pre­vi­ously not achiev­able for mod­els of this class on Indic gen­er­a­tion.The fol­low­ing demon­stra­tions show the prac­ti­cal ca­pa­bil­i­ties of the Sarvam model fam­ily across real-world ap­pli­ca­tions, span­ning web­page gen­er­a­tion, mul­ti­lin­gual con­ver­sa­tional agents, com­plex STEM prob­lem solv­ing, and ed­u­ca­tional tu­tor­ing. The ex­am­ples re­flect the mod­els’ strengths in rea­son­ing, tool us­age, mul­ti­lin­gual un­der­stand­ing, and end-to-end task ex­e­cu­tion, and il­lus­trate how Sarvam mod­els can be in­te­grated into pro­duc­tion sys­tems to build in­ter­ac­tive ap­pli­ca­tions, in­tel­li­gent as­sis­tants, and de­vel­oper tools.The wid­gets be­low demon­strate Sarvam 105Bs agen­tic ca­pa­bil­i­ties through end-to-end pro­ject gen­er­a­tion us­ing a Claude Code har­ness, show­ing the mod­el’s abil­ity to build com­plete web­sites from a sim­ple prompt spec­i­fi­ca­tion.A fully in­ter­ac­tive Pokédex web app, gen­er­ated en­tirely by our 105B model from a sin­gle prompt. Search, fil­ter by type, and browse de­tailed stats.The goal was to gen­er­ate a com­plete, pro­duc­tion-ready web­page in­clud­ing all HTML, CSS, and JavaScript re­quired to run the ap­pli­ca­tion with­out frame­works or build tools. The model used the PokéAPI to dy­nam­i­cally load Pokémon data, im­ple­ment­ing pag­i­na­tion, search, fil­ter­ing, and a de­tailed modal view, all from the prompt shown be­low.A com­plete web­site land­ing page, de­signed and coded by our 105B model in a sin­gle pass. Scroll through to ex­plore the full lay­out, an­i­ma­tions, and in­ter­ac­tions.The task was to build a com­plete web­site for Sarvam, cap­tur­ing the spirit of an Indian AI com­pany build­ing for a bil­lion peo­ple while match­ing a world-class vi­sual stan­dard across ty­pog­ra­phy, mo­tion, lay­out, and in­ter­ac­tion de­sign. The full prompt is shown be­low.Sar­vam 105B was eval­u­ated on the JEE Main 2026 pa­per from Shift 2, con­ducted on 28 January 2026, to demon­strate its STEM rea­son­ing ca­pa­bil­i­ties. The ques­tion pa­per and so­lu­tions were sourced from: https://​allen.in/​jee-main/​jan­u­ary-2026-ques­tion-pa­per-with-so­lu­tion­s­The eval­u­a­tion was car­ried out in two phases:Text-Only Evaluation: For text-only ques­tions, Sarvam 105B was eval­u­ated di­rectly on ques­tions con­tain­ing purely tex­tual con­tent.Di­a­gram-Based Evaluation: For ques­tions that in­cluded di­a­grams, Gemini-3-Pro was used to gen­er­ate struc­tured tex­tual de­scrip­tions of the vi­su­als, which were then pro­vided as in­put to Sarvam 105B for an­swer gen­er­a­tion.The ta­bles be­low sum­ma­rize Sarvam 105Bs per­for­mance across Physics, Chemistry, and Mathematics un­der Pass@1 and Pass@2 eval­u­a­tion set­tings.Un­der Pass@1, the model shows strong first-at­tempt ac­cu­racy across all sub­jects. In Mathematics, it achieves a per­fect 25/25. In Chemistry, it scores 23/25, with near-per­fect per­for­mance on both text-only and di­a­gram-de­rived ques­tions. Physics shows sim­i­larly strong per­for­mance at 22/25, with most er­rors oc­cur­ring in di­a­gram-based rea­son­ing.Un­der Pass@2, per­for­mance im­proves to per­fect scores across all sub­jects. Physics im­proves from 22/25 to 25/25, Chemistry from 23/25 to 25/25, and Mathematics main­tains a per­fect 25/25. Diagram-based ques­tions in both Physics and Chemistry achieve full marks at Pass@2, in­di­cat­ing that the model re­li­ably re­solves vi­sual rea­son­ing tasks when given struc­tured tex­tual rep­re­sen­ta­tions.Tu­tor Mode is an in­ter­nal pro­ject where the Indus stack op­er­ates with a sys­tem prompt op­ti­mized for stu­dent-teacher con­ver­sa­tions. The ex­am­ple be­low shows Sarvam 105B help­ing a stu­dent solve a JEE prob­lem through in­ter­ac­tive di­a­log rather than pro­vid­ing the an­swer di­rectly. The model guides the stu­dent by ask­ing prob­ing ques­tions, build­ing to­ward the un­der­ly­ing con­cepts be­fore ar­riv­ing at the an­swer. This also demon­strates the mod­el’s role-play­ing abil­ity.Note: the ques­tions be­low are taken from the same JEE Mains pa­per solved above.the mean free path of a mol­e­cule of di­am­e­ter 5 x 10^-10 m at the tem­per­a­ture 41°c and pres­sure 1.38 x 10^5 pa, is given as ____ m. (given k_b = 1.38 x 10^-23 j/​k).

Hi there! I see you’re work­ing on a prob­lem about the mean free path of a gas mol­e­cule—that’s a clas­sic con­cept in ki­netic the­ory.

Before we dive into the math, could you let me know which grade you’re in? Also, when you hear the term mean free path,” what do you think it de­pends on? For ex­am­ple, if you imag­ine mol­e­cules in a gas, what phys­i­cal fac­tors would make it harder for a mol­e­cule to travel a long dis­tance with­out hit­ting some­thing?im not re­ally sure about the con­cepts be­hind this. im prepar­ing for jee mains and this topic al­ways con­fuses me.No wor­ries! JEE Mains prob­lems of­ten look more in­tim­i­dat­ing than they ac­tu­ally are. Let’s break it down.

The mean free path () is sim­ply the av­er­age dis­tance a mol­e­cule trav­els be­tween two suc­ces­sive col­li­sions. Think of it like walk­ing through a crowded room; how far you can get be­fore bump­ing into some­one de­pends on a few things you al­ready in­tu­itively know.

Size of mol­e­cules (ddd): Bigger mol­e­cules are eas­ier to hit.

Density/Number of mol­e­cules: More peo­ple in the room means more bumps.

Temperature (TTT) and Pressure (PPP): These dic­tate how packed the mol­e­cules are.

If you had to guess, would the dis­tance be­tween hits () be larger or smaller if the pres­sure () in­creased?i think if the pres­sure is higher, the mol­e­cules are packed tighter, so they would hit each other more of­ten. that should make the dis­tance smaller, right?Spot on! Your in­tu­ition is lead­ing you ex­actly where we need to go.

Here is how those fac­tors re­late math­e­mat­i­cally:

λ∝1d2\lambda \propto \frac{1}{d^2}λ∝d21​: If the mol­e­cule is twice as wide, it’s ac­tu­ally four times more likely to col­lide (because the area it oc­cu­pies mat­ters).

λ∝1P\lambda \propto \frac{1}{P}λ∝P1​: Higher pres­sure means mol­e­cules are squeezed to­gether, lead­ing to more fre­quent col­li­sions.

λ∝T\lambda \propto Tλ∝T: At higher tem­per­a­tures (for a fixed pres­sure), gas ex­pands and mol­e­cules move fur­ther apart.

You men­tioned know­ing . We can ac­tu­ally use that to find the for­mula for . Since we are look­ing for a for­mula in­volv­ing di­am­e­ter (), pres­sure (), and tem­per­a­ture (), let’s try to vi­su­al­ize the collision zone” first.i know pv = nrt, but i cant re­mem­ber the spe­cific for­mula for mean free path. how do we get from one to the other?That’s a great start­ing point be­cause is the heart of gas be­hav­ior!

Let’s vi­su­al­ize why a mol­e­cule col­lides. Imagine a mol­e­cule with di­am­e­ter mov­ing through space. It will hit any other mol­e­cule whose cen­ter comes within a dis­tance of its own cen­ter.

This means our mol­e­cule ef­fec­tively acts like a bulldozer” with an ef­fec­tive di­am­e­ter of . If any other mol­e­cule’s cen­ter falls within this danger zone,” a col­li­sion hap­pens.

If the ef­fec­tive col­li­sion di­am­e­ter is , what would be the cross-sec­tional area of that danger zone” cir­cle? (Recall the area of a cir­cle is ).Close! While the danger zone” di­am­e­ter is , the ac­tual ra­dius in­volved for the cen­ter-to-cen­ter hit is .

Now, imag­ine this mol­e­cule zips for­ward. It sweeps out an imag­i­nary cylin­der. Any mol­e­cule in­side this cylin­der gets hit.

In the de­riva­tion, we find that the mean free path is in­versely pro­por­tional to this area and the num­ber of mol­e­cules per unit vol­ume (). However, be­cause all mol­e­cules are mov­ing (not just one), we add a fac­tor of to ac­count for the av­er­age rel­a­tive ve­loc­ity.

This gives us the fi­nal JEE for­mula:

Do you see where the val­ues from your ques­tion (, , , and ) fit into this?oh, i see! but the ques­tion gives kb as 1.38 x 10^-23. where does that go in the cal­cu­la­tion?It fits per­fectly! The in the ques­tion is the Boltzmann con­stant, and it sits right in the nu­mer­a­tor of our for­mula:

By plug­ging the val­ues in, the units will can­cel out to give you the dis­tance in me­ters. Let’s list what we have:

Before we cal­cu­late, we must con­vert the tem­per­a­ture to Kelvin. Do you re­mem­ber how to turn Celsius into Kelvin?yes, i add 273. so 41 + 273 = 314 k. now i just plug them all in?Ex­actly! You’ve got the tem­per­a­ture right (, or for pre­ci­sion).

Now, here is a pro-tip for JEE math: look for things that can­cel out. Notice that is and is .

When you put them in the for­mula:

The in the top and bot­tom will can­cel out im­me­di­ately! That makes your life much eas­ier.

Would you like to try sim­pli­fy­ing the pow­ers of next? What do you get for the de­nom­i­na­tor’s power of when you square ( m)?i tried cal­cu­lat­ing it all and i think it sim­pli­fies to some­thing like 2.82 x 10^-8. does that mean the an­swer is op­tion c?You nailed it! Option C () is cor­rect. 🎉

When you fin­ish the cal­cu­la­tion, you get ap­prox­i­mately m. Since , then is in­deed .

Used the cor­rected mean free path for­mula .

Would you like me to find an­other prac­tice prob­lem on RMS ve­loc­ity or Graham’s Law to keep this mo­men­tum go­ing?The cod­ing ca­pa­bil­i­ties of Sarvam 30B and Sarvam 105B were eval­u­ated us­ing real-world com­pet­i­tive pro­gram­ming prob­lems from Codeforces (Div3, link). The eval­u­a­tion in­volved gen­er­at­ing Python so­lu­tions and man­u­ally sub­mit­ting them to the Codeforces plat­form to ver­ify cor­rect­ness. Correctness is mea­sured at pass@1 and pass@4 as shown in the table be­low.An­swers are gen­er­ated us­ing the fol­low­ing sys­tem prompt, with code snip­pets ex­tracted from mark­down fences and think to­kens stripped from within tags.The Codeforces con­test used for this eval­u­a­tion took place in February 2026, while the knowl­edge cut­off of both mod­els is June 2025, mak­ing it un­likely that the mod­els had seen these ques­tions. Strong per­for­mance in this set­ting pro­vides ev­i­dence of gen­uine gen­er­al­iza­tion and real prob­lem-solv­ing ca­pa­bil­ity.Sar­vam 30B has been fine-tuned for pro­duc­tion de­ploy­ment of con­ver­sa­tional agents on Samvaad, Sarvam’s Conversational AI plat­form. Compared to mod­els of sim­i­lar size, it shows clear per­for­mance im­prove­ments in both con­ver­sa­tional qual­ity and la­tency.Key strengths in­clude strong pro­fi­ciency in Indian lan­guages, par­tic­u­larly ac­cu­rate han­dling of nu­mer­i­cal in­for­ma­tion within those lan­guages, and re­li­able ex­e­cu­tion of tool calls dur­ing mul­ti­lin­gual in­ter­ac­tions. Latency gains come from a com­bi­na­tion of fewer ac­tive pa­ra­me­ters than com­pa­ra­ble mod­els, tar­geted in­fer­ence op­ti­miza­tions, and re­duced to­k­enizer over­head.The two ex­am­ples be­low show tele­phonic con­ver­sa­tions han­dled by Sarvam 30B in Hindi and Tamil.Sarvam 105B pow­ers Indus, Sarvam’s chat ap­pli­ca­tion, op­er­at­ing with a sys­tem prompt op­ti­mized for con­ver­sa­tions. The ex­am­ple demon­strates the mod­el’s abil­ity to un­der­stand Indic queries, ex­e­cute tool calls ef­fec­tively, and rea­son ac­cu­rately. Web search is con­ducted in English to ac­cess cur­rent and com­pre­hen­sive in­for­ma­tion, while the model in­ter­prets the query and de­liv­ers a cor­rect re­sponse in Telugu.1. Top Pickleball Courts in Vijayawada near me

2. The Pickleball Republic - Siddhartha Nagar, Vijayawada

3. PickleBall Arena (@pickleballarena_vijayawada)

4. Associations Of Sports in Benz Circle, Vijayawada - Justdial

5. Sports Venues in Benz-circle-vijayawada: Book Top …1. Buy Pickleball Equipment Paddles, Balls, Nets Online in …

2. Buy Pickleball Paddles Online at Best Prices In India

3. Pickleball Equipment

4. Buy Pickleball Paddles Online in India at Best Prices

5. Buy HEAD Pickleball Paddle at Best Price in India1. 15 Common Pickleball Errors Ruining Your Game

2. How to Play Pickleball: 9 Rules Every Beginner Should Know

3. 5 com­mon be­gin­ner mis­takes in pick­le­ball

4. Common Pickleball Mistakes: 5 Errors Beginners Make

5. How to Play Pickleball: The Ultimate Guide on Pickleball Rules1. 🏓 Play Pickleball at the Lowest Price Ever in VIJAYAWADA

2. Dink It Pickleball - Vijayawada - Guru Nanak Colony …

3. Pickleball in Vijayawada! Play at The Pickleball Republic

4. 🏓 Play Pickleball at the Lowest Price Ever in VIJAYAWADA

5. 5️⃣0️⃣0️⃣ 1 month swim­ming pool(in­clud­ing train­ing)+ …

Top Lawn Tennis Courts in Vijayawada near meSar­vam 30B and Sarvam 105B rep­re­sent a sig­nif­i­cant step in build­ing high-per­for­mance, open foun­da­tion mod­els in India. By com­bin­ing ef­fi­cient Mixture-of-Experts ar­chi­tec­tures with large-scale, high-qual­ity train­ing data and deep op­ti­miza­tion across the en­tire stack, from to­k­enizer de­sign to in­fer­ence ef­fi­ciency, both mod­els de­liver strong rea­son­ing, cod­ing, and agen­tic ca­pa­bil­i­ties while re­main­ing prac­ti­cal to de­ploy.A defin­ing strength of the Sarvam model fam­ily is its in­vest­ment in the Indian AI ecosys­tem, re­flected in strong per­for­mance across Indian lan­guages, to­k­eniza­tion op­ti­mized for di­verse scripts, and safety and eval­u­a­tion tai­lored to India-specific con­texts. Combined with Apache 2.0 open-source avail­abil­ity, these mod­els serve as foun­da­tional in­fra­struc­ture for sov­er­eign AI de­vel­op­ment.This re­lease also marks a mile­stone in in­ter­nal ca­pa­bil­i­ties. Through this ef­fort, Sarvam has de­vel­oped the know-how to build high-qual­ity datasets at scale, train large mod­els ef­fi­ciently, and achieve strong re­sults at com­pet­i­tive train­ing bud­gets. With these foun­da­tions in place, the next step is to scale fur­ther, train­ing sig­nif­i­cantly larger and more ca­pa­ble mod­els.These mod­els were trained us­ing com­pute pro­vided through the IndiaAI Mission, un­der the Ministry of Electronics and Information Technology, Government of India. Nvidia col­lab­o­rated closely on the pro­ject, con­tribut­ing li­braries used across pre-train­ing, align­ment, and serv­ing. We’re also grate­ful to the de­vel­op­ers who used ear­lier Sarvam mod­els and took the time to share feed­back. We’re open-sourc­ing these mod­els as part of our on­go­ing work to build foun­da­tional AI in­fra­struc­ture in India.

...

Read the original on www.sarvam.ai »

10 154 shares, 17 trendiness

matduggan.com

I have never been an online com­mu­nity first” per­son. The in­ter­net is how I stay in touch with peo­ple I met in real life. I’m not a tweet com­ments at celebri­ties” guy. I was never funny enough to be the fun­ni­est per­son on Twitter.

So when Twitter was ac­ci­den­tally pur­chased by a fas­cist high on ke­t­a­mine, I moved to Mastodon mostly be­cause it seemed to be Twitter with­out the bull­shit”. No rec­om­mended for you feed, no ads, it was bro­ken in a way I find charm­ing. Of course search was bro­ken be­cause all OSS so­cial tools must have one glar­ing lack of func­tion­al­ity. In a night­mare world full of con­stant change it’s good to have a few con­stants to hold on to.

A lot of the nar­ra­tive at the time was this is our flag in the ground in the fight against The Man”. It was­n’t clear in this con­text if they meant cor­po­ra­tions or the me­dia or the weird pseudo celebrity that had taken over so­cial me­dia where peo­ple would breath­lessly tell me about shit like Chris-Chan” and Logan Paul bought a Pokemon card”.

We all need point­less hob­bies, but I care about YouTube stars like I care about dis­tant stars dy­ing. It’s in­ter­est­ing to some­one some­where but those peo­ple don’t talk to me. I mostly use so­cial me­dia as a place to waste time, not a plat­form to form para-so­cial re­la­tion­ships to nar­cis­sists. I pre­fer my nar­cis­sism farm to table. I’d rather dig a grave with a rusty spoon than watch a Twitch star”.

Anyway, I watched mostly ap­a­thet­i­cally as the in­ter­net tried to rally it­self to an­other cause. I read my news at the nor­mal news­pa­pers, watched my nor­mal tele­vi­sion and put so­cial me­dia off into its own silo. Then Trump ef­fec­tively shut down the en­tire free press in the US in a se­ries of bull­shit law­suits.

See I had for­got­ten the one golden rule of cap­i­tal­ism. To thrive in cap­i­tal­ism one must be amoral. Now you can be wildly sick­en­ingly suc­cess­ful with morals but you can­not reach that ab­solute zenith of share­holder value. Either you ac­cept a lower share price and don’t com­mit atroc­i­ties or you be­come evil. There is no third op­tion.

So of course me­dia cor­po­ra­tions be­came bar­gain­ing chips for the oli­garchs’ ac­tual busi­nesses. Why fight a defama­tion suit when you can set­tle it by run­ning fa­vor­able cov­er­age and maybe bank­rupt­ing the me­dia out­let you bought as a stock­ing stuffer? Suddenly I could­n’t find any re­li­able re­port­ing about any­thing in the US. My beloved Washington Post be­came straight-up pro­pa­ganda and des­per­ate at­tempts to cope. Best win­ter stews to make while you watch your neigh­bors get kid­napped at gun­point.” Twelve dol­lars a month for that.

Threads was worth­less be­cause it’s the most bor­ing so­cial me­dia web­site ever imag­ined. It’s a so­cial me­dia net­work de­signed by brands for brands, like if some­one made a ca­ble chan­nel that was just ad­ver­tise­ments and meta com­men­tary about the ad­ver­tise­ments you just saw. Billions of dol­lars at their dis­posal and Meta made a hot new so­cial me­dia net­work with the ap­peal of junk mail.

Bluesky had a bunch of stuff” but they’re try­ing to cap­ture that 2008 Twitter light­ning in a bot­tle which is a gi­ant waste of time. We’re never go­ing to go back to pre­tend­ing that tweet­ing at politi­cians does any­thing and every­one there is des­per­ately try­ing to build a brand” as the funny one or what­ever. I want news I don’t want your end­less meta com­men­tary on the news.

People talk a lot about the pro­to­cols that power Bluesky vs. ActivityPub, be­cause we’re nerds and we be­lieve deep in our hearts that the su­pe­rior pro­to­col will win. This is adorable. It flies in the face of lit­er­ally all of hu­man his­tory, where the more con­ve­nient thing al­ways wins re­gard­less of tech­ni­cal merit. VHS beat Betamax. USB-C took twenty years. The pro­to­col fight is in­ter­est­ing the way me­dieval siege war­fare is in­ter­est­ing — I’m glad some­one’s into it, but it has no bear­ing on my life. There’s no ac­tual plan to self-host Bluesky. Their pro­to­col makes it eas­ier to scale their ser­vice. That’s why it was writ­ten and that’s what it does. End of story.

Now EU news re­mained re­li­able, but send­ing European re­porters into the mad­ness of the US and try­ing to get a report” out of it is an ex­er­cise in frus­tra­tion. This be­came es­pe­cially rel­e­vant for me when Trump threat­ened to in­vade Greenland and sud­denly there was a dis­tinct pos­si­bil­ity that there might be an armed con­flict be­tween Denmark and the US. Danish re­porters weren’t get­ting meet­ings with the right peo­ple and it was just end­less ru­mors and Truth Social non­sense.

If the American press had given me 20 min­utes of air­time I could have con­vinced every­one they don’t want to get in­volved with Greenland. We’re not tough enough as a peo­ple to sur­vive in Greenland, much less take it over”. Greenlandic peo­ple shrug off hor­rific in­juries hun­dreds of kilo­me­ters from med­ical help with a smile. I watched a Greenlandic tod­dler munch meat from the spine of a seal with its head very much in­tact. We aren’t equipped to fuck with these peo­ple, they are the real deal.

So in this com­plete break­down of the press came in the Fediverse. It be­came the only re­li­able source of in­for­ma­tion I had. People posted links with a min­i­mal amount of com­men­tary, pick­ing and choos­ing the best con­tent from other so­cial me­dia net­works. They’re not do­ing it to build a brand” be­cause that’s not a thing in the Fediverse. It’s too dis­jointed to be a place to build a newslet­ter sub­scrip­tion base.

Instead it be­came the only place con­sis­tently post­ing trust­wor­thy in­for­ma­tion I could ac­tu­ally ac­cess. This be­came per­son­ally rel­e­vant when Trump threat­ened to in­vade Greenland, which is the kind of sen­tence I never ex­pected to type and yet here we are. It would be funny if I was­n’t a tiny bit con­cerned that my new home was go­ing to get a CIA overnight regime change spe­cial in the mid­dle of the night.

It was some­where in the mid­dle of DMing with some­one who had for­got­ten more about Greenland than I would ever know and some­one who lived close to an RAF base in the UK that it clicked. This was what they had been talk­ing about. Actual hu­man be­ings were able to find each other and ask di­rect ques­tions with­out this gi­ant moun­tain of bull­shit en­gage­ment piled on top of it. Meta or Oracle or who­ever owns TikTok this week could­n’t stop me.

I never ex­pected to find my news from strangers on a fed­er­ated so­cial net­work that half the in­ter­net has never heard of. I never ex­pected a lot of things. But there’s some­thing qui­etly beau­ti­ful about a place where peo­ple just… share what they know. No brand deals, no en­gage­ment met­rics, no al­go­rithm nudg­ing you to­ward rage. Just some­one who spent twenty years study­ing Arctic pol­icy post­ing a thread at 2 AM be­cause they think you should un­der­stand what’s hap­pen­ing. It’s the in­ter­net I was promised in 1996. It only took thirty years and the com­plete col­lapse of American jour­nal­ism to get here.

...

Read the original on matduggan.com »

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

If you like 10HN please leave feedback and share

Visit pancik.com for more.