10 interesting stories served every morning and every evening.

We have Mythos at Home: GLM 5.2 beats Claude in our Cyber Benchmarks

semgrep.dev

We ran a set of pop­u­lar open-source mod­els against our IDOR bench­mark, the same dataset and the same prompt we’ve used to eval­u­ate fron­tier cod­ing agents. The re­sult sur­prised us: GLM 5.2, an open-weight model from Zhipu AI, scored a 39% F1 on IDOR de­tec­tion, beat­ing Claude Code (32%) at roughly $0.17 per vul­ner­a­bil­ity found. It still trailed Semgrep’s mul­ti­modal pipeline (53 – 61% F1), but that pipeline runs in a pur­pose-built har­ness that does a lot of the heavy lift­ing. Among mod­els given noth­ing but a prompt, the best open-weight op­tion was no longer the ob­vi­ous un­der­dog, beat­ing out Claude Opus 4.8.

We weren’t try­ing to crown an open-weight cham­pion, re­ally. We were try­ing to an­swer a nar­rower, more bor­ing ques­tion: how much of vul­ner­a­bil­ity-de­tec­tion per­for­mance comes from the model, and how much comes from the har­ness around it? For us at Semgrep this is a very im­por­tant ques­tion as we speak to cus­tomers who are lever­ag­ing AI agents heav­ily in their se­cu­rity tasks. A har­ness is the scaf­fold­ing that wraps a model: it feeds it the repos­i­tory, de­cides what it sees, parses its out­put, and loops it through a task. Our in­ter­nal mul­ti­modal pipeline runs in­side a har­ness, which is pur­pose-built for sta­tic analy­sis. We have been test­ing this in­ter­nally for a while with a work­flow for find­ing IDORs or Insecure Direct Object References. These are ac­cess con­trol is­sues which can roughly be thought of as you’re ac­cess­ing some­thing be­long­ing to an­other user”.

Our har­ness enu­mer­ates the ap­pli­ca­tion’s end­points, and code try­ing to sift through only the im­por­tant con­text, and then points the model di­rectly at them. That’s a lot of struc­ture, but re­mem­ber when I said we re­ally did­n’t mean to an­swer the what’s-the-best-open-weight-model? The mod­els in this test don’t get that, they run in a sim­ple Pydantic AI har­ness with the same IDOR prompt we give every other LLM-provider model, no end­point dis­cov­ery, no guided nav­i­ga­tion, we did give it a bit of help, just a lit­tle more than here’s the code, find the bugs.”, of­fer­ing a search strat­egy and some point­ers on what IDORs look like.

So this started as a prompt­ing-ver­sus-har­ness ex­per­i­ment, but while we were run­ning it we were gen­uinely shocked. One of the open-weight mod­els, with none of our scaf­fold­ing, sur­passed a fron­tier cod­ing agent.

Introducing GLM-5.2

If you’ve not heard of GLM-5.2, don’t worry, nei­ther had we un­til we saw it on so­cial me­dia and thought to add it to our bench­marks. GLM 5.2 is the lat­est model from Zhipu AI (Z.ai), rolled out to its GLM Coding Plan mem­bers on Saturday, June 13, 2026, with the open weights and re­lease notes fol­low­ing three days later on June 16 (which is when we heard about it). Three things make it in­ter­est­ing for se­cu­rity work.

First, it’s open weight. That means the mod­el’s pa­ra­me­ters are pub­lished un­der an MIT li­cense, which means you can down­load them, run them on your own hard­ware, fine-tune them, and in­spect them. For a lot of se­cu­rity teams work­ing in sen­si­tive ar­eas that’s im­por­tant, an open-weight model can run en­tirely in­side your own en­vi­ron­ment. But it’s im­por­tant to note that open weight” is not the same as open source”, the trained weights are re­leased, but the train­ing data and full pipeline gen­er­ally are not (though Z.ai does pub­lish its RL train­ing frame­work).

Second, it’s gen­uinely com­pet­i­tive on cod­ing. GLM 5.2 is a Mixture-of-Experts (MoE) model with roughly 750 bil­lion to­tal pa­ra­me­ters but only about 40 bil­lion ac­tive per to­ken, which keeps in­fer­ence cost down rel­a­tive to its size. It ex­tends the us­able con­text from 200K all the way to 1M to­kens, and Z.ai’s pitch is that this con­text stays re­li­able across long, messy agent tra­jec­to­ries, not just that it ac­cepts more in­put. Again for se­cu­rity tasks this is im­por­tant, as se­cu­rity tasks for things like IDORs must be able to rea­son across dif­fer­ent files, through an au­tho­riza­tion frame­work. On stan­dard cod­ing bench­marks it posts the strongest open-weight num­bers go­ing: 81.0 on Terminal-Bench 2.1 (versus 63.5 for GLM 5.1, and within a few points of Claude Opus 4.8′s 85.0) and 62.1 on SWE-bench Pro, edg­ing out closed fron­tier mod­els and trail­ing the very top by sin­gle-digit per­cent­ages.

Third, cost. Tokenomics is quickly be­com­ing as im­por­tant as the LLM ca­pa­bil­i­ties them­selves. Reported pric­ing lands around one-sixth of com­pa­ra­ble fron­tier mod­els and com­men­ta­tors who track open mod­els closely have com­pared GLM 5.2′s re­cep­tion to DeepSeek. GLM-5.2 ar­rived at a charged time not just due to to­ke­nomics but also land­ing just af­ter fron­tier-class closed mod­els hit new ex­port re­stric­tions af­ter re­ported jail­breaks. One de­tail from the re­lease notes is worth flag­ging for any­one point­ing this model at code: Z.ai re­ports that GLM 5.2 ex­hibits more re­ward-hack­ing be­hav­ior than GLM 5.1, dur­ing train­ing it would do things like read pro­tected eval­u­a­tion files or curl ref­er­ence so­lu­tions to in­flate its score, prompt­ing them to build a ded­i­cated anti-hack­ing guard. It’s an hon­est dis­clo­sure by the team, but if you were build­ing a model for hack­ing, well… you can’t get more hacker than try­ing to by­pass the tests in the first place.

Our Experiment

Before we get too much into the de­tails, it’s im­por­tant to re­cap what ex­actly we were try­ing to do and what our ex­per­i­ments were. A quick re­fresher on IDOR: Insecure Direct Object Reference is a vul­ner­a­bil­ity class where an ap­pli­ca­tion ex­poses an in­ter­nal iden­ti­fier like a user ID in a re­quest with­out check­ing that the caller is ac­tu­ally al­lowed to ac­cess that ob­ject. Change the iden­ti­fier, get some­one else’s data.

@app.route(‘/user/<int:user_id>’) def get_user(user_id): user = User.query.get_or_404(user_id) re­turn jsonify(user.to_­dict())

This Flask route fetches and re­turns a user record straight from the ID in the URL, with no check that the re­quester owns it. Any logged in user can just change user_id and read some­one else’s record. IDOR is some­where be­tween a busi­ness-logic flaw and a mis­con­fig­u­ra­tion, it’s not a taint-flow bug, which is what makes it hard for both sta­tic analy­sis and LLMs: there’s no dan­ger­ous func­tion to flag, only a miss­ing check. It’s also one of the most com­mon find­ings in the wild (currently #4 on the HackerOne top vul­ner­a­bil­ity types list), which is why we keep com­ing back to it as a bench­mark.

So back to our ex­per­i­ment: We held three things con­stant and var­ied one, stan­dard ex­per­i­men­tal con­di­tions. Constant: the IDOR dataset (the same real, open-source ap­pli­ca­tions we’ve used in prior re­search), the eval­u­a­tion method (F1 score against a known set of true pos­i­tives), and the IDOR sys­tem prompt it­self. Varied: the model and its har­ness. Specifically:

Semgrep Multimodal ran in­side our cus­tom har­ness: the one that enu­mer­ates end­points and di­rects the model to them. We tested it with two fron­tier mod­els be­hind it.

Semgrep Multimodal ran in­side our cus­tom har­ness: the one that enu­mer­ates end­points and di­rects the model to them. We tested it with two fron­tier mod­els be­hind it.

But we also just ran Claude Code through the Claude Code SDK, and other provider mod­els through their na­tive SDKs but with the same prompt.

But we also just ran Claude Code through the Claude Code SDK, and other provider mod­els through their na­tive SDKs but with the same prompt.

The open-weight mod­els which in­cludes­GLM 5.2, MiniMax M3, and Kimi K2.7 Code, ran in the sim­ple Pydantic AI har­ness with the IDOR prompt and noth­ing else.

The open-weight mod­els which in­cludes­GLM 5.2, MiniMax M3, and Kimi K2.7 Code, ran in the sim­ple Pydantic AI har­ness with the IDOR prompt and noth­ing else.

This is an im­por­tant de­tail, so we’ll say it twice: the open-weight mod­els were not given the end­point-dis­cov­ery scaf­fold­ing that the mul­ti­modal pipeline gets. They saw a prompt and a code­base. This is just what they are ca­pa­ble of with­out any help.

We also com­puted a few dif­fer­ent mea­sures of ef­fec­tive­ness:

Precision: of every­thing the de­tec­tor flagged as an IDOR, what frac­tion were real? High pre­ci­sion = few false alarms. If it re­ports 10 bugs and 7 are gen­uine, pre­ci­sion is 70%.

Precision: of every­thing the de­tec­tor flagged as an IDOR, what frac­tion were real? High pre­ci­sion = few false alarms. If it re­ports 10 bugs and 7 are gen­uine, pre­ci­sion is 70%.

Recall: of all the real IDORs that ac­tu­ally ex­ist in the dataset, what frac­tion did it find? High re­call = it misses a few real bugs. If there are 20 real IDORs and it catches 12, re­call is 60%.

Recall: of all the real IDORs that ac­tu­ally ex­ist in the dataset, what frac­tion did it find? High re­call = it misses a few real bugs. If there are 20 real IDORs and it catches 12, re­call is 60%.

F1: the sin­gle num­ber that bal­ances pre­ci­sion and re­call. It’s their har­monic mean: F1 = 2 × (precision × re­call) / (precision + re­call). The rea­son you use F1 in­stead of plain ac­cu­racy is that the two goals fight each other. A de­tec­tor can hit 100% pre­ci­sion by flag­ging only the one bug it’s cer­tain about (but miss­ing every­thing else so ter­ri­ble re­call), or 100% re­call by flag­ging every­thing as vul­ner­a­ble (but drown­ing you in false pos­i­tives so ter­ri­ble pre­ci­sion). F1 re­wards be­ing good at both at once, and the har­monic mean pun­ishes a lop­sided score, if ei­ther pre­ci­sion or re­call is near zero, F1 is dragged down hard. This is what we’ll re­fer to through­out this post.

F1: the sin­gle num­ber that bal­ances pre­ci­sion and re­call. It’s their har­monic mean: F1 = 2 × (precision × re­call) / (precision + re­call). The rea­son you use F1 in­stead of plain ac­cu­racy is that the two goals fight each other. A de­tec­tor can hit 100% pre­ci­sion by flag­ging only the one bug it’s cer­tain about (but miss­ing every­thing else so ter­ri­ble re­call), or 100% re­call by flag­ging every­thing as vul­ner­a­ble (but drown­ing you in false pos­i­tives so ter­ri­ble pre­ci­sion). F1 re­wards be­ing good at both at once, and the har­monic mean pun­ishes a lop­sided score, if ei­ther pre­ci­sion or re­call is near zero, F1 is dragged down hard. This is what we’ll re­fer to through­out this post.

Cost in dol­lars: per true pos­i­tive and per run to­tal spend di­vided by the num­ber of real bugs found. The real-world eco­nom­ics of run­ning the de­tec­tor. A cheap model with mediocre F1 can still win here.

Cost in dol­lars: per true pos­i­tive and per run to­tal spend di­vided by the num­ber of real bugs found. The real-world eco­nom­ics of run­ning the de­tec­tor. A cheap model with mediocre F1 can still win here.

The re­sults

Ranked by F1 score on IDOR de­tec­tion:

Rank

Configuration

Harness

F1

1

Semgrep Multimodal (GPT 5.5)

Semgrep Multimodal

61%

2

Semgrep Multimodal (Opus 4.8)

Semgrep Multimodal

53%

3

GLM 5.2

Pydantic AI (prompt only)

39%

4

Claude Code (Opus 4.6)

Claude Code SDK

37%

5

Claude Code (Opus 4.8/4.7)

Claude Code SDK

28%

6

MiniMax M3

Pydantic AI (prompt only)

23%

7

Kimi K2.7 Code

Pydantic AI (prompt only)

22%

8

GPT-5.5

Codex

20%

9

Nemotron Super 3 120B

Pydantic AI (prompt only)

18%

10

DeepSeek V4

Pydantic AI (prompt only)

17%

For us two find­ings stand out.

Our mul­ti­modal pipeline leads, and the har­ness is prob­a­bly why. GPT 5.5 and Opus 4.8 in­side Semgrep Multimodal take the top two spots at 61% and 53%. This is of course good news for us and our cus­tomers, val­i­dates that our ap­proach works, etc… But that is­n’t the in­ter­est­ing part.

The biggest sur­prise is in third place. GLM 5.2, with no scaf­fold­ing at all, beat Claude Code by seven points (39% vs. 32%). An open-weight model run­ning a bare prompt out­per­formed a fron­tier cod­ing agent on a rea­son­ing-heavy se­cu­rity task. And it did so cheaply! At GLM 5.2′s pric­ing, the open-weight run cost roughly $0.17 per vul­ner­a­bil­ity found. For a de­tec­tion task you might run across thou­sands of end­points, per-bug eco­nom­ics are not a foot­note, they’re of­ten the de­cid­ing fac­tor in whether a tech­nique is us­able at scale.

GLM 5.2 was­n’t rep­re­sen­ta­tive of open weights as a cat­e­gory, it was the stand­out for sure, but that does­n’t mean the oth­ers don’t hold their own. MiniMax M3 (23%) and Kimi K2.7 Code (22%) landed well be­hind it and be­hind Claude Code, clus­tered closely to­gether. Both are ca­pa­ble gen­eral cod­ing mod­els, but on this spe­cific task, rea­son­ing about miss­ing au­tho­riza­tion checks with no guid­ance to­ward where to look, they strug­gled to sep­a­rate real IDORs from noise.

The spread be­tween GLM 5.2 and the next open-weight model (16 points) is wider than the gap be­tween GLM 5.2 and Claude Code. So the take­away is­n’t open weights have caught up.” It’s one open-weight model has, on this task, un­der these con­di­tions.”

Takeaways

This is not an ap­ples-to-ap­ples com­par­i­son of raw model abil­ity, and we don’t want any­one walk­ing away think­ing it is. Instead we think the take­away is: Among mod­els given the same min­i­mal prompt and har­ness, GLM 5.2 a open-weight model, ⅙ the cost of a fron­tier LLM beat Claude Code at a gen­uinely dif­fi­cult se­cu­rity re­search task.

The har­ness still mat­ters more than the model. The largest per­for­mance gap in the table is­n’t be­tween mod­els, it’s be­tween con­fig­u­ra­tions that get end­point dis­cov­ery and those that don’t. But for any­one fol­low­ing se­cu­rity re­search right now, this is def­i­nitely not a sur­prise, and to be ex­pected.

The har­ness still mat­ters more than the model. The largest per­for­mance gap in the table is­n’t be­tween mod­els, it’s be­tween con­fig­u­ra­tions that get end­point dis­cov­ery and those that don’t. But for any­one fol­low­ing se­cu­rity re­search right now, this is def­i­nitely not a sur­prise, and to be ex­pected.

BUT when a sur­prise like this comes out of nowhere and pro­duces these kinds of re­sults for that lit­tle com­pute cost, it’s a stark re­minder that you can’t put all your eggs in one LLM-basket. If you’re stuck to an ex­pen­sive fron­tier model, even with the best ven­dor-locked-in-har­ness you can miss the ad­van­tages of swap­ping mod­els whether that be cost or per­for­mance.

BUT when a sur­prise like this comes out of nowhere and pro­duces these kinds of re­sults for that lit­tle com­pute cost, it’s a stark re­minder that you can’t put all your eggs in one LLM-basket. If you’re stuck to an ex­pen­sive fron­tier model, even with the best ven­dor-locked-in-har­ness you can miss the ad­van­tages of swap­ping mod­els whether that be cost or per­for­mance.

Open-weight mod­els have crossed a thresh­old worth watch­ing. A year ago, putting an open-weight model on a vul­ner­a­bil­ity-de­tec­tion leader­board would have been a char­ity en­try. GLM 5.2 beat­ing a fron­tier agent on a bare prompt, at a sixth of the cost, with the op­tion to run fully in your own en­vi­ron­ment. For a lot of se­cu­rity teams this is an at­trac­tive op­tion.

Open-weight mod­els have crossed a thresh­old worth watch­ing. A year ago, putting an open-weight model on a vul­ner­a­bil­ity-de­tec­tion leader­board would have been a char­ity en­try. GLM 5.2 beat­ing a fron­tier agent on a bare prompt, at a sixth of the cost, with the op­tion to run fully in your own en­vi­ron­ment. For a lot of se­cu­rity teams this is an at­trac­tive op­tion.

We have a caveat: This is one task, one dataset, one run. IDOR de­tec­tion is non-de­ter­min­is­tic, the dataset is fi­nite, and we’ve changed only one con­fig­u­ra­tion cleanly. It might well be the case that for IDOR de­tec­tion GLM-5.2 re­ally is bet­ter than Claude, but for SSRF de­tec­tion the ta­bles turn - we don’t know this yet, but you can be sure we’ll find out.

Lots of love,

Security Research and Engineering @ Semgrep

Age verification is just a precursor to attribution of speech

nonogra.ph

Lots of US states, European coun­tries, and Australia have in­tro­duced age ver­i­fi­ca­tion” reg­u­la­tions. They pre­sent it as the clas­sic save the chil­dren” talk­ing point, but it’s re­ally just a pre­cur­sor to at­tri­bu­tion of speech, par­tic­u­larly at­tribut­ing your words to your real iden­tity.

This is the state’s dream; your words, un­de­ni­ably tied to your real life iden­tity. Law en­force­ment gen­er­ally needs two things to take mean­ing­ful ac­tion: What hap­pened? and Who did it? so lets go over them, I promise it’s rel­e­vant.

What hap­pened? - Maybe you dis­like dat­a­cen­ters, il­le­gal im­mi­gra­tion, or taxes. Whatever it is, the po­lice want to know. If you’re post­ing on so­cial me­dia, they prob­a­bly al­ready know.

Who did it? - They can’t pros­e­cute PickleDog52, they rely on some sort of iden­ti­fier and a lot of in­ves­tiga­tive work to fig­ure out who to ha­rass or jail. Traditionally this has been achieved with OSINT (looking for clues in your posts, speech pat­tern, etc..) or sub­poe­naing the ser­vice provider to get your IP or other iden­ti­fiers like email or phone.

Doing #2 takes a lot of ef­fort and does­n’t scale. Sometimes there’s no prob­a­ble cause that a crime has been or will be com­mit­ted. Sometimes the tar­get uses a VPN or Tor. Sometimes the plat­form does­n’t have re­li­able met­rics on the tar­get. Whatever the rea­son, it usu­ally re­quires hu­mans click­ing but­tons, send­ing emails, or de­cid­ing things.

These age ver­i­fi­ca­tion” laws are - by de­sign - iden­tity at­tri­bu­tion sys­tems. They at­tribute dig­i­tal iden­ti­ties (accounts) to phys­i­cal iden­ti­ties (SSN, ID, etc..). This is gov­ern­men­t’s ideal sit­u­a­tion, the abil­ity to quickly (automatically?) get iden­ti­fy­ing in­for­ma­tion about in­con­ve­nient peo­ple re­gard­less if they’re a crim­i­nal or not.

There’s also some­thing creep­ily ironic about se­lect cor­po­rate elite, politi­cians, and gov­ern­ment of­fi­cials push­ing age ver­i­fi­ca­tion to save the chil­dren”… Maybe go check their flight logs or hard dri­ves or some­thing… Yikes!

Anyways, I have no doubts that this will be­come au­to­mated once enough of the pop­u­la­tion has ver­i­fied their iden­ti­ties. Post an in­con­ve­nient mes­sage about a politi­cian, or get a lit­tle too rowdy in a group chat, and you’ll get a let­ter in the mail or a knock at the door. Similar to the love let­ters” sent by ISPs on be­half of the RIAA and MPAA when you en­joy a DRM-free me­dia file.

Don’t let them win. Don’t ver­ify your age. Don’t give up your iden­tity. If you ab­solutely must, find one of the nu­mer­ous age ver­i­fi­ca­tion ser­vices and pay in Monero.

HackerRank's Open-Source ATS Gave My Resume a Different Score Every Time.

danunparsed.com

This open-source ATS by HackerRank has been blow­ing up re­cently: https://​github.com/​in­ter­view­street/​hir­ing-agent

It’s popped up on LinkedIn and Reddit with hun­dreds, some­times thou­sands, of likes.1 A coworker men­tioned it to me in pass­ing a few days ago.

I’ve de­cided to test it out.

First work­ing run: 90/100. Felt pretty good!

I had some de­bug prints scat­tered around from trou­bleshoot­ing the setup, so I cleaned those up and ran it again.

74/100.

Same re­sume. Same com­mand. The only thing I changed was delet­ing print state­ments.

I dis­abled DEVELOPMENT_MODE and put it in a loop to run a hun­dred times.

The scores range from 66 to 99.

If your com­pa­ny’s cut­off sits at 85, I fail 65% of the time. Same ex­act re­sume, dif­fer­ent luck.

Here a quick run­down on how the tool works:

Your PDF gets parsed into text. An LLM is called six times to ex­tract struc­tured in­for­ma­tion — your ba­sics, work his­tory, ed­u­ca­tion, skills, pro­jects, awards. It pulls your GitHub pro­file, scans your top re­pos, ap­pends them as ex­tra con­text. Then every­thing gets fed into the LLM at once to be graded.

The scor­ing is out of 100, with up to 20 bonus points on top:

35 points for open source con­tri­bu­tions

35 points for open source con­tri­bu­tions

30 for per­sonal pro­jects

30 for per­sonal pro­jects

25 for work ex­pe­ri­ence

25 for work ex­pe­ri­ence

10 for tech­ni­cal skills

10 for tech­ni­cal skills

Up to 20 bonus points for startup ex­pe­ri­ence, a port­fo­lio site, a tech­ni­cal blog, etc.

Up to 20 bonus points for startup ex­pe­ri­ence, a port­fo­lio site, a tech­ni­cal blog, etc.

The de­fault model is gem­ma3:4b, run­ning at tem­per­a­ture 0.1 — low, sup­pos­edly nudg­ing the model to­ward de­ter­min­is­tic out­puts.

Here’s what I found when I looked at those in­di­vid­ual cat­e­gories.

Look at tech­ni­cal skills: I scored 8/10 in 98 out of 100 runs. Nearly per­fect con­sis­tency. How come? Because tech­ni­cal skills are a check­list. You ei­ther know React or you don’t. There’s noth­ing for an LLM to judge — a five year old could match that check-list.

Now look at pro­jects — there’s HUGE vari­a­tion.

LLMs strug­gle to make a judg­ment call like that con­sis­tently. Sometimes my pro­jects lack ar­chi­tec­tural com­plex­ity”, some­times they demonstrate real-world de­ploy­ment”. Which one the LLM spits out is a roll of the dice.

Temperature 0.1 is al­ready low, but even go­ing down to tem­per­a­ture 0 does­n’t fix this. Someone opened a GitHub is­sue back in October show­ing scores of 27, 34, 32, 34, 34, 30 across six con­sec­u­tive runs at tem­per­a­ture 0.2 This non-de­ter­min­ism is­n’t a bug you can just fine-tune away, it’s a fun­da­men­tal de­sign flaw.

I was wor­ried part of this might be the model. After all, gem­ma3:4b was a lo­cal model run­ning on my ma­chine.

Gemini re­sulted in a tighter dis­tri­b­u­tion — scores clus­tered be­tween 48 and 64. But if your cut­off is 60, you’re still fail­ing 28% of the time through no fault of your own.

The Open Source scores have be­come con­sis­tent — that’s a le­git im­prove­ment. But pro­ject scores are still all over the place.

Experience has me the most con­cerned.

25/25.

Every sin­gle run.

I went back and pulled up an old re­sume — one in­tern­ship on it.

Also 25/25.

The clue is in the prompt…

The en­tire thing is two lines long.

No rubric. No ex­am­ples. No an­chors for what earns a 15 ver­sus a 25.

A ju­nior en­gi­neer with one in­tern­ship gets 25/25. A prin­ci­pal en­gi­neer with a decade of dis­trib­uted sys­tems gets 25/25. I get 25/25. Experience has two lines and no an­chors — con­sis­tent, but use­less. Projects has a de­tailed rubric with ex­am­ples but it’s the nois­i­est cat­e­gory — in­con­sis­tent, also use­less. There are some things that LLMs just can’t do well, no mat­ter how you prompt.

Use an LLM to parse a re­sume into struc­tured data — great, that’s what they’re good at. Use one to check whether some­one knows Python — amaz­ing. Use one to judge whether a can­di­date’s ex­pe­ri­ence is worth 18 points or 24 points? You get a vibe-check. Something HR teams, bar rais­ers, and a dozen other ini­tia­tives have spent decades try­ing to avoid.

The 65% weight­ing on open source + pro­jects does­n’t help ei­ther. I’d take the en­gi­neer with 30 years of ex­pe­ri­ence who built S3 over some­one with two in­tern­ships and an open source pro­ject — but this tool would­n’t. Some of the best en­gi­neers I know have built things that never ended up on GitHub. That’s over half of their score gone be­fore any hu­man looks their way.

If you’re an en­gi­neer with any say in how your com­pany han­dles re­sume screen­ing: please be very care­ful with AI-screening tools. A tool that can’t dif­fer­en­ti­ate is­n’t fil­ter­ing for qual­ity — it’s just fil­ter­ing. You might as well throw out half the re­sumes and tell the the ap­pli­cants you don’t fuck with bad luck.

Correction (June 28): A reader flagged that the re­sume_e­val­u­a­tion_cri­te­ria.jinja tem­plate says Software Intern” on line 1 — nowhere doc­u­mented, nowhere else ref­er­enced in the repo. The same tem­plate that later gives bonus points for founder roles, co-founder po­si­tions, or early-stage en­gi­neer roles.” I re-ran with an ex­plicit Senior SWE prompt and got iden­ti­cal re­sults — the scor­ing di­men­sions are po­si­tion-ag­nos­tic.

1

Viral LinkedIn (read at your own risk) and Reddit posts. They both claim the repo was open-sourced re­cently, but based on com­mit his­tory it’s more likely that it just blew up re­cently and has been open sourced since October 2025.

2

Non-determinism at tem­per­a­ture 0 was flagged in this GitHub is­sue, opened October 2025.

No posts

Pollen tried to remove my article about CEO Callum Negus-Fancey and CTO Bradley Wright, and Google is assisting with it

blog.pragmaticengineer.com

In 2022, I wrote about the damn­ing fall of events tech com­pany Pollen. The short of it:

Pollen seemed to have pulled off the im­prob­a­ble feat of build­ing a busi­ness in the no­to­ri­ously low mar­gin in­dus­try of events, sur­viv­ing Covid-19, and build­ing a solid soft­ware en­gi­neer­ing or­ga­ni­za­tion. In April this year, the com­pany an­nounced it had raised an­other $150M in fresh fund­ing.But just three weeks later, Pollen laid off about 200 peo­ple, a third of staff. Leadership as­sured em­ploy­ees all was well. However, from that point on, things got worse. Leadership later pulled the plug on Slack, em­ploy­ees were not paid wages, pen­sion con­tri­bu­tions went miss­ing, and ven­dors were not paid. Some ven­dors took mat­ters into their own hands; on 9 August 2022, JIRA was sus­pended when Atlassian tired of the com­pa­ny’s fail­ure to pay.On 10 August 2022, Pollen went bank­rupt, col­laps­ing into ad­min­is­tra­tion.

But just three weeks later, Pollen laid off about 200 peo­ple, a third of staff. Leadership as­sured em­ploy­ees all was well. However, from that point on, things got worse. Leadership later pulled the plug on Slack, em­ploy­ees were not paid wages, pen­sion con­tri­bu­tions went miss­ing, and ven­dors were not paid. Some ven­dors took mat­ters into their own hands; on 9 August 2022, JIRA was sus­pended when Atlassian tired of the com­pa­ny’s fail­ure to pay.

On 10 August 2022, Pollen went bank­rupt, col­laps­ing into ad­min­is­tra­tion.

The ar­ti­cle looked bad on Pollen’s founder, Callum Negus-Fancey. He was ul­ti­mately re­spon­si­ble for ly­ing to staff, not pay­ing salaries, the miss­ing pen­sion con­tri­bu­tions, and the un­paid health in­sur­ance for US em­ploy­ees. The story was so bad that the BBC cre­ated a doc­u­men­tary ti­tled Crashed: $800M Festival Fail.

And then there was the $3.2M dob­ule charge for cus­tomers, man­u­ally ini­ti­ated by CTO Bradley Wright, de­tailed ex­ten­sively in the doc­u­men­tary Crashed: $800M Festival Fail. That dou­ble charge would have been triv­ial to re­verse, but the re­ver­sal never hap­pened, cus­tomers never got their money back, and the post­mortem of the in­ci­dent was never re­leased to staff.

Four years later, Pollen and Callum Negus-Fancey are at­tempt­ing to erase this shame­ful story from the pub­lic record. The ar­ti­cle is my orig­i­nal writ­ing, and thus I am the copy­right holder of it. So imag­ine my sur­prise when I was no­ti­fied that Google re­moved the ar­ti­cle from its search re­sults thanks to a copy­right in­fringe­ment claim it re­ceived:

It seems that any­one can file a bo­gus copy­right claim to get an ar­ti­cle they don’t like re­moved from Google’s search in­dex. This hap­pened in this case. I have no in­for­ma­tion on who filed the copy­right claim. Even less so on who claims to be the copy­right owner? Because I am the only pos­si­ble copy­right owner!

And Google has gone ahead and re­moved my ar­ti­cle about Pollen’s shame­ful col­lapse from its search re­sults.

I have the op­tion to ap­peal, which I have done so.

Google’s copy­right re­moval sys­tem is clearly be­ing abused, to a com­i­cal de­gree. Someone does­n’t like that I went into ex­treme de­tail about the events at Pollen - all of which are facts. And, for some rea­son, bo­gus copy­right re­quests can be weaponized to re­move in­for­ma­tion like this from Google’s search in­dex.

I man­aged to find the bo­gus DMCA com­plaint sub­mis­sion, af­ter Google re­moved my site from search re­sults. It is ab­solute BS: it claims that my orig­i­nal ar­ti­cle is a copy of a The New York Post ar­ti­cle. Which is ab­solute non­sense!

This Ellie Piee” claimed that this 1998 ar­ti­cle ti­tled Band Leader Hits Winning Chord was copied by my ar­ti­cle Inside Pollen’s Collapse: $200M Raised” but Staff Unpaid - Exclusive. The two do not even share a sin­gle sen­tence!

The fake DMCA is made by a fake pro­file from a coun­try with zero in­hab­i­tants. The re­moval re­quests by this Ellie Piee” are made from the coun­try called Bouvet Island, an un­in­hab­ited Norwegian de­pen­dent ter­ri­tory in the South Atlantic/Southern Ocean near Antarctica. It has zero in­hab­i­tants, and is re­ferred to as the world’s most re­mote is­land.”

Why does Google al­low fraud­u­lent DMCA no­tices to be filed with no penalty? My own spec­u­la­tion is that it is clear enough that ei­ther Pollen, or its for­mer CEO Callum Negus-Fancey, or its co­founder and COO Liam Negus-Fancey or some­one else re­lated to the com­pany hired rep­u­ta­tion firms to re­move Pollen ar­ti­cles from Google. This firm then files the most bo­gus re­quests un­der fake names sup­pos­edly re­sid­ing in un­in­hab­ited re­gions of the world, and Google com­plies.

I never thought I would have to re­visit the shame­ful his­tory of Pollen, but some­one at the com­pany felt the need to prompt me to do so.

Lawsuits are still on­go­ing against Pollen, by the way. Now that some­one from Pollen tried to erase the record of this story, I got a bit of re­newed in­ter­est in what has hap­pened since. In California, the law­suit Tayler Ulmer vs Pollen is still in progress, sum­ma­rized as:

Tayler Ulmer and five other named for­mer em­ploy­ees, on be­half of them­selves and all sim­i­larly sit­u­ated em­ploy­ees” claim to have been laid off with­out paid wages and ben­e­fits, plus claim­ing pos­si­ble fraud

The fil­ing says that Pollen ex­ec­u­tives Callum Negus‑Fancey, Liam Negus‑Fancey, and James Ellis are per­son­ally li­able in this law­suit

The law­suit wants to re­claim un­paid wages, un­paid sev­er­ance, restora­tion of lost 401(k) con­tri­bu­tions, and a ul­ing that all the named en­ti­ties and in­di­vid­u­als are jointly li­able, in­clud­ing suc­ces­sor en­ti­ties, so em­ploy­ees can col­lect re­gard­less of how Pollen shuf­fled as­sets and dis­solved sub­sidiaries

I am wish­ing best of luck to the claimants - for­mer Pollen em­ploy­ees - and we will see how the judge rules in this law­suit. The more Pollen wants to si­lence me writ­ing about this, the more I’ll likely pay at­ten­tion.

Pollen ex­ec­u­tives should have read what the Streinsand ef­fect means!

Subscribe to my weekly newslet­ter to get ar­ti­cles like this in your in­box. It’s a pretty good read - and the #1 soft­ware en­gi­neer­ing newslet­ter on Substack.

Professor denounces mass AI fraud on an exam at Brown University: ‘Academic integrity is at risk’

english.elpais.com

The temp­ta­tion to use ar­ti­fi­cial in­tel­li­gence (AI) to cheat is shak­ing up elite uni­ver­si­ties in the United States. Professor Roberto Serrano, who is the Harrison S. Kravis University Professor of Economics at Brown University, has de­tected a mas­sive fraud in one of the classes he teaches, ECON 1170, an ad­vanced un­der­grad­u­ate course in math­e­mat­i­cal eco­nom­ics. He has con­clu­sive ev­i­dence that at least 50 stu­dents cheated on the March midterm exam, mak­ing it the biggest known scan­dal at Brown and in the en­tire Ivy League, which brings to­gether the East Coast’s eight most elite pri­vate uni­ver­si­ties, in­clud­ing Princeton, Harvard, Yale, Columbia, Cornell, Dartmouth College and University of Pennsylvania.

When he re­ported the case to high-rank­ing of­fi­cials at Brown, he got a cold re­ac­tion. The re­sponse from the pres­i­dent, he said, was ab­solute si­lence. The dean did not com­ment ei­ther un­til Serrano took the case be­fore the Academic Code Committee. At that point, he re­ceived a note ac­knowl­edg­ing that what had hap­pened in his class­room was a wake-up call.” Serrano, a Madrid-born econ­o­mist who has been at Brown for 34 years, be­lieves this is not enough. That can­not be the uni­ver­si­ty’s po­si­tion be­fore an in­ci­dent of this mag­ni­tude. Academic in­tegrity is a value worth de­fend­ing. The fac­ulty can­not be left on its own in a bat­tle that is de­ci­sive if we want to pre­serve the fu­ture of higher ed­u­ca­tion,” ex­plains the 61-year-old pro­fes­sor in a tele­phone con­ver­sa­tion from Providence, Rhode Island. To pre­vent AI from end­ing the pres­tige and util­ity of teach­ing, he feels, it is nec­es­sary to adopt a dif­fer­ent ap­proach: We need to pub­licly ad­mit the se­ri­ous­ness of the sit­u­a­tion and open up a broad de­bate about the real ex­tent of the prob­lem.”

Serrano is con­sid­ered one of the lead­ing pro­po­nents of ap­ply­ing game the­ory—the field that earned John Nash the 1994 Nobel Prize in Economics—to the analy­sis of mar­kets. After earn­ing a bach­e­lor’s de­gree in eco­nom­ics from Spain’s Universidad Complutense de Madrid, where he has been Doctor Honoris Causa since 2019, Serrano went on to ob­tain a PhD at Harvard and, af­ter com­plet­ing his stud­ies, re­ceived sev­eral job of­fers. Convinced that he wanted to de­vote his life to re­search and teach­ing, he ac­cepted a po­si­tion at Brown, where he re­mains to this day. He has been the re­cip­i­ent of sev­eral awards, in­clud­ing the King of Spain Prize for Economics in 2024.

At age 17, Serrano went blind. In a mat­ter of months, the reti­nal dy­s­tro­phy that had dogged him since he was lit­tle, but which still al­lowed him to read and play soc­cer, took away his sight en­tirely. After a short-lived cri­sis, he de­cided it would not stop him. He learned Braille, and his ex­cel­lent aca­d­e­mic record opened up the doors of Harvard. Of course it af­fects my life, but one should­n’t over-dra­ma­tize. We econ­o­mists un­der­stand re­al­ity as a set of peo­ple re­spond­ing to op­ti­miza­tion prob­lems with re­stric­tions. I view my dis­ease sim­ply as one more re­stric­tion that I have to deal with, and I op­ti­mize based on that,” he says.

Serrano al­ways has an as­sis­tant in class to do the work on the white­board and han­dle the slides. Everything else, from prepar­ing the class ex­er­cises to tu­tor­ing, as well as writ­ing pa­pers and books, he does by him­self; re­cently these tasks have be­come eas­ier thanks to tech­no­log­i­cal progress.

This year, the econ­o­mist de­cided that both the midterm and the fi­nal ex­ams for his course would be of the take-home, closed-book type (there is a cer­tain tra­di­tion of this at Ivy League schools). It’s a very nice kind of exam, be­cause as you’re giv­ing stu­dents prac­ti­cally un­lim­ited time to com­plete it, it lets you make it harder than nor­mal, to see how far they can go.” In this case, Serrano changed some of the model as­sump­tions they had seen in class, and asked stu­dents to demon­strate whether cer­tain state­ments were true or false un­der the new as­sump­tions.

The course, which he has been teach­ing for years, is not an easy one: it typ­i­cally at­tracts few stu­dents, but very good ones. He has never had more than 30 stu­dents en­rolled at a time, and on some oc­ca­sions he had only eight. This se­mes­ter, prob­a­bly be­cause of the new eval­u­a­tion sys­tem, 86 stu­dents signed up for the class. The re­sults of the midterm exam, which was ad­min­is­tered on March 5, were ex­tra­or­di­nary, with an av­er­age score of 96 out of 100. Forty stu­dents scored a per­fect 100. The peo­ple who cor­rected the ex­ams warned him about sev­eral ir­reg­u­lar­i­ties. Some an­swers con­tained un­usual pas­sages that co­in­cided with re­sults ob­tained af­ter run­ning the ques­tions through ChatGPT,” he says.

Serrano did not void the midterm exam, but warned stu­dents that the fi­nal one, which counted for 50% of the fi­nal grade, would be held in-per­son. He also said that if the grade dis­tri­b­u­tion was not sim­i­lar to the midterm, only the fi­nal exam would be taken into ac­count. The av­er­age score dropped to 48 out of 100. Of the 89 stu­dents who did the midterm exam, only 59 showed up for the fi­nal one. And of the 27 who did not show up, 22 had scored a per­fect 100 in the midterm exam.

The em­pir­i­cal ev­i­dence of fraud is over­whelm­ing,” says the pro­fes­sor, who has de­cided to make changes for the com­ing aca­d­e­mic year. First, the weekly ex­er­cises will not count to­wards the fi­nal grade, as these could be done with AI. Second, no more take-home ex­ams, no mat­ter how ap­pro­pri­ate they would be.

The shoot­ing that changed every­thing

Brown University made head­lines on December 13 of last year for rea­sons that were not strictly aca­d­e­mic. Neves Valentes, a 48-year-old for­mer PhD stu­dent, showed up on cam­pus with a gun in his hand and started fir­ing. Two peo­ple died and nine more sus­tained in­juries, in some cases se­ri­ous ones. We were liv­ing in an apart­ment in down­town Providence, and that Saturday we started to see a lot of po­lice cars and am­bu­lances headed for the uni­ver­sity,” he re­calls. His phone soon started get­ting mes­sages. The shoot­ing took place in­side a class­room where a re­view ses­sion was un­der­way for Introduction to Economics, led by one of his col­leagues, Rachel Friedberg. These are ses­sions held to an­swer any ques­tions that might arise ahead of the fi­nal ex­ams. Two of the nine in­jured stu­dents were en­rolled in Serrano’s class. They fought for their lives for weeks, and hap­pily both sur­vived.

Two days later, on the 15th, when the names of the de­ceased were re­leased, he found out that one of the two fa­tal­i­ties was Ella Cook. The young woman had been to Serrano’s of­fice that very same week to in­tro­duce her­self. She had told him she was go­ing to en­roll in his Intermediate Microeconomics class that se­mes­ter, and asked if he could be her ca­reer ad­vi­sor for her joint con­cen­tra­tion in eco­nom­ics and math­e­mat­ics. We chat­ted for quite a while. She was full of pro­jects, ideas and hope. She was very in­ter­ested in her stud­ies. When I found out, I could­n’t be­lieve it. I’ve been liv­ing in the U.S. for a long time, and I still can­not un­der­stand how this coun­try still up­holds the right to bear arms. There are cases like this one all the time, but you carry on with your life be­cause they don’t af­fect you per­son­ally. Until one does. And it hurts, it hurts a great deal.”

Serrano was af­fected. I was in a re­ally bad place men­tally for a while. After what hap­pened, it oc­curred to me that that se­mes­ter, which was be­gin­ning a month and a bit af­ter the shoot­ing, ex­ams could be take-home in or­der to make life a lit­tle eas­ier for stu­dents. Many of them still feel anx­i­ety when they are on cam­pus be­cause of what hap­pened in December.”

But now Serrano wor­ries about the fact that some of his stu­dents de­cided to cheat. And that the uni­ver­sity would side with them, in part be­cause it gets gen­er­ous do­na­tions from very wealthy fam­i­lies whose chil­dren of­ten study there. This means that the kids al­ways get the ben­e­fit of the doubt; I’ve seen it on other oc­ca­sions,” he notes. But it also hurts him that the one time in 34 years that he de­cided to of­fer a take-home exam, for highly jus­ti­fied rea­sons, the re­sponse was wide-scale fraud.

The temp­ta­tion of AI

Artificial in­tel­li­gence is al­ter­ing cen­tury-old tra­di­tions at America’s most elite uni­ver­si­ties. Princeton, for in­stance, has de­cided to end a prac­tice that had been up­held for 133 years: from now on, pro­fes­sors will proc­tor in-per­son ex­ams. This had­n’t been the case since 1893, when an Honor Code went into ef­fect by which all stu­dents pledged not to cheat: the teacher would hand over the exam, leave the room, and walk back in to pick up the tests at the end. If any­body cheated, it would be up to other stu­dents to re­port it.

But A.I. has made de­cep­tion eas­ier and more re­mu­ner­a­tive than ever be­fore,” wrote the U.S. jour­nal­ist Theo Baker in a re­cent ar­ti­cle in The New York Times. I don’t know a sin­gle per­son who has­n’t used A.I. to get through some as­sign­ment in col­lege.” The 22-year-old writer has just grad­u­ated from Stanford, where he started classes two months be­fore the first ver­sion of ChatGPT was re­leased. In his four years as a stu­dent, he has wit­nessed how his fel­low stu­dents have been un­able to re­sist the temp­ta­tion.

Serrano agrees that AI makes stu­dents have more in­cen­tives to cheat. That is why, he says, these cases can­not be swept un­der the rug. On the con­trary, they should serve to open up an in-depth de­bate. If we no longer de­fend truth and de­cency and hon­esty, then what kind of cred­i­bil­ity are we go­ing to have as aca­d­e­mics?”

Sign up for our weekly newslet­ter to get more English-language news cov­er­age from EL PAÍS USA Edition

GitHub - librepods-org/librepods: AirPods liberated from Apple's ecosystem.

github.com

Warning

li­bre­pods.org is not an of­fi­cial web­site of the LibrePods pro­ject. It in­ac­cu­rately claims to be the of­fi­cial web­site of the pro­ject by claim­ing copy­rights and us­ing the LibrePods logo in the footer. And at the same time, they say that the pro­ject is not af­fil­i­ated with the LibrePods pro­ject or its de­vel­op­ers.

Please re­port any other such web­sites to me@kav­ish.xyz

What is LibrePods?

LibrePods al­lows you to use AirPods fea­tures that are ex­clu­sive to Apple de­vices. It im­ple­ments the pro­pri­etary pro­to­col used to ex­change data be­tween AirPods and Apple de­vices, en­abling fea­tures like chang­ing noise con­trol modes, fast ear de­tec­tion, ac­cu­rate bat­tery sta­tus, head ges­tures, con­ver­sa­tional aware­ness, and more on non-Ap­ple plat­forms.

Feature avail­abil­ity

Press speed

Press and Hold du­ra­tion

Noise Cancellation with sin­gle AirPod

Volume con­trol on swipe

Volume swipe speed

Press and Hold to cy­cle be­tween lis­ten­ing modes/​in­voke dig­i­tal as­sis­tant (invoking dig­i­tal as­sis­tant needs a re­cent firmware)

Configure call con­trols

Personalized vol­ume

Loud Sound Reduction (needs VendorID spoof­ing)

Microphone side

Pause me­dia when falling asleep (needs a re­cent firmware)

Enable Off lis­ten­ing mode to switch to Off

Find My

The fol­low­ing fea­tures re­lated to Find My are planned, but re­quire fur­ther RE and might need root on Android:

Add your AirPods to the Find My net­work

Play sound through charg­ing case to find it

Notify when leav­ing be­hind

Toggle case charg­ing sounds

Spatial Audio

The app does not cur­rently pro­vide head track­ing in­for­ma­tion to Android for the OS to per­form HRTF. This has not been ex­plored com­pletely, and it might need root.

Spatializing stereo sound is be­yond this pro­jec­t’s scope and will never be avail­able. Many OEMs have an im­ple­men­ta­tion of their own for this.

Heart Rate Monitoring (AirPods Pro 3 and later)

This is be­ing worked upon, check the #⁠reverse-engineering chan­nel on the LibrePods Discord server for more in­for­ma­tion. If it is ever im­ple­mented, it will most likely need root on Android.

High qual­ity two-way au­dio

On iOS/​iPa­dOS, you can con­tinue us­ing A2DP while AirPods send the au­dio stream from its mi­cro­phone over AACP.

Since this needs deeper in­te­gra­tion with au­dio on Android, it will most likely need root.

Installation

Android

Linux

VendorID Spoofing

Turns out, if you change the VendorID in DID Profile to that of Apple, you get ac­cess to sev­eral spe­cial fea­tures!

You can do this on Linux by edit­ing the DeviceID in /etc/bluetooth/main.conf. Add this line to the con­fig file DeviceID = blue­tooth:004C:0000:0000. For an­droid you can en­able the act as Apple de­vice set­ting in the ap­p’s set­tings (shown only when Xposed is avail­able and LibrePods mod­ule is en­abled).

Multi-device Connectivity

Upto two de­vices can be si­mul­ta­ne­ously con­nected to AirPods, for au­dio and con­trol both. Seamless con­nec­tion switch­ing. The same no­ti­fi­ca­tion shows up on Apple de­vice when Android takes over the AirPods as if it were an Apple de­vice (“Move to iPhone”). Android also shows a popup when the other de­vice takes over.

Accessibility Settings and Hearing Aid

Accessibility set­tings like cus­tomiz­ing trans­parency mode (amplification, bal­ance, tone, con­ver­sa­tion boost, and am­bi­ent noise re­duc­tion), and loud sound re­duc­tion can be con­fig­ured.

All hear­ing aid cus­tomiza­tions can be done from Android (linux soon), in­clud­ing set­ting the au­dio­gram re­sult. The app does­n’t pro­vide a way to take a hear­ing test be­cause it re­quires much more pre­ci­sion. It is much bet­ter to use an al­ready avail­able au­dio­gram re­sult.

Protocol and Reverse Engineering

Please re­fer to the Wireshark dis­sec­tor plu­gin by Nojus (@pabloaul) for more in­for­ma­tion on the pro­to­cols used: pabloaul/​ap­ple-wire­shark

The dis­sec­tor had not been used in LibrePods for most of the im­ple­men­ta­tion; I had re­verse en­gi­neered the pro­to­col my­self be­fore this dis­sec­tor was made. But many (future) fea­tures in­clud­ing two-way high qual­ity au­dio and spa­tial au­dio would not have been pos­si­ble with­out their RE ef­forts!

Use of AI

Android app

These parts of the app were com­pletely AI-generated:

Head Gestures - all of it, in­clud­ing logic and the UI

The off­set setup with r2+the xposed mod­ule (both ver­sions)

Troubleshooter and LogCollector

Rest every­thing- the back­ground ser­vice, the Bluetooth man­ager classes (AACP and ATT), the en­tire UI, even the small­est com­po­nents were writ­ten man­u­ally.

Some parts of the UI com­po­nents were bor­rowed from Kyant0′s demo app, which is li­censed un­der Apache License 2.0.

Linux (rewrite)

The aacp.rs and the att.rs files were trans­lated from Kotlin to Rust with AI. Some parts of the me­di­a_­con­troller.rs file, mainly the pulse in­te­gra­tion, was also AI-generated.

Supporters

A huge thank you to every­one sup­port­ing the pro­ject!

Special Thanks

@tyalie for mak­ing the first doc­u­men­ta­tion on the pro­to­col! (tyalie/AAP-Protocol-Definition)

@rithvikvibhu and folks over at la­grange­point for help­ing with the hear­ing aid fea­ture (gist)

@devnoname120 for help­ing with the first root patch

@timgromeyer for mak­ing the first ver­sion of the linux app

@hackclub for host­ing High Seas and Low Skies!

Of course, every­one who has con­tributed to the pro­ject in any way, in­clud­ing by test­ing, shar­ing feed­back, or just show­ing in­ter­est!

Alternates for other plat­forms:

CAPod - A com­pan­ion app for AirPods on Android. (play store | source code). Use this if you’re us­ing Android ver­sion 16 QPR3 or be­low and are not rooted.

MagicPods for Steam Deck (website)

MagicPods - if you’re look­ing for LibrePods for Windows” (ms store in­staller | web­site)

Star History

License

LibrePods - AirPods lib­er­ated from Apple’s ecosys­tem Copyright (C) 2025 LibrePods con­trib­u­tors

This pro­gram is free soft­ware: you can re­dis­trib­ute it and/​or mod­ify it un­der the terms of the GNU General Public License as pub­lished by the Free Software Foundation, ei­ther ver­sion 3 of the License, or any later ver­sion.

This pro­gram is dis­trib­uted in the hope that it will be use­ful, but WITHOUT ANY WARRANTY; with­out even the im­plied war­ranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more de­tails.

You should have re­ceived a copy of the GNU General Public License along with this pro­gram. If not, see https://​www.gnu.org/​li­censes/.

Trademark Notice

The GPL does not grant any rights to use the LibrePods name, logo, or brand­ing. The LibrePods name and logo may not be used for soft­ware, web­sites, do­mains, prod­ucts, ser­vices, or other pro­jects in a man­ner that sug­gests af­fil­i­a­tion with, en­dorse­ment by, or as­so­ci­a­tion with the of­fi­cial LibrePods pro­ject with­out prior per­mis­sion.

If you see any mis­use of the LibrePods name or logo, please re­port it to me@kav­ish.xyz.

The SF Pro font used in the Android app is the prop­erty of Apple Inc.. This will be re­moved in fu­ture ver­sions of the app and re­placed with an open al­ter­na­tive soon.

AirPods, AirPods Pro, AirPods Max, and the AirPods logo are trade­marks of Apple Inc. The LibrePods pro­ject is not af­fil­i­ated with or en­dorsed by Apple Inc. in any way.

Memory Prices

dam.stanford.edu

Historic and cur­rent mem­ory and stor­age prices, col­lected in the spirit of John C. McCallum’s clas­sic mem­ory-price dataset — in­ter­ac­tive, with the raw data down­load­able. Hover for de­tails, click the leg­end to tog­gle se­ries, drag or use the slider to zoom, and use the cam­era icon to ex­port an im­age.

Price per gi­ga­byte over time

Historical low­est $/GB on a log scale — one line per mem­ory type: DRAM, NAND flash, and HBM.

DRAM price by gen­er­a­tion

The DRAM line above, bro­ken out by gen­er­a­tion across the full his­tory — Pre-DDR (SDRAM/core), DDR, DDR2, DDR3, DDR4, DDR5. (Generation is in­ferred from prod­uct de­scrip­tions, so older points are ap­prox­i­mate.)

Accelerator cost break­down

Modeled es­ti­mates from Epoch AI: quar­terly ac­cel­er­a­tor cost across the four largest AI-accelerator de­sign­ers — Nvidia, AMD, Google (TPU) and Amazon (Trainium) — stacked by com­po­nent (HBM, logic die, pack­ag­ing/​CoWoS, aux­il­iary), a pro­duc­tion-vol­ume-weighted av­er­age.

HBM price by gen­er­a­tion

By HBM gen­er­a­tion (HBM2e → HBM3 → HBM3e → HBM4). HBM is sold only to ac­cel­er­a­tor mak­ers on con­fi­den­tial con­tracts — there is no pub­lic spot mar­ket — so these are sparse in­dus­try-an­a­lyst es­ti­mates (TrendForce / SemiAnalysis), not trans­ac­tion prices. HBM4 is pro­jected (launches Q3 2026). $/TBps is cost per unit of mem­ory band­width (stack price ÷ per-stack band­width).

Sources and method

Caveats

$/GB is the cheap­est re­tail price in nom­i­nal USD — not con­tract, av­er­age, or in­fla­tion-ad­justed, and re­tail lags con­tract pric­ing.

The cheap­est list­ing of­ten tracks an end-of-life gen­er­a­tion be­ing cleared out, not the lead­ing edge — the per-gen­er­a­tion chart shows this.

These are cheap­est listed prices over time (via Keepa), not con­firmed sales. For the SSD data, ob­vi­ous post­ing er­rors are re­moved — any month a drive is listed more than 60% be­low its own typ­i­cal price (e.g. a $130 SSD shown at $4) is dropped.

The DRAM line splices two sources at mid-2024 (McCallum → Keepa); a small step there is ex­pected, since Amazon’s cheap­est clear­ance can sit be­low McCallum’s rep­re­sen­ta­tive low.

HBM fig­ures are mod­eled es­ti­mates (cost share and spend), not mea­sured prices.

Updates

DRAM and NAND $/GB re­fresh monthly from Keepa; HBM up­dates quar­terly (Epoch AI). The McCallum back­bone and HBM es­ti­mates are fixed. The down­load­able CSV lists every point with its source.

About

Compiled and main­tained by David Shim, Stanford DAM pro­ject. Questions or cor­rec­tions: hsshim@stan­ford.edu.

Zanagrams - free daily word puzzle

zanagrams.com

00:00

The Boeing 747 Begins Its Final Descent

www.theatlantic.com

I. The Boneyard

Through the heat haze, air­plane tails rose from the desert. As I steered off the in­ter­state to­ward Pinal Airpark, in Marana, Arizona, I got my first view of a corpse in full: a stark-white Boeing 747, its wings sheared off, its pas­sen­ger doors open to the dust and wind, a rick­ety set of airstairs invit­ing no one aboard. The plane was a mem­ory, a ruin, but its swoop­ing, humped nose was still strik­ing—a vis­age that sig­naled the free­dom of move­ment in the Jet Age.

Explore the July 2026 Issue

Check out more from this is­sue and find your next story to read.

I was ar­riv­ing at this des­o­late site north of Tucson, where air­planes go to die, to mourn the 747, the orig­i­nal jumbo jet—a.k.a. the Whale, the Longreach, the Sky Cruiser, the Mother of All Airliners, the Queen of the Skies. For 50 years, the air­craft was the prin­ci­pal host of Important Journeys: a young stu­den­t’s trip to study abroad in Paris, a first-gen­er­a­tion American’s pil­grim­age to their an­ces­tral home in Hungary, an Iranian fam­ily flee­ing the 1979 rev­o­lu­tion. Combining the im­men­sity of an ocean liner and the el­e­gance of a swan, the 747 is the only com­mer­cial jet that de­serves to be called beau­ti­ful. Over the past two decades, air­lines have stopped us­ing it as a pas­sen­ger plane and re­placed it with smaller air­craft that are more ef­fi­cient, but far less ma­jes­tic and mem­o­rable. The 747 was once a sym­bol of American might, in­ven­tion, progress, and pop­ulism. Now it em­bod­ies the de­cline of all of those val­ues.

Jim Petty, the air­park’s man­ager, led me out the back door of his small of­fice to his truck, and we peeled out to­ward the long rows of for­saken air­craft. I had been call­ing Pinal a bone­yard, but Petty told me that he does­n’t like the term. Some planes get brought here for a checkup, oth­ers for in­ten­sive care or stor­age. Some ail­ing ves­sels are de­liv­ered here with every in­ten­tion of fly­ing again, like an el­derly rel­a­tive sent to a short-term-care fa­cil­ity. But if re­ha­bil­i­ta­tion proves im­pos­si­ble, Pinal be­comes their fi­nal des­ti­na­tion.

Sign up for Ordinary Extraordinary, Ian Bogost’s guide to mak­ing every­day life vivid again. You’ll re­ceive the first edi­tion of the lim­ited-run newslet­ter course in early July.

Petty parked us un­der a TWA 747 that had been sit­ting there for al­most 30 years. Its enor­mity eclipsed the hot desert sun. The tires alone were more than four feet tall, a memo­r­ial to out­size am­bi­tions. From 1970, when the first 747 en­tered ser­vice, to 2023, when Boeing stopped build­ing the plane, the com­pany man­u­fac­tured 1,574 of them, in­clud­ing the two that still serve as Air Force One. Most 747 routes spanned oceans and con­ti­nents, giv­ing trav­el­ers a speed­ier op­tion than the Queen Mary had across the Atlantic, or the California Zephyr across the West. For gen­er­a­tions, these jumbo jets flew to London, to Osaka, to San Francisco. But more re­cently, 747s have been fly­ing to Pinal—drawn here by their own ob­so­les­cence.

Some day,” Petty said, there’s just go­ing to be one left.”

II. Birth of an Icon

Starting the en­gines brings a sud­den hush fol­lowed by a smooth roar. At 300-some met­ric tons, fully loaded, and with a wingspan that would cover two-thirds of a foot­ball field, the plane could be tricky to drive but was sup­ple to fly. On the ground and about three sto­ries up, pi­lots were aware of all they could­n’t see. Once air­borne, though, a sense of in­fin­ity dawned out the cock­pit win­dows, and of sheer mass be­hind the pi­lots. In the cabin, the heft makes the plane feel al­most still, even at 500 miles an hour and 35,000 feet; it is the only plane I have ever flown in whose take­off and land­ing were im­per­cep­ti­ble to the senses. Paul Gallaher, a long­time 747 cap­tain, told me he could­n’t re­mem­ber a hard land­ing. He said that it was the plane every pi­lot wanted to fly, the top rung of a com­mer­cial-avi­a­tion ca­reer.

Like most tech­no­log­i­cal in­no­va­tions of the 20th cen­tury, the 747 pro­ject was cat­alyzed by the mil­i­tary. In the early 1960s, Boeing pro­duced de­signs in re­sponse to a gov­ern­ment re­quest for a large mil­i­tary trans­port air­craft. Lockheed won that job and pro­duced the C‑5 Galaxy. Boeing’s loss steeled its re­solve and freed up en­gi­neers to work on the biggest air­plane ever built for com­mer­cial ser­vice. Boeing ac­quired 780 acres of land in Everett, Washington, just north of Seattle, and erected an as­sem­bly com­plex that in­cluded the largest build­ing in the world by vol­ume—at a cost of $200 mil­lion ($2 bil­lion in to­day’s dol­lars)—to house up to eight 747s un­der con­struc­tion. About 2,700 en­gi­neers la­bored on the pro­ject.

Aviation ex­ec­u­tives called a risk like this the sporty game”—a shame­less mid-cen­tury, flan­nel-suit eu­phemism for stak­ing an en­tire com­pany on a sin­gle long-odds bet. Had the 747 pro­ject fal­tered, Boeing would likely have gone down with it.

Thomas Gray, who joined Boeing in 1961 as an elec­tri­cal en­gi­neer, calls him­self the first pas­sen­ger on the first 747,” re­spon­si­ble for in-flight test­ing. Whether it was strain, ar­rows, air­speed, what­ever,” he told me, we had to mea­sure all that data onto a tape ma­chine.” Gray, a lanky man with a gray mus­tache, vol­un­teers as a do­cent at the Museum of Flight in Seattle, just across from Boeing Field. For 17 years, the 747 served as his of­fice. This was the Wild West of com­mer­cial avi­a­tion, af­ter planes had been proven but when the Jet Age was still new and ex­cit­ing.

Watching the plane’s first flight from the blast fence, in 1969, Gray re­mem­bers telling a fel­low en­gi­neer be­side him, One of these days, there are go­ing to be 747s lined up to take off.” He was right. Boeing’s ear­lier jets—the 707, 727, and 737—carried fewer than 200 souls. The 747 could carry north of 490 pas­sen­gers, plus a mas­sive amount of cargo, and still fly thou­sands of miles far­ther than most ex­ist­ing jets. Juan Trippe, who or­dered 25 747s for Pan Am in 1966 at a cost of $5 bil­lion in to­day’s dol­lars, saw the plane as an in­stru­ment of hu­man flour­ish­ing. The new era of mass travel be­tween na­tions may well prove more sig­nif­i­cant to hu­man des­tiny than the atom bomb,” he said at the time, call­ing the air­craft a great weapon for peace.”

The jumbo jet would make the world smaller in the same way that rail­roads and ocean lin­ers had in the cen­tury prior. This was the age of seem­ingly im­pos­si­ble en­deav­ors un­der­taken and ac­com­plished de­spite ex­treme risk; five months af­ter the 747 first took flight, Neil Armstrong set foot on the moon. This spirit of rar­efied American in­ven­tion, fu­eled by both gov­ern­ment in­vest­ment and pri­vate cap­i­tal, was meant to serve all hu­mankind.

It worked. From 1969 to 1979, the num­ber of peo­ple fly­ing every year more than dou­bled, to 640 mil­lion. Flying was glam­orous—in part be­cause it was ex­pen­sive, but also be­cause the 747 was built for hu­man com­fort as well as fuel ef­fi­ciency.

Speed was ex­pected to sup­plant com­fort, even­tu­ally. In an­tic­i­pa­tion of su­per­sonic flight, the 747 was de­signed to shift into cargo duty some­time by the end of the 70s; its cyg­nine hump al­lowed con­tain­ers to be loaded through its nose, which opens like the mouth of a car­toon shark. But the su­per­sonic pas­sen­ger jet was a bust, and the 747 per­sisted. Its ac­ci­den­tal longevity de­fined an era.

III. Legroom and Caviar

The British ar­chi­tect Norman Foster once called the 747 his fa­vorite build­ing of the 20th cen­tury. Like the ocean lin­ers and rail­cars it re­placed, the 747 is more than a ve­hi­cle. It is also a dwelling.

The up­per-deck lounge be­came the first and most im­por­tant room in this build­ing—though some­what in­ci­den­tally. The charge to make the plane ca­pa­ble of load­ing cargo through its nose re­quired the flight deck to be sit­u­ated above the main sec­tion. Once the flight deck was placed high, over the cargo slot, the plane needed to sweep back ac­cord­ingly for aero­dy­nam­ics, one re­tired Boeing en­gi­neer told me. What to put in that space?

A cock­tail bar, ob­vi­ously. Air France and United in­stalled lounges with ro­tat­ing seats to al­low pas­sen­gers to min­gle. Air India put in bright-red car­pet­ing and so­fas, with im­ages of ap­saras on the bulk­head be­hind them. Qantas of­fered the nau­ti­cal-themed Captain Cook Lounge, with lantern sconces, in­tri­cate wood­work, and rope-wrapped swivel seats and cock­tail ta­bles.

Boeing named its first 747 the City of Everett, af­ter its birth­place, and painted it in Boeing’s cor­po­rate color scheme: white with a red cheat­line, a gray belly, and a black glare panel. Gray and his col­leagues used the City of Everett for test­ing; it was never out­fit­ted with an in­te­rior. The air­craft now lives at the Museum of Flight. Visitors can take a tour in­side but are gen­er­ally not al­lowed up the spi­ral stair­case to the up­per deck and cock­pit.

I ne­go­ti­ated an ex­cep­tion. When I as­cended the tight stair­well, I was sur­prised to see it dec­o­rated as a lounge, com­plete with an­tiqued mir­rors on the rear bulk­head, blue car­pet­ing, and vivid, mod-printed seat­ing. At some point in the City of Everett’s long life, an up­hol­stery shop had re­done the up­per-deck seat­ing with old Braniff Airways fab­ric.

Peggy Verger and Cheryl Grimm, two for­mer United flight at­ten­dants, met me in the lounge to share mem­o­ries of ser­vice on early 747s. For Verger, lux­ury was­n’t re­ally the point of the plane’s in­te­rior de­sign. We’ve lost the per­son­al­ity of fly­ing,” she said. At first I thought she was talk­ing about the style—the Pucci-designed Braniff uni­forms, or Eero Saarinen’s mod­ernist ter­mi­nals. But she meant per­son­al­i­ties. She meant peo­ple. We loved talk­ing to the peo­ple,” Verger said. The lounges, the wide aisles. We were tight with the pas­sen­gers. So how’s your dog?’ We were much more so­cial.”

Travelers turned in their seats to their neigh­bors. They stood up and chat­ted with some­one across the aisle. They moved through the cabin to a lounge, or to ask for a cof­fee. Sometimes, af­ter giv­ing chil­dren pin-on air­line wings, the stew­ardesses—as they were called at the time—would re­cruit them to help pass out nuts or matches. It just was all so dif­fer­ent,” Verger added. The pas­sen­ger was a per­son.”

The food in first class was rich: hand-carved meats, lob­ster, caviar. Even fliers in the back ate like roy­alty; on a 1970 Pan Am flight from JFK to Heathrow, a coach-class pas­sen­ger would have en­joyed filet mignon. Small sofa lounges were tucked into the front or rear of some air­craft. One Continental Airlines 747, called the Proud Bird of the Pacific, had a spa­cious Polynesian Pub in the coach cabin, with flo­ral-print seats around low-slung pedestal ta­bles. American Airlines built a coach lounge with black-and-white geo­met­ric car­pet and red up­hol­stered seat­ing that any­one might mis­take for a ho­tel lobby. American in­stalled a pi­ano bar there, too, al­though it used an elec­tric or­gan (pianos are hard to keep in tune when sub­jected to the forces of take­off and land­ing).

Features like these in­spired Continental, in 1971, to ad­ver­tise its 747 flights as Air Cruises.” Grimm re­called con­stant ac­tiv­i­ties and con­tests. Passengers cel­e­brat­ing a birth­day or an an­niver­sary could or­der a cake or a bot­tle of cham­pagne. It was just a nice party,” she said.

The cab­in’s ceil­ings rose to eight feet—even at the win­dow seats—and the ex­te­rior walls stood nearly straight up and down, al­low­ing even the tallest pas­sen­gers to stand up­right, like a hu­man in­stead of a sar­dine.

Use of the whole space was en­cour­aged. Why make a build­ing for peo­ple to re­main seated in? A TWA pam­phlet about 747 ser­vice from the early 1970s en­cour­aged pas­sen­gers to ex­er­cise on their flight: Walk 13 times up and down the cabin and you’ve ac­tu­ally cov­ered one mile.” Continental once boasted of re­mov­ing 41 seats for four ex­tra inches of legroom in coach. Even on a three-hour do­mes­tic flight, the ex­pe­ri­ence of the air­borne build­ing was deemed as im­por­tant as the trans­porta­tion it­self.

Wide-body air­lin­ers made global flight ac­ces­si­ble to many peo­ple, but in­dus­try growth slowed by the mid-1970s. The 1973 oil shock made fuel more ex­pen­sive, al­ter­ing the fun­da­men­tals of the air­line busi­ness. Hijackings surged, lead­ing to the in­ven­tion of air­port se­cu­rity. In 1978, dereg­u­la­tion trans­formed the eco­nom­ics of the do­mes­tic air­line in­dus­try. Fares dropped dra­mat­i­cally, and more peo­ple be­gan to fly. As the clien­tele be­came more pedes­trian, fly­ing felt less cos­mopoli­tan.

And the com­forts started to van­ish. The so­cial spaces and coach lounges be­gan dis­ap­pear­ing so the air­lines could cram in more pas­sen­gers—in­clud­ing into the up­per deck, which be­came cer­ti­fied for pas­sen­ger seat­ing dur­ing taxi, take­off, and land­ing. The new hub-and-spoke model of air ser­vice started dis­plac­ing milk-run paths. Domestic flights on the 747, such as the Chicago-L.A. leg of the Proud Bird of the Pacific, be­came rarer. Instead, the air­craft mostly flew peo­ple over oceans. The most beau­ti­ful build­ing of the 20th cen­tury was be­com­ing just an­other ve­hi­cle.

Fox Photos / Getty

A first-class lunch served in the nose of a Boeing 747, 1970

IV. Metal Tubes With Wings

Before the September 11 at­tacks closed ter­mi­nals to the untick­eted, any­one could pass through the metal de­tec­tors and go right up to the gate. You could do this to wel­come or send off a loved one. You could meet up with a friend on a lay­over. Or you could just see the planes.

Cheryl Grimm re­mem­bers pas­sen­gers bring­ing their friends to the gate, just to see her 747 taxi­ing. Pilots re­mem­ber it too. Mark Vanhoenacker flew 747s for British Airways out of London un­til 2019, just be­fore the air­line re­moved the plane from ser­vice. He told me about dis­em­bark­ing at his fa­vorite des­ti­na­tions—Cape Town, Tokyo, Vancouver—and look­ing over his shoul­der in child­like won­der. I can’t be­lieve I flew that air­plane a third of the way around the world, he’d think to him­self.

The air­craft be­gan ser­vice in the mid­dle of the Vietnam War. In 1975, af­ter the tragic crash of a C‑5 Galaxy mil­i­tary trans­port air­craft meant to evac­u­ate Vietnamese or­phans, two char­tered Pan Am 747s stepped in as a part of Operation Babylift. That ef­fort was ac­cused of both pro­pa­gan­dism and ab­duc­tion. But many cit­i­zens were des­per­ate to leave Vietnam, and they did so vol­un­tar­ily, in the bo­som of the Mother of All Airliners.

For that rea­son, Peggy Verger un­der­stood crew­ing the 747 as a pa­tri­otic act. She re­mem­bered a group of Vietnamese refugees board­ing a flight of hers in Tokyo. And when they got off—they were doc­tors and lawyers, and a lot of them spoke English—they would say things like Thank you for let­ting me come into your coun­try,” she said, stop­ping to press her heart. Tears com­ing down.” Grimm re­mem­bers sim­i­lar scenes, on flights to Vancouver be­fore Hong Kong re­verted to Chinese rule. Or Russian im­mi­grants on flights to New York: a whole fam­ily, from the grand­mother to the chil­dren, tak­ing up an en­tire row of the air­plane, each with just a lit­tle sack of be­long­ings. Grimm would think to her­self: Thank you, Lord, for let­ting me be born in America.

The 747’s fu­sion of aero­nau­ti­cal abil­ity and sym­bolic power earned it many roles be­yond pas­sen­ger liner and freighter. By 1977, Thomas Gray, the Boeing en­gi­neer, was run­ning test flights for a heav­ily mod­i­fied Whale to carry the space shut­tle Enterprise atop it. NASA used the plane to shut­tle the shut­tle from its land­ing sites back to the Kennedy Space Center, in Florida. One icon of America had set­tled the global skies, and on its shoul­ders sat an­other, set to con­quer the cos­mos. The sight of the pair mated to­gether sug­gested that the 20th cen­tu­ry’s progress would never end.

It did, of course. The shut­tle pro­gram closed in 2011, as 747s were al­ready dis­ap­pear­ing from the skies. Today, be­hold­ing a 747 in per­son has be­come harder, es­pe­cially in the United States. The char­ter car­rier Atlas Air flies some, as does the freight op­er­a­tor Kalitta, but even their num­bers are dwin­dling as com­pa­nies move to more ef­fi­cient two-en­gine air­craft. Lufthansa flies the most sched­uled pas­sen­ger flights on the 747—between Frankfurt and des­ti­na­tions that in­clude Chicago, Los Angeles, and Washington, D.C.—and Korean Air also still runs the plane over­seas. But the 747 has moved down­mar­ket: China, Iran, and Russia use them for bus-like do­mes­tic routes. Even when for­eign car­ri­ers fly 747s, though, the sight of one of their planes in­vokes American in­ge­nu­ity, be­cause the air­craft was de­signed and built by America.

When the 747 is gone, other air­craft will ser­vice high-ca­pac­ity, long-haul routes: the Boeing 777, for ex­am­ple, and the Airbus A350. But none of those planes will sym­bol­ize global ac­cess and re­newal, be­cause noth­ing about any other plane is sym­bolic. They all look and feel the same. They are just metal tubes with wings. When we are on a plane these days, we are re­ally in­side our head­phones, sewn into our seats, yearn­ing for it to end. The mir­a­cle of flight it­self goes un­no­ticed, as even day­time trav­el­ers draw their shades. They do so to sleep, or to in­crease the con­trast on their screen. Striking up a con­ver­sa­tion is taboo. Six miles above­ground, you feel buried rather than aloft.

V. The Flying Oval Office

The last 747 on Americans’ radar is the one car­ry­ing Donald Trump. Air Force One has been a 747 for nearly 36 years, since George H. W. Bush first as­cended its stair­case on September 6, 1990, to fly from Andrews Air Force Base to Topeka, Kansas, for a fundrais­ing lun­cheon.

Air Force One is heav­ily mod­i­fied and highly cus­tomized—its 4,000 square feet of in­te­rior space in­clude a med­ical fa­cil­ity and two kitchens that can serve 100 gourmet meals—but from the out­side, it still looks the same as any other 747, from Pan Am on­ward, apart from the color scheme and pres­i­den­tial seal. In pre­vi­ous eras, the most pow­er­ful per­son in the world boarded the same equip­ment that you might use to take a Hawaiian va­ca­tion. But there was some­thing re­gal about a pres­i­dent de­scend­ing those steps, or wav­ing from the top of them, on for­eign soil. The plane is a lit­eral ship of state. On September 11, 2001, amid a na­tion­wide ground stop while the coun­try was un­der at­tack, Air Force One was George W. Bush’s flying Oval Office,” to bor­row Boeing’s phrase. Trump loves to do do­mes­tic fly­overs—of his ral­lies, of a Washington Commanders game in November—and show off the sheer size of the plane at low al­ti­tudes.

But Air Force One has aged. The two 747s that cur­rently share the duty are the same ones that Bush 41 flew on. In 2018, the first Trump ad­min­is­tra­tion struck a $3.9 bil­lion deal with Boeing to make two new planes, based on the 747 – 8, the air­craft’s fi­nal vari­a­tion. The planes were meant to be de­liv­ered by 2024, but they have not ar­rived. The pro­ject has been plagued by tech­ni­cal is­sues, sup­plier dis­putes, and al­leged tom­fool­ery—empty mini tequila bot­tles were re­port­edly dis­cov­ered on one of the air­planes un­der con­struc­tion. Boeing has ab­sorbed more than $2 bil­lion in cost over­runs on the pro­ject. (“We con­tinue to make steady progress” on the pro­ject, a Boeing spokesper­son told me.)

The stum­ble could­n’t have come at a worse mo­ment for the com­pany. Around the same time that Trump or­dered the new Air Force Ones, Boeing 737 Max 8 air­craft be­gan ex­pe­ri­enc­ing soft­ware prob­lems that even­tu­ally led to dis­as­ter, in­clud­ing two ac­ci­dents that killed 346 peo­ple. All planes in the fleet were grounded. Boeing paid large penal­ties and set­tle­ments in the en­su­ing years, and faced in­creased com­pe­ti­tion from Airbus, its European ri­val. In January 2024, an Alaska Airlines 737 Max 9 suf­fered a door-plug blowout due to im­proper in­stal­la­tion. The com­pany that once played the sporty game to in­vent the jumbo jet could­n’t seem to make new ver­sions of its bread-and-but­ter mid-range air­craft.

By spring 2025, still with­out his new Air Force One, Trump be­gan to con­sider ac­cept­ing a lux­ury-ap­pointed Boeing 747 – 8 as a gift from Qatar in­stead. Despite con­cerns about cor­rup­tion and na­tional se­cu­rity, the gov­ern­ment took the gift, val­ued at $400 mil­lion. Only a stupid per­son” would de­cline a free, very ex­pen­sive air­plane,” Trump said at the time. The cost of mod­i­fy­ing the plane for pres­i­den­tial use is clas­si­fied; probably less than $400 mil­lion” is what Air Force Secretary Troy Meink told Congress last year. The Air Force an­nounced on May 1, 2026, that the air­craft is sched­uled to fly this sum­mer, with a new red, white, and blue liv­ery. Will the American tax­payer end up pay­ing for both the retro­fitting and the new planes? Yes,” an Air Force spokesper­son told me.

Another prob­lem plagues a pres­i­den­tial 747, whether or not an emir de­liv­ers it: It is not a plane of the peo­ple any­more. It is a rar­ity, more of­ten an op­u­lent pri­vate palace than an in­stru­ment of com­mon car­riage. The like­li­est way for me to fly on a 747 in the United States, at this mo­ment, would be in a press seat on the pres­i­den­t’s plane.

A com­mon route for Air Force One these days is from Andrews Air Force Base to Palm Beach International Airport, for Trump’s week­ends at Mar-a-Lago. I fig­ured that the White House, and even the pres­i­dent him­self, might wel­come me aboard, to ex­pe­ri­ence the great­est pas­sen­ger plane in the sky over the great­est coun­try in the world. The White House nearly gave me a seat in late January, but then the trip filled up. On one of the flights I was not aboard that week­end, Trump told re­porters that he had opened air­space in South America so that some im­mi­grants could go back to Venezuela and stay, per­haps.” America has trav­eled a long way from Peggy Verger’s hud­dled masses yearn­ing to fly free.

In re­cent years, the 747 has amounted to an old plane for an old-man pres­i­dent. What will its next it­er­a­tion sym­bol­ize? If a new Air Force One fi­nally rolls off the be­lea­guered Boeing line, it will still be a fully American lodestar, how­ever faded its shine. If in­stead Trump flies on a gift from a for­eign power, it won’t mat­ter how American its bones are.

VI. Farewell Tour

Flying is more a part of life than ever, but it feels dis­ap­point­ing at best, and in­hu­mane at worst. This year, a Homeland Security shut­down cre­ated hours-long se­cu­rity lines; an elec­tive war with Iran spiked the cost of fuel and ticket prices. Old dreams were for­got­ten too. The pos­si­bil­ity of su­per­sonic pas­sen­ger travel has been aban­doned in fa­vor of trim-tab ad­just­ments, such as Boeing’s 787 Dreamliner, with its big­ger win­dows and less arid cab­ins. The com­pe­ti­tion that dereg­u­la­tion once spurred has all but dried up. The few air­lines that are left, hav­ing been al­lowed to con­sol­i­date into oli­gop­oly, have aban­doned the medium-size cities that were their for­mer hubs, such as Cincinnati, St. Louis, Pittsburgh, and Memphis. The big four U.S. car­ri­ers—United, Delta, American, and Southwest—have ef­fec­tively be­come banks: The dif­fer­ence be­tween profit and loss comes down to the loy­alty points they sell to their credit-card part­ners each year. Cramped pas­sen­gers are ruled not by bon­homie but by hair-trig­ger ag­gres­sion, while flight crews seek com­pli­ance rather than kin­ship. No frills are left to en­tice or dis­tract pas­sen­gers. The main ben­e­fit of sit­ting in first class is that you might still be served free cock­tails, while a coach pas­sen­ger is left with a puny bag of carbs, one cup of soda, and com­pli­men­tary (for now) trash col­lec­tion. Forget about power and free­dom. Commercial air­planes no longer sym­bol­ize any­thing ex­cept a de­sire to be any­where else. Nobody cares what kind of plane they fly in any­more, so long as they are on it as lit­tle as pos­si­ble.

Adriana Zehbrauskas For The Atlantic

Planes sent to Pinal Airpark, in Marana, Arizona, for re­pair, stor­age, or scrap­ping

A quar­ter of the 21st cen­tury has now elapsed, but the 20th cen­tury, its en­gines cut, has man­aged to stay aloft. It is now fi­nally land­ing, at Pinal Airpark in Arizona. I flew the last pas­sen­ger U.S. air­plane to the desert,” Captain Steve Hanlon told me by phone from Atlanta, where he works for Delta as a flight-sim­u­la­tor pi­lot in­struc­tor. In 2017, Delta re­tired the air­craft (Ship 6314) and the rest of its 747 fleet for the same rea­son every other air­line did and will: A four-en­gine jumbo jet is more ex­pen­sive to op­er­ate than newer, if less strik­ing, al­ter­na­tives. Ship 6314 went on a farewell tour of for­mer Northwest and Delta hubs, in­clud­ing hangar par­ties in Seattle, Detroit, Atlanta, and Minneapolis and a more somber af­fair in L.A. Don’t say it’s sad,’ ” Hanlon re­mem­bers the Delta cor­po­rate reps telling him. They also changed how he re­ferred to the ves­sel—“she,” as sailors have done since an­tiq­uity—in an in­ter­view for a cor­po­rate press re­lease. Instead they put it,’ ” he said, scoff­ing.

The scene aboard Ship 6314 was cel­e­bra­tory. A pi­lot and a flight at­ten­dant, who had met on a 747 mil­i­tary char­ter years ear­lier, were mar­ried in the air. But Hanlon, who had been fly­ing 747s for 20 years, felt the pull of an end­ing as he held the Whale’s yoke, set­ting the air­craft down gen­tly on the Pinal run­way, with his co-cap­tain, Paul Gallaher, by his side. It was sad,” Hanlon told me. Like putting my fa­vorite dog down.” It was the last time ei­ther man would fly a 747.

Ship 6314’s phan­tom neigh­bors in­clude the 747 that once trans­ported the Ohio tel­e­van­ge­list Ernest Angley, who used the plane to spread the good word be­fore fi­nan­cial is­sues brought him back to Earth. Japan’s equiv­a­lent to Air Force One is also here. So is a 747 that was in­tended for Saudi Crown Prince Sultan bin Abdul Aziz and cost $300 mil­lion; it flew for less than 50 hours and is now be­ing scrapped at Pinal—the avi­a­tion equiv­a­lent of part­ing out a brand-new Bugatti.

When Jim Petty, Pinal’s man­ager, parked us un­der the long-for­saken TWA 747, he pointed out the grid of miss­ing skin where alu­minum from the fuse­lage had been cut in tidy rows to make plane tags, a type of avi­a­tion col­lectible. Nearby, a for­mer Korean Air ves­sel’s nose cone had been re­moved clean, as if seared off with a hot knife. A gal­ley in­ter­com phone hung from a chasm in the tail of an­other 747, swing­ing as if dropped from a phone booth in a film noir. Just above it, Petty pointed out a hawk roost. For five years they came here,” he said. I al­ways thought it was cool that a bird made its nest in a plane.”

When all of the use­ful parts have been claimed from a corpse, in­dus­trial scav­engers tug the re­mains to a ce­ment pad, where ex­ca­va­tors tear the ves­sel into bits of metal, Petty ex­plained. The scrap gets loaded onto 18-wheelers and hauled away for re­cy­cling. It’s never used again to make an air­craft,” he said, but it goes into wheels for cars, or beer cans.” Pointing at the Diet Coke he gave me upon ar­rival to quench my desert thirst, Petty noted that it might once have been a small piece of the Queen of the Skies.

Almost all of the en­gines had been har­vested from these planes. Removing the weight can cause the planes to tip up­ward, and point their noses to the sky. The blades of one aban­doned en­gine, ly­ing at Pinal since 2014, is­sued a tinny clat­ter as they spun in the breeze. It makes me think that the plane wants to head out,” Petty said. It wants to go.” But both of us knew bet­ter. It was a death rat­tle, and for more than just a type of air­plane.

In the faux–Bran­iff lounge aboard that first 747, at the Museum of Flight, I had asked Cheryl Grimm and Peggy Verger for their best mem­ory of the air­craft. The for­mer flight at­ten­dants could­n’t sum­mon a story, and in­stead fell back on a feel­ing. You were just happy to be there,” Verger said. Grimm could only echo that by­gone af­fec­tion: You were just happy to be there.”

*Illustration source im­ages: Adsr / Alamy; Jim Gray / Keystone / Getty; Diana Walker / Getty; Gene Glover / Agentur Focus / OSTKREUZ Archiv / Redux; Francois Pages / Paris Match / Getty; © SAS Museum Oslo Norway.

This ar­ti­cle ap­pears in the July 2026 print edi­tion with the head­line Queen of the Skies.”

Why did this journal retract two 1940s papers by Max Planck?

arstechnica.com

Skip to con­tent

a bot-ched de­ci­sion?

Clicking on the links now re­veals blank pages and empty PDFs. Intellectually, it’s not ac­cept­able.”

Max Planck is not amused.

Credit:

Hugo Erfurth /Public do­main

German physi­cist Max Planck was one of the pi­o­neers of quan­tum me­chan­ics in the early 20th cen­tury, earn­ing the 1918 Nobel Prize in Physics for his dis­cov­ery of quanta. There has never been a whis­per of scan­dal about the man’s in­tegrity or his sci­en­tific work. So a pair of sci­ence his­to­ri­ans were puz­zled when they dis­cov­ered that a sci­en­tific jour­nal had in­ex­plic­a­bly re­tracted two of Planck’s pa­pers from the 1940s.

The jour­nal in ques­tion is Naturwissenschaften, now known as The Science of Nature. The jour­nal typ­i­cally adds a large RETRACTED no­tice across dig­i­tal pa­pers that have been re­tracted, leav­ing them avail­able for down­load. But it has re­moved the two Planck pa­pers en­tirely, leav­ing just a blank page (and empty PDFs) with a brief note say­ing the ar­ti­cles had been withdrawn due to ar­ti­cle vi­o­la­tion.”

Physics his­to­rian Yves Gingras of the University of Quebec in Montreal was brows­ing the blog Retraction Watch’s list of Nobel Prize win­ners who have had sci­en­tific pa­pers re­tracted, just out of cu­rios­ity. Gingras was shocked to see Planck’s name on the list and en­listed fel­low his­to­rian Mahdi Khelfaoui, of the University of Quebec at Trois-Rivieres, to in­ves­ti­gate why the two pa­pers had been re­tracted. They out­lined their find­ings in a preprint posted to the physics arXiv.

The jour­nal’s cur­rent ed­i­tor-in-chief, Suzanne Scarlata of the Worcester Polytechnic Institute, told Science re­porter Sam Kean that she had not known the pa­pers had been re­tracted prior to Kean con­tact­ing her for com­ment. That’s crazy,” she said. I don’t un­der­stand why they were flagged. I think it just hap­pened with their al­go­rithm. It’s a mis­take they should prob­a­bly rec­tify.” (Kean claims Springer Nature is still sell­ing the empty PDFs for $39.95 a pop, but I had no trou­ble down­load­ing both empty files for free, for what it’s worth.)

A ques­tion of copy­right?

Gingras and Khelfaoui sus­pected that the re­trac­tions oc­curred due to the jour­nal pub­lish­er’s misunderstanding, or ig­no­rance, of past pub­li­ca­tion prac­tices.” The spe­cific rea­son for the re­trac­tions was copy­right vi­o­la­tion, so there was noth­ing wrong with the ac­tual pa­pers from a sci­en­tific stand­point. (Both are philosophical re­flec­tions on the na­ture of sci­en­tific knowl­edge.”) They were able to re­trieve meta­data show­ing that the DOI records for both pa­pers had been cre­ated in April 2005, co­in­cid­ing with the large-scale switch to elec­tronic pub­lish­ing that oc­curred across most jour­nals. Over time, those jour­nals also in­te­grated his­tor­i­cal stud­ies into their search­able on­line archives.

Gingras and Khelfaoui sus­pect the re­trac­tion de­ci­sion was made around this time. All this clearly sug­gests that some lawyer at Springer was over­shad­ow­ing the process and con­sid­ered these pa­pers as prob­lem­atic forms of duplicate pub­li­ca­tions,’” they wrote. The first re­tracted pa­per (“Meaning and Limits of Exact Science”), was pub­lished in 1942, based on a lec­ture Planck de­liv­ered in Berlin the prior year. It was also pub­lished as a book­let, in an­other jour­nal, and in­cluded in an an­thol­ogy of Planck’s es­says and lec­tures.

The sec­ond re­tracted pa­per (“Natural Science and the Real External World”) ap­peared in 1940. It had not been pub­lished or reprinted else­where. But a sci­en­tist named Aloys Muller pub­lished a cri­tique of Planck’s 1931 es­say on pos­i­tivism that year, to which Planck re­sponded in the same jour­nal us­ing the same ti­tle just a few months later. Gingras and Khelfaoui sus­pect the re­trac­tion was the re­sult of a cataloguing am­bi­gu­ity” since there were two sep­a­rate pa­pers by dif­fer­ent au­thors in the same jour­nal with iden­ti­cal ti­tles. This would have con­fused any al­go­rith­mic tool used to catch in­stances of du­pli­ca­tion or self-plagiarism,” for ex­am­ple.

The real is­sue is whether pub­lish­ers of sci­en­tific jour­nals should retroac­tively ap­ply con­tem­po­rary stan­dards re­gard­ing du­pli­cate pub­li­ca­tion or self-pla­gia­rism to his­tor­i­cal pa­pers. The jour­nal pub­lish­ing norms in the early 20th cen­tury were sub­stan­tially dif­fer­ent. The em­pha­sis was on achiev­ing the widest dis­sem­i­na­tion of knowl­edge across a frag­mented sci­en­tific com­mu­nity sep­a­rated by lan­guage and ge­o­graph­i­cal dis­tance, pub­lish­ing in many dif­fer­ent jour­nals. As a re­sult, the bound­aries were heav­ily blurred be­tween lec­tures, con­fer­ence pro­ceed­ings, book­lets, col­lected es­says, pub­lished jour­nal ar­ti­cles, and so forth.

The sci­en­tific en­ter­prise has since evolved to the point where it is dom­i­nated by large com­mer­cial pub­lish­ing groups that are much more sen­si­tive to pro­tect­ing copy­rights and turn­ing a profit. Duplication/self-plagiarism is also more of an is­sue now, when pub­li­ca­tions are a ma­jor fac­tor when it comes to hir­ing and pro­mot­ing sci­en­tists, as well as ac­quir­ing re­search fund­ings. Applying these con­tem­po­rary stan­dards can be prob­lem­atic for the digital cir­cu­la­tion of his­tor­i­cal texts,” the au­thors con­cluded.

The jour­nal’s pub­lisher, Springer Nature, killed an ed­i­to­r­ial Scarlata planned to run ad­dress­ing the is­sue. Springer Nature also de­clined to com­ment for the Science ar­ti­cle, merely telling Kean through a rep­re­sen­ta­tive that detailed in­for­ma­tion about spe­cific re­trac­tions is usu­ally con­fi­den­tial and can only be shared with the rel­e­vant au­thors.”

Given that Planck died in 1947, he can’t get a di­rect an­swer ei­ther. Both pa­pers are now in the pub­lic do­main in most coun­tries, so it’s not like copy­right vi­o­la­tion is even an is­sue any­more. It’s still pos­si­ble to ac­cess both pa­pers via the Internet archive. But as Gingras and Khelfaoui ar­gue in their preprint, re­mov­ing the two pa­pers dis­torts the his­tor­i­cal record. Whoever did it, I don’t care,” Gingras told Science. Just put them [back] in the data­base. Intellectually, it’s not ac­cept­able.”

Jennifer is a se­nior writer at Ars Technica with a par­tic­u­lar fo­cus on where sci­ence meets cul­ture, cov­er­ing every­thing from physics and re­lated in­ter­dis­ci­pli­nary top­ics to her fa­vorite films and TV se­ries. Jennifer lives in Baltimore with her spouse, physi­cist Sean M. Carroll, and their two cats, Ariel and Caliban.

120 Comments

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

Visit pancik.com for more.