10 interesting stories served every morning and every evening.




1 974 shares, 38 trendiness

A Tale of Two Bills: Lawful Access Returns With Changes to Warrantless Access But Dangerous Backdoor Surveillance Risks Remain

The decades-long bat­tle over law­ful ac­cess en­tered a new phase yes­ter­day with the in­tro­duc­tion of Bill C-22, the Lawful Access Act. This bill fol­lows the at­tempt last spring to bury law­ful ac­cess pro­vi­sions in Bill C-2, a bor­der mea­sures bill that was the new gov­ern­men­t’s first piece of sub­stan­tive leg­is­la­tion. The law­ful ac­cess el­e­ments of the bill faced an im­me­di­ate back­lash given the in­clu­sion of un­prece­dented rules per­mit­ting wide­spread war­rant­less ac­cess to per­sonal in­for­ma­tion. Those rules were on very shaky con­sti­tu­tional ground and the gov­ern­ment ul­ti­mately de­cided to hit the re­set but­ton on law­ful ac­cess by pro­ceed­ing with the bor­der mea­sures in a dif­fer­ent bill.

Lawful ac­cess never dies, how­ever. Bill C-22 cover the two main as­pects of law­ful ac­cess: law en­force­ment ac­cess to per­sonal in­for­ma­tion held by com­mu­ni­ca­tion ser­vice providers such as ISPs and wire­less providers and the de­vel­op­ment of sur­veil­lance and mon­i­tor­ing ca­pa­bil­i­ties within Canadian net­works. In fact, the bill is sep­a­rated into two with the first half deal­ing with timely ac­cess to data and in­for­ma­tion” and the sec­ond es­tab­lish­ing the Supporting Authorized Access to Information Act (SAAIA).

I an­tic­i­pate pro­vid­ing ex­ten­sive cov­er­age of the bill on both this blog and my pod­cast. My ini­tial take is that the ac­cess to data and in­for­ma­tion piece of the bill is much im­proved. The ear­lier Bill C-2 it­er­a­tion of a new in­for­ma­tion de­mand power was as­ton­ish­ing in its breadth (covering far more than just com­mu­ni­ca­tions providers by tar­get­ing any­one who pro­vides a ser­vice in Canada in­clud­ing physi­cians and lawyers) and de­mands for war­rant­less dis­clo­sure of per­sonal in­for­ma­tion in di­rect con­tra­dic­tion to re­cent Supreme Court of Canada ju­rispru­dence.

The gov­ern­ment has scrapped that ap­proach by shift­ing to a new confirmation of ser­vice” de­mand power. This would al­low law en­force­ment to de­mand that tele­com providers (not any ser­vice provider) con­firm whether they pro­vide ser­vice to a par­tic­u­lar per­son. The other sub­scriber in­for­ma­tion would be sub­ject to a new pro­duc­tion or­der re­viewed and ap­proved by a judge. This would ad­dress the long­stand­ing po­lice com­plaint that they may do con­sid­er­able work seek­ing in­for­ma­tion about a sub­scriber at a provider only to learn that the per­son is­n’t a cus­tomer and they start over with some­one else.

These new rules con­tain other or­ders and rules on vol­un­tary dis­clo­sure, chal­leng­ing the re­quests, ex­i­gent cir­cum­stances, and for­eign or­ders for the same in­for­ma­tion. I plan to un­pack these rules in the com­ing weeks. For ex­am­ple, there are con­cerns about the thresh­olds that the pro­duc­tion or­ders en­vi­sion, namely the low reasonable grounds to sus­pect” stan­dard. However, the main take­away here is that the gov­ern­ment has sig­nif­i­cantly lim­ited the scope of war­rant­less in­for­ma­tion de­mand pow­ers, now fo­cus­ing solely on telecom­mu­ni­ca­tions providers and whether they pro­vide ser­vice to a par­tic­u­lar in­di­vid­ual. Access to more per­sonal in­for­ma­tion will re­quire over­sight. That’s a ma­jor con­ces­sion and high­lights how Bill C-2 was too broad, dan­ger­ous from a pri­vacy per­spec­tive, and un­likely to pass con­sti­tu­tional muster.

If that is the good news, the bad news is very bad. The SAAIA, which es­tab­lishes new re­quire­ments for com­mu­ni­ca­tions providers to ac­tively work with law en­force­ment on their sur­veil­lance and mon­i­tor­ing ca­pa­bil­i­ties are largely un­changed from Bill C-2. In fact, there are el­e­ments in­volv­ing data re­ten­tion that are even worse. The gov­ern­ment will point to in­creased over­sight — min­is­te­r­ial or­ders must now be ap­proved by the Intelligence Commissioner — but the con­cerns re­gard­ing sur­veil­lance ca­pa­bil­i­ties, se­cu­rity vul­ner­a­bil­i­ties, se­crecy, and cross-bor­der data shar­ing re­main.

The SAAIA has huge im­pli­ca­tions for net­work providers as they en­vi­sion pro­vid­ing law en­force­ment with di­rect ac­cess to provider net­works to test ca­pa­bil­i­ties for data ac­cess and in­ter­cep­tion. The bill in­tro­duces a new term — “electronic ser­vice provider” — that is pre­sum­ably de­signed to ex­tend be­yond tele­com and Internet providers by scop­ing in Internet plat­forms (Google, Meta, etc.). Those in­ter­na­tional ser­vices are now key play­ers in elec­tronic com­mu­ni­ca­tions (think Gmail or WhatsApp), though some may be be­yond this form of reg­u­la­tion (eg. Signal if you don’t in­ad­ver­tently add peo­ple to chat groups).

The de­f­i­n­i­tion of an ESP is:

a per­son that, in­di­vid­u­ally or as part of a group, pro­vides an elec­tronic ser­vice, in­clud­ing for the pur­pose of en­abling com­mu­ni­ca­tions, and that
(a) pro­vides the ser­vice to per­sons in Canada; or

(b) car­ries on all or part of its busi­ness ac­tiv­i­ties in Canada.‍

a ser­vice, or a fea­ture of a ser­vice, that in­volves the cre­ation, record­ing, stor­age, pro­cess­ing, trans­mis­sion, re­cep­tion, emis­sion or mak­ing avail­able of in­for­ma­tion in elec­tronic, dig­i­tal or any other in­tan­gi­ble form by an elec­tronic, dig­i­tal, mag­netic, op­ti­cal, bio­met­ric, acoustic or other tech­no­log­i­cal means, or a com­bi­na­tion of any such means.”

All elec­tronic ser­vice providers are sub­ject to oblig­a­tions to provide all rea­son­able as­sis­tance, in any pre­scribed time and man­ner, to per­mit the as­sess­ment or test­ing of any de­vice, equip­ment or other thing that may en­able an au­tho­rized per­son to ac­cess in­for­ma­tion.” Moreover, all are re­quired to keep such re­quests se­cret.

But be­yond the ba­sic oblig­a­tions, the gov­ern­ment will iden­tify core providers” who will be sub­ject to ad­di­tional reg­u­la­tions. These may in­clude:

(a) the de­vel­op­ment, im­ple­men­ta­tion, as­sess­ment, test­ing and main­te­nance of op­er­a­tional and tech­ni­cal ca­pa­bil­i­ties, in­clud­ing ca­pa­bil­i­ties re­lated to ex­tract­ing and or­ga­niz­ing in­for­ma­tion that is au­tho­rized to be ac­cessed and to pro­vid­ing ac­cess to such in­for­ma­tion to au­tho­rized per­sons;

(b) the in­stal­la­tion, use, op­er­a­tion, man­age­ment, as­sess­ment, test­ing and main­te­nance of any de­vice, equip­ment or other thing that may en­able an au­tho­rized per­son to ac­cess in­for­ma­tion;

(c) notices to be given to the Minister or other per­sons, in­clud­ing with re­spect to any ca­pa­bil­ity re­ferred to in para­graph (a) and any de­vice, equip­ment or other thing re­ferred to in para­graph (b); and

(d) the re­ten­tion of cat­e­gories of meta­data — in­clud­ing trans­mis­sion data, as de­fined in sec­tion 487.‍011 of the Crim­i­nal Code — for rea­son­able pe­ri­ods of time not ex­ceed­ing one year.

Note that the re­ten­tion of meta­data found in (d) is new. It was not in Bill C-2, so this bill ac­tu­ally ex­pands the scope of oblig­a­tions. The new bill con­tains some lim­its on data re­ten­tion:

4) Paragraph (2)‍(d) does not au­tho­rize the mak­ing of reg­u­la­tions that re­quire core providers to re­tain in­for­ma­tion that would re­veal

(a) the con­tent — that is to say the sub­stance, mean­ing or pur­pose — of in­for­ma­tion trans­mit­ted in the course of an elec­tronic ser­vice;

(b) a per­son’s web brows­ing his­tory; or

(c) a per­son’s so­cial me­dia ac­tiv­i­ties.

The bill also re­tains an ex­cep­tion for sys­temic vul­ner­a­bil­i­ties, which states:

A core provider is not re­quired to com­ply with a pro­vi­sion of a reg­u­la­tion made un­der sub­sec­tion (2), with re­spect to an elec­tronic ser­vice, if com­pli­ance with that pro­vi­sion would re­quire the provider to in­tro­duce a sys­temic vul­ner­a­bil­ity re­lated to that ser­vice or pre­vent the provider from rec­ti­fy­ing such a vul­ner­a­bil­ity.

There re­main con­cerns that is in­suf­fi­cient and that there are real risks that net­works may be made less se­cure by virtue of these rules with the changes kept se­cret from the pub­lic. Moreover, as Kate Robertson of the Citizen Lab has dis­cussed (including on the Law Bytes pod­cast), many of these rules ap­pear geared to­ward global in­for­ma­tion shar­ing, in­clud­ing com­pli­ance with the Second Additional Protocol to the Budapest Convention (2AP) and the CLOUD Act.

There is much to un­pack with this sec­tion in­clud­ing the abil­ity to chal­lenge or­ders, the se­crecy as­so­ci­ated with the sys­tem, over­sight, and costs. I plan to cover these as well but for the mo­ment it is suf­fi­cient to con­clude that Bill C-22’s SAAIA en­vi­sions a sig­nif­i­cant change to how gov­ern­ment agen­cies in­ter­act with Canadian com­mu­ni­ca­tions net­works and net­work providers rais­ing enor­mous pri­vacy and civil lib­er­ties con­cerns. The gov­ern­ment may have taken war­rant­less ac­cess to sub­scriber in­for­ma­tion off the table, but there re­mains se­ri­ous pri­vacy con­cerns as­so­ci­ated with its law­ful ac­cess plans.

...

Read the original on www.michaelgeist.ca »

2 738 shares, 172 trendiness

Palestinian boy, 12, describes how Israeli forces killed his family in car

They pulled me out in­stead and be­gan jump­ing on my back,” he said. Then they took me to a cor­ner and ques­tioned me about who had been in the car. I told them it was my mother and fa­ther. They ac­cused me of ly­ing and started beat­ing me.”

...

Read the original on www.bbc.com »

3 611 shares, 57 trendiness

The price of accountability: corruption erodes social trust more in democracies than in autocracies

While cor­rup­tion ex­ists in both democ­ra­cies and au­toc­ra­cies, its so­cial con­se­quences may dif­fer fun­da­men­tally across regime types. Democratic norms of equal­ity and im­par­tial­ity make trust highly sen­si­tive to in­sti­tu­tional fail­ure. We the­o­rize two mech­a­nisms—nor­ma­tive am­pli­fi­ca­tion and rep­re­sen­ta­tive con­ta­gion—by which cor­rup­tion erodes trust more in democ­ra­cies. In democ­ra­cies, cor­rup­tion vi­o­lates core fair­ness norms and im­pli­cates the cit­i­zenry that elected cor­rupt of­fi­cials. In au­toc­ra­cies, cor­rup­tion is ex­pected and elites are seen as sep­a­rate from or­di­nary cit­i­zens. To test this the­ory, we per­form mul­ti­level analy­sis of data from 62 coun­tries com­bin­ing in­di­vid­ual-level sur­vey re­sponses with coun­try-level de­mo­c­ra­tic qual­ity in­di­ca­tors.We first demon­strate that per­ceiv­ing cor­rup­tion pre­dicts lower gen­er­al­ized trust al­most uni­ver­sally across in­di­vid­u­als. We then show this in­di­vid­ual-level psy­cho­log­i­cal mech­a­nism is con­sid­er­ably stronger in democ­ra­cies than in au­toc­ra­cies, even con­trol­ling for in­equal­ity and coun­try-level cor­rup­tion.These find­ings re­veal an asym­met­ric vul­ner­a­bil­ity: the ac­count­abil­ity struc­tures that make democ­ra­cies func­tion also make their so­cial cap­i­tal frag­ile. This has im­por­tant im­pli­ca­tions for un­der­stand­ing de­mo­c­ra­tic re­silience, as cor­rup­tion threat­ens the so­cial trust nec­es­sary for de­mo­c­ra­tic co­op­er­a­tion dif­fer­ently across regime types.

Democracy may be uniquely sen­si­tive to cer­tain threats. Recent schol­ar­ship on de­mo­c­ra­tic back­slid­ing re­veals how democ­ra­cies can erode from within when norms de­cay and in­sti­tu­tions weaken (Levitsky and Ziblatt, 2018). In this ar­ti­cle, we iden­tify a spe­cific sen­si­tiv­ity: in democ­ra­cies, so­cial cap­i­tal ap­pears to be par­tic­u­larly re­spon­sive to cor­rup­tion. We the­o­rize that this sen­si­tiv­ity arises from democ­ra­cy’s foun­da­tional com­mit­ments to equal­ity and im­par­tial­ity. These com­mit­ments may cre­ate two psy­cho­log­i­cal mech­a­nisms that am­plify cor­rup­tion’s im­pact on so­cial trust. First, nor­ma­tive am­pli­fi­ca­tion: in democ­ra­cies, where uni­ver­sal­ism is the pro­fessed ideal, cor­rup­tion may sig­nal a breach of the so­cial con­tract. Citizens may in­fer that if the in­sti­tu­tions meant to em­body fair­ness are com­pro­mised, the wider so­ci­ety is un­trust­wor­thy. In au­toc­ra­cies, by con­trast, where par­tic­u­lar­ism is ex­pected, cor­rup­tion con­firms busi­ness as usual rather than sig­nal­ing so­ci­etal rot. Second, rep­re­sen­ta­tive con­ta­gion: in democ­ra­cies, cor­rupt of­fi­cials are viewed as em­a­nat­ing from the peo­ple” through elec­tions, po­ten­tially im­pli­cat­ing the cit­i­zenry it­self. In au­toc­ra­cies, preda­tory elites are seen as a dis­tinct class, quar­an­ti­ning in­ter­per­sonal trust from elite malfea­sance. If these mech­a­nisms op­er­ate as the­o­rized, then the in­di­vid­ual-level psy­cho­log­i­cal process link­ing cor­rup­tion per­cep­tions to so­cial dis­trust should be regime-de­pen­dent—strong in democ­ra­cies, weak in au­toc­ra­cies.A study by You (2018) pro­vides sug­ges­tive ev­i­dence for our the­sis. Using coun­try-level data on so­cial trust and cor­rup­tion, and study­ing democ­ra­cies and au­toc­ra­cies sep­a­rately, he demon­strated that more cor­rup­tion is strongly as­so­ci­ated with weaker so­cial trust among democ­ra­cies—but not among au­toc­ra­cies. This strik­ing pat­tern is con­sis­tent with our the­ory. However, as the find­ing was ob­tained at the ag­gre­gate level, it leaves open whether it re­flects gen­uine dif­fer­ences in how in­di­vid­u­als psy­cho­log­i­cally process cor­rup­tion, or whether it is an ar­ti­fact of other phe­nom­ena.The pre­sent pa­per aims to pro­vide in­di­vid­ual-level ev­i­dence for how trust among peo­ple in democ­ra­cies may be es­pe­cially sen­si­tive to cor­rup­tion. After repli­cat­ing You’s coun­try-level find­ings in more re­cent data from 62 coun­tries—cov­er­ing the full spec­trum from au­toc­ra­cies like Russia and Iran to sta­ble lib­eral democ­ra­cies like New Zealand and Netherlands—we use mul­ti­level mod­el­ing to test whether a cor­re­spond­ing in­di­vid­ual-level pat­tern ex­ists. We find that in­di­vid­u­als’ per­cep­tions of cor­rup­tion are as­so­ci­ated with lower gen­er­al­ized trust in democ­ra­cies, while this same in­di­vid­ual-level as­so­ci­a­tion is sub­stan­tially weaker or ab­sent in au­toc­ra­cies. These find­ings sug­gest an asym­me­try in how cor­rup­tion re­lates to so­cial trust across regime types. While democ­ra­cies fos­ter high so­cial trust through their in­sti­tu­tions, they may si­mul­ta­ne­ously make that so­cial cap­i­tal more vul­ner­a­ble to per­cep­tions of in­sti­tu­tional fail­ure. This may be the price of ac­count­abil­ity: the very norms that make democ­ra­cies func­tion—equal­ity, rep­re­sen­ta­tion, trans­parency—may also en­sure that in­sti­tu­tional fail­ures res­onate in cit­i­zens’ so­cial world­views.

We com­bine in­di­vid­ual-level data from the most re­cent wave (2017–2022) of the World Values Survey (WVS; Haerpfer et al., 2022) with coun­try-level in­di­ca­tors (averaged across the same pe­riod) of de­mo­c­ra­tic qual­ity from the Varieties of Democracy (V-Dem) pro­ject (Coppedge et al., 2025; Pemstein et al., 2025). Our analy­sis in­cludes 62 coun­tries for which we have com­plete data on all vari­ables of in­ter­est. We use WVS Wave 7 (2017–2022) be­cause it con­tains the cor­rup­tion per­cep­tion mod­ule re­quired for our analy­sis. Although a Joint EVS/WVS dataset ex­ists with 92 coun­tries, the European Values Survey does not in­clude the cor­rup­tion per­cep­tion items, mak­ing it un­suit­able for our pur­poses. Our 62 coun­tries there­fore rep­re­sent the full set of coun­tries with com­plete data on per­ceived cor­rup­tion, gen­er­al­ized trust, and de­mo­c­ra­tic qual­ity in­di­ca­tors. The WVS pro­vides our key in­di­vid­ual-level mea­sures. Generalized trust is mea­sured by the stan­dard ques­tion: Generally speak­ing, would you say that most peo­ple can be trusted or that you need to be very care­ful in deal­ing with peo­ple?” Responses are coded 1 for most peo­ple can be trusted” and 0 oth­er­wise. Perceived cor­rup­tion is mea­sured by ask­ing re­spon­dents how wide­spread they be­lieve cor­rup­tion to be among pub­lic of­fi­cials, on a scale from 1 (there is no cor­rup­tion in my coun­try) to 10 (there is abun­dant cor­rup­tion in my coun­try). While cor­rup­tion per­cep­tions may not per­fectly align with ob­jec­tive cor­rup­tion lev­els (Charron, 2016), per­cep­tions are pre­sum­ably what di­rectly af­fect in­di­vid­ual trust judg­ments.We in­clude stan­dard in­di­vid­ual-level con­trols: age (five cat­e­gories: 18–29, 30–39, 40–49, 50–59, 60+), gen­der (male/female), ed­u­ca­tion (three lev­els based on ISCED cat­e­gories: low, medium, high), house­hold in­come (three lev­els based on the WVS 10-point scale: low [1–3], medium [4–7], high [8–10]), and em­ploy­ment sta­tus (three cat­e­gories: em­ployed [full-time, part-time, or self-em­ployed], not in la­bor force [retired, home­maker, or stu­dent], and un­em­ployed/​other).From V-Dem, we use two mea­sures of de­mo­c­ra­tic qual­ity: the Regimes of the World (RoW) clas­si­fi­ca­tion and the Liberal Democracy Index. The RoW (Lührmann et al., 2018) is a cat­e­gor­i­cal mea­sure dis­tin­guish­ing closed au­toc­ra­cies (no mul­ti­party elec­tions), elec­toral au­toc­ra­cies (multiparty elec­tions that are not free and fair), elec­toral democ­ra­cies (free elec­tions but lim­ited lib­eral pro­tec­tions), and lib­eral democ­ra­cies (free elec­tions with strong lib­eral pro­tec­tions). Following our the­o­ret­i­cal frame­work—which em­pha­sizes that the mech­a­nisms of nor­ma­tive am­pli­fi­ca­tion and rep­re­sen­ta­tive con­ta­gion re­quire gen­uine elec­toral ac­count­abil­ity—we cre­ate a bi­nary clas­si­fi­ca­tion: democ­ra­cies (electoral and lib­eral democ­ra­cies, RoW = 2–3) ver­sus au­toc­ra­cies (closed and elec­toral au­toc­ra­cies, RoW = 0–1). Electoral au­toc­ra­cies are clas­si­fied as au­toc­ra­cies be­cause, de­spite hav­ing mul­ti­party elec­tions, these elec­tions lack the com­pet­i­tive in­tegrity nec­es­sary for the rep­re­sen­ta­tive con­ta­gion mech­a­nism to op­er­ate.In con­trast to the cat­e­gor­i­cal RoW mea­sure, the Liberal Democracy Index is a con­tin­u­ous mea­sure, which cap­tures both elec­toral and lib­eral di­men­sions of democ­racy, in­clud­ing the qual­ity of elec­tions, checks on ex­ec­u­tive power, equal­ity be­fore the law, and in­di­vid­ual lib­er­ties. This in­dex ranges from 0 (least de­mo­c­ra­tic) to 1 (most de­mo­c­ra­tic). We use the Liberal Democracy Index rather than the Electoral Democracy Index (also known as Polyarchy) be­cause our the­o­ret­i­cal mech­a­nisms—nor­ma­tive am­pli­fi­ca­tion and rep­re­sen­ta­tive con­ta­gion—de­pend on fea­tures be­yond elec­toral pro­ce­dures. The lib­eral com­po­nent of the Liberal Democracy Index cap­tures the rule of law, checks on ex­ec­u­tive power, and equal­ity be­fore the law, which are cen­tral to our ar­gu­ment that cor­rup­tion in democ­ra­cies vi­o­lates norms of im­par­tial­ity. In ro­bust­ness analy­ses, we also test whether re­sults hold us­ing the Electoral Democracy Index as an al­ter­na­tive mod­er­a­tor.Our re­search de­sign tests three pro­gres­sively re­fined hy­pothe­ses out­lined above.To test H1, we cal­cu­late coun­try-level ag­gre­gates of per­ceived cor­rup­tion and gen­er­al­ized trust, then ex­am­ine whether their cor­re­la­tion dif­fers when cal­cu­lated sep­a­rately among democ­ra­cies and au­toc­ra­cies as de­fined by the RoW cat­e­go­riza­tion. H1 pre­dicts a strong neg­a­tive cor­re­la­tion be­tween per­ceived cor­rup­tion and gen­er­al­ized trust among democ­ra­cies but a weaker cor­re­la­tion among au­toc­ra­cies.The above di­choto­mous analy­sis matches the orig­i­nal ap­proach of You (2018). As a con­tin­u­ous al­ter­na­tive, we also ex­am­ine the in­ter­ac­tion be­tween per­ceived cor­rup­tion and the Liberal Democracy Index in a mul­ti­ple re­gres­sion analy­sis of coun­try-level gen­er­al­ized trust. H1 pre­dicts a neg­a­tive in­ter­ac­tion, rep­re­sent­ing a stronger neg­a­tive ef­fect of cor­rup­tion in more de­mo­c­ra­tic coun­tries.To test whether the ag­gre­gate pat­tern re­flects gen­uine in­di­vid­ual-level mech­a­nisms (H2) and whether these mech­a­nisms vary by regime type (H3), we es­ti­mate a ran­dom-in­ter­cept, ran­dom-slope mul­ti­level lo­gis­tic re­gres­sion model. This ap­proach mod­els the hi­er­ar­chi­cal struc­ture of the data, with in­di­vid­u­als nested within coun­tries. Standard er­rors ap­pro­pri­ately re­flect un­cer­tainty at both lev­els.At Level 1 (individual), gen­er­al­ized trust is mod­eled as a func­tion of per­ceived cor­rup­tion, con­trol­ling for de­mo­graphic char­ac­ter­is­tics (age, gen­der, ed­u­ca­tion, in­come, and em­ploy­ment sta­tus). At Level 2 (country), we al­low both the in­ter­cept and the slope for per­ceived cor­rup­tion to vary across coun­tries. Crucially, we in­clude a cross-level in­ter­ac­tion be­tween per­ceived cor­rup­tion and the Liberal Democracy Index (treated as a con­tin­u­ous vari­able), which di­rectly tests H3: whether the in­di­vid­ual-level cor­rup­tion-trust re­la­tion­ship varies with de­mo­c­ra­tic qual­ity. In other words, the cross-level in­ter­ac­tion es­ti­mates whether the psy­cho­log­i­cal mech­a­nism link­ing cor­rup­tion per­cep­tions to trust op­er­ates dif­fer­ently de­pend­ing on in­sti­tu­tional con­text.For­mally, the model can be ex­pressed as fol­lows:where is the prob­a­bil­ity of ex­press­ing trust for in­di­vid­ual i in coun­try j; is per­ceived cor­rup­tion (grand-mean cen­tered); is a vec­tor of de­mo­graphic con­trols; is the Liberal Democracy Index (grand-mean cen­tered); is the cross-level in­ter­ac­tion co­ef­fi­cient test­ing H3; and , are coun­try-level ran­dom ef­fects as­sumed to fol­low a bi­vari­ate nor­mal dis­tri­b­u­tion.For com­pu­ta­tional ef­fi­ciency with large sam­ple sizes (>85,000 in­di­vid­u­als), we use an ag­gre­gated bi­no­mial ap­proach. Observations are grouped by coun­try, cor­rup­tion level, and de­mo­graphic cat­e­gories, and trust in­ci­dence is mod­eled us­ing a bi­no­mial dis­tri­b­u­tion. This yields es­ti­mates iden­ti­cal to in­di­vid­ual-level analy­sis but with sub­stan­tially im­proved com­pu­ta­tional per­for­mance. Both per­ceived cor­rup­tion and the Liberal Democracy Index are grand-mean cen­tered to fa­cil­i­tate in­ter­pre­ta­tion of main ef­fects.We also con­duct ro­bust­ness checks in­clud­ing: (1) adding com­pet­ing cross-level mod­er­a­tors to test whether these fac­tors can ac­count for the democ­racy mod­er­a­tion; (2) test­ing press free­dom, the Electoral Democracy Index, and state re­silience (Travaglino et al., 2025) as al­ter­na­tive mod­er­a­tors in sep­a­rate mod­els (as their high cor­re­la­tions with lib­eral democ­racy, r = 0.90 and 0.78 re­spec­tively, pre­clude si­mul­ta­ne­ous es­ti­ma­tion); and (3) leave-one-out analy­ses to en­sure no sin­gle coun­try dri­ves the re­sults.For com­pet­ing mod­er­a­tors, we in­clude eco­nomic in­equal­ity (Gini co­ef­fi­cient from SWIID; Solt, 2020), po­lit­i­cal po­lar­iza­tion (from V-Dem), and mea­sures of dig­i­tal in­for­ma­tion ac­cess. We in­clude both so­cial me­dia use as a self-re­ported news source (country-level mean from WVS item on fre­quency of ob­tain­ing po­lit­i­cal in­for­ma­tion from so­cial me­dia) and in­ter­net pen­e­tra­tion (percentage of pop­u­la­tion us­ing the in­ter­net; World Bank, 2024).If our the­ory is cor­rect, we should ob­serve a neg­a­tive main as­so­ci­a­tion be­tween cor­rup­tion per­cep­tions and trust at the in­di­vid­ual level (H2) and a neg­a­tive cross-level in­ter­ac­tion, in­di­cat­ing that the cor­rup­tion-trust re­la­tion­ship is stronger (more neg­a­tive) in more de­mo­c­ra­tic coun­tries (H3).

Table 1 pre­sents the 62 coun­tries, or­dered by the Liberal Democracy Index, with their re­sults for gen­er­al­ized trust and per­ceived cor­rup­tion. Figure 1 tests H1 by show­ing how coun­try-level gen­er­al­ized trust varies with per­ceived cor­rup­tion, sep­a­rately for democ­ra­cies and au­toc­ra­cies. In sup­port of H1, the pat­tern strik­ingly dif­fers be­tween regime types. Among democ­ra­cies, there is a strong neg­a­tive re­la­tion­ship: coun­tries with higher per­ceived cor­rup­tion have sub­stan­tially lower gen­er­al­ized trust. Among au­toc­ra­cies, this re­la­tion­ship is con­sid­er­ably weaker—repli­cat­ing You’s (2018) find­ing in more re­cent data and with a the­ory-dri­ven op­er­a­tional­iza­tion of regime type based on the Regimes of the World clas­si­fi­ca­tion. The al­ter­na­tive analy­sis us­ing the con­tin­u­ous Liberal Democracy Index as a mod­er­a­tor of the ef­fect of per­ceived cor­rup­tion on gen­er­al­ized trust con­firms this pat­tern: the coun­try-level in­ter­ac­tion be­tween per­ceived cor­rup­tion and lib­eral democ­racy is neg­a­tive (B = −12.07, 95% CI [−22.36, −1.77], p = 0.022).The coun­try-level as­so­ci­a­tion be­tween cor­rup­tion and trust dif­fers be­tween regime types. Scatter plot of coun­try-level per­ceived cor­rup­tion (x-axis) and gen­er­al­ized trust (y-axis). Each point rep­re­sents a coun­try, with shape and color in­di­cat­ing regime type based on the Regimes of the World clas­si­fi­ca­tion. Regression lines are shown sep­a­rately for democ­ra­cies (electoral and lib­eral democ­ra­cies) and au­toc­ra­cies (closed and elec­toral au­toc­ra­cies). The neg­a­tive re­la­tion­ship be­tween cor­rup­tion and trust is strong among democ­ra­cies and much weaker among au­toc­ra­cies.To test H2 and H3 we now turn to the mul­ti­level analy­sis, which mod­els how in­di­vid­u­als’ cor­rup­tion per­cep­tions are as­so­ci­ated with their trust while al­low­ing this re­la­tion­ship to vary across coun­tries. In sup­port of H2, the main as­so­ci­a­tion be­tween per­ceived cor­rup­tion and gen­er­al­ized trust is neg­a­tive (B = −0.12, 95% CI [−0.14, −0.11], p B = 0.79, 95% CI [0.01, 1.56], p = 0.047), in­di­cat­ing that at av­er­age lev­els of per­ceived cor­rup­tion, in­di­vid­u­als in more de­mo­c­ra­tic coun­tries ex­hibit higher gen­er­al­ized trust.In sup­port of H3, the cross-level in­ter­ac­tion be­tween per­ceived cor­rup­tion and lib­eral democ­racy is neg­a­tive (B = −0.16, 95% CI [−0.22, −0.10], p Fig­ure 2 il­lus­trates this pat­tern by plot­ting coun­try-spe­cific cor­rup­tion-trust slopes against de­mo­c­ra­tic qual­ity. The slopes are ex­tracted from a model with­out the cross-level in­ter­ac­tion, show­ing the em­pir­i­cal vari­a­tion that the in­ter­ac­tion term cap­tures. The blue line rep­re­sents the pre­dicted slope from the main mod­el’s in­ter­ac­tion term. Corruption slopes tend to be strongly neg­a­tive in coun­tries with high de­mo­c­ra­tic qual­ity while be­ing close to zero in coun­tries with low de­mo­c­ra­tic qual­ity.Lib­eral democ­racy and the in­di­vid­ual-level cor­rup­tion-trust re­la­tion­ship. Each point rep­re­sents one coun­try. The x-axis shows the V-Dem lib­eral democ­racy in­dex; the y-axis shows the coun­try-spe­cific cor­rup­tion slope ex­tracted from a ran­dom-slopes model with­out the cross-level in­ter­ac­tion. More neg­a­tive val­ues in­di­cate stronger neg­a­tive as­so­ci­a­tions be­tween per­ceived cor­rup­tion and trust. The blue line shows the pre­dicted re­la­tion­ship from the main mod­el’s cross-level in­ter­ac­tion. The strong neg­a­tive slope demon­strates that as de­mo­c­ra­tic qual­ity in­creases, the in­di­vid­ual-level psy­cho­log­i­cal mech­a­nism link­ing cor­rup­tion per­cep­tions to so­cial dis­trust be­comes sub­stan­tially stronger.To trans­late these re­sults into sub­stan­tive terms, we cal­cu­lated pre­dicted prob­a­bil­i­ties of ex­press­ing trust at dif­fer­ent lev­els of cor­rup­tion per­cep­tion and democ­racy, il­lus­trated in Figure 3. For a highly de­mo­c­ra­tic coun­try (90th per­centile of the Liberal Democracy Index, the solid line in Figure 3), mov­ing from low per­ceived cor­rup­tion (4 on the 1–10 scale, which is the 10th per­centile of ob­served val­ues) to high per­ceived cor­rup­tion (10 on the scale, 90th per­centile) is as­so­ci­ated with a de­crease in the prob­a­bil­ity of trust­ing oth­ers from ap­prox­i­mately 34 to 14%. The same change in cor­rup­tion per­cep­tion is as­so­ci­ated with a much smaller de­crease, from ap­prox­i­mately 17 to 11% in a highly au­to­cratic coun­try (10th per­centile of the Liberal Democracy Index, the dashed line in Figure 3).Predicted prob­a­bil­ity of trust­ing oth­ers by per­ceived cor­rup­tion and de­mo­c­ra­tic qual­ity. Lines show model-pre­dicted prob­a­bil­i­ties at the 10th (low democ­racy), 50th (medium democ­racy), and 90th (high democ­racy) per­centiles of the Liberal Democracy Index. The steeper slope for high-democ­racy coun­tries il­lus­trates the stronger cor­rup­tion-trust link in de­mo­c­ra­tic con­texts.We con­ducted sev­eral ro­bust­ness checks. First, we added eco­nomic in­equal­ity (Gini co­ef­fi­cient) as a com­pet­ing cross-level mod­er­a­tor. The democ­racy × cor­rup­tion in­ter­ac­tion re­mains es­sen­tially un­changed in this model (B = −0.16, 95% CI [−0.22, −0.1], p B = −0.16, 95% CI [−0.22, −0.09], p B = −0.18, 95% CI [−0.26, −0.11], p Sec­ond, we tested whether al­ter­na­tive coun­try-level char­ac­ter­is­tics could serve as mod­er­a­tors. Press free­dom showed the same mod­er­at­ing ef­fect to lib­eral democ­racy (B = −0.16, 95% CI [−0.22, −0.1], p B = −0.16, 95% CI [−0.22, −0.1], p B = −0.02, 95% CI [−0.03, −0.01], p = 0), sug­gest­ing that de­mo­c­ra­tic in­sti­tu­tions rather than state ca­pac­ity drive the mod­er­a­tion.Third, we con­ducted leave-one-out analy­ses, re-es­ti­mat­ing the model 62 times, each time ex­clud­ing one coun­try. The cross-level in­ter­ac­tion co­ef­fi­cient ranged from −0.17 to −0.15 across these analy­ses, with no sin­gle coun­try dri­ving the re­sults.

This ar­ti­cle pro­vides sys­tem­atic in­di­vid­ual-level ev­i­dence that the cor­rup­tion-trust as­so­ci­a­tion dif­fers across regime types. Previous re­search doc­u­mented this pat­tern at the ag­gre­gate level (You, 2018) but could not de­ter­mine whether it re­flected gen­uine dif­fer­ences in in­di­vid­ual-level psy­cho­log­i­cal processes or merely com­po­si­tional ef­fects. By mea­sur­ing coun­try-spe­cific in­di­vid­ual-level co­ef­fi­cients and show­ing they vary sys­tem­at­i­cally with de­mo­c­ra­tic qual­ity, we pro­vide ev­i­dence con­sis­tent with the view that regime type shapes how cor­rup­tion per­cep­tions re­late to so­cial trust. Our find­ings sug­gest an asym­me­try in how cor­rup­tion re­lates to trust across regime types. In au­toc­ra­cies, the in­di­vid­ual-level as­so­ci­a­tion be­tween cor­rup­tion per­cep­tions and so­cial dis­trust ap­pears weak or ab­sent, which may help ex­plain how some au­to­cratic regimes com­bine high cor­rup­tion with rel­a­tively high gen­er­al­ized trust (Figure 1). In democ­ra­cies, this as­so­ci­a­tion is sub­stan­tially stronger. This pat­tern sug­gests that gov­ern­ment qual­ity in democ­ra­cies may mat­ter not merely for ad­min­is­tra­tive ef­fi­ciency or eco­nomic per­for­mance, but po­ten­tially for the so­cial con­di­tions that sup­port de­mo­c­ra­tic gov­er­nance. Institutional in­tegrity may af­fect the so­cial trust that fa­cil­i­tates de­mo­c­ra­tic co­op­er­a­tion—from vol­un­tary tax com­pli­ance to elec­toral par­tic­i­pa­tion to civic en­gage­ment.These find­ings speak to con­tem­po­rary de­bates about de­mo­c­ra­tic back­slid­ing and re­silience. They sug­gest that cor­rup­tion scan­dals in es­tab­lished democ­ra­cies should per­haps not be viewed merely as crim­i­nal jus­tice mat­ters or ad­min­is­tra­tive fail­ures, but as po­ten­tial threats to so­cial co­he­sion. This may help ex­plain a puz­zling fea­ture of con­tem­po­rary pol­i­tics: why rel­a­tively mi­nor cor­rup­tion scan­dals can gen­er­ate sig­nif­i­cant po­lit­i­cal crises in es­tab­lished democ­ra­cies, while au­toc­ra­cies weather far more egre­gious cor­rup­tion with lim­ited so­cial con­se­quence. The dif­fer­ence may lie not in the sever­ity of the cor­rup­tion per se, but in how in­sti­tu­tional frame­works shape how cit­i­zens in­ter­pret and re­spond to cor­rup­tion.Our re­sults also have po­ten­tial im­pli­ca­tions for anti-cor­rup­tion ef­forts. Standard ap­proaches fo­cus on tech­ni­cal re­forms: strength­en­ing au­dit in­sti­tu­tions, im­prov­ing pro­cure­ment trans­parency, rais­ing civil ser­vant salaries, and en­hanc­ing crim­i­nal en­force­ment. While these mea­sures may re­duce cor­rup­tion lev­els, our find­ings sug­gest they may be in­suf­fi­cient to ad­dress the so­cial con­se­quences of cor­rup­tion in democ­ra­cies. If cor­rup­tion per­cep­tions are in­deed as­so­ci­ated with re­duced so­cial trust in de­mo­c­ra­tic con­texts, then anti-cor­rup­tion strate­gies in democ­ra­cies may need to be ac­com­pa­nied by ef­forts to re­build and main­tain so­cial trust: swift, vis­i­ble ac­count­abil­ity when cor­rup­tion is dis­cov­ered; sym­bolic reaf­fir­ma­tion of de­mo­c­ra­tic equal­ity norms; and frank pub­lic dis­course about how cor­rup­tion re­lates to de­mo­c­ra­tic val­ues.These find­ings also sug­gest that gov­ern­ment com­mu­ni­ca­tion about trans­parency and anti-cor­rup­tion ef­forts may mat­ter for so­cial trust. Because per­cep­tions of cor­rup­tion, and not merely ac­tual cor­rup­tion, ap­pear to drive the trust ero­sion we doc­u­ment, proac­tive com­mu­ni­ca­tion about in­sti­tu­tional in­tegrity may be valu­able. Democratic gov­ern­ments could in­vest in pub­li­ciz­ing ac­count­abil­ity mea­sures, suc­cess­ful pros­e­cu­tions of cor­rupt of­fi­cials, and on­go­ing in­sti­tu­tional re­forms. Such com­mu­ni­ca­tion cam­paigns would not sub­sti­tute for sub­stan­tive anti-cor­rup­tion work but could com­ple­ment it by en­sur­ing that cit­i­zens are aware of their gov­ern­men­t’s com­mit­ment to im­par­tial­ity. This may be es­pe­cially im­por­tant in democ­ra­cies, where our find­ings sug­gest that trust is par­tic­u­larly sen­si­tive to per­ceived cor­rup­tion.More broadly, these find­ings con­tribute to un­der­stand­ing po­ten­tial mi­cro-foun­da­tions of regime sta­bil­ity. While much schol­ar­ship fo­cuses on how in­sti­tu­tions shape elite be­hav­ior, our re­sults sug­gest that in­sti­tu­tions may also shape mass psy­chol­ogy in ways rel­e­vant for regime dy­nam­ics. Democratic in­sti­tu­tions may cre­ate cit­i­zens whose so­cial trust is more re­spon­sive to per­ceived in­sti­tu­tional fail­ure than cit­i­zens in au­toc­ra­cies.We ac­knowl­edge sev­eral lim­i­ta­tions of this study. First, and most im­por­tantly, we the­o­rize but do not di­rectly test the spe­cific mech­a­nisms we pro­pose—nor­ma­tive am­pli­fi­ca­tion and rep­re­sen­ta­tive con­ta­gion. Our data show that the cor­rup­tion-trust as­so­ci­a­tion varies by regime type, but we can­not ob­serve the psy­cho­log­i­cal processes that pro­duce this vari­a­tion. Future ex­per­i­men­tal re­search could di­rectly ma­nip­u­late nor­ma­tive frames (e.g., pre­sent­ing cor­rup­tion as vi­o­lat­ing equal­ity norms ver­sus as typ­i­cal elite be­hav­ior) and rep­re­sen­ta­tive con­nec­tion (e.g., em­pha­siz­ing that of­fi­cials were elected by cit­i­zens ver­sus ap­pointed by elites) to test whether these fac­tors mod­er­ate how cor­rup­tion in­for­ma­tion af­fects trust. Survey re­search could also mea­sure per­ceived norm vi­o­la­tion and rep­re­sen­ta­tive iden­ti­fi­ca­tion as me­di­a­tors. Until such stud­ies are con­ducted, our mech­a­nis­tic ac­count re­mains the­o­ret­i­cal.Sec­ond, our cross-sec­tional in­di­vid­ual-level data can­not es­tab­lish the causal di­rec­tion from cor­rup­tion per­cep­tions to trust. The as­so­ci­a­tion we ob­serve is con­sis­tent with cor­rup­tion per­cep­tions re­duc­ing trust, but re­verse causal­ity is also plau­si­ble: in­di­vid­u­als with gen­er­ally low trust may be more in­clined to per­ceive cor­rup­tion. For ev­i­dence sup­port­ing the cor­rup­tion-to-trust di­rec­tion, we rely on the ex­per­i­men­tal lit­er­a­ture (Rothstein and Eek, 2009; Martinangeli et al., 2024), which demon­strates that ex­po­sure to in­for­ma­tion about cor­rup­tion causally re­duces gen­er­al­ized trust. Our con­tri­bu­tion is to show that this as­so­ci­a­tion varies sys­tem­at­i­cally by regime type, but we can­not rule out that regime type also mod­er­ates re­verse-causal processes.Third, our cross-sec­tional de­sign can­not cap­ture within-coun­try change over time. Longitudinal analy­sis track­ing how the within-coun­try cor­rup­tion-trust re­la­tion­ship shifts af­ter pro­longed de­moc­ra­ti­za­tion or back­slid­ing would pro­vide stronger ev­i­dence for our the­o­ret­i­cal ac­count. Fourth, we fo­cus on lib­eral democ­racy as the pri­mary mod­er­a­tor, and do not ex­am­ine the sep­a­rate roles of dif­fer­ent in­sti­tu­tional fea­tures such as ju­di­cial in­de­pen­dence, press free­dom, and elec­toral in­tegrity. These fea­tures are highly cor­re­lated in our data, mak­ing it dif­fi­cult to iso­late their in­de­pen­dent con­tri­bu­tions.

Eriksson K and Vartanova I (2026) The price of ac­count­abil­ity: cor­rup­tion erodes so­cial trust more in democ­ra­cies than in au­toc­ra­cies. Front. Polit. Sci. 8:1779810. doi: 10.3389/fpos.2026.1779810This is an open-ac­cess ar­ti­cle dis­trib­uted un­der the terms of the Creative Commons Attribution License (CC BY). The use, dis­tri­b­u­tion or re­pro­duc­tion in other fo­rums is per­mit­ted, pro­vided the orig­i­nal au­thor(s) and the copy­right owner(s) are cred­ited and that the orig­i­nal pub­li­ca­tion in this jour­nal is cited, in ac­cor­dance with ac­cepted aca­d­e­mic prac­tice. No use, dis­tri­b­u­tion or re­pro­duc­tion is per­mit­ted which does not com­ply with these terms. All claims ex­pressed in this ar­ti­cle are solely those of the au­thors and do not nec­es­sar­ily rep­re­sent those of their af­fil­i­ated or­ga­ni­za­tions, or those of the pub­lisher, the ed­i­tors and the re­view­ers. Any prod­uct that may be eval­u­ated in this ar­ti­cle or claim that may be made by its man­u­fac­turer is not guar­an­teed or en­dorsed by the pub­lisher.

...

Read the original on www.frontiersin.org »

4 545 shares, 66 trendiness

MoD sources warn Palantir’s role at heart of government is a threat to UK’s security

It beg­gars be­lief’: MoD sources warn Palantir’s role at heart of gov­ern­ment is threat to UKs se­cu­rity It beg­gars be­lief’: MoD sources warn Palantir’s role at heart of gov­ern­ment is threat to UKs se­cu­rity Experts say that claims UK data re­mains un­der gov­ern­ment own­er­ship miss the point that the com­pany has the ca­pa­bil­ity to build its own de­tailed pic­ture of the British pop­u­la­tion, and even in­fer state se­crets. Report by Charlie Young and Carole Cadwalladr UK Prime Minister Keir Starmer and Louis Mosley, head of Palantir Technologies UK. Graphic: the NervePalantir, the US AI sur­veil­lance and se­cu­rity firm with hun­dreds of mil­lions of pounds in UK gov­ern­ment con­tracts, poses a na­tional se­cu­rity threat to the UK, ac­cord­ing to two anony­mous high-level sources work­ing with the Ministry of Defence. The in­sid­ers, who are se­nior sys­tems en­gi­neers with knowl­edge of the Palantir soft­ware sys­tems the MoD is us­ing, have come for­ward to speak af­ter the pub­lished an in­ves­ti­ga­tion in January that re­vealed Palantir had at least £670m worth of con­tracts across the UK gov­ern­ment, in­clud­ing £15m with the UK nu­clear weapons agency. In that in­ves­ti­ga­tion, data and se­cu­rity ex­perts claimed that the con­tracts with the firm, owned by Peter Thiel, are a crit­i­cal risk to Britain’s na­tional se­cu­rity. At the time, an MoD spokesman told the that all data re­mains sov­er­eign and un­der the own­er­ship of the MoD”. However, the MoD in­sid­ers, who have de­tailed knowl­edge of the un­der­ly­ing tech­nol­ogy, say such state­ments are ignorant” and/​or mis­lead­ing. It’s be­lieved to be the first time in­di­vid­u­als cur­rently work­ing with the min­istry have spo­ken out about the na­tional se­cu­rity risks Palantir poses. They are do­ing so be­cause they be­lieve that these are mat­ters of the high­est pub­lic in­ter­est and that par­lia­ment needs to act. The first, a se­nior sys­tems en­gi­neer with the MoD who has decades of ex­pe­ri­ence across the de­fence in­dus­try, told the : Ministers clearly have a lack of un­der­stand­ing of Palantir’s tech­nol­ogy. The state­ments with re­spect to sov­er­eign data ap­pear to be miss­ing the point en­tirely. [They’re] miss­ing the re­al­i­ties of data scrap­ing, of ag­gre­ga­tion, and the fact that Palantir is build­ing its own rich pic­ture of our na­tion that they can use for their own ends.“Al­low­ing a sin­gle en­tity, for­eign or do­mes­tic, to have such far-reach­ing, per­va­sive ac­cess is in­her­ently dan­ger­ous. How our na­tional cy­ber­se­cu­rity cen­tre has al­lowed this beg­gars be­lief.”At the heart of the claims is that while the un­der­ly­ing data may re­main un­der the MoD’s con­trol, any in­sights de­rived from that data do not. The im­pli­ca­tions of this, the in­sid­ers say, are far-reach­ing, es­pe­cially be­cause of the vast quan­tity of per­sonal and other data the com­pany has ac­cess to across UK gov­ern­ment de­part­ments. One source said: Palantir does not need to own the data or even have stew­ard­ship. They can ex­tract, trans­form and ex­ploit the meta­data to build their own rich pic­ture.”A sec­ond source, who has a back­ground in in­tel­li­gence, said Palantir prob­a­bly has a com­plete pro­file on the whole UK pop­u­la­tion. They have vis­i­bil­ity into wildly dif­fer­ent fo­cus ar­eas, yet their data is all con­densed into one for­eign sup­pli­er’s con­trol/​vis­i­bil­ity. At the very least I’d call that a se­cu­rity risk.”’We find our­selves hitched to an er­ratic, dan­ger­ous, mega­lo­ma­niac power in de­nial of its own lim­its. If Palantir knows every­thing, it just gives them huge ex­tra lever­age’­Fur­ther, the sources claim that Palantir can see far more in­for­ma­tion than the gov­ern­ment re­alises. Palantir can ag­gre­gate data from across dif­fer­ent gov­ern­ment datasets to gen­er­ate top se­cret in­for­ma­tion, the has been told. One source de­scribed a hy­po­thet­i­cal ex­am­ple where Palantir could com­bine three pieces of un­clas­si­fied in­for­ma­tion to de­ter­mine the lo­ca­tion of a nu­clear sub­ma­rine. They said: A par­cel is sent out by a de­fence sup­plier with a Nato part num­ber, an ad­dress and an ar­rival date. Even if the la­bel is a QR code and is­n’t hu­man-read­able, the data it con­tains would al­low Palantir to know that a nu­clear sub­ma­rine would be in Diego Garcia on 4 April. Those three bits of in­for­ma­tion — the part num­ber, Diego Garcia and 4 April — are, in­di­vid­u­ally, com­pletely un­clas­si­fied. Together, they are se­cret.”Dun­can McCann, head of tech and data at the Good Law Project, called the in­for­ma­tion potentially ex­plo­sive”. What the rev­e­la­tions show, he says, is that the UK has given a pri­vate com­pany such de­tailed ac­cess to our na­tional se­cu­rity data that they can them­selves in­fer things that they just aren’t sup­posed to know. For whose ben­e­fit is this?” The MoD did not re­spond to the press in­quiries. However, in  January, de­fence min­is­ter Luke Pollard told the House of Commons: All data used and de­vel­oped in Palantir’s soft­ware de­ployed across the Ministry of Defence will re­main un­der the own­er­ship of the MoD. We have clear con­trac­tual con­trols in place to en­sure this as well as con­trol over the data sys­tem that Palantir soft­ware sits upon.”The se­nior sys­tems en­gi­neer said this state­ment was be­side the point. He said: Whether or not the UK tech­ni­cally owns the data is al­most ir­rel­e­vant. That’s like read­ing a se­cret love let­ter and say­ing the se­crets in it are safe, just be­cause you’ve promised never to copy it word for word or take it out of the room.”“When you have that mo­saic built from UK sov­er­eign de­fence, health, roads, power net­works, power sta­tions, and our ma­jor in­dus­trial bases, you have a de­tailed un­der­stand­ing of vir­tu­ally every as­pect of the sov­er­eign United Kingdom. For an ad­ver­sary, or even a na­tion with whom we have a spe­cial re­la­tion­ship, that pic­ture is worth more than all the fine art on Earth.” A spokesman for Palantir said: These en­tirely false claims have no ground­ing in fact and no se­ri­ous me­dia out­let would re­port them.”Con­cerns about Palantir’s cen­tral role in the UKs crit­i­cal in­fra­struc­ture have height­ened since January when President Trump threat­ened Greenland, a ter­ri­tory of Denmark, a Nato ally. Palan­tir is also be­ing used in America to pro­file and tar­get im­mi­grants for re­moval by Immigration and Customs Enforcement of­fi­cers. Palantir’s work with the US Department of Government Efficiency (Doge) helped syn­the­sise tax and Homeland Security data to give Palantir ac­cess to the USs first search­able cit­i­zen data­base, spark­ing cy­ber­se­cu­rity con­cerns.  It also un­der­pins the AI sys­tems be­ing used by the US mil­i­tary in Venezuela, Gaza and the cur­rent op­er­a­tions in Iran. Martin Wrigley, a Liberal Democrat MP and mem­ber of the sci­ence, in­no­va­tion and tech­nol­ogy Commons se­lect com­mit­tee, said: The UK needs sov­er­eign ca­pa­bil­i­ties in sov­er­eign hands, and we need to re­move com­pa­nies closely as­so­ci­ated with for­eign po­lit­i­cal or­gan­i­sa­tions that are a risk.”Up­grade to mem­ber­ship to fund the Nerve Jim Killock, ex­ec­u­tive di­rec­tor of the Open Rights Group, said:If the US has de­tailed in­sights across every­thing that the MoD does, then in the event of us be­ing re­cal­ci­trant about help­ing the US bomb some coun­try, they can re­mind us — sub­tly or un­sub­tly — what they might do in re­tal­i­a­tion. “The Ministry of Defence or the prime min­is­ter must have some inkling of the risks, but now we find our­selves hitched to an er­ratic, dan­ger­ous, mega­lo­ma­niac power in de­nial of its own lim­its. If Palantir knows every­thing, it just gives them huge ex­tra lever­age.”In con­trast to the MoD, the Swiss army re­jected Palantir’s tech­nol­ogy, de­spite nu­mer­ous pitches that in­cluded an ap­proach by the head of Palantir UK, Louis Mosley. A key con­cern of the army, ac­cord­ing to an of­fi­cial re­port seen by Swiss out­let Republik, was the possibility that sen­si­tive data could be ac­cessed by the US gov­ern­ment and in­tel­li­gence ser­vices”.The has iden­ti­fied a pre­vi­ous case in which Palantir claimed pro­pri­etary rights to data in­sights af­ter its con­tract was can­celled. In the early 2010s, the New York Police Department con­tracted Palantir to help find high-pro­file tar­gets us­ing data scrap­ing and analy­sis. In 2017, it can­celled the con­tract, but Palantir claimed its plat­forms — Gotham and Foundry, the same sys­tems used in­side the UK gov­ern­ment — cre­ated a unique ecosys­tem that sat on top of NYPD data. That meant any analy­sis de­rived from those plat­forms was, they claimed, Palantir’s in­tel­lec­tual prop­erty.As Buzzfeed re­ported at the time: The emerg­ing dis­pute is not over the data that the NYPD has fed into Palantir’s soft­ware, but over the analy­sis that the soft­ware has pro­duced — all the in­sights.” This sug­gests Palantir could gen­er­ate in­sights from UK gov­ern­ment data that in turn could be deemed Palantir’s in­tel­lec­tual prop­erty. Ac­cord­ing to McCann, a hall­mark of al­most every Palantir en­gage­ment is an at­tempt at se­crecy, ob­fus­ca­tion — a real dif­fi­culty in find­ing out what is ac­tu­ally go­ing on.We re­ally don’t have the le­gal ar­chi­tec­ture to de­fend against this kind of big tech. The reg­u­la­tors are asleep at the wheel.”The Nerve is a fear­less, in­de­pen­dent me­dia ti­tle launched by five for­mer Guardian / Observer jour­nal­ists: in­ves­tiga­tive jour­nal­ist Carole Cadwalladr, ed­i­tors Sarah Donaldson, Jane Ferguson and Imogen Carter and cre­ative di­rec­tor Lynsey Irvine. We cover cul­ture, pol­i­tics and tech, brought to you in twice weekly newslet­ters on Tuesdays and Fridays (sign up ). We rely on fund­ing from our com­mu­nity, so please also con­sider join­ing us as a pay­ing mem­ber. You can read more about our mis­sion

...

Read the original on www.thenerve.news »

5 462 shares, 23 trendiness

How I write software with LLMs

On pro­gram­ming and other things.

Tutorials

How to cre­ate an in­ter­net ra­dio sta­tion

How to delete emails on a POP3 server

How to en­code your movies

How to keep your­self se­cure on­line

How to run Opera off a USB thumb drive

How to use BitTorrent to send files

Various

I don’t care for the joy of pro­gram­mingLately I’ve got­ten heav­ily back into mak­ing stuff, and it’s mostly be­cause of LLMs. I thought that I liked pro­gram­ming, but it turned out that what I like was mak­ing things, and pro­gram­ming was just one way to do that. Since LLMs have be­come good at pro­gram­ming, I’ve been us­ing them to make stuff non­stop, and it’s very ex­cit­ing that we’re at the be­gin­ning of yet an­other en­tirely un­ex­plored fron­tier.

There’s a lot of de­bate about LLMs at the mo­ment, but a few friends have asked me about my spe­cific work­flow, so I de­cided to write it up in de­tail, in the hopes that it helps them (and you) make things more eas­ily, quickly, and with higher qual­ity than be­fore.

I’ve also in­cluded a real (annotated) cod­ing ses­sion at the end. You can go there di­rectly if you want to skip the work­flow de­tails.

For the first time ever, around the re­lease of Codex 5.2 (which feels like a cen­tury ago) and, more re­cently, Opus 4.6, I was sur­prised to dis­cover that I can now write soft­ware with LLMs with a very low de­fect rate, prob­a­bly sig­nif­i­cantly lower than if I had hand-writ­ten the code, with­out los­ing the ben­e­fit of know­ing how the en­tire sys­tem works. Before that, code would quickly de­volve into un­main­tain­abil­ity af­ter two or three days of pro­gram­ming, but now I’ve been work­ing on a few pro­jects for weeks non-stop, grow­ing to tens of thou­sands of use­ful lines of code, with each change be­ing as re­li­able as the first one.

I also no­ticed that my en­gi­neer­ing skills haven’t be­come use­less, they’ve just shifted: I no longer need to know how to write code cor­rectly at all, but it’s now mas­sively more im­por­tant to un­der­stand how to ar­chi­tect a sys­tem cor­rectly, and how to make the right choices to make some­thing us­able.

On pro­jects where I have no un­der­stand­ing of the un­der­ly­ing tech­nol­ogy (e.g. mo­bile apps), the code still quickly be­comes a mess of bad choices. However, on pro­jects where I know the tech­nolo­gies used well (e.g. back­end apps, though not nec­es­sar­ily in Python), this has­n’t hap­pened yet, even at tens of thou­sands of SLoC. Most of that must be be­cause the mod­els are get­ting bet­ter, but I think that a lot of it is also be­cause I’ve im­proved my way of work­ing with the mod­els.

One thing I’ve no­ticed is that dif­fer­ent peo­ple get wildly dif­fer­ent re­sults with LLMs, so I sus­pect there’s some el­e­ment of how you’re talk­ing to them that af­fects the re­sults. Because of that, I’m go­ing to drill very far down into the weeds in this ar­ti­cle, go­ing as far as post­ing ac­tual ses­sions, so you can see all the de­tails of how I de­velop.

Another point that should be men­tioned is that I don’t know how mod­els will evolve in the fu­ture, but I’ve no­ticed a trend: In the early days of LLMs (not so much with GPT-2, as that was very lim­ited, but with davinci on­wards), I had to re­view every line of code and make sure that it was cor­rect. With later gen­er­a­tions of LLMs, that went up to the level of the func­tion, so I did­n’t have to check the code, but did have to check that func­tions were cor­rect. Now, this is mostly at the level of general ar­chi­tec­ture”, and there may be a time (next year) when not even that is nec­es­sary. For now, though, you still need a hu­man with good cod­ing skills.

What I’ve built this way

I’ve built quite a few things re­cently, and I want to list some of them here be­cause a com­mon crit­i­cism of LLMs is that peo­ple only use them for toy scripts. These pro­jects range from se­ri­ous daily dri­vers to art pro­jects, but they’re all real, main­tained pro­jects that I use every day:

The largest thing I’ve built lately is an al­ter­na­tive to OpenClaw that fo­cuses on se­cu­rity. I’ve wanted an LLM per­sonal as­sis­tant for years, and I fi­nally got one with this. Here, most peo­ple say but you can’t make LLMs se­cure!”, which is mis­un­der­stand­ing that se­cu­rity is all about trade­offs, and that what my agent tries to do is max­i­mize se­cu­rity for a given amount of us­abil­ity. I think it suc­ceeds very well, I’ve been us­ing it for a while now and re­ally like the fact that I can rea­son ex­actly about what it can and can’t do.

It man­ages my cal­en­dar and in­tel­li­gently makes de­ci­sions about my avail­abil­ity or any clashes, does re­search for me, ex­tends it­self by writ­ing code, re­minds me of all the things I used to for­get and man­ages chores au­tonomously, etc. Assistants are some­thing that you can’t re­ally ex­plain the ben­e­fit of, be­cause they don’t have one killer fea­ture, but they al­le­vi­ate a thou­sand small pa­per cuts, pa­per cuts which are dif­fer­ent for each per­son. So, try­ing to ex­plain to some­one what’s so good about hav­ing an as­sis­tant ends up get­ting a re­ac­tion of but I don’t need any of the things you need” and misses the point that every­one needs dif­fer­ent things, and an agent with ac­cess to tools and the abil­ity to make in­tel­li­gent de­ci­sions to solve prob­lems is a great help for any­one.

I’m plan­ning to write this up in more de­tail soon, as there were some very in­ter­est­ing chal­lenges when de­sign­ing it, and I like the way I solved them.

Maybe my nam­ing re­cently has­n’t been stel­lar, but this is a small pen­dant that records voice notes, tran­scribes them, and op­tion­ally POSTs them to a web­hook of your choice. I have it send the voice notes to my LLM, and it feels great to just take the thing out of my pocket at any time, press a but­ton, and record a thought or ask a ques­tion into it, and know that the an­swer or todo will be there next time I check my as­sis­tan­t’s mes­sages.

It’s a sim­ple thing, but the use­ful­ness comes not so much from what it does, but from the way it does it. It’s al­ways avail­able, al­ways re­li­able, and with zero fric­tion to use.

I’m plan­ning to write some­thing about this too, but this one is more of an art piece: It’s a tick­ing wall clock that ticks sec­onds ir­reg­u­larly, but is al­ways ac­cu­rate to the minute (with its time get­ting synced over the in­ter­net). It has var­i­ous modes, one mode has vari­able tick tim­ing, from 500 ms to 1500 ms, which is de­light­fully in­fu­ri­at­ing. Another mode ticks im­per­cep­ti­bly more quickly than a sec­ond, but then pauses for a sec­ond ran­domly, mak­ing the un­sus­pect­ing ob­server ques­tion their san­ity. Another one races to :59 at dou­ble speed and then waits there for thirty sec­onds, and the last one is sim­ply a nor­mal clock, be­cause all the ir­reg­u­lar tick­ing dri­ves me crazy.

Pine Town is a whim­si­cal in­fi­nite mul­ti­player can­vas of a meadow, where you get your own lit­tle plot of land to draw on. Most peo­ple draw… ques­tion­able con­tent, but once in a while an adult will visit and draw some­thing nice. Some draw­ings are real gems, and it’s gen­er­ally fun scrolling around to see what peo­ple have made.

I’ve made all these pro­jects with LLMs, and have never even read most of their code, but I’m still in­ti­mately fa­mil­iar with each pro­jec­t’s ar­chi­tec­ture and in­ner work­ings. This is how:

For the har­ness, I use OpenCode. I re­ally like its fea­tures, but ob­vi­ously there are many choices for this, and I’ve had a good ex­pe­ri­ence with Pi as well, but what­ever har­ness you use, it needs to let you:

Use mul­ti­ple mod­els from dif­fer­ent com­pa­nies.

Most first-party har­nesses (Claude Code, Codex CLI, Gemini CLI) will fail this, as com­pa­nies only want you to use their mod­els, but this is nec­es­sary.

Define cus­tom agents that can au­tonomously call each other.

There are var­i­ous other nice-to-haves, such as ses­sion sup­port, work­tree man­age­ment, etc, that you might want to have de­pend­ing on your pro­ject and tech stack, but those are up to you. I’ll ex­plain the two re­quire­ments above, and why they’re nec­es­sary.

You can con­sider a spe­cific model (e.g. Claude Opus) as a per­son. Sure, you can start again with a clean con­text, but the model will mostly have the same opin­ions/​strengths/​weak­nesses as it did be­fore, and it’s very likely to agree with it­self. This means that it’s fairly use­less to ask a model to re­view the code it just wrote, as it tends to mostly agree with it­self, but it also means that get­ting a dif­fer­ent model to re­view the code will lead to a big im­prove­ment. Essentially, you’re get­ting a re­view from a sec­ond set of eyes.

Different mod­els will have dif­fer­ent strengths and weak­nesses here. For ex­am­ple (and this is very spe­cific to to­day’s mod­els), I find Codex 5.4 pretty nit­picky and pedan­tic. This is­n’t some­thing I want when I want to get code writ­ten, but it def­i­nitely is some­thing I want for a re­view. The de­ci­sions Opus 4.6 makes cor­re­late quite well with the de­ci­sions I would have made, and Gemini 3 Flash (yes, Flash!) has even been very good at com­ing up with so­lu­tions that other mod­els did­n’t see.

Everyone has a dif­fer­ent opin­ion on what model suits them for which job, and mod­els tend to al­ter­nate (e.g. I used Codex as my main model back in November, switch­ing back to Opus later). To get the best re­sults, you need a mix of all of them.

Agents that call each other

The work­flow I use con­sists of dif­fer­ent agents, and if the har­ness does­n’t have the abil­ity to let agents talk to each other, you’ll be do­ing a lot of an­noy­ing fer­ry­ing of in­for­ma­tion be­tween LLMs. You prob­a­bly want to cut down on that, so this is a very use­ful fea­ture.

My work­flow con­sists of an ar­chi­tect, a de­vel­oper, and one to three re­view­ers, de­pend­ing on the im­por­tance of the pro­ject. These agents are con­fig­ured as OpenCode agents (basically skill files, files with in­struc­tions for how I want each agent to be­have).

I use mul­ti­ple agents (rather than us­ing just one for every­thing) for three rea­sons:

It al­lows me to use an ex­pen­sive model (Opus) for plan­ning and gen­er­at­ing a de­tailed plan but a cheap one (Sonnet) for the ac­tual writ­ing of the code.

That just saves on to­kens ver­sus hav­ing Opus do the whole thing.

It lets me use dif­fer­ent mod­els to re­view the code, which does in­crease qual­ity be­cause dif­fer­ent mod­els will catch dif­fer­ent things when re­view­ing.

It lets me use agents with dif­fer­ent ca­pa­bil­i­ties (e.g. one might have read-only ac­cess to the code, whereas the other might have write ac­cess to).

I don’t think us­ing two agents with the same model and the same ca­pa­bil­i­ties makes much sense, as I think it’s some­thing like one per­son pre­tend­ing to wear dif­fer­ent hats, but I haven’t stud­ied this ex­ten­sively.

I also tend to write the skill files by hand, as I find it does­n’t re­ally help if you ask the LLM to write a skill. It would be like ask­ing some­one to write up in­struc­tions on how to be a great en­gi­neer and then gave them their own in­struc­tions and said here’s how to be a great en­gi­neer, now be one”. It ob­vi­ously won’t re­ally make them bet­ter, so I try to write the in­struc­tions my­self.

If you want to try this your­self, you can down­load my agent files.

The ar­chi­tect (Claude Opus 4.6, cur­rently) is the only agent I in­ter­act with. This needs to be a very strong model, typ­i­cally the strongest model I have ac­cess to. This step does­n’t con­sume too many to­kens, as it’s mostly chat, but you want this to be very well-rea­soned.

I’ll tell the LLM my main goal (which will be a very spe­cific fea­ture or bug­fix e.g. I want to add re­tries with ex­po­nen­tial back­off to Stavrobot so that it can retry if the LLM provider is down”), and talk to it un­til I’m sure it un­der­stands what I want. This step takes the most time, some­times even up to half an hour of back-and-forth un­til we fi­nal­ize all the goals, lim­i­ta­tions, and trade­offs of the ap­proach, and agree on what the end ar­chi­tec­ture should look like. It re­sults in a rea­son­ably low-level plan, with a level of de­tail of in­di­vid­ual files and func­tions. For ex­am­ple, tasks might be I’ll add ex­po­nen­tial back­off to these three code­paths of these two com­po­nents in this file, as no other com­po­nent talks to the LLM provider”.

I know that some peo­ple in this step pre­fer to have the LLM write out the plan to a file, and then they add their feed­back to that file in­stead of talk­ing to the LLM. This is a mat­ter of per­sonal pref­er­ence, as I can see both ap­proaches work­ing equally well, so feel free to do the re­views that way if it suits you more. Personally, I pre­fer chat­ting to the LLM.

To clar­ify, in this step I’m not just prompt­ing, I’m shap­ing the plan with the help of the LLM. I still have to cor­rect the LLM a lot, ei­ther be­cause it’s wrong or sim­ply be­cause it’s not do­ing things the way I’d do them, and that’s a big part of my con­tri­bu­tion, as well as the part I get joy from. This di­rec­tion is what lets me call pro­jects mine, be­cause some­one else us­ing the same LLM would have come up with a dif­fer­ent thing.

When I’m sat­is­fied that we’ve ironed out all the kinks (the LLM is very help­ful at this, ask­ing ques­tions for what it does­n’t know yet and giv­ing me op­tions), I can fi­nally ap­prove the plan. I’ve asked the ar­chi­tect to not start any­thing un­til I ac­tu­ally say the word approved”, as a few mod­els tend to be overea­ger and go off to start the im­ple­men­ta­tion when they feel like they un­der­stood, whereas I want to make sure I’m con­fi­dent it un­der­stood.

Then, the ar­chi­tect will split the work into tasks, and write each task out into a plan file, usu­ally in more de­tail (and at a lower level) than our chat, and call the de­vel­oper to start work. This gives the de­vel­oper con­crete di­rec­tion, and min­i­mizes the high-level choices the de­vel­oper can make, as the choices have al­ready been made for it.

The de­vel­oper can be a weaker, more to­ken-ef­fi­cient model (I use Sonnet 4.6). The plan should­n’t give it much lee­way into what it can do, and its job is strictly to im­ple­ment the changes in the plan. When it’s done, it calls the re­view­ers to re­view its work.

Each re­viewer will in­de­pen­dently look at the plan and diff of the fea­ture that was just im­ple­mented, and cri­tique it. For this step, I will al­ways use at least Codex, some­times I’ll add Gemini, and on im­por­tant pro­jects I’ll add Opus as well.

This feed­back goes back to the de­vel­oper, which ei­ther in­te­grates it, if the re­view­ers agree, or it es­ca­lates to the ar­chi­tect when the re­view­ers dis­agree. I’ve found that Opus is very good at choos­ing the right feed­back to im­ple­ment, some­times ig­nor­ing feed­back be­cause it’s too pedan­tic (i.e. hard to im­ple­ment and un­likely to be a prob­lem in prac­tice). Obviously, when I use ob­jec­tive as­sess­ments like very good”, I re­ally mean I agree with it a lot”.

This way of work­ing means that I still know every choice that was made above the func­tion level, and can use that knowl­edge in sub­se­quent runs. I of­ten no­tice the LLM rec­om­mend things that might be good in an­other code­base, but ei­ther won’t work or are sub­op­ti­mal in my code­base, which shows that the LLM has some blind spots when re­search­ing the code. I will of­ten say no, you should do this us­ing Y”, at which point the LLM re­al­izes that Y ac­tu­ally ex­ists in the code and is a bet­ter way than the one it rec­om­mended.

On the flip side, when I’m not fa­mil­iar enough with the tech­nol­ogy to be on top of the ar­chi­tec­ture, I tend to not catch bad de­ci­sions that the LLM makes. This leads to the LLM build­ing more and more on top of those bad de­ci­sions, even­tu­ally get­ting in a state where it can’t un­tan­gle the mess. You know this hap­pens when you keep telling the LLM the code does­n’t work, it says I know why! Let me fix it” and keeps break­ing things more and more.

That’s a real fail­ure mode that has hap­pened to me too many times now, which is why I ended up with this work­flow. For this rea­son, I try to un­der­stand things as much as I can while plan­ning, even if I’m un­fa­mil­iar with the spe­cific tech­nol­ogy. If I man­age to steer the LLM well, it saves a lot of trou­ble later on.

Here’s an an­no­tated tran­script from a real ses­sion where I add email sup­port to Stavrobot. I’ve trimmed the tool calls and ver­bose bits, but the con­ver­sa­tion and de­ci­sion-mak­ing process is ex­actly as it hap­pened.

I start by telling the LLM what I’d like to im­ple­ment, at a very high level. Sometimes I’ll give it more de­tail, es­pe­cially if I al­ready have an idea of how I want the im­ple­men­ta­tion done.

Me: I’d like to add email sup­port to this bot. Let’s think through

how we would do this.

The bot reads the code and finds all the rel­e­vant bits, and asks some ques­tions.

LLM: Good overview ac­quired. Here’s what I see and the ques­tions we

need to re­solve:

## Current pat­tern

All chan­nels fol­low the same flow: in­bound web­hook/​lis­tener →

en­queueMes­sage(mes­sage, source, sender, at­tach­ments) → LLM

processes → LLM calls send_*_mes­sage tool to re­ply. Channels are

ei­ther in-process (Telegram web­hooks, WhatsApp via Baileys) or

sep­a­rate con­tain­ers (Signal bridge). Email would be a new chan­nel

fol­low­ing this same pat­tern.

## Key de­sign ques­tions

1. Inbound — how does email ar­rive?

- IMAP polling (simplest — poll a mail­box on a timer, works

with any provider)

- Webhook from a trans­ac­tional email ser­vice (SendGrid/Mailgun

in­bound parse — re­quires that ser­vice)

- Run an SMTP server (most com­plex, re­quires DNS/MX setup)

2. Outbound — how do we send?

- SMTP client (works with any provider — Gmail, Fastmail,

self-hosted)

- Transactional email API (SendGrid/Mailgun API)

3. What’s your ac­tual use case? This af­fects sev­eral de­ci­sions:

- Full bidi­rec­tional con­ver­sa­tion (reply to emails, main­tain

threads)?

- One-way no­ti­fi­ca­tions from the bot?

- Processing in­com­ing emails (e.g., for­warded emails,

di­gests)?

4. Architecture — sep­a­rate con­tainer or in-process?

Signal uses a bridge con­tainer be­cause sig­nal-cli is a heavy

Java process. Email li­braries are light­weight, so in-process

(like Telegram) is vi­able. But the AGENTS.md rule about not

adding run­times to the app con­tainer does­n’t ap­ply here since

it’s just TypeScript/Node.

5. Email-specific con­cerns:

- How to han­dle HTML emails? Strip to plain text? Pass a

mark­down con­ver­sion to the LLM?

- Thread track­ing — match replies to con­ver­sa­tions by

In-Reply-To/References head­ers, or treat each email

...

Read the original on www.stavros.io »

6 395 shares, 15 trendiness

Nasdaq's Shame

Disclaimer: The fol­low­ing ar­ti­cle is for in­for­ma­tional pur­poses only and rep­re­sents the opin­ion of the au­thor. It should not be con­strued as in­vest­ment or any other kind of ad­vice. All in­for­ma­tion is based on pub­licly avail­able data be­lieved to be ac­cu­rate as of the date of pub­li­ca­tion, but the au­thor makes no rep­re­sen­ta­tions or war­ranties as to its com­plete­ness or ac­cu­racy. Readers should con­duct their own re­search and due dili­gence and don’t rely on anony­mous goof­balls on the in­ter­net for any­thing. Thank you for your at­ten­tion to this mat­ter.

When buy­ing and sell­ing are con­trolled by leg­is­la­tion, the first things to be bought and sold are leg­is­la­tors.” - P. J. O’Rourke

I’m gen­er­ally a fan of in­dex in­vest­ing for most peo­ple. Historically, in­dex­ing was a bril­liant, low-cost way for in­vestors to free-ride on the price dis­cov­ery done by ac­tive man­agers (longs and the oc­ca­sional evil lad­der-at­tack­ing short seller). The mar­ket did the hard work of fig­ur­ing out what a com­pany was worth, and the in­dex fund sim­ply bought a rep­re­sen­ta­tive slice of the mar­ket. The in­dex was the tail; price dis­cov­ery was the dog.

Today, the tail is vi­o­lently wag­ging the dog. And the dog has the shits. Trillions of dol­lars are blindly slosh­ing around in pas­sive funds, and in­dex in­clu­sion dic­tates mar­ket struc­ture rather than re­flect­ing it.

Nasdaq re­cently cir­cu­lated a Nasdaq-100 Index Consultation” (located here). They are of­fi­cially seek­ing feed­back from in­vestors on pro­posed up­dates to their in­dex method­ol­ogy. But let’s be real: this consultation” is Nasdaq-speak for let­ting us know what they are go­ing to force feed us, like a baggy foie gras duck. A thinly-veiled blue­print for how to force­fully trans­fer wealth from the re­tire­ment ac­counts of pas­sive re­tail in­vestors di­rectly into the pock­ets of cor­po­rate in­sid­ers and early in­vestors.

Source: Google (not a shitco) Gemini Nano Banana 2

Why the sud­den urge to rewrite the rule­book? Because Elon Musk’s SpaceX is gear­ing up for an IPO with a re­ported tar­get val­u­a­tion of around $1.75 tril­lion. To win this lu­cra­tive list­ing over the NYSE, they ap­pear to be shame­lessly bend­ing the knee to a spe­cific de­mand by SpaceX (per Reuters) for near-im­me­di­ate in­dex in­clu­sion. This rule change will also give Nasdaq a leg up over the NYSE on sub­se­quent large IPOs (think OpenAI, or Anthropic as ex­am­ples).

Let’s look at the ac­tual pro­posed rules in this con­sul­ta­tion doc­u­ment, be­cause it is an ab­solute mas­ter­class in struc­tural mar­ket ma­nip­u­la­tion.

Currently, large com­pa­nies newly listed via IPO aren’t added to the in­dex in a timely man­ner. Nasdaq wants to fix” this sup­posed prob­lem that just mys­te­ri­ously cropped up the mo­ment SpaceX an­nounced its IPO plans, with a Fast Entry” rule. Under this rule, any newly listed com­pany whose en­tire mar­ket cap­i­tal­iza­tion ranks within the top 40 cur­rent con­stituents gets an­nounced with at least five days’ no­tice and added to the in­dex af­ter just fif­teen trad­ing days.

Crucially, the com­pany will be en­tirely ex­empt from the stan­dard sea­son­ing and liq­uid­ity re­quire­ments.

Here is the smok­ing gav­age gun. In the ex­act same doc­u­ment, Nasdaq pro­poses a new ap­proach for in­clud­ing and weight­ing low-float” se­cu­ri­ties, de­fined as those be­low 20% free float.

For con­text, a saner in­dex like the S&P 500 is strictly free-float ad­justed,” mean­ing it only weights a com­pany based on the shares ac­tu­ally avail­able for the pub­lic to trade. Nasdaq, on the other hand, uses a method­ol­ogy that fac­tors in all shares, in­clud­ing locked-up in­sider shares that aren’t even for sale—and their pro­posed fix” for this low-float prob­lem is where the math gets truly di­a­bol­i­cal.

Under this new pro­posal, each low-float se­cu­ri­ty’s weight will be me­chan­i­cally ad­justed to five times its free float per­cent­age, capped at 100%.

Source: Nasdaq

Let’s see how this ac­tu­ally works.

Assume SpaceX IPOs at a $1.75 tril­lion val­u­a­tion. To keep things sim­ple, as­sume the com­pany floats 5% of the shares to the pub­lic ($87.5 bil­lion worth of trad­able stock). The other 95% is sub­ject to a stan­dard lock-up pe­riod and can­not be traded.

Under Nasdaq’s pro­posed 5x mul­ti­plier rule, the com­pa­ny’s in­dex weight would be cal­cu­lated at 25% of its to­tal mar­ket cap (5% float x 5), even though only 5% is avail­able for trad­ing. If they float only 5% of the shares, the weight­ing would be 25% (about $438 Billion). 15% would be $1.3 Trillion.

So, on Day 15, pas­sive Nasdaq-100 ETFs (like the QQQ) and mu­tual funds are math­e­mat­i­cally forced to buy al­lo­ca­tions of this stock as if it were, in the 5% ex­am­ple, a $438 bil­lion com­pany.

And here is the real kicker: those fig­ures as­sume the stock stays flat from the IPO price. It won’t. Active traders and hedge funds will ag­gres­sively front-run this guar­an­teed, price-in­sen­si­tive bid. If the stock ral­lies 40% in those first two weeks, pas­sive funds aren’t buy­ing based on the IPO price—they are buy­ing based on the Day 15 mar­ket price. They will be forced to blindly al­lo­cate cap­i­tal at what­ever in­sane price it trades up to.

The in­dex is ap­ply­ing a phan­tom, mega-dol­lar weight­ing to a re­stricted, tightly-held float. Tens of bil­lions of dol­lars of price-in­sen­si­tive, pas­sive cap­i­tal are legally man­dated to ag­gres­sively bid for the stock over a mat­ter of days. You are ef­fec­tively forc­ing a fire­hose of mega-cap in­dex cap­i­tal through a gar­den hose of ac­tual liq­uid­ity. It is a recipe for a mas­sive, ar­ti­fi­cial sup­ply-and-de­mand squeeze.

Some may ar­gue But the lock-up pe­riod is 180 days! The ini­tial in­dex squeeze hap­pens on Day 15. The mar­ket will have months to find real price dis­cov­ery be­fore the in­sid­ers can ac­tu­ally sell!”

Yes, the ini­tial Day 15 squeeze will in­evitably cool off. But by waiv­ing the sea­son­ing pe­riod and jam­ming this low-float be­he­moth into the in­dex im­me­di­ately, you have com­pletely cor­rupted the base­line. You’ve forced pas­sive in­dex­ers to buy at the ab­solute top of an en­gi­neered liq­uid­ity squeeze. You have es­tab­lished a ma­nip­u­lated, ar­ti­fi­cially el­e­vated price floor fu­eled by forced buy­ing.

For the next five months, the stock will be based on a highly dis­torted mar­ket struc­ture, dri­ven in part by con­tin­ued pas­sive in­flows (barring a mar­ket melt­down). And here is where the math gets truly sadis­tic.

Nasdaq’s pro­posed rules ex­plic­itly state that float fig­ures are only up­dated dur­ing sched­uled quar­terly re­bal­ances. And what hap­pens when a com­pa­ny’s float goes above 20%? The 5x mul­ti­plier is dropped, and the com­pany is up­graded to a full, 100% in­dex weight­ing.

So, if you are His Nibs, how do you play this? You time your lock-up ex­pi­ra­tion right be­fore a quar­terly in­dex re­bal­ance.

The mo­ment the lock-up ex­pires, the phys­i­cal float jumps from 5% to 15% (in our ex­am­ples) to 100%. At the very next re­bal­ance, Nasdaq up­dates the math, and the pas­sive funds are me­chan­i­cally forced to in­crease their al­lo­ca­tion from that ar­ti­fi­cial 25-75% (again, in our ex­am­ples) weight­ing up to the full 100%.

They are legally man­dated to ag­gres­sively buy bil­lions of dol­lars more of the stock the ex­act mo­ment the in­sid­ers are able to flood the mar­ket with their un­locked shares. Can you feel your baggy liver turn­ing to foie gras yet?

I’m not al­leg­ing any par­tic­u­lar wrong­do­ing by SpaceX and Nasdaq here. Why would­n’t SpaceX use its (legal) lever­age to push Nasdaq around? And Nasdaq is per­fectly en­ti­tled to feed what’s left of its cred­i­bil­ity into the Cathiewoodchipper.

I don’t think the com­pany is fo­cused on (legally) ex­tort­ing Nasdaq for early in­dex in­clu­sion be­cause of the ini­tial 15 days. I be­lieve they are fo­cused on the lockup ex­piry and en­sur­ing that it co­in­cides with a mas­sive forced bid. The smaller the ini­tial IPO float, the more pow­er­ful the re­bal­ance will be.

To hit the December 18, 2026 quar­terly re­bal­ance date for the Nasdaq 100, SpaceX needs to IPO around mid-June. You will no doubt be shocked to learn that, in­deed, SpaceX is tar­get­ing, ahem, mid-June for its IPO.

Source: New York Post, ref­er­enc­ing The Financial Times

To seem­ingly throw prospec­tive bag­gies off the scent, people fa­mil­iar with the mat­ter” have ap­par­ently leaked the pur­ported rea­son for a mid-June IPO: The align­ment of Jupiter and Venus.

Source: New York Post

Can’t make this shit up. Jupiter and Venus. F*ck me.

I would not at all be sur­prised to see the com­pany is­sue ad­di­tional shares into the forced in­dex buy­ing, po­ten­tially com­pet­ing with its own in­vestors to soak up that de­mand.

Nasdaq has promised to pub­lish a summary” of com­ments that it re­ceived dur­ing the com­ment pe­riod. Perhaps Nasdaq will, in its fi­nal de­ter­mi­na­tion, con­clude that the in­tegrity of its in­dexes is para­mount, and not im­ple­ment this change. Perhaps the SEC will step in and at­tempt to block it, to pro­tect in­vestors, right?

If you’re play­ing a poker game and you look around the table and can’t tell who the sucker is, it’s you.” - Paul Newman

I wrote this ar­ti­cle my­self, and it ex­presses my own per­sonal opin­ions only. I am not re­ceiv­ing com­pen­sa­tion for it. I have no busi­ness re­la­tion­ship with any com­pany whose stock is men­tioned in this ar­ti­cle. It is not a rec­om­men­da­tion to buy, sell, short, hold, or avoid any se­cu­rity. It should not be re­lied upon for any pur­pose other than en­ter­tain­ment. Numbers and analy­sis pre­sented have not been proof-read or in­de­pen­dently ver­i­fied. Assume I am a goof­ball. I’ve been called worse by bet­ter than you. Despite my best ef­forts I make mis­takes. I do get it wrong some­times.

...

Read the original on keubiko.substack.com »

7 369 shares, 43 trendiness

US Job Market Visualizer

This is a re­search tool that vi­su­al­izes 342 oc­cu­pa­tions from the Bureau of Labor Statistics Occupational Outlook Handbook, cov­er­ing 143M jobs across the US econ­omy. Each rec­tan­gle’s area is pro­por­tional to to­tal em­ploy­ment. Color shows the se­lected met­ric — tog­gle be­tween BLS pro­jected growth out­look, me­dian pay, ed­u­ca­tion re­quire­ments, and AI ex­po­sure. Click any tile to view its full BLS page. This is not a re­port, a pa­per, or a se­ri­ous eco­nomic pub­li­ca­tion — it is a de­vel­op­ment tool for ex­plor­ing BLS data vi­su­ally.

LLM-powered col­or­ing: The source code in­cludes scrap­ers, parsers, and a pipeline for writ­ing cus­tom LLM prompts to score and color oc­cu­pa­tions by any cri­te­ria. You write a prompt, the LLM scores each oc­cu­pa­tion, and the treemap col­ors ac­cord­ingly. The Digital AI Exposure” op­tion is one ex­am­ple — it es­ti­mates how much cur­rent AI (which is pri­mar­ily dig­i­tal) will re­shape each oc­cu­pa­tion. But you could write a dif­fer­ent prompt for any ques­tion — e.g. ex­po­sure to hu­manoid ro­bot­ics, off­shoring risk, cli­mate im­pact — and re-run the pipeline to get a dif­fer­ent col­or­ing.

You are an ex­pert an­a­lyst eval­u­at­ing how ex­posed dif­fer­ent oc­cu­pa­tions are to AI. You will be given a de­tailed de­scrip­tion of an oc­cu­pa­tion from the Bureau of Labor Statistics.

Rate the oc­cu­pa­tion’s over­all AI Exposure on a scale from 0 to 10.

AI Exposure mea­sures: how much will AI re­shape this oc­cu­pa­tion? Consider both di­rect ef­fects (AI au­tomat­ing tasks cur­rently done by hu­mans) and in­di­rect ef­fects (AI mak­ing each worker so pro­duc­tive that fewer are needed).

A key sig­nal is whether the job’s work prod­uct is fun­da­men­tally dig­i­tal. If the job can be done en­tirely from a home of­fice on a com­puter — writ­ing, cod­ing, an­a­lyz­ing, com­mu­ni­cat­ing — then AI ex­po­sure is in­her­ently high (7+), be­cause AI ca­pa­bil­i­ties in dig­i­tal do­mains are ad­vanc­ing rapidly. Even if to­day’s AI can’t han­dle every as­pect of such a job, the tra­jec­tory is steep and the ceil­ing is very high. Conversely, jobs re­quir­ing phys­i­cal pres­ence, man­ual skill, or real-time hu­man in­ter­ac­tion in the phys­i­cal world have a nat­ural bar­rier to AI ex­po­sure.

Use these an­chors to cal­i­brate your score:

- 0–1: Minimal ex­po­sure. The work is al­most en­tirely phys­i­cal, hands-on, or re­quires real-time hu­man pres­ence in un­pre­dictable en­vi­ron­ments. AI has es­sen­tially no im­pact on daily work. Examples: roofer, land­scaper, com­mer­cial diver.

- 2–3: Low ex­po­sure. Mostly phys­i­cal or in­ter­per­sonal work. AI might help with mi­nor pe­riph­eral tasks (scheduling, pa­per­work) but does­n’t touch the core job. Examples: elec­tri­cian, plumber, fire­fighter, den­tal hy­gien­ist.

- 4–5: Moderate ex­po­sure. A mix of phys­i­cal/​in­ter­per­sonal work and knowl­edge work. AI can mean­ing­fully as­sist with the in­for­ma­tion-pro­cess­ing parts but a sub­stan­tial share of the job still re­quires hu­man pres­ence. Examples: reg­is­tered nurse, po­lice of­fi­cer, vet­eri­nar­ian.

- 6–7: High ex­po­sure. Predominantly knowl­edge work with some need for hu­man judg­ment, re­la­tion­ships, or phys­i­cal pres­ence. AI tools are al­ready use­ful and work­ers us­ing AI may be sub­stan­tially more pro­duc­tive. Examples: teacher, man­ager, ac­coun­tant, jour­nal­ist.

- 8–9: Very high ex­po­sure. The job is al­most en­tirely done on a com­puter. All core tasks — writ­ing, cod­ing, an­a­lyz­ing, de­sign­ing, com­mu­ni­cat­ing — are in do­mains where AI is rapidly im­prov­ing. The oc­cu­pa­tion faces ma­jor re­struc­tur­ing. Examples: soft­ware de­vel­oper, graphic de­signer, trans­la­tor, data an­a­lyst, para­le­gal, copy­writer.

- 10: Maximum ex­po­sure. Routine in­for­ma­tion pro­cess­ing, fully dig­i­tal, with no phys­i­cal com­po­nent. AI can al­ready do most of it to­day. Examples: data en­try clerk, tele­mar­keter.

Respond with ONLY a JSON ob­ject in this ex­act for­mat, no other text:

{“exposure”: <0-10>, rationale”: <2-3 sen­tences ex­plain­ing the key fac­tors>“}

Caveat on Digital AI Exposure scores: These are rough LLM es­ti­mates, not rig­or­ous pre­dic­tions. A high score does not pre­dict the job will dis­ap­pear. Software de­vel­op­ers score 9/10 be­cause AI is trans­form­ing their work — but de­mand for soft­ware could eas­ily grow as each de­vel­oper be­comes more pro­duc­tive. The score does not ac­count for de­mand elas­tic­ity, la­tent de­mand, reg­u­la­tory bar­ri­ers, or so­cial pref­er­ences for hu­man work­ers. Many high-ex­po­sure jobs will be re­shaped, not re­placed.

...

Read the original on karpathy.ai »

8 315 shares, 12 trendiness

LLMs can be absolutely exhausting

Some days I get in bed af­ter a tor­tu­ous 4-5 hour ses­sion work­ing with Claude or Codex won­der­ing what the heck hap­pened. It’s easy to blame the model - there’s so many op­tions to choose from:

They’re dumb­ing down the model to save money.

It’s not un­com­mon for me to come back to the prob­lem the next day, my own con­text win­dow cleared from rest, and find a fast and ful­fill­ing path for­ward with the help of the LLM. What’s go­ing on?

This one seems pretty ob­vi­ous. If I am be­com­ing men­tally fa­tigued, I will write worse prompts, and be­cause of that, the AI will do a worse job. Here’s an ex­am­ple of what hap­pens when I’m re­ally tired: Kick off a some­what meaty prompt (after 30% of con­text was used to align with the AI on the prob­lem), re­al­ize right af­ter sub­mit­ting that I missed some key con­text, in­ter­rupt the LLM, pro­vide the con­text, and then have it pro­ceed. Without a doubt, in­ter­rupt­ing Claude Code or steering” in Codex leads to worse out­comes.

Some of the work I’m do­ing right now re­quires pars­ing some large files. There’s bugs in that pars­ing logic that I’m try­ing to work through with the LLM. The prob­lem is, every tweak re­quires re-pars­ing and it’s a slow process. I liken it to a slot ma­chine that takes 10 min­utes to spin. To add in­sult to in­jury, some of these tasks take quite a bit of con­text to get rolling on a new ex­per­i­ment, and by the end of the pars­ing job, the LLM is 2% away from com­paction. That then leads to ei­ther a very dumb AI or an AI that is pre­tend­ing to know what’s go­ing on with the re­cent ex­per­i­ment once it’s com­plete.

If I reach the point where I am not get­ting joy out of writ­ing a great prompt, then it’s time to throw in the towel. That has to be the first sig­nal. If I’m half-ass­ing it, be­ing short, in­ter­rupt­ing, and get­ting frus­trated, then time to take a break.

There’s some metacog­ni­tion that needs to take place here. Am I be­ing less de­scrip­tive be­cause I haven’t ac­tu­ally thought through this prob­lem and I’m hop­ing the AI will just fill the gaps? That can be a very se­duc­tive trap to fall into. AIs are get­ting quite good at fill­ing in un­de­fined re­quire­ments, some­thing that I re­mem­ber hav­ing to do as a soft­ware en­gi­neer my­self, but they’re not good enough yet.

There’s times I write a prompt with so much clar­ity in my de­sired end-state that I’m al­ready cel­e­brat­ing the end-re­sult when I sub­mit the prompt be­cause I know the AI is go­ing to CRUSH IT. That’s the feel­ing I need to look for in every prompt. If it’s more the feel­ing of un­sure­ness or im­pa­tience, it’s just not go­ing to pan out.

In the case of my pars­ing prob­lem I men­tioned above, it was too slow and the feed­back loop was painful. I want my slot ma­chine to take sec­onds/​min­utes to spin, not 15/20/30 min­utes. In these cases, I’ve started to spin up a new ses­sion with the LLM, lay out my prob­lem with feed­back loop speed, ex­press my de­sire to get to a sub 5-minute loop, give it an ex­am­ple of a fail­ure case, and ask it to re­pro­duce that fail­ure case as quickly as pos­si­ble. This is start­ing to sound fa­mil­iar … TDD any­one?

I was al­ways the scrappy en­gi­neer. Sure I wrote tests, but I was never one to stop and cre­ate elab­o­rate test cases or in­te­gra­tion tests for be­spoke prob­lems. That was too time con­sum­ing, and also, I was get­ting paid even if my feed­back loop was­n’t per­fect.

It’s been quite the jour­ney to fight that feel­ing that writ­ing elab­o­rate tests is time-con­sum­ing when work­ing with AI. If you give an LLM clear suc­cess cri­te­ria: Reproduce this spe­cific fail­ure case and make sure the clock time is less than 5 min­utes to do it. Feel free to ex­per­i­ment with ways to op­ti­mize the code path or omit cer­tain pieces that are un­nec­es­sary to re­pro­duce” - the AI will not only re­pro­duce the prob­lem (maybe slowly the first time), but it will cre­ate levers for a faster feed­back cy­cle. With that fast feed­back cy­cle, it will con­sume less con­text and be SMARTER. This can se­ri­ously save hours of de­bug­ging time.

When I am ex­hausted from work­ing with an LLM - it might ac­tu­ally be a skill is­sue”. I need to rec­og­nize when I’m tired and en­ter­ing the doom-loop psy­chosis. Cognitive out­sourc­ing of re­quire­ments is se­duc­tive, but it’s a trap. If I’m not en­joy­ing the act of writ­ing the per­fect prompt and ab­solutely con­fi­dent I will re­turn to a re­sult I’m 95% happy with, I need to ei­ther take a break or pon­der if I’ve re­ally thought through the prob­lem. If things are mov­ing slow and it feels as though con­text is fill­ing up too quickly - I need to make that the prob­lem to solve. Find a path, with the help of the LLM, to it­er­ate faster and use up less con­text.

You can sub­scribe to my blog here or via RSS feed.

...

Read the original on tomjohnell.com »

9 301 shares, 36 trendiness

Why I Love FreeBSD

When I first laid eyes on the FreeBSD Handbook, back in 2002, I could­n’t be­lieve what I was see­ing. Six years of Linux, a re­la­tion­ship I’ve writ­ten about else­where, across var­i­ous dis­tri­b­u­tions, had trained me to hunt for doc­u­men­ta­tion in frag­ments: of­ten in­com­plete, of­ten out­dated, some­times al­ready stale af­ter barely a year. Here was an op­er­at­ing sys­tem that came with a com­plete, ac­cu­rate, up-to-date (as much as pos­si­ble), de­tailed man­ual. I was al­ready a con­vinced be­liever in Open Source, but I found my­self rea­son­ing in very prac­ti­cal terms: if the team be­hind this OS puts this much care into its doc­u­men­ta­tion, imag­ine how solid the sys­tem it­self must be. And so I de­cided to give it a try. I had a Sony Vaio with no room for a dual boot. I synced every­thing to a desk­top ma­chine with more space, took a breath, and made a de­ci­sion: I’d in­stall FreeBSD on that lap­top and re­in­stall Linux when the ex­per­i­ment was over.

At the time I had no idea that this ex­per­i­ment would shape the way I de­sign and run sys­tems for the next twenty years.

I re­al­ized al­most im­me­di­ately that GNU/Linux and FreeBSD were so sim­i­lar they were com­pletely dif­fer­ent.

The Unix in­spi­ra­tion was the same, but every­thing worked dif­fer­ently - and the im­pres­sion was that FreeBSD was dis­tinctly more ma­ture, less chaotic, more fo­cused. A mag­nif­i­cent cathe­dral - a form then widely crit­i­cized in the cir­cles I moved in - but one that had cer­tain un­de­ni­able virtues. Back then I com­piled the en­tire sys­tem from source, and I no­ticed right away that per­for­mance was bet­ter on that hard­ware than Linux had ever been. Not only that: Linux would over­heat and pro­duce un­pre­dictable re­sults - er­rors, sud­den shut­downs, fans scream­ing even af­ter com­pi­la­tion fin­ished. My Linux friends con­tin­ued to in­sist it was a hardware prob­lem”, but FreeBSD han­dled the load far more grace­fully. I could read my email in mutt while com­pil­ing, some­thing that was prac­ti­cally im­pos­si­ble on Linux, which would slow to a crawl. The fans would set­tle within sec­onds of the load end­ing, and the sys­tem felt gen­uinely more re­spon­sive. I never ex­pe­ri­enced a crash. I was run­ning KDE on all my sys­tems at the time, and the ex­pe­ri­ence on FreeBSD was no­tice­ably su­pe­rior - more con­sis­tent and steady per­for­mance, none of the mi­cro-freezes I’d come to ac­cept on Linux, greater over­all sta­bil­ity. The one draw­back: I com­piled every­thing, in­clud­ing KDE. I was a uni­ver­sity stu­dent and could­n’t leave my lap­top in an­other room - the risk of an incident” in­volv­ing one of my flat­mates was too real - so I kept it within ar­m’s reach, night af­ter night, fans spin­ning as KDE and all its ap­pli­ca­tions com­piled. At some point I fig­ured out ex­actly how long the KDE build took, and started us­ing it as a clock: fans run­ning meant it was be­fore four in the morn­ing. Fans silent meant I’d made it past.

The Handbook taught me an enor­mous amount - more than many of my uni­ver­sity courses - in­clud­ing things that had noth­ing to do with FreeBSD specif­i­cally. It taught me the right ap­proach: un­der­stand first, act sec­ond. The more I read, the more I wanted a printed copy to keep at my desk. So I con­vinced my par­ents that I needed a laser printer for uni­ver­sity work”. And the first thing I printed, of course, was the Handbook. That Handbook still con­tains rel­e­vant in­for­ma­tion to­day. There have been sig­nif­i­cant changes over the past twenty-four years, but the foun­da­tions are still the same. Many tools still work ex­actly as they did. Features have been added, but the orig­i­nals still op­er­ate on the same prin­ci­ples. Evolution, not rev­o­lu­tion. And when you’re build­ing some­thing meant to last, that is - in my view - ex­actly the right phi­los­o­phy. Change is good. Innovation is good. On my own ma­chines I’ve bro­ken and re­built things thou­sands of times. But pro­duc­tion en­vi­ron­ments must be sta­ble and pre­dictable. That, still to­day, is one of the qual­i­ties I value most in every BSD.

Over the years, FreeBSD has served me well. At a cer­tain point it stepped down as my pri­mary desk­top - partly be­cause I switched to Mac, partly be­cause of un­sup­ported hard­ware - but it never stopped be­ing one of my first choices for servers and any se­ri­ous work­load. As I of­ten say: I only have one work­sta­tion, and I use it to ac­cess hun­dreds of servers. It’s far eas­ier to re­place a work­sta­tion - I can re­con­fig­ure every­thing in a cou­ple of hours - than to deal with a pro­duc­tion server gone side­ways, with anx­ious clients wait­ing or op­er­a­tions ground to a halt.

FreeBSD has never chased in­no­va­tion for its own sake. It has never chased hype at the ex­pense of its core pur­pose. Its motto is The Power to Serve” - and to do that ef­fec­tively, ef­fi­ciently, se­curely. That is what FreeBSD has been for me.

I love FreeBSD be­cause it has served me for decades with­out sur­prises. I love FreeBSD be­cause it in­no­vates while mak­ing sure my 2009 servers keep run­ning cor­rectly, re­quir­ing only small ad­just­ments at each ma­jor up­date rather than a com­plete over­haul.

I love FreeBSD be­cause it does­n’t re­name my net­work in­ter­faces af­ter a re­boot or an up­grade.

And be­cause its jails - around since 2000 - are an ef­fec­tive, ef­fi­cient, se­cure, sim­ple, and fully na­tive mech­a­nism: you can man­age every­thing with­out in­stalling a sin­gle ex­ter­nal pack­age. I love FreeBSD be­cause ZFS is na­tive, and with it I get na­tive boot en­vi­ron­ments, which means safe, re­versible up­grades. Or, if you’re run­ning UFS, you change a sin­gle char­ac­ter in fstab and the en­tire filesys­tem be­comes read-only - cleanly, with no kludges. I love FreeBSD be­cause bhyve is an ef­fi­cient, light­weight, re­li­able hy­per­vi­sor. I love it for its per­for­mance, for its fea­tures, for every­thing it has given me.

But I love FreeBSD also - and above all - for its com­mu­nity. Around the BSDs, in gen­eral, you find peo­ple dri­ven by gen­uine pas­sion, cu­rios­ity, and com­pe­tence. Over the past twenty years the tech world has at­tracted many peo­ple who ap­pear to be in­ter­ested in tech­nol­ogy. In re­al­ity, they are of­ten just look­ing for some­thing to mon­e­tize quickly, even at the cost of de­stroy­ing it. In the BSD com­mu­nity, that is far less com­mon. At con­fer­ences I’ve had the chance to meet de­vel­op­ers in per­son - to un­der­stand their spirit, their skill, and yes, their pas­sion. Not just in the vol­un­teers who con­tribute for the joy of it, but in those funded by the Foundation as well. And then there are the en­gi­neers from com­pa­nies that rely heav­ily on FreeBSD - Netflix among them - and they bring the same qual­ity: that en­gage­ment, that en­thu­si­asm, that tells you FreeBSD is­n’t a job for them. It’s a plea­sure. Which is one of the rea­sons why every time I at­tend a BSD con­fer­ence, I come home even more in love with the pro­ject: the vibe of the com­mu­nity, the ded­i­ca­tion of the de­vel­op­ers, the pres­ence of a Foundation that is strong and ef­fec­tive with­out be­ing dom­i­neer­ing or self-im­por­tant - which, com­pared to the foun­da­tions of other ma­jor Open Source pro­jects, makes it gen­uinely re­mark­able. Faces that have been part of this pro­ject for over twenty years, and still light up the mo­ment they find their friends and start talk­ing about what they’ve been work­ing on. That pos­i­tiv­ity is con­ta­gious - and it flows di­rectly into the code, the pro­ject, the vi­sion for what comes next. Because that’s the heart of it. FreeBSD has al­ways been an op­er­at­ing sys­tem writ­ten by hu­mans, for hu­mans: built to serve and to be use­ful, with a con­sis­tency, doc­u­men­ta­tion, prag­ma­tism, and crafts­man­ship that most other pro­jects - par­tic­u­larly main­stream Linux dis­tri­b­u­tions - sim­ply don’t have. The Foundation wants to hear from or­di­nary users. It ac­tively pro­motes the kind of en­gage­ment that brings more peo­ple to FreeBSD. Not be­cause big tech com­pa­nies are push­ing to cre­ate de­pen­dency, but be­cause it be­lieves in the pro­ject.

So thank you, FreeBSD, for help­ing me stay pas­sion­ate for so many years, for keep­ing my pro­jects run­ning, for keep­ing my clients’ servers up and my data safe. Thank you, FreeBSD, for never wast­ing time chas­ing the trend of the mo­ment, and in­stead fo­cus­ing on do­ing things right. Thank you, FreeBSD, for all the ex­tra­or­di­nary peo­ple - from across the en­tire BSD com­mu­nity - you’ve brought into my life. Friends, not col­leagues. Real peo­ple. The gen­uine kind. And when the peo­ple run­ning some­thing still be­lieve in it - truly be­lieve in it, af­ter all these years - and the pro­ject keeps suc­ceed­ing, that tells you there is real sub­stance un­der­neath. In the code. In the peo­ple. In the com­mu­nity.

FreeBSD does­n’t want to be the best and great­est”. It wants to serve.

...

Read the original on it-notes.dragas.net »

10 280 shares, 34 trendiness

My Journey to a reliable and enjoyable locally hosted voice assistant

I have been watch­ing HomeAssistant’s progress with as­sist for some time. We pre­vi­ously used Google Home via Nest Minis, and have switched to us­ing fully lo­cal as­sist backed by lo­cal first + llama.cpp (previously Ollama). In this post I will share the steps I took to get to where I am to­day, the de­ci­sions I made and why they were the best for my use case specif­i­cally.

Here are links to ad­di­tional im­prove­ments posted about in this thread.

I have tested a wide va­ri­ety of hard­ware from a 3050 to a 3090, most mod­ern dis­crete GPUs can be used for lo­cal as­sist ef­fec­tively, it just de­pends on your ex­pec­ta­tions of ca­pa­bil­ity and speed for what hard­ware is re­quired.

I am run­ning HomeAssistant on my UnRaid NAS, specs are not re­ally im­por­tant as it has noth­ing to do with HA Voice.

* 1 Pixel 7a used as a satel­lite/ hub with View Assist

* Beelink MiniPC with USB4 (the ex­act model is­n’t im­por­tant as long as it has USB4)

The be­low table shows GPUs that I have tested with this setup. Response time will vary based on the model that is used.

The be­low table shows the mod­els I have tested us­ing this setup with var­i­ous fea­tures and their per­for­mance.

All mod­els be­low are good for ba­sic tool call­ing. Advanced fea­tures are listed with the mod­els qual­ity at re­li­ably re­pro­duc­ing the de­sired be­hav­ior.

(1) Handles com­mands like Turn on the fan and off the lights”

(2) Understands when it is in a par­tic­u­lar area and does not ask which light?” when there is only one light in the area, but does cor­rectly ask when there are mul­ti­ple of the de­vice type in the given area.

(3) Is able to parse mis­heard com­mands (ex: turn on the pan”) and re­li­ably ex­e­cute the in­tended com­mand

(4) Is able to re­li­ably ig­nore un­wanted in­put with­out be­ing neg­a­tively af­fected by mis­heard text that was an in­tended com­mand.

llama.cpp is rec­om­mended for op­ti­mal per­for­mance, see my re­ply be­low for de­tails.

The fol­low­ing are Speech to Text op­tions that I have tested:

* LLM Conversation Provides im­prove­ments to the base con­ver­sa­tion to im­prove de­fault ex­pe­ri­ence talk­ing with Assist

My point in post­ing this is not to sug­gest that what I have done is the right way” or even some­thing oth­ers should repli­cate. But I learned a lot through­out this process and I fig­ured it would be worth shar­ing so oth­ers could get a bet­ter idea of what to ex­pect, pit­falls, etc.

Throughout the last year or two we have no­ticed that Google Assistant through these Nest Minis has got­ten pro­gres­sively dumber / worse while also not bring­ing any new fea­tures. This is gen­er­ally fine as the WAF was still much higher than not hav­ing voice, but it be­came in­creas­ingly an­noy­ing as we were met with more and more Sorry, I can’t help with that” or I don’t know the an­swer to that, but ac­cord­ing to XYZ source here is the an­swer”. It gen­er­ally worked, but not re­li­ably and was of­ten a fuss to get an­swers to ar­bi­trary ques­tions.

Then there is the usual pri­vacy con­cern of hav­ing on­line mi­cro­phones through­out your home, and the an­noy­ance that every time AWS or some­thing else went down you could­n’t use voice to con­trol lights in the house.

I started by play­ing with one of Ollama’s in­cluded mod­els. Every few weeks I would con­nect Ollama to HA, spin up as­sist and try to use it. Every time I was dis­ap­pointed and sur­prised by its lack of abil­i­ties and most of the time ba­sic tool calls would not work. I do be­lieve HA has made things bet­ter, but I think the biggest is­sue was my un­der­stand­ing.

Ollama mod­els that you see on Ollama are not even close to ex­haus­tive in terms of the mod­els that can be run. And worse yet, the de­fault :4b mod­els for ex­am­ple are of­ten low quan­ti­za­tion (Q4_K) which can cause a lot of prob­lems. Once I learned about the abil­ity to use HuggingFace to find GGUF mod­els with higher quan­ti­za­tions, as­sist was im­me­di­ately per­form­ing much bet­ter with no prob­lems with tool call­ing.

After get­ting to the point where the fun­da­men­tal ba­sics were pos­si­ble, I or­dered a Voice Preview Edition to use for test­ing so I could get a bet­ter idea of the end-to-end ex­pe­ri­ence. It took me some time to get things work­ing well, orig­i­nally I had WiFi re­cep­tion is­sues where the ping was very in­con­sis­tent on the VPE (despite be­ing next to the router) and this led to the speech out­put be­ing stut­tery and hav­ing a lot of mid-word pauses. After ad­just­ing piper to use stream­ing and cre­at­ing a new ded­i­cated IoT net­work, the per­for­mance has been much bet­ter.

Controlling de­vice is great, and Ollama’s abil­ity to ad­just de­vices when the lo­cal pro­cess­ing missed a com­mand was help­ful. But to re­place our speak­ers, Assist had to be ca­pa­ble of the fol­low­ing things:

* Ability to ask about a spe­cific busi­ness to get open­ing / clos­ing times

* Ability to do gen­eral knowl­edge lookup to an­swer ar­bi­trary ques­tions

* Ability to play mu­sic with search abil­i­ties en­tirely with voice

At first I was un­der the im­pres­sion these would have to be built out sep­a­rately, but I even­tu­ally found the bril­liant llm-in­tents in­te­gra­tion which pro­vides a num­ber of these ser­vices to Assist (and by ex­ten­sion, Ollama). Once set­ting these up, the re­sults were mediocre.

For those that want to see it, here is my prompt.

This is when I learned that the prompt will make or break your voice ex­pe­ri­ence. The de­fault HA prompt won’t get you very far, as LLMs need a lot of guid­ance to know what to do and when.

I gen­er­ally im­proved my prompt by tak­ing my cur­rent prompt and putting it into ChatGPT along with a de­scrip­tion of the cur­rent be­hav­ior and de­sired be­hav­ior of the LLM. Then back-and-forth at­tempts un­til I con­sis­tently got the de­sired re­sult. After a few cy­cles of this, I started to get a feel of how to make these im­prove­ments my­self.

I started by try­ing to get weather work­ing, the first chal­lenge was get­ting the LLM to even call the weather ser­vice. I have found that hav­ing ded­i­cated # sec­tions for each ser­vice that is im­por­tant along with a bul­leted list of de­tails / in­struc­tions works best.

Then I needed to make the weather re­sponse for­mat­ted in a way that was de­sir­able with­out ex­tra in­for­ma­tion. At first, the re­sponse would in­clude ex­tra com­men­tary such as sounds like a nice sum­mery day!” or other things that de­tracted from the con­cise­ness of the re­sponse. Once this was solved, a spe­cific ex­am­ple of the out­put worked best to get the ex­act re­sponse for­mat that was de­sired.

For places and search, the prob­lem was much the same, it did not want to call the tool and in­stead in­sisted that it did not know the user’s lo­ca­tion or the an­swer to spe­cific ques­tions. This mostly just needed some spe­cific in­struc­tions to al­ways call the spe­cific tool when cer­tain types of ques­tions were asked, and that has worked well.

The fi­nal prob­lem I had to solve was emo­jis, most re­sponses would end with a smi­ley face or some­thing, which is not good to TTS. This took a lot of sec­tions in the prompt, but over­all has com­pletely re­moved it with­out ad­verse af­fects.

NOTE: Not sure if a re­cent Home Assistant or Music Assistant up­date im­proved things, but the LLM is now able to nat­u­rally search and play mu­sic with­out the au­toma­tion. I am leav­ing this sec­tion in as an ex­am­ple, as I still be­lieve au­toma­tions can be a good way to solve some prob­lems when there is not an easy way to give the LLM ac­cess to a cer­tain fea­ture.

It is cer­tainly the most de­sir­able out­come that every func­tion would be ex­e­cuted per­fectly by the LLM with­out in­ter­ven­tion, but at least in my case with the model I am us­ing that is not true. But there are cases where that re­ally is not a bad thing.

In my case, mu­sic was one of this case. I be­lieve this is an area that im­prove­ments are cur­rently be made, but for me the au­to­matic case was not work­ing well. I started by get­ting mu­sic as­sis­tant setup. I found var­i­ous LLM blue­prints to cre­ate a script that al­lows the LLM to start play­ing mu­sic au­to­mat­i­cally, but it did not work well for me.

That is when I re­al­ized the power of the sen­tence au­toma­tion trig­ger and the beauty of mu­sic as­sis­tant. I cre­ate an au­toma­tion that trig­gers on Play {music}. The au­toma­tion has a map of as­sist_satel­lite to me­di­a_­player in the au­toma­tion, so it will play mu­sic on the cor­rect me­dia player based on which satel­lite makes the re­quest. Then it passes {music} (which can be a song, al­bum, artist, what­ever) to mu­sic as­sis­tan­t’s play ser­vice which per­forms the search­ing and starts play­ing.

The next prob­lem to solve was the wake­word. For WAF the de­fault in­cluded op­tions weren’t go­ing to work. After some back and forth we de­cided on Hey Robot. I use this repo to train a cus­tom mi­crowake­word which is us­able on the VPE and Satellite1. This only took ~30 min­utes to run on my GPU and the re­sults have been quite good. There are some false pos­i­tives, but over­all the rate is sim­i­lar to the Google Homes that have been re­placed and with the abil­ity to au­to­mate mut­ing it is pos­si­ble we can solve that prob­lem with that un­til the train­ing / op­tions be­come bet­ter.

I def­i­nitely would not rec­om­mend this for the av­er­age Home Assistant user, IMO a lot of pa­tience and re­search is needed to un­der­stand par­tic­u­lar prob­lems and work to­wards a so­lu­tion, and I imag­ine we will run into more prob­lems as we con­tinue to use these. I am cer­tainly not done, but that is the beauty of this so­lu­tion - most as­pects of it can be tuned.

The goal has been met though, over­all we have a more en­joy­able voice as­sis­tant that runs lo­cally with­out pri­vacy con­cerns, and our core tasks are han­dled re­li­ably.

Let me know what you think! I am happy to an­swer any ques­tions.

...

Read the original on community.home-assistant.io »

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

If you like 10HN please leave feedback and share

Visit pancik.com for more.