10 interesting stories served every morning and every evening.

S&P 500 rejects SpaceX, also blocking entry for OpenAI and Anthropic

arstechnica.com

Such rule changes would have ac­com­mo­dated SpaceX’s plan to only of­fer ap­prox­i­mately 3 per­cent of its IPO shares to pub­lic in­vestors, and the fact that SpaceX is cur­rently un­prof­itable with a grow­ing debt load that has reached $29 bil­lion be­cause of its spend­ing spree on AI in­fra­struc­ture.

But in its fi­nal de­ci­sion, the S&P Dow Jones Indices stated that no changes will be made to the el­i­gi­bil­ity cri­te­ria in­clud­ing fi­nan­cial vi­a­bil­ity screens, sea­son­ing pe­riod, or min­i­mum IWF.” Even af­ter the stan­dard year­long wait, SpaceX, Anthropic, and OpenAI may strug­gle to de­liver the con­sis­tent prof­itabil­ity nec­es­sary to qual­ify for the S&P 500.

Money rules and ex­cep­tions

Swift en­try into the S&P 500 would have trig­gered $14 bil­lion of pas­sive fund buy­ing for SpaceX, ac­cord­ing to Bloomberg Intelligence. The in­vest­ment re­search arm of Bloomberg also es­ti­mated that OpenAI could have gained more than $8 bil­lion, and Anthropic could have net­ted $4.6 bil­lion from sim­i­lar pas­sive buy­ing sprees trig­gered by their S&P 500 en­tries.

This is be­cause $7.5 tril­lion in pas­sively man­aged funds—pop­u­lar among both in­di­vid­ual in­vestors and in­sti­tu­tional in­vestors—fol­low the S&P 500 by pur­chas­ing shares of com­pa­nies ac­cord­ing to their pro­por­tional rep­re­sen­ta­tion in the S&P 500 in­dex. For ex­am­ple, the Vanguard and Fidelity bro­ker­age gi­ants both of­fer pas­sive in­vest­ment funds that track the S&P 500 com­po­si­tion.

However, the S&P Dow Jones Indices did carve out one con­ces­sion” by chang­ing the in­vestable weight fac­tor rules for lower-profile bench­marks” such as the S&P Total Market Index and Dow Jones US Total Stock Market Index, ac­cord­ing to Quartz. That could al­low an IPO faster en­try into those in­dexes.

By con­trast, the Nasdaq stock ex­change changed its rules to al­low SpaceX to en­ter the Nasdaq-100 Index within 15 trad­ing days as op­posed to the usual three months. Similarly, the FTSE Russell in­dex provider de­cided to give SpaceX and other fol­low-on com­pa­nies ac­cel­er­ated en­try to the Russell Top 500 Index af­ter the close of the fifth trad­ing day fol­low­ing an IPO.

The de­nial of ac­cel­er­ated S&P 500 en­try for SpaceX comes just days af­ter Morningstar an­a­lysts de­scribed SpaceX as hav­ing been significantly over­val­ued” in the lead-up to its IPO. The in­vest­ment re­search firm val­ued SpaceX at $780 bil­lion—less than half of SpaceX’s $1.75 tril­lion IPO goal—pri­mar­ily based on the strengths of SpaceX’s Starlink satel­lite ser­vice and rocket launch busi­ness.

This story was up­dated on June 6, 2026 to more clearly de­scribe the pro­posed rule changes that would have ap­plied to all MegaCap com­pa­nies.

How LLMs Actually Work

www.0xkato.xyz

Home

Blog

Research

About

Portfolio

Monday. June 01, 2026 -

26 mins

This post is a walk­through of how LLMs work. Modern LLMs are mostly built by stack­ing trans­former blocks over and over, so un­der­stand­ing the trans­former ma­chin­ery gets you most of the way there.

I’ll cover the core mech­a­nisms in­side mod­ern trans­former-based LLMs, with­out all that sticky math stuff. Don’t get me wrong, you should learn the math, but this can serve as an in­tro­duc­tion.

Most mod­ern LLMs share the same trans­former-fam­ily skele­ton. The dif­fer­ences come from what each one was trained on, the scale and con­fig­u­ra­tion choices, and the post-train­ing done on top. By the end, you should be able to read many mod­ern LLM pa­pers or model cards and know which piece of the ar­chi­tec­ture each sec­tion is talk­ing about.

Here’s the path:

Tokens, how a string of text be­comes a se­quence of in­te­gers

Embeddings, how those in­te­gers get mean­ing

Positional en­cod­ing, how the model knows what or­der the to­kens came in

Attention, how to­kens share in­for­ma­tion with each other

Multi-head at­ten­tion, how the model tracks many kinds of re­la­tion­ships at once

The feed-for­ward net­work, where a large share of the mod­el’s stored struc­ture lives

The resid­ual stream and layer nor­mal­iza­tion, what makes deep stacks train­able

Predicting the next to­ken, what the model ac­tu­ally out­puts and how the gen­er­a­tion loop works

Architecture vs trained weights, what’s broadly shared across mod­ern LLMs, and what’s dif­fer­ent

Tiny ex­plain­ers ap­pear through­out so any­one can fol­low along, re­gard­less of back­ground.

Tokenization

Models don’t read text di­rectly. They read in­te­ger IDs. The step that con­verts your prompt into a se­quence of those in­te­gers.

That con­ver­sion step is called to­k­eniza­tion. A to­k­enizer takes a string and pro­duces a se­quence of in­te­gers, where each in­te­ger points to an en­try in a fixed vo­cab­u­lary. Modern LLM vo­cab­u­lar­ies usu­ally con­tain tens of thou­sands to a few hun­dred thou­sand en­tries.

Tiny ex­plainer: to­ken ID A to­ken ID is the in­te­ger the model uses for one vo­cab­u­lary en­try. The model works with the num­ber, not the writ­ten word it­self.

Tiny ex­plainer: to­ken ID A to­ken ID is the in­te­ger the model uses for one vo­cab­u­lary en­try. The model works with the num­ber, not the writ­ten word it­self.

Tokens aren’t usu­ally whole words. They’re usu­ally sub­word pieces. The word tokenization” might split into [“token”, ization”]. The word running” might split into [“run”, ning”]. The rea­son is ef­fi­ciency. Whole-word vo­cab­u­lar­ies are too big and don’t gen­er­al­ize to new words. Character-level vo­cab­u­lar­ies are too small and force the model to learn even the sim­plest pat­terns from scratch. Subword to­k­eniza­tion sits in the mid­dle. The most com­mon pieces be­come sin­gle to­kens, and rare or novel words get com­posed from smaller pieces.

Tiny ex­plainer: vo­cab­u­lary The vo­cab­u­lary is the to­k­eniz­er’s fixed list of pieces. Each piece has an ID, and the model can only di­rectly re­ceive IDs from that list.

Tiny ex­plainer: vo­cab­u­lary The vo­cab­u­lary is the to­k­eniz­er’s fixed list of pieces. Each piece has an ID, and the model can only di­rectly re­ceive IDs from that list.

The trade-off shows up in places peo­ple don’t ex­pect. The clas­sic ex­am­ple: ask an LLM how many R’s are in strawberry.” LLMs used to get it wrong. That’s not the model fail­ing at count­ing. It’s the model not op­er­at­ing on let­ters di­rectly, only to­ken IDs that hap­pen to spell out a word a hu­man would split let­ter by let­ter.

Different model fam­i­lies use dif­fer­ent to­k­eniz­ers. GPT mod­els use Byte Pair Encoding vari­ants. SentencePiece is com­mon in LLaMA-style mod­els. The choice mat­ters for com­pute (fewer to­kens means less work) and for things like mul­ti­lin­gual cov­er­age, but the ba­sic shape is the same. Text in, in­te­gers out.

Now that the prompt is a se­quence of in­te­gers, the next step is to give those in­te­gers mean­ing.

Embeddings

A to­ken ID like 1024 is just a row in­dex. It does­n’t mean any­thing by it­self. The thing that gives it mean­ing is a gi­ant table called the em­bed­ding ma­trix.

Every model has one. It has one row per en­try in the vo­cab­u­lary, and each row is a long vec­tor of num­bers. The length of each row is the mod­el’s hid­den size. In many 7B-class mod­els, that means 4,096 num­bers per to­ken. Larger mod­els usu­ally use wider vec­tors.

Tiny ex­plainer: vec­tor A vec­tor is a list of num­bers. In a trans­former, each to­ken be­comes a vec­tor so the model can do math with it.

Tiny ex­plainer: vec­tor A vec­tor is a list of num­bers. In a trans­former, each to­ken be­comes a vec­tor so the model can do math with it.

When the to­k­enizer hands the model an in­te­ger, the model looks up that row and uses the vec­tor in­stead. That vec­tor is the to­ken’s em­bed­ding. It’s the mod­el’s rep­re­sen­ta­tion of what that to­ken means,” learned dur­ing train­ing.

Tiny ex­plainer: em­bed­ding ma­trix The em­bed­ding ma­trix is a lookup table. Token ID in, learned vec­tor out.

Tiny ex­plainer: em­bed­ding ma­trix The em­bed­ding ma­trix is a lookup table. Token ID in, learned vec­tor out.

The in­ter­est­ing prop­erty of these em­bed­dings is that se­man­ti­cally sim­i­lar to­kens end up with sim­i­lar vec­tors. The vec­tor for king” is close in space to the vec­tor for queen,” and the vec­tor for Paris” is close to France.” None of this is hard-coded. It emerges from train­ing on enough text, and the model learns these po­si­tions be­cause they let it pre­dict text well.

You can do arith­metic on em­bed­dings and it some­times works. The fa­mous ex­am­ple is king − man + woman ≈ queen. The geom­e­try of em­bed­ding space car­ries real se­man­tic struc­ture, even though no­body told the model to build it that way.

Worth be­ing clear on: at this stage every to­ken has been re­placed by its em­bed­ding, but the em­bed­ding alone says noth­ing about where the to­ken sits in the se­quence. The vec­tor for dog” is the same vec­tor whether dog” is the first word in your prompt or the fifth. That’s a prob­lem.

That’s the gap po­si­tional en­cod­ing fills.

Positional en­cod­ing

Plain self-at­ten­tion does­n’t have a built-in rep­re­sen­ta­tion of word or­der. Without some po­si­tional sig­nal, it has no di­rect way to know that dog” came be­fore bites” in­stead of af­ter it.

Word or­der changes mean­ing. So the model needs an­other piece. It needs a way to in­ject the po­si­tion of each to­ken into the math.

Tiny ex­plainer: po­si­tional en­cod­ing Positional en­cod­ing is how the model gets or­der in­for­ma­tion. It tells the model where each to­ken sits in the se­quence.

Tiny ex­plainer: po­si­tional en­cod­ing Positional en­cod­ing is how the model gets or­der in­for­ma­tion. It tells the model where each to­ken sits in the se­quence.

The orig­i­nal trans­former pa­per (Vaswani et al. 2017) did this by giv­ing each po­si­tion its own pat­tern of num­bers and adding it di­rectly to each to­ken’s em­bed­ding be­fore any other pro­cess­ing. Position 1 had one pat­tern, po­si­tion 5 had a dif­fer­ent pat­tern, po­si­tion 100 had an­other. The pat­terns came from sine and co­sine waves at dif­fer­ent fre­quen­cies. Now the em­bed­ding for dog” at po­si­tion 1 was dif­fer­ent from the em­bed­ding for dog” at po­si­tion 5, just be­cause the po­si­tion pat­tern added to it was dif­fer­ent.

That worked, and si­nu­soidal en­cod­ings were cho­sen partly be­cause they can ex­trap­o­late be­yond the ex­act se­quence lengths seen dur­ing train­ing. But ad­di­tive po­si­tion schemes still had two prob­lems that be­came im­por­tant as mod­els scaled up.

First, the em­bed­ding had to carry both mean­ing and po­si­tion in the same set of num­bers. There’s only so much you can pack in.

Second, learned ab­solute po­si­tion em­bed­dings in par­tic­u­lar don’t gen­er­al­ize cleanly. If you trained on prompts up to 2,048 to­kens long, the model never saw po­si­tion 5,000 dur­ing train­ing, and the em­bed­ding for that po­si­tion was not learned in the same way.

Modern mod­els mostly use a dif­fer­ent scheme called Rotary Position Embeddings (RoPE), in­tro­duced by Su et al. in 2021 and now used in LLaMA, Mistral, Gemma, Qwen, and most other open-weight fam­i­lies. The in­tu­ition: in­stead of adding po­si­tion info to each to­ken’s vec­tor, RoPE ro­tates the vec­tor by an an­gle that de­pends on its po­si­tion. A to­ken at po­si­tion 1 gets a small turn, a to­ken at po­si­tion 100 gets a big­ger turn. When two to­kens are later com­pared dur­ing at­ten­tion, what mat­ters is the dif­fer­ence be­tween their ro­ta­tions, which en­codes how far apart they are.

Tiny ex­plainer: RoPE RoPE stands for Rotary Position Embeddings. Instead of adding a po­si­tion vec­tor, it ro­tates to­ken vec­tors so rel­a­tive dis­tance shows up dur­ing at­ten­tion.

Tiny ex­plainer: RoPE RoPE stands for Rotary Position Embeddings. Instead of adding a po­si­tion vec­tor, it ro­tates to­ken vec­tors so rel­a­tive dis­tance shows up dur­ing at­ten­tion.

The prac­ti­cal ad­van­tages are real. RoPE en­codes rel­a­tive po­si­tion nat­u­rally (which is closer to what at­ten­tion ac­tu­ally wants). It gen­er­al­izes bet­ter to longer con­texts. And it does­n’t add new pa­ra­me­ters to the model.

Even with good po­si­tional en­cod­ing, mod­ern LLMs have a doc­u­mented lost in the mid­dle” prob­lem (Liu et al. 2023). They use in­for­ma­tion at the start and end of long prompts more re­li­ably than in­for­ma­tion buried in the mid­dle. That’s why prompt en­gi­neer­ing tips like put im­por­tant con­text first” or repeat key info at the end” ac­tu­ally help. The model is­n’t us­ing every part of your prompt equally well.

With to­ken mean­ing and po­si­tion both en­coded, the next ques­tion is how do to­kens ac­tu­ally ex­change in­for­ma­tion?

Attention

This is the mech­a­nism that gave the ar­chi­tec­ture its name. Attention.

Inside every trans­former layer, at­ten­tion does one thing. It lets each to­ken look at the other to­kens it is al­lowed to see and de­cide which ones mat­ter for what comes next.

It does this by giv­ing each to­ken three roles at once. Each to­ken gets trans­formed into three new vec­tors, called Query, Key, and Value (Q, K, V).

Tiny ex­plainer: Q, K, V Query means what am I look­ing for,” Key means what do I match with,” and Value is the in­for­ma­tion that gets copied when the match is strong.

Tiny ex­plainer: Q, K, V Query means what am I look­ing for,” Key means what do I match with,” and Value is the in­for­ma­tion that gets copied when the match is strong.

The Query asks, what am I look­ing for from other to­kens?”

The Key says, this is what I of­fer to to­kens look­ing at me.”

The Value car­ries, this is what gets passed along when a match hap­pens.”

The same to­ken plays all three roles at the same time. The Q, K, V trans­for­ma­tions are learned ma­tri­ces, so the model fig­ures out dur­ing train­ing what each to­ken should look for and what it should of­fer.

Matching hap­pens through a sim­i­lar­ity score. Each to­ken’s Query is com­pared against the Key of each to­ken it is al­lowed to see, us­ing a scaled dot prod­uct. Intuitively, this mea­sures how much the two vec­tors line up. The scal­ing keeps the num­bers sta­ble be­fore soft­max.

Tiny ex­plainer: dot prod­uct A dot prod­uct is a sim­ple way to score how aligned two vec­tors are. Higher align­ment means a stronger match.

Tiny ex­plainer: dot prod­uct A dot prod­uct is a sim­ple way to score how aligned two vec­tors are. Higher align­ment means a stronger match.

The match scores then get turned into weights us­ing soft­max. Softmax takes any set of num­bers and turns them into a prob­a­bil­ity-like dis­tri­b­u­tion that sums to 1. Tokens with higher match scores get higher weights, and the weights are then used to take a weighted av­er­age of the value vec­tors.

Tiny ex­plainer: soft­max Softmax turns raw scores into weights that add up to 1. Big scores get big weights, small scores get small weights.

Tiny ex­plainer: soft­max Softmax turns raw scores into weights that add up to 1. Big scores get big weights, small scores get small weights.

An ex­am­ple. Consider the sen­tence The cat that I saw yes­ter­day was sleep­ing.” When the model processes was,” it needs to fig­ure out what’s do­ing the sleep­ing. The Query vec­tor for was” gets com­pared against the Key vec­tors of the to­kens it is al­lowed to see. The dot prod­uct with cat” is high, be­cause the model has learned that verbs like was” need a sub­ject and that sub­jects like cat” pro­duce Key vec­tors that line up well. The dot prod­uct with yesterday” is low. Softmax turns those scores into weights, cat” gets a high weight, yesterday” gets a low one. The model then takes a weighted sum of the cor­re­spond­ing value vec­tors, so the value for cat” dom­i­nates the re­sult. The new rep­re­sen­ta­tion of was” is now mostly shaped by the value of cat.” That’s how a to­ken sev­eral po­si­tions back be­comes the ref­er­ent.

There’s a con­straint spe­cific to GPT-style lan­guage mod­els, which is that they gen­er­ate text left to right. A to­ken at po­si­tion 5 is only al­lowed to at­tend to po­si­tions 1 through 5. It can­not at­tend to to­kens at po­si­tions 6, 7, 8, be­cause those haven’t been gen­er­ated yet. This is called causal mask­ing. The im­ple­men­ta­tion is sim­ple: fu­ture to­kens get match scores so low they end up with ef­fec­tively zero weight af­ter soft­max.

Tiny ex­plainer: causal mask­ing Causal mask­ing hides fu­ture to­kens. It keeps a de­coder-only lan­guage model from look­ing ahead while pre­dict­ing the next to­ken.

Tiny ex­plainer: causal mask­ing Causal mask­ing hides fu­ture to­kens. It keeps a de­coder-only lan­guage model from look­ing ahead while pre­dict­ing the next to­ken.

One of the most in­ter­est­ing find­ings in in­ter­pretabil­ity re­search is about spe­cial­ized at­ten­tion heads called in­duc­tion heads, found by Anthropic in 2022. These heads learn to spot pat­terns of the form A B … A” in the prompt and pre­dict that B comes next. When the model sees A” the sec­ond time, the in­duc­tion head looks back to where A” ap­peared be­fore, sees what came af­ter, and copies that. They’re one of the clear­est known mech­a­nisms be­hind in-con­text learn­ing, the abil­ity of an LLM to pick up a pat­tern from your prompt and con­tinue it.

Tiny ex­plainer: in­duc­tion head An in­duc­tion head is an at­ten­tion head that no­tices re­peated pat­terns in the prompt and helps con­tinue them.

Tiny ex­plainer: in­duc­tion head An in­duc­tion head is an at­ten­tion head that no­tices re­peated pat­terns in the prompt and helps con­tinue them.

Attention has one big cost. In full at­ten­tion, each to­ken com­pares against all the to­kens it is al­lowed to see, so dou­bling the prompt length roughly quadru­ples the work. This is why long prompts are ex­pen­sive to run, and why a lot of re­cent re­search is about mak­ing at­ten­tion more ef­fi­cient (FlashAttention, sparse at­ten­tion, lin­ear at­ten­tion).

But one at­ten­tion head only gives the model one learned view of those re­la­tion­ships.

Multi-head at­ten­tion

A sin­gle at­ten­tion pass gives the model one way of de­cid­ing which to­kens mat­ter to which other to­kens. That’s not enough. Language has many re­la­tion­ships hap­pen­ing at the same time. Subject and verb agree­ment. Pronouns and the names they re­fer to. Long-range ref­er­ences be­tween sen­tences. Word or­der and lo­cal phrases.

Multi-head at­ten­tion solves this by run­ning at­ten­tion many times in par­al­lel, with each par­al­lel pass op­er­at­ing in its own smaller space. Each par­al­lel pass is called a head.

Tiny ex­plainer: at­ten­tion head An at­ten­tion head is one in­de­pen­dent at­ten­tion pass with its own learned pro­jec­tions.

Tiny ex­plainer: at­ten­tion head An at­ten­tion head is one in­de­pen­dent at­ten­tion pass with its own learned pro­jec­tions.

The part that’s of­ten de­scribed wrong, in­clud­ing in plenty of tu­to­ri­als. Each head does­n’t get a lit­eral slice of the orig­i­nal to­ken vec­tor. Each head has its own learned pro­jec­tion ma­tri­ces that map the full to­ken vec­tor down to its own smaller Q, K, and V vec­tors. So if a model has 4,096 num­bers per to­ken and 32 heads, each head usu­ally works in a 128-dimensional space, but those 128 num­bers are a learned pro­jec­tion of the full 4,096, not a fixed slice. Different views” of the same to­ken, not dif­fer­ent chunks of it.

Each head runs its at­ten­tion pass in­de­pen­dently. Then the out­puts of all the heads get con­cate­nated and passed through a fi­nal lin­ear layer that mixes them back into one full-size vec­tor. The model learns that fi­nal mix­ing too.

What makes this in­ter­est­ing is that dif­fer­ent heads of­ten end up par­tially spe­cial­ized. The model is never told what each head should do. Specialization emerges nat­u­rally dur­ing train­ing. Researchers have found heads that track gram­mar (linking verbs to their ob­jects, ar­ti­cles to their nouns), heads that fig­ure out which pro­noun refers to which name, heads that track po­si­tional pat­terns, in­duc­tion heads, and many more. A sin­gle trans­former layer might have 32 heads. A mod­ern fron­tier model has dozens of lay­ers. So a typ­i­cal LLM has thou­sands of at­ten­tion heads in to­tal, each adding its own learned view.

There’s a prac­ti­cal cost con­cern that drove a re­cent ar­chi­tec­tural change. Each head needs to keep its Key and Value vec­tors in mem­ory for all the to­kens al­ready gen­er­ated, so that when a new to­ken gets gen­er­ated the model does­n’t have to re­com­pute every­thing from scratch. This is called the KV cache, and it’s the main mem­ory cost of run­ning an LLM at long con­text lengths.

Tiny ex­plainer: KV cache The KV cache stores old Key and Value vec­tors dur­ing gen­er­a­tion. It saves the model from re­com­put­ing the whole prompt every time it adds a to­ken.

Tiny ex­plainer: KV cache The KV cache stores old Key and Value vec­tors dur­ing gen­er­a­tion. It saves the model from re­com­put­ing the whole prompt every time it adds a to­ken.

Modern de­coder-only LLMs mostly use a vari­ant called Grouped-Query Attention (GQA). Instead of every head hav­ing its own keys and val­ues, groups of heads share the same key and value heads. LLaMA-2 70B has 64 query heads but only 8 key/​value heads. Mistral 7B has 32 query heads and 8 key/​value heads. The re­sult is nearly the same ac­cu­racy as full multi-head at­ten­tion but with much less mem­ory pres­sure and in­fer­ence cost.

Tiny ex­plainer: GQA Grouped-Query Attention lets mul­ti­ple query heads share fewer key/​value heads. That cuts KV-cache mem­ory while keep­ing many query views.

Tiny ex­plainer: GQA Grouped-Query Attention lets mul­ti­ple query heads share fewer key/​value heads. That cuts KV-cache mem­ory while keep­ing many query views.

Feed-forward net­work

After at­ten­tion fin­ishes mix­ing in­for­ma­tion be­tween to­kens, every layer has a sec­ond step that no­body talks about as much. The feed-for­ward net­work.

GrapheneOS user reported to authorities for using GrapheneOS

discuss.grapheneos.org

GrapheneOS Discussion Forum

Google will pay SpaceX $920M per month for compute

techcrunch.com

SpaceX has lined up an­other com­pute deal ahead of its his­toric IPO, this time with Google. The com­pany an­nounced the deal in a reg­u­la­tory fil­ing on Friday.

Under the terms of the deal, Google will pay SpaceX $920 mil­lion per month from October 2026 through June 2029 for ac­cess to approximately 110,000 NVIDIA GPUs, CPUs, mem­ory, and other re­lated com­po­nents.”

The deal is sim­i­lar in length and scope to the one SpaceX an­nounced with Anthropic in late May. As part of that deal, Anthropic agreed to pay SpaceX $1.25 bil­lion per month through 2029 to rent all the avail­able com­pute from its Colossus 1 data cen­ter near Memphis, Tennessee, that xAI — now part of SpaceX — orig­i­nally built for its own ar­ti­fi­cial in­tel­li­gence ef­forts.

Google’s deal ap­pears to be pay­ing for roughly half the amount of com­pute that Anthropic has ac­cess to at Colossus 1. SpaceX did­n’t say which spe­cific data cen­ter Google would be us­ing. CEO Elon Musk has pre­vi­ously sug­gested his com­pany would re­serve the Colossus 2 data cen­ter for xAI.

Anthropic was sig­nif­i­cantly lim­ited in its com­pute ca­pac­ity prior to its deal with SpaceX, rais­ing us­age lim­its on the same day the deal was an­nounced. Google is in a very dif­fer­ent po­si­tion, with some es­ti­mates nam­ing it as the world’s largest sin­gle owner of AI com­pute.

In a state­ment, a Google rep­re­sen­ta­tive de­scribed the deal as a re­sult of un­ex­pected de­mand for its re­cently launched AI prod­ucts. Google Cloud and SpaceX are long-time part­ners,” Google said in a state­ment. This is a short-term, timely agree­ment to en­sure we have bridge ca­pac­ity to meet surg­ing cus­tomer de­mand for our agent plat­form, Gemini Enterprise, which has been even higher than we ex­pected.”

But its par­ent com­pany Alphabet is on a spend­ing spree. Alphabet has al­ready com­mit­ted to more than $180 bil­lion in cap­i­tal ex­pen­di­tures this year and has said it ex­pects that to significantly in­crease” in 2027. To help with that, Alphabet re­cently an­nounced an $80 bil­lion eq­uity sale.

Also like the Anthropic deal, the agree­ment with Google in­cludes a can­cel­la­tion clause. Both SpaceX and Google have the op­tion to ter­mi­nate the agree­ment with 90 days’ no­tice af­ter December 31, 2026. Google’s ac­cess to the data cen­ter will ramp up through September at a re­duced fee,” ac­cord­ing to the fil­ing.

If we fail to de­liver ac­cess to the com­mit­ted amount of GPUs by September 30, 2026, then fol­low­ing a one-month grace pe­riod, Google may im­me­di­ately ter­mi­nate the agree­ment or ac­cept the num­ber of GPUs pro­vided” with a re­duc­tion in the monthly fees, it reads.

SpaceX an­nounced the deal just one week be­fore the com­pa­ny’s stock is ex­pected to start trad­ing on the Nasdaq ex­change. Paperwork filed with the Securities and Exchange Commission shows the com­pany is aim­ing to raise around $75 bil­lion at a val­u­a­tion of around $1.75 tril­lion — mak­ing it the largest in his­tory.

Google is a long­time in­vestor in SpaceX. Its stake in Musk’s com­pany is ex­pected to be worth more than $100 bil­lion af­ter the IPO. The com­pa­nies are also re­port­edly in talks to try to build or­bital data cen­ters — a ma­jor com­po­nent of SpaceX’s fu­ture plans post-IPO.

When you pur­chase through links in our ar­ti­cles, we may earn a small com­mis­sion. This does­n’t af­fect our ed­i­to­r­ial in­de­pen­dence.

Sean O’Kane is a re­porter who has spent a decade cov­er­ing the rapidly-evolv­ing busi­ness and tech­nol­ogy of the trans­porta­tion in­dus­try, in­clud­ing Tesla and the many star­tups chas­ing Elon Musk. Most re­cently, he was a re­porter at Bloomberg News where he helped break sto­ries about some of the most no­to­ri­ous EV SPAC flops. He pre­vi­ously worked at The Verge, where he also cov­ered con­sumer tech­nol­ogy, hosted many short- and long-form videos, per­formed prod­uct and ed­i­to­r­ial pho­tog­ra­phy, and once nearly passed out in a Red Bull Air Race plane.

You can con­tact or ver­ify out­reach from Sean by email­ing sean.okane@techcrunch.com or via en­crypted mes­sage at okane.01 on Signal.

View Bio

Sigma 45mm f/2.8 Lens Repair & Analysis | Salvaged Circuitry

salvagedcircuitry.com

Sigma 45mm f/​2.8 Lens Repair & Analysis

[05.12.24]

I have a cam­era gear col­lec­tion prob­lem and as part of my per­sonal 12 step plan, I re­strict my­self from pur­chas­ing func­tion­ing lenses. This sounds il­log­i­cal, and it frankly is, but it’s very hard for me to re­sist heav­ily dis­counted lenses. To keep my hat, I tend to only bid on lenses that are less than 1/4 of the go­ing used sale prices and have lit­tle to no me­chan­i­cal dam­age. In this case I’ve been eye­ing the re­cently pro­duced sigma I-series lenses that fea­ture mostly alu­minum con­struc­tion. A bro­ken 45mm f/​2.8 lens popped up on ebay in January for a song and a dance, and I sim­ply could not re­sist.

The auc­tion was listed by an ebay seller that tends to have reg­u­lar in­ven­tory of bro­ken mod­ern cam­era gear which is great from a re­pair per­spec­tive. Occasionally the seller will tear­down equip­ment and sell parts, leav­ing me a bit un­easy about the in­ter­nal state of the listed items for sale, but I took a chance and went with it.

Arrival

The lens came well pack­aged and on ini­tial in­spec­tion, fea­tured zero me­chan­i­cal flaws. No scratches on the bar­rel or lens el­e­ments what­so­ever. To prop­erly in­spect the outer lens el­e­ments, I use my oil free air com­pres­sor and thor­oughly blow off any de­bris from the lens and prop­erly clean both front and rear el­e­ment with a kimwipe and lens clean­ing so­lu­tion. Eye glasses cleaner from the drug store is ad­e­quate for most ex­ter­nal lenses. Isopropyl al­co­hol is an­other good al­ter­na­tive, but don’t use on plas­tic lenses.

This is bro­ken???

I mounted the lens to my Lumix S5 and with seem­ingly too much force, it clicked into place. The cam­era booted up fine and even dis­played a live im­age, but no elec­tronic con­trols worked what­so­ever. None of the di­als or switches on the lens re­sponded to user in­put. The con­trol di­als on the cam­era did not reg­is­ter move­ment. Clearly, there was some­thing elec­tri­cally wrong with the lens. The con­trol PCB is usu­ally found on the rear of the lens near­est to the rear lens con­tact block. It would also pro­vide a good time to in­ves­ti­gate the very stiff lens mount.

Tools

The bar­ri­ers to en­try on this re­pair are low. Most of these tools are pretty stan­dard and generic at this point. The biggest ex­pense be­sides the lens is fil­tered air, but even a com­pressed air duster can suf­fice. Note: Since most of the cam­era in­dus­try de­sign folks are cen­tered in Japan, the JIS screw is stan­dard. Using a Phillips will work but it tends to wear down the heads on the JIS screws faster. Here are some of my go-to tools:

Kimwipes / lint-free lens clean­ing wipes

Spray Isopropyl Alcohol (ipa)

Eye glass cleaner

Microfiber cloth

Nitrile gloves

Highly fil­tered shop air / oil free com­pres­sor

Tape

Sharpie

Scalpel

Plastic Spudger

Magnifier / op­tic

JIS x 2.5mm / Philips #00 screw­driver

JIS x 3.0mm / Philips #0 screw­driver

Disassembly

For dis­as­sem­bly, I ori­ent the lens with the aper­ture mark fac­ing me and the table front edge. The rear plas­tic beauty spacer around the rear el­e­ment is re­moved first along with (3) black ma­chine screws. Following, the two nickel plated screws that fix­ture the side of the plas­tic lens block ter­mi­nal in­ter­face to the metal lens mount are re­moved. The screws are placed on dou­ble sided tape in an ori­en­ta­tion that matches the lens ori­en­ta­tion. This makes fu­ture re­assem­ble sub­stan­tially eas­ier.

Next or­der of busi­ness, the lens mount bay­o­net and shims. The ori­en­ta­tion and or­der of these shims mat­ter so they de­serve their own bit of tape. This lens had trou­ble mount­ing to a cam­era body so I thor­oughly in­spected the shims, bay­o­net mount back and the lens body for im­per­fec­tions and sur­face con­t­a­m­i­na­tion. I cleaned all sur­faces with ipa and moved on. Note: be ex­tra care­ful when han­dling the lens con­tact block flex ca­ble.

At this point in the dis­as­sem­bly, the lens con­tact block can be freely re­moved from the con­trol PCB. The con­tact block for L-mount fea­tures 10 ter­mi­nals that con­nect to the con­trol PCB via a flex­i­ble poly­imide ca­ble. This flex ca­ble has a ten­dency to tear eas­ily, es­pe­cially if not han­dled care­fully. Before con­tin­u­ing fur­ther with the tear­down, take a mul­ti­me­ter and check the con­ti­nu­ity of each trace. If there are vis­i­ble tears in the flex ca­ble, re­pair that first be­fore con­tin­u­ing to di­ag­nose any prob­lems. I have an­other guide on how to make your own flex ca­bles here. The flex ca­ble me­tered out to be flaw­less, so I pro­ceeded the tear­down.

The rear CNC ma­chined alu­minum shell of the lens is up next. Two ground­ing straps are fix­tured to the rear shell us­ing nickel plated ma­chine screws. The straps are ori­ented around the 2pm and 7pm clock po­si­tion. A push-in switch flex con­nec­tor is po­si­tioned around the 11pm po­si­tion that can be wig­gled out us­ing a pair of tweez­ers. There are (4) black ox­ide self tap­ping screws that mate the shell to the cen­ter plas­tic lens mod­ule. The rear shell can now be safely lifted up and set aside, with the aper­ture mark po­si­tioned to­ward the table edge.

The con­trol PCB and re­main­der of the flex ca­bles are now eas­ily ac­ces­si­ble. Three black self tap­ping screws mate the PCB to the plas­tic lens mod­ule at the 2pm, 7pm, and 10pm clock po­si­tion, re­spec­tively. Once the flex ca­bles are wig­gled loose, the con­trol PCB can be freed from the lens body and more closely scru­ti­nized.

PCB Analysis

The C-shaped PCB looks right at home with the dozens of other lens con­trol PCBs I’ve an­a­lyzed. There’s a main mi­cro­con­troller, DC-DC con­troller, mo­tor con­troller, crys­tal os­cil­la­tor, and a slew of pas­sives.

The re­verse side is adorned with FPC (Flexible Printed Circuit) con­nec­tors, test points and an 8-pin SPI flash pack­age di­rectly be­low the main mi­cro­con­troller.

Inspecting any un­known PCB for faults can be quite in­tim­i­dat­ing, but I find it eas­i­est to start with trac­ing the in­put power lines first. Where is this board sup­posed to re­ceive power from? Where do the V+ and Gnd traces first be­gin on the PCB? What is the first com­po­nent that re­ceives power on the board? PCBs can be a very com­pli­cated mess of lay­ers and jumped traces, so feel free to write down sim­pli­fied schematic notes on scrap pa­per to keep a sim­ple PCB power or­der-of-op­er­a­tions.

To start, trace the in­put power from the lens ter­mi­nal block. The thicker flex PCB traces are most def­i­nitely V+ and Gnd. Follow those traces on the PCB and use the con­ti­nu­ity mode of your mul­ti­me­ter to find out where the power traces on the PCB lead to. In this case, trac­ing the in­put power is tricky be­cause the large traces as­so­ci­ated with the flex ca­ble are hid­den be­neath the FPC con­nec­tor and passed through the PCB to the op­po­site side through vias. The power traces were then fed to a small square black chip, a sta­ple in small scale elec­tron­ics: the DC-DC con­verter.

A tell­tale sign of a DC-DC con­troller is the pres­ence of an ad­ja­cently placed prop­erly-chunky tan, beige or black col­ored block that dwarfs the size of the power con­troller. The photo be­low in­di­cates a cir­cled 2R2 or a 2.2 uH (micro Henry) in­duc­tor. This close prox­im­ity in­duc­tor arrange­ment is uni­ver­sally rec­om­mended by semi­con­duc­tor man­u­fac­tur­ers in ef­fort to re­duce ra­di­ated emis­sions / noise.

In this case, the 16-VQFN pack­age TI TPS62140RGTR Buck con­verter, la­beled as PA71 TI 18i” is used in this sigma lens PCB. The lay­out en­gi­neer heeded the ad­vice of the TI datasheet and im­ple­mented a very sim­i­lar lay­out but added a few com­po­nents to keep things spicy. Looking over the lay­out rec­om­men­da­tion, C1″ con­nects Vin to Gnd, and serves as the main in­put fil­ter ca­pac­i­tor for the dc-dc con­verter. Inquisitive minds can eas­ily de­ter­mine that the un­known N” la­beled pack­age ad­ja­cent to C1 on the in­put volt­age rail is in­deed a fuse to pro­tect the dc-dc from dam­age. A quick check with the mul­ti­me­ter con­firms that the fuse was open, so it took one for the team and saved the DC-DC from de­struc­tion.

Searching on­line for an N” la­beled fuse does not lead to many promis­ing search re­sults but it did bring up a sug­gested 2a rated smt fuse on aliex­press. The TI TPS62140RGTR datasheet quotes a 2a out­put cur­rent, and while there’s op­er­a­tional qui­es­cent cur­rent, 2a is likely an ap­pro­pri­ate value. Being fa­mil­iar with the Panasonic Semi smt fuses used all through­out the Lumix cam­era line, I picked up a 2amp 32v fast blow fuse part# ERB-RE2R00V. The lu­mix gh3, gh4 and gh5 cam­eras use a mix­ture of 32v 2.5a and 1.5a fuses, so I knew I was in the right ball­park. I have found that in cam­era elec­tron­ics, 2 ter­mi­nal re­sis­tor-look­ing pack­ages with ar­bi­trary sin­gle let­ter no­ta­tion tend to be smt fuses. They some­times have scal­loped ter­mi­nals.

The fuse im­ple­mented was sized as 0603, mak­ing re­pair pos­si­ble with less ex­pen­sive and pre­cise equip­ment. 0402 and even 0201 fuses very much ex­ist. The lay­out en­gi­neer also left space be­side the fuse for ease of ac­cess with re­pair tools. There have been many times when a fuse was put in a pocket or crowded sec­tion of a PCB re­quir­ing a nearby com­po­nent des­ol­der­ing to gain ac­cess to the failed com­po­nent. An ex­am­ple is the Lumix GH3 / GH4 main­board that fea­tures the bat­tery in­put fuse sand­wiched be­tween an SD card slot and a pro­trud­ing bat­tery con­nec­tor. SMT tweez­ers make this re­pair a breeze, but wield­ing two sol­der­ing irons also works in a pinch. Desolder the failed fuse, clean the pads, po­si­tion the new fuse, hold down the fuse and sol­der one ter­mi­nal at a time.

Fuse Investigation

As to why the fuse failed, I have not dis­cov­ered a spe­cific fail­ure point. Certain con­clu­sions can be ar­rived at, such as spe­cific edge cases where some­one leaves the lens in AFC mode (auto fo­cus con­tin­u­ous) and sets the cam­era to hunt for fo­cus for hours / days on end. It’s pos­si­ble the lens was never de­signed to be used 24/7 for rack­ing fo­cus and caused the buck con­verter to pull more cur­rent than than 2a fuse could sus­tain, caus­ing it to open.

An in­ter­est­ing op­er­a­tional con­di­tion in the TI datasheet may pro­vide clues to the fail­ure point. On page 11 it states: The out­put cur­rent of the de­vice is lim­ited by the cur­rent limit (see Section 7.5). Due to in­ter­nal prop­a­ga­tion de­lay, the ac­tual cur­rent can ex­ceed the sta­tic cur­rent limit dur­ing that time.” ILIMF is the sta­tic cur­rent limit. ILIMF = High-side MOSFET for­ward cur­rent limit. Test con­di­tions: VIN = 12 V, TA = 25°C. Min:2.45A Typ:3A Max:3.5A.” Thus in over cur­rent sit­u­a­tions, due to in­ter­nal prop­a­ga­tion de­lay, the ac­tual cur­rent con­sumed can ex­ceed the sta­tic cur­rent limit for a very short amount of time. If the lens con­trol PCB de­sign­ers im­ple­mented a 2a fast-act­ing smt fuse as es­ti­mated, the dc-dc con­troller would have been op­er­at­ing out­side the 2a fuse spec­i­fi­ca­tion. This is spec­u­la­tion at this point, but fail­ure point analy­sis is cer­tainly in­ter­est­ing.

Did the re­pair work?

What’s the TLDR? Does she chooch? Most cer­tainly! This is the lens in ac­tion. AFC per­for­mance is not light­ning fast, but I don’t ex­pect it to be! The man­ual fo­cus dial works won­ders and has just the right amount of damp­en­ing to make it en­joy­able to use. The aper­ture ring feels like a close cousin to the Lumix LX100, which is noth­ing short of ex­cel­lent.

Further Troubleshooting

If the fuse was con­tin­u­ous, the next tar­get for in­spec­tion would be the out­put volt­age on the dc-dc. Is the out­put within op­er­a­tional spec­i­fi­ca­tions? Is it be­low or ex­ceed­ing the power re­quire­ments of the main mi­cro­con­troller? The main mi­cro on this PCB is la­beled as a 341Fy 551486” but it is re­ally a Toshiba TMPM341FYXBG. This is a 32bit Arm M3 mi­cro with plenty of fea­tures, I/O pe­riph­eral sup­port and mo­tor con­trol com­mu­ni­ca­tion pro­to­cols. A mi­cro­con­troller chip serves as the main com­mu­ni­ca­tion hub on the con­trol PCB. Dedicated mi­cro­con­trollers re­quire ac­cu­rate clock sig­nals to speak with other mi­cros and pe­riph­er­als in cir­cuit, thus if you find a ded­i­cated crys­tal os­cil­la­tor on a PCB, there’s a good chance there’s a mi­cro­con­troller nearby!

Traditional quartz crys­tal os­cil­la­tors vary in op­er­a­tional fre­quency and are sealed in metal pack­ages sil­ver in lus­ter. They are com­monly found in sur­face mount or through hole pack­ages in many shapes and sizes. MEMS crys­tal os­cil­la­tors are a pos­si­bil­ity but they are slightly more ex­pen­sive and thus not as com­mon. MEMS os­cil­la­tors are usu­ally a very tiny square re­flec­tive chip-scale / flip-chip pack­age. Some mi­cro­con­trollers fea­ture on-chip os­cil­la­tors but they are not as con­sis­tent as ex­ter­nal crys­tals, thus ex­ter­nal crys­tals are still pre­ferred.

Building on pre­vi­ous trou­bleshoot­ing, the TMPM341FYXBG needs to be checked for in­put power. Failure of any com­po­nents be­tween the out­put rail of the dc-dc con­troller and the power in­put of the TMPM341FYXBG can cause the main mi­cro to mal­func­tion. The TMPM341FYXBG is a 0.5mm pitch 6x6mm 113 ball P-TFBGA113 BGA pack­age, mean­ing there is no easy prob­ing of Vin and gnd. The nearby cir­cuit will have to be in­ves­ti­gated. Working with what we know, the main mi­cro is a Arm M3, mean­ing it’s a 3.3v com­po­nent. The Toshiba datasheet quotes an op­er­a­tional volt­age range of 2.7 to 3.6V. If in­put power to the mi­cro is not within that range, or if there’s a short be­tween Vin and ground near the mi­cro, that’s def­i­nitely a red flag.

The eas­i­est way to probe for volt­ages live is to in­stall the lens con­tact flex back into the con­trol PCB and 3D print a fake lens as a jig in or­der to get the lens PCB in con­tact with a cam­era body on hand. Surprisingly, this is an eas­ier feat than ex­pected since sigma up­loaded nearly all of their cam­era bod­ies and ac­ces­sories on grab­cad for free. Step files for all y’all. Hats off to sigma for up­load­ing these, most cam­era com­pa­nies do not do this. Once the lens con­trol PCB is in po­si­tion and mounted to the cam­era, prob­ing can be­gin. Power traces are con­ven­tion­ally wider than sig­nal traces, so one can go about prob­ing wide traces near the TMPM341FYXBG for 3.3v.

If the mi­cro­con­troller is re­ceiv­ing power within spec­i­fi­ca­tion, an ad­di­tional range of trou­bleshoot­ing would be nec­es­sary to di­ag­nose the lens PCB fail­ure. Luckily, there are cir­cu­lar test pads near the main mi­cro, in­di­cat­ing these con­trol PCBs were pro­grammed and tested on a bed-of-nails jig be­fore as­sem­bly. Unfortunately, these test pads are not la­beled so find­ing the cor­rect ones be­comes a game of trial and er­ror. A logic an­a­lyzer would be needed to probe the test points near the mi­cro­con­troller. If UART can be dis­cov­ered on any of the test pins, the logic an­a­lyzer can help pro­vide in­put on whether the mi­cro­con­troller is boot­ing cor­rectly by de­ci­pher­ing the boot-up bit se­quence.

Depending on how far down the rab­bit hole you want to go, there is an 8 pin SPI flash pack­age la­beled GD V4CE 2030” which is likely a 8 – 32mbit nor pack­age that is com­monly used with ARM mi­cro­con­trollers to ex­tend pro­gram mem­ory space. While I do not have the ex­act datasheet for this pack­age, I was able to de­ter­mine it was in­deed flash be­cause GD is the des­ig­nated com­pany pre­fix for GigaDevice” a known mem­ory pack­age man­u­fac­turer and 8 pin pack­ages are a sta­ple in small sized ex­ter­nal flash pack­ages. In this case, the pack­age mea­sures a very small 3*2mm XY foot­print and closely matches the USON8 LGA8 pack­ages of­fered in the GigaDevice datasheet (page 29). The flash chip can be des­ol­dered and the con­tents can be read / cloned onto an­other flash pack­age if there was an inkling that the flash pack­age might have failed. The analy­sis of the flash pack­age ex­ceeds the con­text of this re­pair, but it is in­ter­est­ing to in­ves­ti­gate the ex­tent of fail­ure analy­sis.

If the main mi­cro­con­troller seems to be work­ing cor­rectly, there would likely be some in­put val­ues dis­played on the cam­era LCD when the lens con­trol PCB is in­stalled, such as aper­ture or fo­cus dis­tance in­for­ma­tion. The val­ues may be in­cor­rect but some val­ues should be dis­played on the cam­era dis­play from the lens con­troller mi­cro. Continuing down the line, the next tar­get would be the mo­tor con­troller IC. In this case, the U24020 202184” pack­age is a Rohm BU24020GU mo­tor con­troller con­fig­ured as an SPI pe­riph­eral. SPI is a syn­chro­nous form of com­mu­ni­ca­tion and re­quires a clock sig­nal be­tween the mas­ter con­troller and slave de­vice.

Looking at the PCB macro shot, the BU24020GU has plenty of sur­round­ing pas­sive com­po­nents but has an un­pop­u­lated 4 pin pack­age that was likely in­tended for a 4 ter­mi­nal slot-style Photointerrupter but was aban­doned later in the de­vel­op­ment cy­cle. This is the kind of sim­plis­tic sen­sor that is po­si­tioned around a coded ro­tat­ing disc to log po­si­tion­ing in­for­ma­tion. Looking over the Rohm datasheet, this is an­other 3.3v part. Just as be­fore, thou shall check volt­ages! If the part is not re­ceiv­ing the 2.7 –> 3.6v in­put range, the chip’s not go­ing to work prop­erly. This is un­for­tu­nately an­other BGA com­po­nent, so prob­ing is go­ing to be chal­leng­ing. However, physics never left your side, and there’s mo­bile elec­trons on this here high­way. Thicker traces = more elec­trons!

Looking around the U24020 chip, you can see a pat­tern in se­ries of three - three arrange­ments of two ca­pac­i­tors tied to ground. These two ca­pac­i­tors are de­cou­pling ca­pac­i­tors, sized big” and small.” The big de­cou­pling ca­pac­i­tor is usu­ally be­tween 0.1uf and 1uf while the small ca­pac­i­tor is in the nano­farad range. This arrange­ment of ca­pac­i­tors is very com­mon and is de­signed to re­duce noise at dif­fer­ent fre­quen­cies. The datasheet tells us this is a four chan­nel mo­tor con­troller, and the ca­pac­i­tor arrange­ment tells us that power to all chan­nels has been laid out on the PCB. Thus, a mea­sure­ment be­tween any of high­lighted power pairs will pro­vide all you need to know. The datasheet lists Vin = DVDD, MVCC12, MVCC34 while Gnd = DVSS, MGND12 MGND34. Vin and gnd are high­lighted in the tra­di­tional color scheme, red for Vin, black for gnd.

If the mo­tor con­troller is re­ceiv­ing 3.3v but the lens fo­cus mo­tor is not mov­ing when a fo­cus pull is trig­gered, make sure to op­ti­cally in­spect the lens fo­cus flex ca­ble. Flex ca­bles tend to fa­tigue and break if the flex ra­dius is com­pressed too small. Fatiguing ca­bles can be prob­lem­atic be­cause de­pend­ing on the move­ment of the ca­ble, the fo­cus mech­a­nism can be­gin to tem­porar­ily work again, but it’s a false pos­i­tive as the next flex may kill lens com­mu­ni­ca­tion.

Other Lens PCB Tidbits

Looking over the lens PCB macro shots, there are a slew of tiny holes scat­tered through­out the board. These tiny holes are through hole vias and a large num­ber of them are drilled into the green top layer ground pour. These vias serve as re­turn paths for noisy com­po­nents on the PCB and con­nect the outer layer ground pour to in­ner ground lay­ers. You may have won­dered why there are such large clus­ters of vias in cer­tain parts of the board but not in oth­ers. Clusters of vias, or via stitch­ing, is used to pro­vide a low-im­ped­ance path for the re­turn cur­rent in­duced by a par­tic­u­larly noisy com­po­nent. By sur­round­ing a noisy com­po­nent or re­gion of the PCB, stitch­ing vias can block prop­a­ga­tion of elec­tro­mag­netic waves up to some max­i­mum fre­quency. The stitch­ing vias im­ple­mented here do not fully en­close any trace or spe­cific com­po­nent, so they do not work as a Faraday cage or as a guard ring, but they do pro­vide noise re­turn paths to help lower ra­di­ated EMI dur­ing the fi­nal de­sign process.

Conclusion

I com­pleted this re­pair about 2 months ago, just in time for spring’s first blooms in the north east­ern United States. The 45mm lens has been work­ing a treat in and around the gar­den and for doc­u­ment­ing other elec­tron­ics pro­jects. I added some sam­ple pho­tos in the be­hind-the-scenes pho­to­set to show this lens in ac­tion. It’s amaz­ing that a tiny 0603 com­po­nent can keep a good lens down and I’m glad I was able to thor­oughly doc­u­ment the re­pair process for oth­ers to learn from. All in all, the com­plete lens tear­down and fuse swap took less than an hour! In com­par­i­son, this writeup eas­ily ex­ceeded an or­der of mag­ni­tude more time to com­plete than the re­pair. Would I do it all again? Most cer­tainly!

Thanks for Reading!

Want more? Here’s a be­hind the scenes look at my work­space and some of the im­ages that did not make the cut to be in­cluded in the write-up:

Moving beyond fork() + exec()

lwn.net

fork() is a rel­a­tively ex­pen­sive sys­tem call; it must copy the en­tire process state (including mem­ory) for the child process. Many op­ti­miza­tions have been made over the years, but a fork is still a fun­da­men­tally costly op­er­a­tion. To make things worse, a fork() call is of­ten im­me­di­ately fol­lowed by an exec(), which will dis­card all of that mem­ory that was so care­fully copied for the child. Attempts (such as vfork()) have been made over the years to op­ti­mize for this case, but the pat­tern still is more ex­pen­sive than it could be.

Spawn tem­plates

Chen’s patch set takes an in­ter­est­ing ap­proach to op­ti­mize the fork() and exec() pat­tern. It is fo­cused on ap­pli­ca­tions that re­peat­edly launch processes run­ning the same ex­e­cutable; imag­ine, for ex­am­ple, a pro­gram that must run Git re­peat­edly to ob­tain in­for­ma­tion about the con­tents of a repos­i­tory. In such cases, the pro­gram could es­tab­lish a tem­plate to ac­cel­er­ate those in­vo­ca­tions, spread­ing the setup cost across mul­ti­ple op­er­a­tions. This tem­plate would be cre­ated with the spawn_tem­plate_cre­ate() sys­tem call:

struct spawn_tem­plate_cre­ate_args { __aligned_u64 flags; __s32 ex­ecfd; __u32 ex­ec_flags; __aligned_u64 file­name; /* Some fields elided */ };

int spawn_tem­plate_cre­ate(struct spawn_tem­plate_cre­ate_args *args, size_t args_­size);

This call will re­turn a file de­scrip­tor rep­re­sent­ing a tem­plate for the ex­e­cutable file, which can be spec­i­fied as ei­ther a file de­scrip­tor (execfd) or an ab­solute path (filename), but not both. To cre­ate the tem­plate, the ker­nel will open the in­di­cated file and cache a bunch of in­for­ma­tion that will al­low a process to run that file more quickly in the fu­ture.

The ap­pli­ca­tion in ques­tion may run a given ex­e­cutable many times, but each in­vo­ca­tion is dif­fer­ent in a num­ber of ways. The de­tails of a spe­cific in­vo­ca­tion must be placed into an in­stance of this struc­ture:

struct spawn_tem­plate_s­pawn_args { __aligned_u64 flags; __aligned_u64 pidfd; __aligned_u64 argv; __aligned_u64 envp; __aligned_u64 ac­tions; __aligned_u64 ac­tion­s_len; __aligned_u64 re­served[4]; };

The argv field is a pointer to the ar­gu­ment list to be passed to the pro­gram, while envp points to its en­vi­ron­ment. Changes to file de­scrip­tors and sig­nal han­dling, in­stead, are passed through ac­tions, which is a pointer to an ar­ray of:

struct spawn_tem­plate_ac­tion { __u32 type; __u32 flags; __s32 fd; __s32 newfd; __aligned_u64 arg; };

If, for ex­am­ple, file de­scrip­tor four should be closed in the child, the as­so­ci­ated spawn_tem­plate_ac­tion struc­ture would have type set to SPAWN_TEMPLATE_ACTION_CLOSE and fd set to four. Other ac­tions ex­ist for du­pli­cat­ing file de­scrip­tors, open­ing files, chang­ing the work­ing di­rec­tory, and chang­ing sig­nal han­dling.

Once the spawn_tem­plate_s­pawn_args struc­ture has been filled in, the new process can be run with:

int spawn_tem­plate_s­pawn(int tem­plate_fd, struct spawn_tem­plate_s­pawn_args *args, int args_­size);

Internally, this sys­tem call fol­lows some­thing close to the nor­mal fork()/​exec() path. Chen is care­ful to point out that all of the nor­mal checks ap­plied when ex­e­cut­ing a new file re­main in place. But the cached in­for­ma­tion in the tem­plate makes the whole process faster than it was be­fore.

How much faster? Benchmark re­sults pro­vided in the cover let­ter show an im­prove­ment of about 2%, which may not seem like a lot, but it may make a dif­fer­ence for ap­pli­ca­tions that fit the ex­pected pat­tern.

Toward posix_s­pawn()

The most de­tailed re­view of this work was posted by Mateusz Guzik, who said: This prob­lem is dear to my heart and I have been pon­der­ing it on and off for some time now. The en­tire fork + exec id­iom is ter­ri­ble and needs to be re­tired”. He pointed out that the fo­cus of the patch set was a bit strange in that it left the fork() part of the prob­lem un­touched. That is where most of the cost lies, he said, so op­ti­miza­tion ef­forts should seek to re­move it from the pic­ture. Rather than copy­ing the cur­rent process, creating a pris­tine process is the way to go”.

Christian Brauner was fa­vor­able to­ward the goal, say­ing: The idea of hav­ing a builder api for exec is­n’t all that crazy”. His sug­ges­tion, though, was that a new API should be built on top of the ex­ist­ing pidfd ab­strac­tion. Without get­ting into any de­gree of de­tail, he said that the right ap­proach would be to cre­ate an op­tion to pidfd_open() to cre­ate an empty process. A se­ries of calls to a new pidfd_­con­fig() sys­tem call would then con­fig­ure this new process as de­sired, set­ting up its en­vi­ron­ment, im­age to ex­e­cute, and more. pidfd_­con­fig() would thus be anal­o­gous to fs­con­fig().

An im­por­tant ob­jec­tive for a new in­ter­face, Brauner said, would be the abil­ity to sup­port an im­ple­men­ta­tion of posix_s­pawn() in user space. posix_s­pawn() is well suited as a re­place­ment for the fork()/​exec() pat­tern; de­vel­op­ers would likely wel­come a na­tive im­ple­men­ta­tion that is­n’t (unlike the cur­rent im­ple­men­ta­tion) hid­ing fork() and exec() un­der the cov­ers.

Chen agreed that the API as broadly sketched out by Brauner seemed bet­ter, and said that fu­ture work would be in that di­rec­tion. So there will be no spawn tem­plates in the Linux ker­nel but, if Chen’s fu­ture work comes to fruition, Linux may fi­nally gain a proper posix_s­pawn() im­ple­men­ta­tion in­stead.

Did you like this ar­ti­cle?? Subscribe now at the spe­cial dis­counted rate to get a lot more like it.

pokeemerald-wasm

pokeemerald.com

Just a moment...

elijahpotter.dev

The Smart TV in Your LivingRoom Is a Node in the AIScraping Economy

blog.includesecurity.com

The work at Include Security has us work­ing with AI day in and day out (hacking it, us­ing it, train­ing it, etc).

We’re all aware of the com­mu­nity-level op­po­si­tion hap­pen­ing against dat­a­cen­ters, aimed at im­prov­ing AI ca­pa­bil­i­ties, be­ing built re­cently. What you might not be aware of are the dis­trib­uted ef­forts to train AI that could be us­ing the de­vices in­side your home.

In this post, we’re go­ing to ex­plore how the com­pany Bright Data fa­cil­i­tates mod­ern AI mod­els scrap­ing train­ing data from the Internet us­ing its res­i­den­tial proxy net­work.

Bright Data is a data-col­lec­tion com­pany that sells ac­cess to what it mar­kets as the world’s largest res­i­den­tial proxy net­work of 400M+ home IP ad­dresses that its cus­tomers route web-scrap­ing traf­fic through. The sup­ply be­hind that net­work comes from an SDK: a piece of soft­ware em­bed­ded in con­sumer apps that, with the user’s con­sent, turns their phone or smart TV into one of those exit nodes.

We’ll doc­u­ment what you, the av­er­age user, should know about what this com­pa­ny’s SDK does on your sys­tems such as your mo­bile phone and your smart TV. We’re go­ing to ex­plore how their SDK works, which plat­forms have shipped it, and why your Internet-connected TV is the ul­ti­mate proxy for AI mod­els look­ing to train on data scraped from the Internet.

Why This Matters Now

AI com­pa­nies de­pend on web-scraped con­tent: for pre-train­ing, for re­trieval, for agent ground­ing, for search. But the mod­ern web is­n’t scrape­able from a dat­a­cen­ter. Cloudflare, DataDome, HUMAN, among oth­ers throt­tle or block re­quests from known cloud IPs.

The workaround is res­i­den­tial prox­ies. A scrap­ing job routed through a Comcast or T-Mobile sub­scriber’s con­nec­tion ar­rives at the tar­get site from an IP that be­longs to a pay­ing res­i­den­tial cus­tomer. Krebs re­ported in October 2025 that a glut of prox­ies from Aisuru and other sources is fu­el­ing large-scale data har­vest­ing ef­forts tied to var­i­ous AI pro­jects.” Academic mea­sure­ment go­ing back to 2019 shows these net­works are over­whelm­ingly mis­used. The FBI is­sued a for­mal ad­vi­sory ear­lier this year.

Most of the ex­ist­ing press has fo­cused on the il­le­gal res­i­den­tial-proxy sup­ply: bot­nets (Aisuru, Kimwolf), tro­janized apps (HUMAN Security’s PROXYLIB dis­clo­sure), pre-in­fected IoT hard­ware (Google/Mandiant’s IPIDEA take­down). These are the bad ac­tors.

On the other hand, the le­gal sup­ply side has re­ceived far less scrutiny. Today Bright Data is the largest res­i­den­tial proxy net­work in the world by its own mar­ket­ing, ad­ver­tis­ing 150M+ IPs” sourced via a con­sent SDK em­bed­ded in part­ner apps. This re­search doc­u­ments how that SDK works, which plat­forms have shipped it, and why the con­nected-TV is the ul­ti­mate res­i­den­tial proxy.

Why Connected TV (CTV) is the Ideal Proxy

Connected TV, a.k.a Smart TV, is a near-per­fect res­i­den­tial proxy. Compared to a mo­bile phone:

A TV never hits 1% bat­tery, jumps be­tween WiFi net­works or gets locked when the user is asleep. Some part­ner pub­lish­ers do dis­close the Bright Data re­la­tion­ship in their pri­vacy poli­cies PlayWorks is one ex­am­ple. But pri­vacy-pol­icy dis­clo­sure is the wrong con­trol sur­face for a TV. It is hard to scroll through a le­gal doc­u­ment nav­i­gated by ar­row keys on a re­mote, and the in-app con­sent di­a­log, does­n’t con­vey that a pay­ing Bright Data cus­tomer is about to route their scrap­ing traf­fic through the user’s home in­ter­net.

Petflix, a Roku app doc­u­mented by The Verge, is a rep­re­sen­ta­tive case. Its opt-in screen reads: To en­joy Petflix for free with fewer ads, you are al­low­ing Bright Data to oc­ca­sion­ally use your de­vice’s free re­sources and IP ad­dress to down­load pub­lic web data from the in­ter­net. Bright Data will only use your IP ad­dress for ap­proved busi­ness-re­lated use cases. None of your per­sonal in­for­ma­tion is ac­cessed or col­lected ex­cept your IP ad­dress. Period.” The Petflix di­a­log says occasionally.” The SDKs pub­licly queryable con­fig sets max_b­w_­month­ly_wifi: 200,000,000,000 bytes — a 200 GB de­fault monthly WiFi bud­get.

Who Bright Data Names as Partners

Bright Data ex­poses a part­ner man­i­fest end­point. The end­point is unau­then­ti­cated and any­one can fetch it. Names in the man­i­fest that I was able to iden­tify with high con­fi­dence from pub­lic sources:

Others (desoline, free_­time, ot­t_s­tu­dio, glob­al_mi­cro­trad­ing, m_m_­me­dia, easystaff_lp) are pre­sent but less iden­ti­fi­able from pub­lic sources. bright_screen­savers, bright_videos, and bright­data are Bright Data’s own apps.

A note on what the part­ner list proves: Being listed in Bright Data’s con­fig means an in­te­gra­tion might have ex­isted at some point. It does not by it­self prove that a spe­cific pub­lish­er’s cur­rently-ship­ping app(s) in­cludes the SDK in pro­duc­tion. For any named pub­lisher, per-app ver­i­fi­ca­tion is re­quired.

What the part­ner list does di­rectly prove:

Bright Data ships this ros­ter in an unau­then­ti­cated pub­lic end­point.

At least three CTV-focused en­ti­ties (PlayWorks, CloudTV, Longvision) mon­e­tized their user’s de­vices as res­i­den­tial proxy exit nodes. PlayWorks in par­tic­u­lar re­ports CTV dis­tri­b­u­tion across ma­jor TV plat­forms and ISPs, with reach fig­ures in the hun­dreds of mil­lions of house­holds per its own mar­ket­ing ma­te­ri­als.

How does the Bright Data SDK turn a user’s de­vice into a res­i­den­tial proxy exit node?

The Bright Data SDK is a pub­licly doc­u­mented com­mer­cial prod­uct, of­fered to pub­lish­ers via Bright Data’s SDK in­te­gra­tion docs (with a JavaScript vari­ant for web). What fol­lows builds on that pub­lic sur­face with find­ings from re­verse-en­gi­neer­ing the ship­ping iOS frame­work and in­stru­ment­ing 30 days of its run­time traf­fic.

The SDK ships as an iOS frame­work (brdsdk.framework) in­side part­ner apps. I re­verse-en­gi­neered the bi­nary and cap­tured 30 days of traf­fic from a re­search fleet run­ning the SDK in­side a con­sent-in­stalled part­ner app.

The Unauthenticated Config

On every launch the SDK calls:

GET <https://​clientsdk.bright-sdk.com/​sd­k_­con­fig_ios.json>?appid=<bundle>&ver=<sdk-version>&uuid=sdk-ios-<32hex>

The end­point is unau­then­ti­cated in any mean­ing­ful sense. The server gates only on two query pa­ra­me­ters ap­pid (an app bun­dle ID, which can be found in the App Store list­ing of the part­ner app) and ver (the SDK ver­sion string). Supply those and any ran­domly gen­er­ated UUID, and the server re­turns the same re­sponse a real de­vice gets: fea­ture flags, idle-de­tec­tion thresh­olds (battery %, CPU/memory ceil­ings, WiFi-vs-cellular rules), per-coun­try band­width tiers, and the part­ner man­i­fest I show­cased above. Each of these branches is worth ex­am­in­ing on its own: the idle rules that de­cide when your de­vice is el­i­gi­ble to re­lay, a flag that routes peer traf­fic around your VPN, a map that stitches your in­stalls across plat­forms into one iden­tity, and the per-coun­try band­width caps.

The Peer Tunnel

After con­fig fetch, the SDK opens a per­sis­tent WebSocket to:

wss://​prox­yjs.brdt­net.com:443

This host­name re­solves to AWS Global Accelerator IPs (3.33.193.183, 15.197.193.114 as of this writ­ing). The TLS cer­tifi­cate is CN=*.luminatinet.com — the do­main for Luminati Networks, Bright Data’s pre-2018 cor­po­rate name. The re­brand was pub­licly an­nounced in 2018. Active SDK in­fra­struc­ture still runs on the legacy cert, which is a use­ful de­tec­tion pivot: the cur­rent cus­tomer-fac­ing proxy ser­vice lives on bright­data.com-branded do­mains, so any lu­mi­natinet.com / brdt­net.com traf­fic on your net­work is specif­i­cally the peer-tun­nel plane, not cus­tomer-side Bright Data us­age. The server iden­ti­fies it­self as uWeb­Sock­ets: 20.

The peer end­point re­quires no au­then­ti­ca­tion to up­grade. The server ac­cepts any TLS-valid WebSocket up­grade and im­me­di­ately pushes the con­nect­ing client an ap­pli­ca­tion-layer frame with the clien­t’s pub­lic IP echoed back. From there, a hand­shake un­folds:

Server → client: tun­nel_init es­tab­lishes the ses­sion, re­turns the clien­t’s pub­lic IP.

Server → client: cid_set the server as­signs the client a ses­sion-track­ing iden­ti­fier in the for­mat <IP>-<token>/ls<N>c<M>p443_<IP>_<counter>. We con­firmed this for­mat matches the cid field pre­sent in the SDKs cap­tured teleme­try traf­fic from real de­vices.

Server → client: sta­tus_get the server polls the de­vice for its idle state, bat­tery, net­work type, and avail­able band­width. The de­vice re­sponds with a con­tin­u­ous teleme­try feed: idle, wifi_­con­nected, mo­bile_­con­nected, mo­bile_­type (LTE/5G), roam­ing, bat­tery_level, us­ing_­bat­tery, screen_on, on_­call, cpu_us­age, mem_us­age, raw_bw, bw, ipv6_­sup­ported, ap­pid (the host app), sd­k_ver­sion, plat­form, and the as­signed cid. This is a con­tin­u­ous feed of phys­i­cal-de­vice state to a third party, de­liv­ered via a con­sent di­a­log whose text is cho­sen by the host app pub­lisher.

Handshake com­plete. Once the de­vice re­ports fa­vor­able sta­tus, the server’s job-match­ing layer is free to push cmd_­tun frames: in­di­vid­ual scrap­ing-job in­struc­tions that the SDK ex­e­cutes as HTTP re­quests against third-party sites, us­ing the user’s res­i­den­tial IP as the source.

Every frame on the WebSocket is plain JSON with a fixed en­ve­lope:

{“type”: ipc_call”|“ipc_post”|“ipc_result”|“ipc_error”,“cmd”:  <command>, cookie”: <correlation-id>,“err_code”: 0, msg”: { …payload… }}

The full com­mand vo­cab­u­lary ex­tracted from the bi­nary and ver­i­fied on the wire:

There’s no mes­sage sign­ing, HMAC, client cer­tifi­cate or de­vice at­tes­ta­tion. Only the TLS layer and the server’s IP-reputation fil­ter gat­ing which peers ac­tu­ally re­ceive jobs. For read­ers fa­mil­iar with com­mer­cial mal­ware pro­to­col de­sign: this is sub­stan­tially less se­cure than typ­i­cal C2.

When the SDK con­sid­ers you idle”

The con­fig ships an ex­plicit rule­book for when the de­vice is el­i­gi­ble to re­lay some­one else’s traf­fic:

idle_metrics”: {  ignore_screen_on”: true,      // re­lay even with the screen on  ignore_on_call”: true,        // re­lay while the user is on a phone call  max_bw_ratio”: 1,  min_battery”: 0.2,  wifi_on_battery”: true,  min_battery_wifi”: 0.2,  max_cpu_usage”: 70,  max_mem_usage”: 90,  mem_screen_off”: true,  idle_timeout”: 30,  not_idle_timeout”: 10}

The ig­nore_screen_on and ig­nore_on_­call flags are no­table: idle” does not mean the user is away from the de­vice. It means the de­vice’s CPU, mem­ory, and bat­tery are within the SDKs thresh­olds. A user on a phone call, ac­tively read­ing the screen, is con­sid­ered idle for re­lay pur­poses.

Cross-Platform Identity Linkage

The con­fig also ships a dual_­pair­ing map:

dual_pairing”: {  ios_com.brd.earnapp”: [“win_earnapp.com”, mac_com.earnapp”]}

That’s a server-side map ty­ing a user’s iOS, Windows, and ma­cOS in­stal­la­tions of the same brand into one en­tity. It’s cross-plat­form iden­tity stitch­ing doc­u­mented in­side a pub­lic con­fig file.One more for­ward-look­ing field: http3_en­abled: true. The SDK is al­ready ship­ping the flag for QUIC-based peer trans­port. A fu­ture ver­sion may move the peer tun­nel from TCP/443 to UDP/443, which would break any de­fender re­ly­ing on TCP con­nec­tion track­ing to de­tect the WebSocket.

The Inspection Bypass

The SDKs con­fig ships a flag use_netifs”: true. That flag trig­gers code in the SDK bi­nary that con­structs its NWConnection with a spe­cific re­quired in­ter­face: en0 (WiFi) or pdp_ip0 (cellular), rather than us­ing the sys­tem de­fault route.

On iOS, this by­passes any con­fig­ured VPNs tun0 in­ter­face en­tirely. The peer tun­nel does not cross a user-con­fig­ured VPN, even when the rest of the ap­p’s HTTPS traf­fic does.

We ob­served this em­pir­i­cally. My re­search setup in­cludes trans­par­ent TLS in­ter­cep­tion. It cap­tured every HTTPS call the SDK made, ex­cept the peer tun­nel to prox­yjs.brdt­net.com:443, even though port 443 is ex­plic­itly redi­rected to the in­spec­tor. The by­pass uses Apple’s doc­u­mented NWParameters.requiredInterface API.

It’s worth em­pha­siz­ing that the SDK uses two in­de­pen­dent in­spec­tion by­passes, one per plane:

Control plane (config fetch, teleme­try pings): built on CFNetwork’s CFHTTPMessage prim­i­tives rather than URLSession/NSURLConnection. This de­feats URLSessionlevel in­stru­men­ta­tion (swizzling, net­work ex­ten­sions, URLProtocol sub­classes) com­monly used in mo­bile app-sec tool­ing, while still re­spect­ing the sys­tem proxy and so re­main­ing vis­i­ble to TLS-intercepting re­searchers.

Data plane (peer tun­nel): built on NWConnection with re­quired­In­ter­face set to the phys­i­cal in­ter­face. This is what de­feats VPNs and en­sures the scrap­ing is ex­e­cuted from a res­i­den­tial IP.

Both choices are le­git­i­mate Apple APIs. The com­bi­na­tion is the in­ter­est­ing ar­ti­fact: the data plane is in­vis­i­ble to VPN-based in­spec­tion and the con­trol plane is in­vis­i­ble to URLSession-based hooks. Researchers who rely on ei­ther sin­gle tech­nique see only half the SDKs be­hav­ior.

For en­ter­prise se­cu­rity teams run­ning MDM, cor­po­rate-VPN-based traf­fic in­spec­tion, or home-router parental con­trols: the most sen­si­tive chan­nel this SDK op­er­ates is de­signed to go around your vis­i­bil­ity layer.

The ge­og­ra­phy tiers

The con­fig ships per-coun­try band­width thresh­olds. Four coun­tries get ex­plicit non-de­fault poli­cies:

Looking at the con­fig, Uzbekistan and Oman de­vices are per­mit­ted to re­lay down to 1% bat­tery, with daily caps 20× the de­fault and monthly caps 60× the de­fault. Qatar and UAE de­vices are throt­tled be­low de­fault.  We can only spec­u­late as to why the tiers are drawn this way. One read­ing is de­lib­er­ate mar­ket seg­men­ta­tion, re­lax­ing lim­its where grid power is sta­ble and throt­tling where mo­bile data is ex­pen­sive. The de­fault-world­wide al­lowance still per­mits 500 MB of some­one else’s traf­fic per month over the user’s home in­ter­net.

Testing Setup and Methodology

Three data sources:

Thirty days of TLS-inspecting proxy cap­tures from iOS de­vice run­ning con­sent-in­stalled part­ner apps (including XYO COIN, which em­beds the Bright SDK).

Static analy­sis of the SDK bi­nary (brdsdk.framework, ver­sion 1.532.120, iOS ar­m64).

All spe­cific Bright Data host­names, cert fin­ger­prints, and TLS in­fra­struc­ture de­scribed are pub­licly ob­serv­able by any­one mak­ing the same re­quests. No ses­sion-spe­cific iden­ti­fy­ing data from ei­ther the re­search fleet or the re­search client ap­pears in this doc­u­ment.

Timeline

May 11, 2026 — Email no­tice sent to pri­vacy@bright­data.com no­ti­fy­ing their team about the re­lease of this blog post. No re­sponse to the no­ti­fi­ca­tion has been re­ceived at the time of this ar­ti­cle’s pub­lish­ing.

Defense Approaches

The traf­fic leaves clear fin­ger­prints at the net­work bound­ary, and the SDK leaves iden­ti­fi­able sym­bols in the app bi­nary. The ap­proaches be­low let you de­tect and block the peer tun­nel — at the net­work level or on the de­vice it­self. Three ap­proaches, or­dered by ease of de­ploy­ment:

Approach 1: DNS block (trivial, ef­fec­tive for net­work-routed de­vices):

prox­yjs.brdt­net.com­pro­x­yjs.lu­mi­natinet.com­pro­x­yjs.bright-sdk.com­clientsdk.bright-sdk.com­clientsdk.brdt­net.com

Blocking prox­yjs.* kills the peer tun­nel with­out af­fect­ing any cus­tomer who le­git­i­mately uses Bright Data’s cus­tomer-fac­ing proxy ser­vice on a dif­fer­ent do­main.

Approach 2: TLS SNI fil­ter­ing: Drop or alert on TLS hand­shakes where serv­er_­name matches *.brdtnet.com, *.luminatinet.com, or *.luminati.io. Works at the net­work bound­ary with­out TLS in­spec­tion.

Approach 3: TLS cer­tifi­cate fin­ger­print:

.brdtnet.com → SHA256 313ce4ec7d5a51e5…

.luminatinet.com → SHA256 5028612e625befea…

Stable un­til Sectigo cert ro­ta­tion (current certs valid through mid-2026).

The use_netifs caveat: All three lay­ers only work on traf­fic that crosses your net­work bound­ary. The SDKs use_netifs bind­ing means that on iOS, when the de­vice is on cel­lu­lar, peer traf­fic by­passes cor­po­rate WiFi en­tirely. For man­aged fleets, the com­ple­men­tary con­trol is MDM-based app bi­nary scan­ning: search in­stalled apps for the Swift sym­bols BrdWebSocketFacade and BrdNetwork.DNSResolver, and pro­hibit apps con­tain­ing them on cor­po­rate-is­sued de­vices.

For house­hold users con­cerned about a spe­cific smart TV or mo­bile app: block the host­names above at your router’s DNS set­tings (Pi-hole, NextDNS, Cloudflare Gateway, your ISPs equiv­a­lent).

This blog post was writ­ten in part­ner­ship with our guest au­thor and in­de­pen­dent se­cu­rity re­searcher Buchodi.

Collections: Pre-Modern Armies for Worldbuilders, Part I: Why They Fight

acoup.blog

This week I want to try some­thing a lit­tle dif­fer­ent. Rather than tak­ing apart a par­tic­u­lar fan­tasy mil­i­tary sys­tem, I thought I might try to lay out a more gen­eral sense of how mil­i­tary sys­tems tend to map on to so­ci­eties, both be­cause such gen­eral his­tor­i­cal frame­works are handy for think­ing about the past, but also be­cause they make use­ful rules of thumb for imag­in­ing fan­tas­ti­cal so­ci­eties. So es­sen­tially here we are ask­ing: how do so­ci­eties end up with the sort of armies they have?

This is go­ing to take a few posts to get through be­cause there are ac­tu­ally quite a few key com­po­nents to cover: the why and how of re­cruit­ment (both why do these peo­ple feel ob­lig­ated to serve’ and how do you get them into the army’), how a so­ci­ety pay for that (or does­n’t), who leads it and how, and how once formed any army co­heres in the field. Finally, we’ll wrap up with some his­tor­i­cal archetypes’ to show how these dif­fer­ent facets link to­gether with the un­der­ly­ing civil­ian so­ci­ety and also how that shapes what they look like on the bat­tle­field (including weapons and tac­tics).

This se­ries is also go­ing to be a bit un­usual be­cause in some ways its pur­pose is to link up and sum­ma­rize a bunch of other posts. We’ve had a lot of posts and se­ries over the years which ex­am­ined this or that his­tor­i­cal or fic­tional mil­i­tary and dis­cussed the ways in which their mil­i­taries re­flected civil­ian so­ci­ety and I wanted to pull a lot of that to­gether in one place. As a re­sult in this se­ries — more than most — the links are go­ing to be load bear­ing.’ Likewise a lot of the heavy bib­li­og­ra­phy here is go­ing to live in the links, al­though I think for some­one look­ing to get a han­dle on how pre-mod­ern so­ci­eties and pre-mod­ern mil­i­taries come to­gether, the two key read­ings I would sug­gest are P. Crone, Pre-Industrial Societies: Anatomy of the Pre-Modern World (1989) and then J. Landers, The Field and the Forge: Population, Production and Power in the Pre-Industrial West (2003). Also well worth read­ing as an overview is Azar Gat, War in Human Civilization (2006).

Now we’re go­ing to re­strict our­selves a bit here in that we are go­ing to stick to pre-mod­ern or more cor­rectly pre-in­dus­trial armies. The rules change a lot for in­dus­trial and post-in­dus­trial armies, though by the same to­ken we re­ally don’t have nearly the same range of ex­am­ples for in­dus­trial armies ei­ther: we re­ally have a sin­gle dom­i­nant model for in­dus­trial armies that emerged in Europe from 1914 to 1945 and then a bunch of re­ac­tions to that model (along with what we might term an in­dus­trial transitional’ pe­riod from ~1800 to 1914). It is thus hard to build a com­plete ty­pol­ogy, be­cause the in­dus­trial sam­ple size is so small.

By con­trast, the sam­ple for pre-in­dus­trial agrar­ian armies is re­ally big, so it be­comes a bit eas­ier to spot re­cur­ring pat­terns of or­ga­ni­za­tion and struc­ture as dif­fer­ent so­ci­eties stum­ble on to the same so­lu­tions for gen­er­at­ing force. So that’s what we’re go­ing to do this week: look at some of the pat­terns, keep­ing in mind that these are gen­eral rules with many com­pli­ca­tions and ex­cep­tions. In the process, we’re go­ing to pull to­gether a lot of the in­di­vid­ual dis­cus­sions of spe­cific sys­tems — his­tor­i­cal and fan­tas­ti­cal — as ex­am­ples.

Fans of fic­tional worlds will have of­ten run into the most egre­gious ex­am­ples of the fail­ure to think in these terms. Professional or seem­ingly pro­fes­sional armies em­ployed by so­ci­eties that lack the ad­min­is­tra­tive struc­ture to man­age them, armies that are too large or too small for their par­ent so­ci­eties, guards’ that seem to spring out of holes in the ground rather than or­gan­i­cally fit into so­ci­ety any­where and so on.

But first, as al­ways, re­cruit­ing and main­tain­ing large pre-mod­ern armies is ex­pen­sive! Much like many of those pre-mod­ern armies, this pro­ject is sup­ported by de­volv­ing the costs of my ru­inous book-buy­ing habit on to re­cruits read­ers. You can help by spread­ing the word to new read­ers and by sup­port­ing this pro­ject over at Patreon. If you want up­dates when­ever a new post ap­pears or want to hear my more bite-sized mus­ings on his­tory, se­cu­rity af­fairs and cur­rent events, you can fol­low me on Bluesky (@bretdevereaux.bsky.social). I am also ac­tive on Threads (bretdevereaux) and main­tain a de min­imis pres­ence on Twitter (@bretdevereaux).

Armies and Societies

I have writ­ten this maxim a few dif­fer­ent ways, but it is worth writ­ing again: no army can help but recre­ate its civil­ian so­cial struc­tures on the bat­tle­field.

When an­a­lyz­ing a his­tor­i­cal army or cre­at­ing a fic­tional one, every­thing must be­gin with that idea, that mil­i­tary sys­tems grow out of and re­flect their civilian’ so­ci­eties or — for so­ci­eties that lack civil­ians as such — re­flect the civil­ian side of the lives of their mem­bers. That means that armies tend to recre­ate civil­ian hi­er­ar­chies, with sim­i­lar — of­ten iden­ti­cal — lines of sta­tus be­tween the two.

So to un­der­stand what kind of mil­i­tary our so­ci­ety might come up with, we first need to ask some key ques­tions about the civil­ian so­ci­ety.

First: is this so­ci­ety agrar­ian? Which is to say, are they farm­ers? In most cases, the an­swer will be yes be­cause with only a hand­ful of ex­cep­tions, if they’re not farm­ers you’re not go­ing to have cities or states and most set­tings have those. That said, if your so­ci­ety con­sists of no­mads — ei­ther hunter-gath­er­ers or pas­toral no­mads — they aren’t go­ing to have a state (which is a crea­ture of the agrar­ian world) and so you want to think about non-state forms of mil­i­tary or­ga­ni­za­tion, which is go­ing to chan­nel them to­wards some spe­cific so­lu­tions to our prob­lems be­low.

Next: is this a state? Is mil­i­tary force in this so­ci­ety col­lected into a sin­gle po­lit­i­cal en­tity or is it frag­mented among many dif­fer­ent cen­ters of power? One odd choice I see in a lot of fan­tasy set­tings is to have huge, sprawl­ing cities with non-state sys­tems of or­ga­ni­za­tion (power in­for­mally di­vided among a bunch of dif­fer­ent groups that all wield force), but that’s not a pat­tern we see of­ten his­tor­i­cally. Instead, the more ur­ban a so­ci­ety is, the more likely it is that mil­i­tary power is con­cen­trated into a sin­gle po­lit­i­cal en­tity — the state. At the same time, non-state poli­ties may lack a sin­gle po­lit­i­cal en­tity with a mo­nop­oly on the use of force, but that does­n’t mean they lack a mil­i­tary sys­tem, it just means that power is frag­mented in that sys­tem.

Third: what kind of aris­toc­racy does this so­ci­ety have? Every so­ci­ety has a so­cio-eco­nomic elite, but there are dif­fer­ent kinds. Does aris­to­cratic wealth mostly flow up­wards from large land­hold­ings or flow down­wards from em­ploy­ment in a royal bu­reau­cracy (the for­mer is much more com­mon)? Likewise, to what de­gree does this so­ci­ety have a bu­reau­cracy as such and how much power does it wield? It can be easy to as­sume mod­ern bu­reau­cratic ad­min­is­tra­tive struc­tures, but these are rare in pre-mod­ern so­ci­eties: power is of­ten wielded by lo­cal grandees than by em­ployed rep­re­sen­ta­tives of the state and if the power is wielded by those grandees, the mil­i­tary sys­tem is likely to run through them to some ex­tent as well as well.

Your aris­to­crats are go­ing to as­sume that — since they lead so­ci­ety in peace — they lead so­ci­ety in war, but how they do so de­pends on their self-con­cep­tion. Here, I dis­tin­guish some­times be­tween mil­i­tary aris­to­crats — aris­toc­ra­cies who un­der­stand their pri­mary pur­pose is war­fare gen­er­ally (often lead­er­ship), as dis­tinct from re­li­gious or bu­reau­cracy aris­toc­ra­cies that might be of a non-mil­i­tary char­ac­ter — and war­rior aris­to­crats, who un­der­stand their pri­mary pur­pose in so­ci­ety as per­son­ally fight­ing in a spe­cific way (usually but not al­ways mounted).

Note that while war­rior aris­to­crats’ le­git­i­macy in claim­ing aris­to­cratic sta­tus comes from their per­sonal prac­tice of vi­o­lence, the source of their power is al­most in­vari­ably wealth from large land­hold­ings: they’re not aris­to­crats be­cause they’re good at fight­ing. Instead, they’re aris­to­crats be­cause they’re rich and then to jus­tify the wealth and power they wield, they prac­tice a cer­tain form of di­rect, per­sonal kind of war­fare. A guy who is re­ally good at fight­ing but is poor and with­out ti­tle is not a knight; a guy who has wealth and ti­tle but is ter­ri­ble at fight­ing is a bad knight, but a knight nonethe­less. Warrior-elites are thus elites-who-are-war­riors, not nec­es­sar­ily war­riors-who-are-elite-at-war, though since their so­cial class places a lot of em­pha­sis at be­ing good at fight­ing, they’re of­ten very good at fight­ing (in a spe­cific way, again, usu­ally but not al­ways mounted).

Fourth: how do the reg­u­lar farm­ers (who are 90+% of the pop­u­la­tion) con­nect to the aris­toc­racy? Are they mostly free-hold­ers who own their own land, but are eco­nom­i­cally de­pen­dent on the Big Man? Or does the lo­cal Big Man — that is, the aris­to­crat who is near­est them — own their land it­self? Or does the king (or state, in some other form; it might be a tem­ple!) own their land, in which case the aris­to­crat they en­gage with is an ad­min­is­tra­tor rather than a land-owner?

For the aris­toc­racy to ex­ist (and for the state to ex­ist, if it does), it has to be si­phon­ing agri­cul­tural pro­duc­tion from these smaller farm­ers, so con­sider how that hap­pens as well. Aristocrats col­lect rents on the lands they own or con­trol. The state may col­lect taxes, but in many pre-mod­ern states, royal rev­enues are dom­i­nated by the lands the king owns rather than taxes. Naturally, if taxes are be­ing col­lected, that im­plies some kind of bu­reau­cracy col­lect­ing them, which non-state so­ci­eties may not have and which may be un­der­de­vel­oped in weak-state so­ci­eties.

What we’re try­ing to get with all of these ques­tions is think­ing about how the peas­antry and the aris­toc­racy re­late to each other and how that re­la­tion­ship is un­der­stood and jus­ti­fied. Those ques­tions are im­por­tant be­cause civil so­ci­ety comes first — armies are built out of ex­ist­ing sub­sis­tence sys­tems and so­cial struc­tures, not usu­ally the other way around — and be­cause the struc­ture of a so­ci­ety lim­its the pos­si­ble mil­i­tary sys­tems it can house.

Recruitment Principles

Once we have a sense of our civil­ian so­ci­ety, the next thing we need to think about is how do we get re­cruits?

Landers (op. cit.) breaks down re­cruit­ment sys­tems based on the prin­ci­ple they func­tion on, dis­tin­guish­ing be­tween gen­eral com­pul­sion (conscription by force, levies), the en­ti­tle­ment prin­ci­ple (service as the flip-side of the coin for some set of rights or sta­tus), the vo­ca­tional prin­ci­ple (standing armies or mil­i­tary aris­toc­ra­cies that served be­cause that was their role in so­ci­ety) or de­vo­lu­tion (devolve the prob­lem down­ward onto vas­sals, com­mu­ni­ties or house­holds). That’s a use­ful frame­work, but I want to shift it around some­what for our pur­poses, be­cause I want to sep­a­rate clearly why the re­cruits fight from how you get them (and be­cause I think general com­pul­sion’ is ac­tu­ally not the most use­ful cat­e­gory here).

So we can start with what I am go­ing to call the re­cruit­ment prin­ci­ple (as dis­tinct from the re­cruit­ment method), which is the why of your re­cruit­ment: why do these fel­lows feel like they must or ought to serve. A lot of his­tor­i­cal fic­tion or fan­tasy set­tings fail to ad­dress this par­tic­u­lar ques­tion or else an­swer it with a very crude because they have to’ (that is, com­pul­sion) but that’s not usu­ally how this works. After all, this so­ci­ety is about to give these fel­lows weapons, so with­out some broader so­cial struc­ture that en­cour­ages or con­strains them to re­mind at the stan­dard, there is very lit­tle pre­vent­ing them from de­sert­ing or re­volt­ing. Compulsion can get me into the ranks, but it strug­gles to keep them there.

The first place most mod­ern folks’ mind goes, of course, is to pat­tern this task off of their own jobs and so to as­sume that these fel­lows are un­der arms be­cause they are paid to be, which I am go­ing to term the em­ploy­ment prin­ci­ple (separate from the vo­ca­tional prin­ci­ple). We may sum it up with, recruits show up purely as an eco­nomic trans­ac­tion: ser­vice for money” — it’s a job. These may be for­eign troops (in which case they’re mer­ce­nar­ies) or do­mes­tic troops, but the key thing here is that the bond which holds them to the army is mon­e­tary: they get paid.

The prob­lem is this is not ac­tu­ally the most com­mon re­cruit­ment prin­ci­ple. Indeed, while many armies may em­ploy mer­ce­nar­ies as aux­il­iary troops or main­tain some small stand­ing em­ploy­ment-based com­po­nent (like non-no­ble pro­fes­sional re­tain­ers, for in­stance), it is fairly rare for pre-mod­ern armies to func­tion purely as a job.’ The ex­cep­tions are pro­fes­sional armies, but pro­fes­sional armies are the ex­cep­tion, not the rule: the later Han dy­nasty, the Roman Empire (but not the Republic) and early mod­ern Europe fea­ture pro­fes­sional armies, but oth­er­wise these are un­com­mon. Crucially — and we’ll come back to this as we move along — pro­fes­sional armies re­quire a strong state with a ca­pa­ble bu­reau­cracy and ex­ten­sive rev­enues, be­cause the state is tak­ing on the whole ad­min­is­tra­tive and fi­nan­cial bur­den of main­tain­ing the army. Early mod­ern European states fa­mously strug­gled hor­ri­bly un­der those bur­dens, while the Roman Army of the im­pe­r­ial pe­riod con­sumed well over half of the state’s bud­get.

Note that war­riors and sol­diers re­cruited by other prin­ci­ples might also get paid (although of­ten not as much), the dif­fer­ence is that there is some other so­cial con­nec­tion that is un­der­ly­ing their re­cruit­ment.

Instead, it is more com­mon that the core of mil­i­tary forces in pre-mod­ern so­ci­eties arise out of three ba­sic sets of prin­ci­ples (two of which I am bor­row­ing from Landers): the en­ti­tle­ment prin­ci­ple, the vo­ca­tional prin­ci­ple and what I am go­ing to call the clien­t­age prin­ci­ple. All three share an el­e­ment in that what ties an in­di­vid­ual to re­cruit­ment is who they are which in pre-mod­ern so­ci­eties that are gen­er­ally ex­tremely low so­cial-mo­bil­ity so­ci­eties, is al­most in­vari­ably a prod­uct of what fam­ily they were born into.

In en­ti­tle­ment prin­ci­ple re­cruit­ing, li­a­bil­ity for mil­i­tary ser­vice is an ex­pec­ta­tion that cor­re­sponds to a set of so­cial rights and priv­i­leges, most of­ten cit­i­zen­ship. Note that we’re not talk­ing about cit­i­zen­ship as a re­ward for ser­vice, but rather ser­vice as a re­quire­ment of cit­i­zens. Naturally, for an en­ti­tle­ment sys­tem like this to re­ally func­tion, there needs to be some so­cially valu­able po­si­tion, with con­nected rights and priv­i­leges, avail­able for com­mon folk (we’ll talk about aris­to­crats in a sec­ond). That tends to make en­ti­tle­ment prin­ci­ple ser­vice a crea­ture of smaller cit­i­zen­ship-based com­mu­ni­ties: A Greek po­lis re­cruit­ing ho­plites, the Roman Republic re­cruit­ing its le­gions, or me­dieval town and com­mune gov­ern­ments es­tab­lish­ing a ser­vice re­quire­ments amongst the town­folk (the burghers), whose cit­i­zen­ship in the town marks them apart from the reg­u­lar peas­antry.

The great ad­van­tage of en­ti­tle­ment prin­ci­ple sys­tems is that, be­cause so­cial sta­tus and mil­i­tary ser­vice are tightly in­ter­con­nected, get­ting sol­diers to muster and keep­ing them in the ranks is rel­a­tively eas­ier. Think about a Roman cit­i­zen sol­dier in the Middle Republic: if he deserts, where does he even desert to — his home­town where every­one knows he’s sup­posed to be with the army and where he and his fam­i­ly’s en­tire so­cial iden­tity is tied up with his li­a­bil­ity for mil­i­tary ser­vice? The sys­tem cre­ates re­ally strong so­cial pres­sures that make this eas­ier.

The lim­i­ta­tion of such sys­tems is that they re­quire that en­ti­tle­ment in the first place and that en­ti­tle­ment al­most al­ways comes with the ex­pec­ta­tion of a po­lit­i­cal voice through some kind of vot­ing or com­mu­nal con­sen­sus de­ci­sion-mak­ing. That may not sound like a trade­off to you, but it cer­tainly is to the elites of this so­ci­ety: to re­cruit on this ba­sis they have to cede power to the com­mons to some de­gree in or­der to cre­ate the po­lit­i­cal en­ti­tle­ment worth fight­ing for. In prac­tice, it should be noted, the sys­tems don’t gen­er­ally seem to form that way: they are not grants from the aris­toc­racy to the com­mons (‘fight for me and I’ll let you vote!’) but rather con­ces­sions wrested from the aris­toc­racy by the com­mons through col­lec­tive ac­tion (‘let us vote or we won’t fight!’), which then ac­quire the heavy re­in­force­ment of be­com­ing the tra­di­tional rights and priv­i­leges of the cit­i­zenry.

The next op­tion is what we can call (following Landers) the vo­ca­tional prin­ci­ple, which also con­nects ser­vice to who you are, but rather than con­nect­ing it to your place in a po­lit­i­cal or­der, it con­nects ser­vice to a place in the broader so­cial or­der: the vo­ca­tional prin­ci­ple is one in which a cer­tain class of peo­ple fight be­cause they are the war­rior class, typ­i­cally be­cause you were born into the war­rior class.

The vo­ca­tional prin­ci­ple can come in two forms. First, in many non-agrar­ian, (hunter-gatherer or pas­toral no­mads (like Steppe no­mads)), or rel­a­tively less com­plex horticultural‘ so­ci­eties, it is of­ten the case that the en­tire free adult male pop­u­la­tion is part of the warrior class.’ These are, af­ter all, gen­er­ally very small clan- or tribal-based so­ci­eties with a lot less so­cial strat­i­fi­ca­tion so everybody’ (that is, all free adult males) fights. For men, par­tic­i­pat­ing in com­mu­nal war­fare is a core com­po­nent to be­long­ing to the tribe, camp, clan or vil­lage.

The mis­take one sees in a lot of spec­u­la­tive fic­tion (and also cer­tain re­ac­tionary po­lit­i­cal move­ments) is as­sum­ing that this sort of everyone is a war­rior’ so­cial struc­ture can be trans­planted to more com­plex so­ci­eties with greater de­grees of spe­cial­iza­tion. The re­duc­tio ad ab­sur­dum of this are some por­tray­als of Star Trek’s Klingons: an en­tire post-in­dus­trial multi-planet em­pire that can de­sign star­ships (and so must be hy­per-spe­cial­ized) but where also some­how every­one is a war­rior trained in close-com­bat weapons. Real so­ci­eties do not train their star­ship de­sign­ers (or their black­smiths) to also be mas­ter swords­men be­cause that is­n’t worth any­one‘s time.1 But they pretty clearly can’t: the mo­ment a so­ci­ety be­gins spe­cial­iz­ing its la­bor (required to achieve high pop­u­la­tion den­si­ties), fighting’ be­comes to one de­gree or an­other a spe­cial­ized role too.

The thing is, as we’ve dis­cussed, while non-spe­cial­ized all war­rior’ so­ci­eties can some­times over­whelm highly spe­cial­ized agrar­ian so­ci­eties by and large since the ad­vent of farm­ing the most re­source-rich parts of the world have been dom­i­nated by com­plex, strat­i­fied and spe­cial­ized agrar­ian so­ci­eties, be­cause of their higher pop­u­la­tion den­si­ties — pre-mod­ern agrar­ian so­ci­eties can get into the 30 – 70 peo­ple per square mile range, com­pared to some­thing like 0.5 per­son per square mile for hunter-gath­er­ers out­side of very re­source rich zones and some­thing like around 2 – 5 per square mile for no­madic pas­toral­ists. It usu­ally does­n’t mat­ter if every­one in your tribe is trained to be a war­rior if those farm­ers over there can triple your num­bers by mo­bi­liz­ing just 10% of their peas­ants. There are ex­cep­tions, of course, but they’re rare.

Instead in more spe­cial­ized so­ci­eties we see the sec­ond form of the vo­ca­tional prin­ci­ple: a war­rior class in which a dis­tinct spe­cial­ized class in so­ci­ety are war­riors (or mil­i­tary lead­ers), usu­ally by birth (because, again, these are low so­cial mo­bil­ity so­ci­eties). In essence, this is a case where in the more com­plex so­ci­ety, just as farmer’ and blacksmith’ and so on have be­come both spe­cial­ized jobs and also ba­si­cally hered­i­tary classes (because who is pick­ing subsistence farmer’ if pampered no­ble’ is an op­tion?), warrior’ be­comes just one more spe­cial­ist so­cial class, de­fined largely by hered­ity.

That can take a num­ber of forms, the most com­mon of which is the mil­i­tary aris­toc­racy. The aris­toc­racy — or some part of it (there may be a par­al­lel civic or re­li­gious aris­toc­racy) — has as its jus­ti­fi­ca­tion for its ex­is­tence that it is the part of so­ci­ety that fights or at least that spe­cial­izes in war­fare. These fel­lows are aris­to­crats, to be clear, be­cause they’re rich, not be­cause the fight well — but to be a mem­ber of the aris­to­cratic class in good stand­ing with the dis­pro­por­tion­ate ac­cess to pres­tige and re­sources that im­plies also re­quires be­ing a mil­i­tary spe­cial­ist and so they de­velop those skills and are avail­able for priv­i­leged mil­i­tary po­si­tions (like cav­alry or com­mand). We’ll get into, in a later part of this se­ries, the dif­fer­ences be­tween war­rior aris­toc­ra­cies and what I’m go­ing to call of­fi­cer aris­toc­ra­cies (does the no­ble pri­mar­ily fight or lead?).

That said, this cat­e­gory also in­cludes some other ways of struc­tur­ing a mil­i­tary vo­ca­tion for a so­ci­ety. One we’ve dis­cussed only a lit­tle bit are mil­i­tary slaves (like the Mamluks)- a low sta­tus class of vo­ca­tional war­riors, though these fel­lows have a habit of not re­main­ing low-sta­tus or slaves for very long, be­cause — of course — they have weapons.

Alternately, con­quer­ing em­pires might seek to cre­ate a vo­ca­tional mil­i­tary class by putting sol­diers on plots of land (complete with la­bor­ers) in the ex­pec­ta­tion that they and their chil­dren will re­main li­able for an elite kind of mil­i­tary ser­vice. These we call mil­i­tary set­tlers and they are usu­ally a fea­ture of a regime mov­ing in — so­ci­eties usu­ally do not im­pose mil­i­tary set­tlers on them­selves. The Macedonians’ in Hellenistic king­doms make for a good ex­am­ple of this, as do Arab gar­ri­son cities in the Rashidun Caliphate. For everyone is a war­rior’ so­ci­eties that do end up over­run­ning larger, more com­plex agrar­ian so­ci­eties, this is of­ten what hap­pens: the tribal eth­nic group be­comes a mil­i­tary aris­toc­racy set­tled as over­lords over the re­source rich land of the con­quered.

Finally, we have clien­t­age prin­ci­ple re­cruit­ment, where the re­cruit­ing prin­ci­ple is that the men be­ing pulled into the ranks are — in their civil­ian so­ci­ety — de­pen­dents of the fel­lows re­cruit­ing them. In this case mil­i­tary ser­vice is part of the oblig­a­tions of the de­pen­dent to­wards their su­pe­rior. That may seem strange in some cases — as a con­di­tion of giv­ing the lo­cal Big Man a chunk of your food, you also some­times have to fight for him? — but its im­por­tant to re­mem­ber that these so­ci­eties do not see the ex­change that way. Instead, they’d frame it that, as a con­di­tion of hav­ing the Big Man’s pro­tec­tion and be­ing able to farm his land, you give him a chunk of the pro­duce and are also ex­pected to fight for him. It’s im­por­tant to re­mem­ber that these prin­ci­ples for re­cruit­ment are not laws about the phys­i­cal uni­verse, but fun­da­men­tally ques­tions of psy­chol­ogy and cul­ture: if the en­tire cul­ture agrees that the land be­longs to the lord or the king or the tem­ple and you are pay­ing (in a way) for the priv­i­lege of farm­ing it, then that is the re­al­ity for all con­cerned.

Dependents here can come in a few va­ri­eties. The high­est sta­tus such de­pen­dents might be re­tain­ers, men main­tained in an aris­to­crats house­hold as full time muscle.’ While these fel­lows might be paid mer­ce­nar­ies, in a lot of so­ci­eties they’re not get­ting paid in cash but rather in sta­tus and a liv­ing: they get to live as part of the Big Man’s house­hold, they get their food and other ne­ces­si­ties and they’re a more im­por­tant per­son than the peas­antry. Crucially, re­tain­ers of this sort are not free agents’ to the high­est bid­der, but of­ten tightly bound by for­mal ties (clientage, hos­pi­tal­ity, fa­mil­ial bonds, homage and so on) to a spe­cific aris­to­crat.

Below that, a Big Man might ex­pect that as part of the un­equal rec­i­p­ro­cal ex­change of clien­t­age, his clients — the poor farm­ers around him — might owe him sup­port which would in­clude fol­low­ing his lead in war­fare. At the same time, as we’ll see, we can flip this sort of think­ing around and say that for the com­mu­nity, the Big Man forms a nat­ural leader around which the com­mu­nity, if it is un­der threat, can rally (and the flip­side of that, the Big Man is prob­a­bly a vo­ca­tional war­rior, as above). Finally, the de­pen­dents here might be some form of non-free per­sons — not usu­ally slaves, but rather ten­ants or serfs. Often the pack­age of oblig­a­tions these folks owed their over­lord in­cluded corvée la­bor of some sort, so mil­i­tary ser­vice as such an oblig­a­tion makes some sense.

We can see these sorts of sys­tems at work with the Carolingian gen­eral and se­lect levies or the Anglo-Saxon fyrd. In both the Carolingian and Anglo-Saxon sys­tem, there was a general levy’ of all free men called up as a lo­cal de­fense mili­tia, but house­holds were also brigaded to­gether and re­quired col­lec­tively to fur­nish a man for the se­lect levy to pro­vide a stand­ing or ex­pe­di­tionary force. It is strik­ing how these sys­tems re­quired the ac­tive par­tic­i­pa­tion of lo­cal mag­nates in or­der to act as fo­cal points for or­ga­ni­za­tion and lead­er­ship. As a re­sult, these sys­tems tend to be fun­da­men­tally lo­cal: while the king has the au­thor­ity to call up a whole bunch of re­gional se­lect-levies or fyrds to make up a field army, in prac­tice these are lo­cal units, not a national’ con­scrip­tion sys­tem. Notably, Charlemagne’s ef­fort to im­pose a royal bu­reau­cracy on the Carolingian levy us­ing royal of­fi­cials (the missi, those hav­ing been sent [by the king]’) emerges as a kind of last-gasp ef­fort to keep this sys­tem run­ning as it comes apart and never quite works as a cen­tral­ized sys­tem.

That said, this sort of sys­tem could be cen­tral­ized and ex­tended to form a national’ con­scrip­tion sys­tem, with the ex­am­ple that springs to mind be­ing the early Han dy­nasty (202BC-220AD) mil­i­tary sys­tem in China, which emerged out of the mass con­scrip­tion sys­tems of the Warring States pe­riod, where very large armies were raised for spe­cific cam­paigns against peer com­peti­tor states. Notably, as the Han dy­nasty’s pri­mary se­cu­rity chal­lenges lay with hold­ing fron­tiers (the Qin dy­nasty hav­ing al­ready re­moved all of the peer com­peti­tors be­fore be­ing re­placed by the Han), the Han sys­tem steadily trans­formed into a pro­fes­sional stand­ing army com­posed of a mix of paid pro­fes­sion­als and mil­i­tary set­tlers. That said — and we’ll come right back to this next week — mass con­scrip­tion re­quires record-keep­ing, bu­reau­cracy and state cen­tral­iza­tion that rel­a­tively few pre-mod­ern poli­ties have. Still it cer­tainly is pos­si­ble to have a so­ci­ety with at least the no­tion that the com­mon peas­ant is sim­ply ob­lig­ated to per­form some amount of mil­i­tary ser­vice.

Putting Society and Principle Together

So to re­cap, we can list our re­cruit­ment prin­ci­ples with a very rough sense of how com­mon they are and where:

The Employment Principle (because they get paid): fre­quently used to sup­ple­ment armies that have a core re­cruited an­other way but only rarely the main re­cruit­ment prin­ci­ple. Where it is used as such (professional armies), it re­quires a strong state with a lot of rev­enue and state ca­pac­ity. Examples: Imperial Rome, the later Han Dynasty, some early mod­ern European armies.

The Entitlement Principle (because it is the con­verse of some set of rights these fel­lows have): com­mon for city-states or other sorts of re­publics, but re­quires hav­ing a le­gal/​po­lit­i­cal sta­tus like cit­i­zen­ship which is valu­able enough to fight for. Troops re­cruited on this prin­ci­ple can be ex­pected to ba­si­cally re­cruit and arm them­selves in many cases, but they’re paid’ in po­lit­i­cal rights as much as cash. Examples: The Roman Republic, Greek po­lis-armies, me­dieval town mili­tias.

The Vocational Principle (because it is their so­cial role/​class):

All-Warrior Society (every free adult male is a war­rior): com­mon in largely non-spe­cial­ized so­ci­eties — hunter-gath­er­ers, no­madic pas­toral­ists, very early agri­cul­ture. Troops re­cruited on this ba­sis arm, or­ga­nize and largely re­cruit them­selves, but these so­ci­eties tend to be small, low pop­u­la­tion den­sity and com­par­a­tively poor. Examples: Plains Native Americans, Steppe no­mads, hunter-gath­erer so­ci­eties.

Warrior Class or Officer Class (specialized so­ci­ety with a ded­i­cated fight­ing or mil­i­tary-lead­er­ship class): ex­tremely com­mon among com­plex agrar­ian so­ci­eties, a mil­i­tary aris­toc­racy of some sort is prac­ti­cally the de­fault mode of lead­er­ship in such so­ci­eties, but note that war­rior-aris­to­crats and of­fi­cer-aris­to­crats may have very dif­fer­ent ex­pec­ta­tions of what that means. Often the fel­lows pro­vide the lead­er­ship for oth­er­wise em­ploy­ment-, en­ti­tle­ment- or clien­t­age-based armies or al­ter­nately a core of spe­cial­ist war­riors around which such levies are grafted. Examples: Almost too nu­mer­ous to pro­vide — non-state Gallic aris­to­crats, me­dieval European knights and no­bil­ity, the Roman Senate (an officer class’ ex­am­ple!), and so on.

Military Settlers (an im­posed mil­i­tary aris­toc­racy of fight­ers given land in ex­change for fu­ture ser­vice): a fairly com­mon so­lu­tion for con­sol­i­dat­ing con­quest (especially for so­ci­eties which sim­ply lack the bu­reau­cratic in­fra­struc­ture for di­rect gov­er­nance), cre­at­ing a new up­per-stra­tum of mil­i­tary-aris­to­crats that are of­ten eth­ni­cally dis­tinct from the ruled. Examples: Macedonian mil­i­tary-set­tlers af­ter Alexander’s con­quests; the gar­ri­son-cities of the Rashidun Caliphate.

Military Slaves (a sub­or­di­nate class of spe­cial­ist war­riors): a rel­a­tively un­com­mon and his­tor­i­cally un­sta­ble sys­tem, but hardly an un­known one, heav­ily de­pen­dent on the avail­abil­ity of an eth­ni­cally dis­tinct class of war­riors avail­able to be en­slaved. Examples: Mamluks, Janissaries.

We might also put Prisoner Armies (recruitment as pun­ish­ment for a crime) in this cat­e­gory. These tend to be some­what more sta­ble, but their mil­i­tary per­for­mance is not al­ways stel­lar. Example: the armies of the Song Dynasty.2

All-Warrior Society (every free adult male is a war­rior): com­mon in largely non-spe­cial­ized so­ci­eties — hunter-gath­er­ers, no­madic pas­toral­ists, very early agri­cul­ture. Troops re­cruited on this ba­sis arm, or­ga­nize and largely re­cruit them­selves, but these so­ci­eties tend to be small, low pop­u­la­tion den­sity and com­par­a­tively poor. Examples: Plains Native Americans, Steppe no­mads, hunter-gath­erer so­ci­eties.

Warrior Class or Officer Class (specialized so­ci­ety with a ded­i­cated fight­ing or mil­i­tary-lead­er­ship class): ex­tremely com­mon among com­plex agrar­ian so­ci­eties, a mil­i­tary aris­toc­racy of some sort is prac­ti­cally the de­fault mode of lead­er­ship in such so­ci­eties, but note that war­rior-aris­to­crats and of­fi­cer-aris­to­crats may have very dif­fer­ent ex­pec­ta­tions of what that means. Often the fel­lows pro­vide the lead­er­ship for oth­er­wise em­ploy­ment-, en­ti­tle­ment- or clien­t­age-based armies or al­ter­nately a core of spe­cial­ist war­riors around which such levies are grafted. Examples: Almost too nu­mer­ous to pro­vide — non-state Gallic aris­to­crats, me­dieval European knights and no­bil­ity, the Roman Senate (an officer class’ ex­am­ple!), and so on.

Military Settlers (an im­posed mil­i­tary aris­toc­racy of fight­ers given land in ex­change for fu­ture ser­vice): a fairly com­mon so­lu­tion for con­sol­i­dat­ing con­quest (especially for so­ci­eties which sim­ply lack the bu­reau­cratic in­fra­struc­ture for di­rect gov­er­nance), cre­at­ing a new up­per-stra­tum of mil­i­tary-aris­to­crats that are of­ten eth­ni­cally dis­tinct from the ruled. Examples: Macedonian mil­i­tary-set­tlers af­ter Alexander’s con­quests; the gar­ri­son-cities of the Rashidun Caliphate.

Military Slaves (a sub­or­di­nate class of spe­cial­ist war­riors): a rel­a­tively un­com­mon and his­tor­i­cally un­sta­ble sys­tem, but hardly an un­known one, heav­ily de­pen­dent on the avail­abil­ity of an eth­ni­cally dis­tinct class of war­riors avail­able to be en­slaved. Examples: Mamluks, Janissaries.

We might also put Prisoner Armies (recruitment as pun­ish­ment for a crime) in this cat­e­gory. These tend to be some­what more sta­ble, but their mil­i­tary per­for­mance is not al­ways stel­lar. Example: the armies of the Song Dynasty.2

We might also put Prisoner Armies (recruitment as pun­ish­ment for a crime) in this cat­e­gory. These tend to be some­what more sta­ble, but their mil­i­tary per­for­mance is not al­ways stel­lar. Example: the armies of the Song Dynasty.2

The Clientage Principle (because it is an oblig­a­tion they have to­wards so­cial su­pe­ri­ors)

Retainers and Clientage (little men have spe­cific ties of loy­alty to Big Men who can call them to arms): as far as I can tell, the pri­mary way com­plex non-state so­ci­eties raise mil­i­tary force. Because it re­lies on per­sonal ties, it tends to stay frag­mented. Examples: non-state Gaul and Spain, but also vas­salage-based me­dieval poli­ties.

Universal Military Service (little men owe mil­i­tary ser­vice to their lord, king or the state): com­mon al­though rarely as uni­ver­sal or cen­tral­ized as the name im­plies. Often takes the form of re­gional mili­tias ag­glom­er­ated into a larger army (examples: Carolingian se­lect-levy, the Anglo-Saxon fyrd), but there are rare ex­am­ples of truly mass con­scrip­tion sys­tems, par­tic­u­larly in China (examples: Warring States pe­riod, Qin Dynasty, early Han Dynasty).

Retainers and Clientage (little men have spe­cific ties of loy­alty to Big Men who can call them to arms): as far as I can tell, the pri­mary way com­plex non-state so­ci­eties raise mil­i­tary force. Because it re­lies on per­sonal ties, it tends to stay frag­mented. Examples: non-state Gaul and Spain, but also vas­salage-based me­dieval poli­ties.

Universal Military Service (little men owe mil­i­tary ser­vice to their lord, king or the state): com­mon al­though rarely as uni­ver­sal or cen­tral­ized as the name im­plies. Often takes the form of re­gional mili­tias ag­glom­er­ated into a larger army (examples: Carolingian se­lect-levy, the Anglo-Saxon fyrd), but there are rare ex­am­ples of truly mass con­scrip­tion sys­tems, par­tic­u­larly in China (examples: Warring States pe­riod, Qin Dynasty, early Han Dynasty).

What I hope emerges from this quick com­par­i­son is how sen­si­tive these prin­ci­ples are to the struc­ture of the un­der­ly­ing so­ci­ety: for most so­ci­eties, the op­tions whit­tle down to just a hand­ful al­most im­me­di­ately. A frag­mented state with a weak cen­tral bu­reau­cracy will al­most in­evitably need to re­ply on mil­i­tary aris­to­crats, their re­tain­ers and clients be­cause it has­n’t the rev­enues or the po­lit­i­cal struc­ture for any­thing else, for in­stance. A so­ci­ety with spe­cial­ized eco­nomic roles is­n’t go­ing to be able to set up as an all war­rior’ so­ci­ety and a so­ci­ety with­out spe­cial­ized eco­nomic roles is­n’t go­ing to be able to use any other sys­tem. A so­ci­ety with­out a tra­di­tion of uni­ver­sal mil­i­tary ser­vice is go­ing to have a hard time con­script­ing its peas­antry and a so­ci­ety with­out a cit­i­zen­ship-like le­gal/​po­lit­i­cal sta­tus is go­ing to have a hard time re­cruit­ing on an en­ti­tle­ment ba­sis. Likewise, if a so­ci­ety lacks a large war­rior-aris­to­crat class, then it lacks a large war­rior-aris­to­crat class and can­not re­cruit on that ba­sis.

Next week, we’ll look at putting these prin­ci­ples into ac­tion, think­ing about how armies are raised and paid for.

The Klingons ac­tu­ally make sense if you as­sume the Klingons we see are ac­tu­ally the other kind of vo­ca­tional war­riors — a mil­i­tary aris­toc­racy — and that we sim­ply never meet any blue col­lar’ in­hab­i­tants of the Klingon em­pire. Of course, that would change the noble war­riors’ of the Klingon aris­toc­racy into the cruel, brutish slave-mas­ters of an em­pire in which they ex­ist solely to op­press and ex­ploit their highly pro­duc­tive, spe­cial­ized vic­tims; one of TNG-and-later Star Trek’s prob­lems is that it is very hard to square the cir­cle whereby co­ex­ist­ing in al­liance with the Klingon Empire as we see it is the right and moral thing for the Federation to do.

I am not re­ally an ex­pert on these sys­tems, which is why I haven’t said much about them, but you can get a sense of the Song sys­tem read­ing E. Alyagon, Inked: Tattooed Soldiers and the Song Empire’s Penal Military Complex (2023).

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

Visit pancik.com for more.