10 interesting stories served every morning and every evening.

S&P 500 rejects SpaceX, also blocking entry for OpenAI and Anthropic

arstechnica.com

Such rule changes would have ac­com­mo­dated SpaceX’s plan to only of­fer ap­prox­i­mately 3 per­cent of its IPO shares to pub­lic in­vestors, and the fact that SpaceX is cur­rently un­prof­itable with a grow­ing debt load that has reached $29 bil­lion be­cause of its spend­ing spree on AI in­fra­struc­ture.

But in its fi­nal de­ci­sion, the S&P Dow Jones Indices stated that no changes will be made to the el­i­gi­bil­ity cri­te­ria in­clud­ing fi­nan­cial vi­a­bil­ity screens, sea­son­ing pe­riod, or min­i­mum IWF.” Even af­ter the stan­dard year­long wait, SpaceX, Anthropic, and OpenAI may strug­gle to de­liver the con­sis­tent prof­itabil­ity nec­es­sary to qual­ify for the S&P 500.

Money rules and ex­cep­tions

Swift en­try into the S&P 500 would have trig­gered $14 bil­lion of pas­sive fund buy­ing for SpaceX, ac­cord­ing to Bloomberg Intelligence. The in­vest­ment re­search arm of Bloomberg also es­ti­mated that OpenAI could have gained more than $8 bil­lion, and Anthropic could have net­ted $4.6 bil­lion from sim­i­lar pas­sive buy­ing sprees trig­gered by their S&P 500 en­tries.

This is be­cause $7.5 tril­lion in pas­sively man­aged funds—pop­u­lar among both in­di­vid­ual in­vestors and in­sti­tu­tional in­vestors—fol­low the S&P 500 by pur­chas­ing shares of com­pa­nies ac­cord­ing to their pro­por­tional rep­re­sen­ta­tion in the S&P 500 in­dex. For ex­am­ple, the Vanguard and Fidelity bro­ker­age gi­ants both of­fer pas­sive in­vest­ment funds that track the S&P 500 com­po­si­tion.

However, the S&P Dow Jones Indices did carve out one con­ces­sion” by chang­ing the in­vestable weight fac­tor rules for lower-profile bench­marks” such as the S&P Total Market Index and Dow Jones US Total Stock Market Index, ac­cord­ing to Quartz. That could al­low an IPO faster en­try into those in­dexes.

By con­trast, the Nasdaq stock ex­change changed its rules to al­low SpaceX to en­ter the Nasdaq-100 Index within 15 trad­ing days as op­posed to the usual three months. Similarly, the FTSE Russell in­dex provider de­cided to give SpaceX and other fol­low-on com­pa­nies ac­cel­er­ated en­try to the Russell Top 500 Index af­ter the close of the fifth trad­ing day fol­low­ing an IPO.

The de­nial of ac­cel­er­ated S&P 500 en­try for SpaceX comes just days af­ter Morningstar an­a­lysts de­scribed SpaceX as hav­ing been significantly over­val­ued” in the lead-up to its IPO. The in­vest­ment re­search firm val­ued SpaceX at $780 bil­lion—less than half of SpaceX’s $1.75 tril­lion IPO goal—pri­mar­ily based on the strengths of SpaceX’s Starlink satel­lite ser­vice and rocket launch busi­ness.

This story was up­dated on June 6, 2026 to more clearly de­scribe the pro­posed rule changes that would have ap­plied to all MegaCap com­pa­nies.

How LLMs Actually Work

www.0xkato.xyz

Home

Blog

Research

About

Portfolio

Monday. June 01, 2026 -

26 mins

This post is a walk­through of how LLMs work. Modern LLMs are mostly built by stack­ing trans­former blocks over and over, so un­der­stand­ing the trans­former ma­chin­ery gets you most of the way there.

I’ll cover the core mech­a­nisms in­side mod­ern trans­former-based LLMs, with­out all that sticky math stuff. Don’t get me wrong, you should learn the math, but this can serve as an in­tro­duc­tion.

Most mod­ern LLMs share the same trans­former-fam­ily skele­ton. The dif­fer­ences come from what each one was trained on, the scale and con­fig­u­ra­tion choices, and the post-train­ing done on top. By the end, you should be able to read many mod­ern LLM pa­pers or model cards and know which piece of the ar­chi­tec­ture each sec­tion is talk­ing about.

Here’s the path:

Tokens, how a string of text be­comes a se­quence of in­te­gers

Embeddings, how those in­te­gers get mean­ing

Positional en­cod­ing, how the model knows what or­der the to­kens came in

Attention, how to­kens share in­for­ma­tion with each other

Multi-head at­ten­tion, how the model tracks many kinds of re­la­tion­ships at once

The feed-for­ward net­work, where a large share of the mod­el’s stored struc­ture lives

The resid­ual stream and layer nor­mal­iza­tion, what makes deep stacks train­able

Predicting the next to­ken, what the model ac­tu­ally out­puts and how the gen­er­a­tion loop works

Architecture vs trained weights, what’s broadly shared across mod­ern LLMs, and what’s dif­fer­ent

Tiny ex­plain­ers ap­pear through­out so any­one can fol­low along, re­gard­less of back­ground.

Tokenization

Models don’t read text di­rectly. They read in­te­ger IDs. The step that con­verts your prompt into a se­quence of those in­te­gers.

That con­ver­sion step is called to­k­eniza­tion. A to­k­enizer takes a string and pro­duces a se­quence of in­te­gers, where each in­te­ger points to an en­try in a fixed vo­cab­u­lary. Modern LLM vo­cab­u­lar­ies usu­ally con­tain tens of thou­sands to a few hun­dred thou­sand en­tries.

Tiny ex­plainer: to­ken ID A to­ken ID is the in­te­ger the model uses for one vo­cab­u­lary en­try. The model works with the num­ber, not the writ­ten word it­self.

Tiny ex­plainer: to­ken ID A to­ken ID is the in­te­ger the model uses for one vo­cab­u­lary en­try. The model works with the num­ber, not the writ­ten word it­self.

Tokens aren’t usu­ally whole words. They’re usu­ally sub­word pieces. The word tokenization” might split into [“token”, ization”]. The word running” might split into [“run”, ning”]. The rea­son is ef­fi­ciency. Whole-word vo­cab­u­lar­ies are too big and don’t gen­er­al­ize to new words. Character-level vo­cab­u­lar­ies are too small and force the model to learn even the sim­plest pat­terns from scratch. Subword to­k­eniza­tion sits in the mid­dle. The most com­mon pieces be­come sin­gle to­kens, and rare or novel words get com­posed from smaller pieces.

Tiny ex­plainer: vo­cab­u­lary The vo­cab­u­lary is the to­k­eniz­er’s fixed list of pieces. Each piece has an ID, and the model can only di­rectly re­ceive IDs from that list.

Tiny ex­plainer: vo­cab­u­lary The vo­cab­u­lary is the to­k­eniz­er’s fixed list of pieces. Each piece has an ID, and the model can only di­rectly re­ceive IDs from that list.

The trade-off shows up in places peo­ple don’t ex­pect. The clas­sic ex­am­ple: ask an LLM how many R’s are in strawberry.” LLMs used to get it wrong. That’s not the model fail­ing at count­ing. It’s the model not op­er­at­ing on let­ters di­rectly, only to­ken IDs that hap­pen to spell out a word a hu­man would split let­ter by let­ter.

Different model fam­i­lies use dif­fer­ent to­k­eniz­ers. GPT mod­els use Byte Pair Encoding vari­ants. SentencePiece is com­mon in LLaMA-style mod­els. The choice mat­ters for com­pute (fewer to­kens means less work) and for things like mul­ti­lin­gual cov­er­age, but the ba­sic shape is the same. Text in, in­te­gers out.

Now that the prompt is a se­quence of in­te­gers, the next step is to give those in­te­gers mean­ing.

Embeddings

A to­ken ID like 1024 is just a row in­dex. It does­n’t mean any­thing by it­self. The thing that gives it mean­ing is a gi­ant table called the em­bed­ding ma­trix.

Every model has one. It has one row per en­try in the vo­cab­u­lary, and each row is a long vec­tor of num­bers. The length of each row is the mod­el’s hid­den size. In many 7B-class mod­els, that means 4,096 num­bers per to­ken. Larger mod­els usu­ally use wider vec­tors.

Tiny ex­plainer: vec­tor A vec­tor is a list of num­bers. In a trans­former, each to­ken be­comes a vec­tor so the model can do math with it.

Tiny ex­plainer: vec­tor A vec­tor is a list of num­bers. In a trans­former, each to­ken be­comes a vec­tor so the model can do math with it.

When the to­k­enizer hands the model an in­te­ger, the model looks up that row and uses the vec­tor in­stead. That vec­tor is the to­ken’s em­bed­ding. It’s the mod­el’s rep­re­sen­ta­tion of what that to­ken means,” learned dur­ing train­ing.

Tiny ex­plainer: em­bed­ding ma­trix The em­bed­ding ma­trix is a lookup table. Token ID in, learned vec­tor out.

Tiny ex­plainer: em­bed­ding ma­trix The em­bed­ding ma­trix is a lookup table. Token ID in, learned vec­tor out.

The in­ter­est­ing prop­erty of these em­bed­dings is that se­man­ti­cally sim­i­lar to­kens end up with sim­i­lar vec­tors. The vec­tor for king” is close in space to the vec­tor for queen,” and the vec­tor for Paris” is close to France.” None of this is hard-coded. It emerges from train­ing on enough text, and the model learns these po­si­tions be­cause they let it pre­dict text well.

You can do arith­metic on em­bed­dings and it some­times works. The fa­mous ex­am­ple is king − man + woman ≈ queen. The geom­e­try of em­bed­ding space car­ries real se­man­tic struc­ture, even though no­body told the model to build it that way.

Worth be­ing clear on: at this stage every to­ken has been re­placed by its em­bed­ding, but the em­bed­ding alone says noth­ing about where the to­ken sits in the se­quence. The vec­tor for dog” is the same vec­tor whether dog” is the first word in your prompt or the fifth. That’s a prob­lem.

That’s the gap po­si­tional en­cod­ing fills.

Positional en­cod­ing

Plain self-at­ten­tion does­n’t have a built-in rep­re­sen­ta­tion of word or­der. Without some po­si­tional sig­nal, it has no di­rect way to know that dog” came be­fore bites” in­stead of af­ter it.

Word or­der changes mean­ing. So the model needs an­other piece. It needs a way to in­ject the po­si­tion of each to­ken into the math.

Tiny ex­plainer: po­si­tional en­cod­ing Positional en­cod­ing is how the model gets or­der in­for­ma­tion. It tells the model where each to­ken sits in the se­quence.

Tiny ex­plainer: po­si­tional en­cod­ing Positional en­cod­ing is how the model gets or­der in­for­ma­tion. It tells the model where each to­ken sits in the se­quence.

The orig­i­nal trans­former pa­per (Vaswani et al. 2017) did this by giv­ing each po­si­tion its own pat­tern of num­bers and adding it di­rectly to each to­ken’s em­bed­ding be­fore any other pro­cess­ing. Position 1 had one pat­tern, po­si­tion 5 had a dif­fer­ent pat­tern, po­si­tion 100 had an­other. The pat­terns came from sine and co­sine waves at dif­fer­ent fre­quen­cies. Now the em­bed­ding for dog” at po­si­tion 1 was dif­fer­ent from the em­bed­ding for dog” at po­si­tion 5, just be­cause the po­si­tion pat­tern added to it was dif­fer­ent.

That worked, and si­nu­soidal en­cod­ings were cho­sen partly be­cause they can ex­trap­o­late be­yond the ex­act se­quence lengths seen dur­ing train­ing. But ad­di­tive po­si­tion schemes still had two prob­lems that be­came im­por­tant as mod­els scaled up.

First, the em­bed­ding had to carry both mean­ing and po­si­tion in the same set of num­bers. There’s only so much you can pack in.

Second, learned ab­solute po­si­tion em­bed­dings in par­tic­u­lar don’t gen­er­al­ize cleanly. If you trained on prompts up to 2,048 to­kens long, the model never saw po­si­tion 5,000 dur­ing train­ing, and the em­bed­ding for that po­si­tion was not learned in the same way.

Modern mod­els mostly use a dif­fer­ent scheme called Rotary Position Embeddings (RoPE), in­tro­duced by Su et al. in 2021 and now used in LLaMA, Mistral, Gemma, Qwen, and most other open-weight fam­i­lies. The in­tu­ition: in­stead of adding po­si­tion info to each to­ken’s vec­tor, RoPE ro­tates the Query and Key vec­tors by an an­gle that de­pends on the to­ken’s po­si­tion. A to­ken at po­si­tion 1 gets a small turn, a to­ken at po­si­tion 100 gets a big­ger turn. When two to­kens are later com­pared dur­ing at­ten­tion, what mat­ters is the dif­fer­ence be­tween their Query and Key ro­ta­tions, which en­codes how far apart they are.

Tiny ex­plainer: RoPE RoPE stands for Rotary Position Embeddings. Instead of adding a po­si­tion vec­tor, it ro­tates Query and Key vec­tors so rel­a­tive dis­tance shows up dur­ing at­ten­tion.

Tiny ex­plainer: RoPE RoPE stands for Rotary Position Embeddings. Instead of adding a po­si­tion vec­tor, it ro­tates Query and Key vec­tors so rel­a­tive dis­tance shows up dur­ing at­ten­tion.

The prac­ti­cal ad­van­tages are real. RoPE en­codes rel­a­tive po­si­tion nat­u­rally (which is closer to what at­ten­tion ac­tu­ally wants). It gen­er­al­izes bet­ter to longer con­texts. And it does­n’t add new pa­ra­me­ters to the model.

Even with good po­si­tional en­cod­ing, mod­ern LLMs have a doc­u­mented lost in the mid­dle” prob­lem (Liu et al. 2023). They use in­for­ma­tion at the start and end of long prompts more re­li­ably than in­for­ma­tion buried in the mid­dle. That’s why prompt en­gi­neer­ing tips like put im­por­tant con­text first” or repeat key info at the end” ac­tu­ally help. The model is­n’t us­ing every part of your prompt equally well.

With to­ken mean­ing and po­si­tion both en­coded, the next ques­tion is how do to­kens ac­tu­ally ex­change in­for­ma­tion?

Attention

This is the mech­a­nism that gave the ar­chi­tec­ture its name. Attention.

Inside every trans­former layer, at­ten­tion does one thing. It lets each to­ken look at the other to­kens it is al­lowed to see and de­cide which ones mat­ter for what comes next.

It does this by giv­ing each to­ken three roles at once. Each to­ken gets trans­formed into three new vec­tors, called Query, Key, and Value (Q, K, V).

Tiny ex­plainer: Q, K, V Query means what am I look­ing for,” Key means what do I match with,” and Value is the in­for­ma­tion that gets copied when the match is strong.

Tiny ex­plainer: Q, K, V Query means what am I look­ing for,” Key means what do I match with,” and Value is the in­for­ma­tion that gets copied when the match is strong.

The Query asks, what am I look­ing for from other to­kens?”

The Key says, this is what I of­fer to to­kens look­ing at me.”

The Value car­ries, this is what gets passed along when a match hap­pens.”

The same to­ken plays all three roles at the same time. The Q, K, V trans­for­ma­tions are learned ma­tri­ces, so the model fig­ures out dur­ing train­ing what each to­ken should look for and what it should of­fer.

Matching hap­pens through a sim­i­lar­ity score. Each to­ken’s Query is com­pared against the Key of each to­ken it is al­lowed to see, us­ing a scaled dot prod­uct. Intuitively, this mea­sures how much the two vec­tors line up. The scal­ing keeps the num­bers sta­ble be­fore soft­max.

Tiny ex­plainer: dot prod­uct A dot prod­uct is a sim­ple way to score how aligned two vec­tors are. Higher align­ment means a stronger match.

Tiny ex­plainer: dot prod­uct A dot prod­uct is a sim­ple way to score how aligned two vec­tors are. Higher align­ment means a stronger match.

The match scores then get turned into weights us­ing soft­max. Softmax takes any set of num­bers and turns them into a prob­a­bil­ity-like dis­tri­b­u­tion that sums to 1. Tokens with higher match scores get higher weights, and the weights are then used to take a weighted av­er­age of the value vec­tors.

Tiny ex­plainer: soft­max Softmax turns raw scores into weights that add up to 1. Big scores get big weights, small scores get small weights.

Tiny ex­plainer: soft­max Softmax turns raw scores into weights that add up to 1. Big scores get big weights, small scores get small weights.

An ex­am­ple. Consider the sen­tence The cat that I saw yes­ter­day was sleep­ing.” When the model processes was,” it needs to fig­ure out what’s do­ing the sleep­ing. The Query vec­tor for was” gets com­pared against the Key vec­tors of the to­kens it is al­lowed to see. The dot prod­uct with cat” is high, be­cause the model has learned that verbs like was” need a sub­ject and that sub­jects like cat” pro­duce Key vec­tors that line up well. The dot prod­uct with yesterday” is low. Softmax turns those scores into weights, cat” gets a high weight, yesterday” gets a low one. The model then takes a weighted sum of the cor­re­spond­ing value vec­tors, so the value for cat” dom­i­nates the re­sult. The new rep­re­sen­ta­tion of was” is now mostly shaped by the value of cat.” That’s how a to­ken sev­eral po­si­tions back be­comes the ref­er­ent.

There’s a con­straint spe­cific to GPT-style lan­guage mod­els, which is that they gen­er­ate text left to right. A to­ken at po­si­tion 5 is only al­lowed to at­tend to po­si­tions 1 through 5. It can­not at­tend to to­kens at po­si­tions 6, 7, 8, be­cause those haven’t been gen­er­ated yet. This is called causal mask­ing. The im­ple­men­ta­tion is sim­ple: fu­ture to­kens get match scores so low they end up with ef­fec­tively zero weight af­ter soft­max.

Tiny ex­plainer: causal mask­ing Causal mask­ing hides fu­ture to­kens. It keeps a de­coder-only lan­guage model from look­ing ahead while pre­dict­ing the next to­ken.

Tiny ex­plainer: causal mask­ing Causal mask­ing hides fu­ture to­kens. It keeps a de­coder-only lan­guage model from look­ing ahead while pre­dict­ing the next to­ken.

One of the most in­ter­est­ing find­ings in in­ter­pretabil­ity re­search is about spe­cial­ized at­ten­tion heads called in­duc­tion heads, found by Anthropic in 2022. These heads learn to spot pat­terns of the form A B … A” in the prompt and pre­dict that B comes next. When the model sees A” the sec­ond time, the in­duc­tion head looks back to where A” ap­peared be­fore, sees what came af­ter, and copies that. They’re one of the clear­est known mech­a­nisms be­hind in-con­text learn­ing, the abil­ity of an LLM to pick up a pat­tern from your prompt and con­tinue it.

Tiny ex­plainer: in­duc­tion head An in­duc­tion head is an at­ten­tion head that no­tices re­peated pat­terns in the prompt and helps con­tinue them.

Tiny ex­plainer: in­duc­tion head An in­duc­tion head is an at­ten­tion head that no­tices re­peated pat­terns in the prompt and helps con­tinue them.

Attention has one big cost. In full at­ten­tion, each to­ken com­pares against all the to­kens it is al­lowed to see, so dou­bling the prompt length roughly quadru­ples the work. This is why long prompts are ex­pen­sive to run, and why a lot of re­cent re­search is about mak­ing at­ten­tion more ef­fi­cient (FlashAttention, sparse at­ten­tion, lin­ear at­ten­tion).

But one at­ten­tion head only gives the model one learned view of those re­la­tion­ships.

Multi-head at­ten­tion

A sin­gle at­ten­tion pass gives the model one way of de­cid­ing which to­kens mat­ter to which other to­kens. That’s not enough. Language has many re­la­tion­ships hap­pen­ing at the same time. Subject and verb agree­ment. Pronouns and the names they re­fer to. Long-range ref­er­ences be­tween sen­tences. Word or­der and lo­cal phrases.

Multi-head at­ten­tion solves this by run­ning at­ten­tion many times in par­al­lel, with each par­al­lel pass op­er­at­ing in its own smaller space. Each par­al­lel pass is called a head.

Tiny ex­plainer: at­ten­tion head An at­ten­tion head is one in­de­pen­dent at­ten­tion pass with its own learned pro­jec­tions.

Tiny ex­plainer: at­ten­tion head An at­ten­tion head is one in­de­pen­dent at­ten­tion pass with its own learned pro­jec­tions.

The part that’s of­ten de­scribed wrong, in­clud­ing in plenty of tu­to­ri­als. Each head does­n’t get a lit­eral slice of the orig­i­nal to­ken vec­tor. Each head has its own learned pro­jec­tion ma­tri­ces that map the full to­ken vec­tor down to its own smaller Q, K, and V vec­tors. So if a model has 4,096 num­bers per to­ken and 32 heads, each head usu­ally works in a 128-dimensional space, but those 128 num­bers are a learned pro­jec­tion of the full 4,096, not a fixed slice. Different views” of the same to­ken, not dif­fer­ent chunks of it.

Each head runs its at­ten­tion pass in­de­pen­dently. Then the out­puts of all the heads get con­cate­nated and passed through a fi­nal lin­ear layer that mixes them back into one full-size vec­tor. The model learns that fi­nal mix­ing too.

What makes this in­ter­est­ing is that dif­fer­ent heads of­ten end up par­tially spe­cial­ized. The model is never told what each head should do. Specialization emerges nat­u­rally dur­ing train­ing. Researchers have found heads that track gram­mar (linking verbs to their ob­jects, ar­ti­cles to their nouns), heads that fig­ure out which pro­noun refers to which name, heads that track po­si­tional pat­terns, in­duc­tion heads, and many more. A sin­gle trans­former layer might have 32 heads. A mod­ern fron­tier model has dozens of lay­ers. So a typ­i­cal LLM has thou­sands of at­ten­tion heads in to­tal, each adding its own learned view.

There’s a prac­ti­cal cost con­cern that drove a re­cent ar­chi­tec­tural change. Each head needs to keep its Key and Value vec­tors in mem­ory for all the to­kens al­ready gen­er­ated, so that when a new to­ken gets gen­er­ated the model does­n’t have to re­com­pute every­thing from scratch. This is called the KV cache, and it’s the main mem­ory cost of run­ning an LLM at long con­text lengths.

Tiny ex­plainer: KV cache The KV cache stores old Key and Value vec­tors dur­ing gen­er­a­tion. It saves the model from re­com­put­ing the whole prompt every time it adds a to­ken.

Tiny ex­plainer: KV cache The KV cache stores old Key and Value vec­tors dur­ing gen­er­a­tion. It saves the model from re­com­put­ing the whole prompt every time it adds a to­ken.

Modern de­coder-only LLMs mostly use a vari­ant called Grouped-Query Attention (GQA). Instead of every head hav­ing its own keys and val­ues, groups of heads share the same key and value heads. LLaMA-2 70B has 64 query heads but only 8 key/​value heads. Mistral 7B has 32 query heads and 8 key/​value heads. The re­sult is nearly the same ac­cu­racy as full multi-head at­ten­tion but with much less mem­ory pres­sure and in­fer­ence cost.

Tiny ex­plainer: GQA Grouped-Query Attention lets mul­ti­ple query heads share fewer key/​value heads. That cuts KV-cache mem­ory while keep­ing many query views.

Tiny ex­plainer: GQA Grouped-Query Attention lets mul­ti­ple query heads share fewer key/​value heads. That cuts KV-cache mem­ory while keep­ing many query views.

Feed-forward net­work

After at­ten­tion fin­ishes mix­ing in­for­ma­tion be­tween to­kens, every layer has a sec­ond step that no­body talks about as much. The feed-for­ward net­work.

GrapheneOS user reported to authorities for using GrapheneOS

discuss.grapheneos.org

GrapheneOS Discussion Forum

Meta confirms thousands of Instagram accounts were hacked by abusing its AI chatbot

this.weekinsecurity.com

Meta is no­ti­fy­ing thou­sands of peo­ple whose Instagram ac­counts were hi­jacked dur­ing the months-long abuse of the com­pa­ny’s AI chat­bot, which hack­ers re­peat­edly tricked into tak­ing con­trol of a per­son’s ac­count.

In a new data breach no­ti­fi­ca­tion let­ter, seen by this week in se­cu­rity, Meta has re­vealed for the first time how many peo­ple had their ac­counts hi­jacked as part of the long-run­ning hack­ing cam­paign, which was dis­cov­ered ear­lier this week and first re­ported by 404 Media ($) and TechCrunch ($). The num­ber of af­fected ac­counts gives some clar­ity as to how wide­spread this hack­ing cam­paign was, and for how long it op­er­ated.

According to the data breach no­tice filed with Maine’s at­tor­ney gen­er­al’s of­fice late on Friday, Meta no­ti­fied at least 20,225 peo­ple that their ac­counts had been com­pro­mised, in­clud­ing 30 peo­ple in Maine.

The com­pro­mises al­lowed the hack­ers to take over the per­son’s en­tire Instagram and any linked ac­counts, in­clud­ing ob­tain­ing con­tact in­for­ma­tion, dates of birth, and pro­file in­for­ma­tion, as well as the abil­ity to ac­cess the per­son’s posts, di­rect mes­sages, and ac­count ac­tiv­ity, the no­tice reads.

Meta’s no­tice con­firmed that the breach re­lates to a vul­ner­a­bil­ity in an AI-assisted ac­count re­cov­ery sys­tem for Instagram,” which was ex­ploited to perform pass­word re­sets on Instagram user ac­counts.”

As pre­vi­ously re­ported, hack­ers abused a flaw in Meta’s chat­bot that al­lowed any­one to re­set the pass­word of any ac­count that did not have two-fac­tor au­then­ti­ca­tion switched on. The bug tricked the chat­bot into send­ing a ver­i­fi­ca­tion code to an email ad­dress con­trolled by the hacker, rather than the ac­count hold­er’s email ad­dress on file, sim­ply by ask­ing it. The chat­bot com­plied any­way.

The tool it­self worked prop­erly and func­tioned as in­tended; how­ever due to a bug in a sep­a­rate code path, the sys­tem did not prop­erly ver­ify that the email ad­dress pro­vided by the in­di­vid­ual re­quest­ing a pass­word re­set matched the email ad­dress as­so­ci­ated with that user’s Instagram ac­count,” said Meta in its breach no­tice.

As a re­sult, when an in­di­vid­ual pro­vided an email ad­dress not pre­vi­ously as­so­ci­ated with the ac­count, the sys­tem in­cor­rectly sent a pass­word re­set link to that unas­so­ci­ated email rather than re­ject­ing the re­quest. This al­lowed unau­tho­rized third par­ties to re­ceive a pass­word re­set link for ac­counts they did not own,” the com­pany added.

At this point, Meta says, the hack­ers could re­set some­one’s pass­word and take over their ac­count as if they were the right­ful owner.

Meta said that it is unaware” of what, if any, per­sonal in­for­ma­tion was ac­cessed dur­ing the hacks. (An email to Meta’s press line ask­ing for clar­ity on this was un­re­turned as of early Saturday.)

According to Maine’s list­ing, the hacks be­gan around April 17 and lasted un­til this week, when Meta said that it had se­cured the chat­bot. Instagram re­port­edly started no­ti­fy­ing af­fected in­di­vid­u­als ear­lier this week by send­ing a pass­word re­set no­ti­fi­ca­tion, even as some re­ported that the hacks were on­go­ing.

Meta also con­firmed in the no­tice that it alerted users to se­cure their ac­counts, say­ing it instructed im­pacted users to re­set their pass­words and re-au­then­ti­cate through se­cure, ver­i­fied chan­nels.”

Meta said that it has dis­abled the AI chat­bot for now and re­moved the code path that al­lowed the chat­bot to re­set user ac­counts, and said it’s also check­ing other chat­bots across its plat­forms to pre­vent a re­peat in­ci­dent. It’s not yet clear what cir­cum­stances led up to the chat­bot be­ing abused, but comes soon af­ter Meta laid off thou­sands of em­ploy­ees while re­ward­ing top ex­ec­u­tives with stock in­cen­tives, as the com­pany con­tin­ues to dou­ble-down on AI.

~ ~

Thank you so much for read­ing ~this week in se­cu­rity~. If you liked this ar­ti­cle, please share it! Feel free to reach out with any feed­back, ques­tions, or com­ments about this ar­ti­cle: this@weekin­se­cu­rity.com.

Pentagon raised threat of Israeli spying on U.S. to highest level, sources say

www.nbcnews.com

WASHINGTON — The Pentagon is in­creas­ingly con­cerned about Israel ramp­ing up its spy­ing on the U.S., re­cently rais­ing the coun­ter­in­tel­li­gence threat level from America’s top ally in the Middle East to the high­est level, ac­cord­ing to two U.S. of­fi­cials and one for­mer U.S. of­fi­cial.

The Pentagon’s Defense Intelligence Agency in re­cent weeks is­sued the new coun­ter­in­tel­li­gence threat as­sess­ment amid ris­ing ten­sions be­tween Israel and the U.S. over the way for­ward in the war with Iran, the of­fi­cials said. They said the DIA posted an in­ter­nal mes­sage, viewed by one of the cur­rent of­fi­cials, that raised the level for Israel to critical.”

The des­ig­na­tion stems from con­cerns within the Pentagon that Israel is mak­ing a par­tic­u­lar ef­fort to sur­veil top U.S. of­fi­cials to get in­for­ma­tion on the Trump ad­min­is­tra­tion’s in­ter­nal de­lib­er­a­tions and de­ci­sion-mak­ing on the con­flicts in the Middle East, the of­fi­cials said.

The DIA as­sess­ment in­cludes a seven-page doc­u­ment and fea­tures a chart, ac­cord­ing to one of the cur­rent U.S. of­fi­cials. The doc­u­ment says the as­sess­ment of Israel is that its abil­ity to con­duct hu­man es­pi­onage and tech­ni­cal col­lec­tion is at a critical level,” ac­cord­ing to the of­fi­cial.

It also iden­ti­fies a se­ries of spe­cific in­ci­dents that height­ened U.S. con­cerns, the of­fi­cial said.

A spokesper­son for the Israeli Embassy in Washington, D.C., said in a state­ment that it is completely false” that Israel spies on the U.S. Israel does not gather in­tel­li­gence on American en­ti­ties, let alone US gov­ern­ment of­fi­cials,” the spokesper­son said. Israel in­tel­li­gence col­lec­tion ef­forts are aimed at its en­e­mies, not its al­lies. Any claims to the con­trary are ei­ther mis­in­formed or po­lit­i­cally mo­ti­vated.”

The Pentagon de­clined to com­ment.

A White House of­fi­cial said in a state­ment, This en­tire story is false and sourced to some­one who does­n’t have any knowl­edge of what’s go­ing on.”

The Office of the Director of National Intelligence, which over­sees all the U.S. in­tel­li­gence agen­cies in­clud­ing the DIA, did not re­spond to a re­quest for com­ment.

While it is com­mon­place for al­lies and ad­ver­saries across the globe to spy on each other, the cur­rent and for­mer U.S. of­fi­cials said Israel’s re­cent ef­forts have gone well be­yond what is typ­i­cal and ex­pected es­pi­onage. The of­fi­cials did not know if a spe­cific in­ci­dent trig­gered the DIAs de­ci­sion to raise the coun­ter­in­tel­li­gence threat level.

The height­ened alert comes as President Donald Trump and Israeli Prime Minister Benjamin Netanyahu have clashed over the war with Iran and Israel’s mil­i­tary op­er­a­tions in Lebanon, in­clud­ing in a tense phone call this past week, NBC News re­ported. Trump ac­knowl­edged af­ter­ward to re­porters that he called Netanyahu crazy” dur­ing the call as ques­tions mount about whether the two coun­tries’ ob­jec­tives in the Middle East are be­gin­ning to sig­nif­i­cantly di­verge.

Since a cease­fire deal was reached in early April, Trump has been pur­su­ing a diplo­matic deal with Iran to end the war Israel and the U.S. launched on Feb. 28. Israel has pub­licly ex­pressed skep­ti­cism that Iran would abide by any ne­go­ti­ated deal. Netanyahu has pushed for a re­sump­tion of bomb­ing raids against Iran and dis­agreed with Trump, who has pressed him to scale back at­tacks against Hezbollah in Lebanon, ac­cord­ing to Western of­fi­cials.

Israel is keenly in­ter­ested in whether Trump de­cides to re­sume ma­jor com­bat op­er­a­tions against Iran or to end the con­flict, the cur­rent and for­mer U.S. of­fi­cials and out­side ex­perts said.

The most prac­ti­cal out­come for the Pentagon is that U.S. of­fi­cials will use ex­tra cau­tion when trav­el­ing to Israel or vis­it­ing with Israeli of­fi­cials, the cur­rent and for­mer U.S. of­fi­cials said. They said there did not ap­pear to be any im­pact on the high-level in­tel­li­gence-shar­ing that oc­curs on a daily ba­sis be­tween the two coun­tries, par­tic­u­larly as­so­ci­ated with the Iran war.

The U.S. al­ready takes ex­tra pre­cau­tions when vis­it­ing Israel,” one of the cur­rent U.S. of­fi­cials said. They’re well-known to ag­gres­sively col­lect.”

The U.S., like other coun­tries, main­tains elab­o­rate coun­ter­in­tel­li­gence, or spy catcher,” ef­forts to pre­vent and track es­pi­onage by for­eign ad­ver­saries as well as by al­lies and part­ners, seek­ing to safe­guard state se­crets and mon­i­tor at­tempts to re­cruit or co­erce U.S. of­fi­cials. Under U.S. law, the FBI has the lead­ing role in coun­ter­in­tel­li­gence ef­forts, but they also in­volve a range of gov­ern­ment agen­cies and the mil­i­tary.

According to cur­rent and for­mer diplo­mats and for­mer na­tional se­cu­rity of­fi­cials, Israel for years has had a rep­u­ta­tion for ag­gres­sive es­pi­onage even against the U.S., its clos­est ally. It’s a prac­tice that has long raised con­cerns among na­tional se­cu­rity and diplo­matic of­fi­cials, and U.S. in­tel­li­gence of­fi­cials closely mon­i­tor the is­sue, ac­cord­ing to ex­perts and the cur­rent and for­mer U.S. of­fi­cials.

Top U.S. of­fi­cials of­ten take ex­tra care when trav­el­ing to Israel, some­times us­ing burner phones and com­put­ers and tak­ing ex­treme cau­tion when speak­ing in ho­tel rooms dur­ing of­fi­cial trips, the cur­rent and for­mer U.S. of­fi­cials and ex­perts said.

Israel has a hy­per-ag­gres­sive in­tel­li­gence ser­vice,” said Emily Harding, vice pres­i­dent of the Defense and Security Department and di­rec­tor of the in­tel­li­gence, na­tional se­cu­rity and tech­nol­ogy pro­gram at the Center for Strategic and International Studies, a think tank in Washington. They are ex­ceed­ingly in­ter­ested in what we are up to,” Harding said of the Israelis.

In the 1980s, spy­ing by Israel caused a rift with Washington, with U.S. Navy in­tel­li­gence an­a­lyst Jonathan Pollard spend­ing 30 years in prison af­ter he was found to have sold suit­cases of top-se­cret doc­u­ments to Israel.

The U.S. also spies on its al­lies and seeks to gather in­tel­li­gence on for­eign part­ners, as ev­i­denced in 2013 by leaks from in­tel­li­gence con­trac­tor Edward Snowden.

Those leaks showed that the U.S. was eaves­drop­ping on European lead­ers, in­clud­ing then-Ger­man Chancellor Angela Merkel’s mo­bile phone, spark­ing out­rage in Berlin.

The U.S. and Israel re­main close al­lies, and the two coun­tries’ in­tel­li­gence ser­vices have forged a close work­ing re­la­tion­ship over decades. But con­cerns about pos­si­ble Israeli es­pi­onage at such a sen­si­tive mo­ment — when the two gov­ern­ments are not in full agree­ment about the war with Iran — carry the risk of un­der­min­ing trust be­tween the two coun­tries, two ad­di­tional for­mer U.S. of­fi­cials said.

Google will pay SpaceX $920M per month for compute

techcrunch.com

SpaceX has lined up an­other com­pute deal ahead of its his­toric IPO, this time with Google. The com­pany an­nounced the deal in a reg­u­la­tory fil­ing on Friday.

Under the terms of the deal, Google will pay SpaceX $920 mil­lion per month from October 2026 through June 2029 for ac­cess to approximately 110,000 NVIDIA GPUs, CPUs, mem­ory, and other re­lated com­po­nents.”

The deal is sim­i­lar in length and scope to the one SpaceX an­nounced with Anthropic in late May. As part of that deal, Anthropic agreed to pay SpaceX $1.25 bil­lion per month through 2029 to rent all the avail­able com­pute from its Colossus 1 data cen­ter near Memphis, Tennessee, that xAI — now part of SpaceX — orig­i­nally built for its own ar­ti­fi­cial in­tel­li­gence ef­forts.

Google’s deal ap­pears to be pay­ing for roughly half the amount of com­pute that Anthropic has ac­cess to at Colossus 1. SpaceX did­n’t say which spe­cific data cen­ter Google would be us­ing. CEO Elon Musk has pre­vi­ously sug­gested his com­pany would re­serve the Colossus 2 data cen­ter for xAI.

Anthropic was sig­nif­i­cantly lim­ited in its com­pute ca­pac­ity prior to its deal with SpaceX, rais­ing us­age lim­its on the same day the deal was an­nounced. Google is in a very dif­fer­ent po­si­tion, with some es­ti­mates nam­ing it as the world’s largest sin­gle owner of AI com­pute.

In a state­ment, a Google rep­re­sen­ta­tive de­scribed the deal as a re­sult of un­ex­pected de­mand for its re­cently launched AI prod­ucts. Google Cloud and SpaceX are long-time part­ners,” Google said in a state­ment. This is a short-term, timely agree­ment to en­sure we have bridge ca­pac­ity to meet surg­ing cus­tomer de­mand for our agent plat­form, Gemini Enterprise, which has been even higher than we ex­pected.”

But its par­ent com­pany Alphabet is on a spend­ing spree. Alphabet has al­ready com­mit­ted to more than $180 bil­lion in cap­i­tal ex­pen­di­tures this year and has said it ex­pects that to significantly in­crease” in 2027. To help with that, Alphabet re­cently an­nounced an $80 bil­lion eq­uity sale.

Also like the Anthropic deal, the agree­ment with Google in­cludes a can­cel­la­tion clause. Both SpaceX and Google have the op­tion to ter­mi­nate the agree­ment with 90 days’ no­tice af­ter December 31, 2026. Google’s ac­cess to the data cen­ter will ramp up through September at a re­duced fee,” ac­cord­ing to the fil­ing.

If we fail to de­liver ac­cess to the com­mit­ted amount of GPUs by September 30, 2026, then fol­low­ing a one-month grace pe­riod, Google may im­me­di­ately ter­mi­nate the agree­ment or ac­cept the num­ber of GPUs pro­vided” with a re­duc­tion in the monthly fees, it reads.

SpaceX an­nounced the deal just one week be­fore the com­pa­ny’s stock is ex­pected to start trad­ing on the Nasdaq ex­change. Paperwork filed with the Securities and Exchange Commission shows the com­pany is aim­ing to raise around $75 bil­lion at a val­u­a­tion of around $1.75 tril­lion — mak­ing it the largest in his­tory.

Google is a long­time in­vestor in SpaceX. Its stake in Musk’s com­pany is ex­pected to be worth more than $100 bil­lion af­ter the IPO. The com­pa­nies are also re­port­edly in talks to try to build or­bital data cen­ters — a ma­jor com­po­nent of SpaceX’s fu­ture plans post-IPO.

When you pur­chase through links in our ar­ti­cles, we may earn a small com­mis­sion. This does­n’t af­fect our ed­i­to­r­ial in­de­pen­dence.

Sean O’Kane is a re­porter who has spent a decade cov­er­ing the rapidly-evolv­ing busi­ness and tech­nol­ogy of the trans­porta­tion in­dus­try, in­clud­ing Tesla and the many star­tups chas­ing Elon Musk. Most re­cently, he was a re­porter at Bloomberg News where he helped break sto­ries about some of the most no­to­ri­ous EV SPAC flops. He pre­vi­ously worked at The Verge, where he also cov­ered con­sumer tech­nol­ogy, hosted many short- and long-form videos, per­formed prod­uct and ed­i­to­r­ial pho­tog­ra­phy, and once nearly passed out in a Red Bull Air Race plane.

You can con­tact or ver­ify out­reach from Sean by email­ing sean.okane@techcrunch.com or via en­crypted mes­sage at okane.01 on Signal.

View Bio

pokeemerald-wasm

pokeemerald.com

rs - an accurate VHS video effect

ntsc.rs

ntsc-rs is a free, open-source video ef­fect which ac­cu­rately em­u­lates ana­log TV and VHS ar­ti­facts.

Other pop­u­lar ef­fects eye­ball the look of VHS tapes us­ing sim­ple color lookup ta­bles and over­lays. ntsc-rs uses al­go­rithms that model how NTSC trans­mis­sion and VHS en­cod­ing ac­tu­ally work, based on al­go­rithms de­vel­oped in com­pos­ite-video-sim­u­la­tor, zhuker/​ntsc, and ntscQT.

ntsc-rs is writ­ten in Rust, and is mul­ti­threaded and SIMD-accelerated. Unlike sim­i­lar ef­fects such as ntscQT, it can run in real time at much higher res­o­lu­tions than ac­tual NTSC footage.

ntsc-rs is avail­able not just as a stand­alone and web ap­pli­ca­tion, but also as a plu­gin for After Effects, Premiere, and all OpenFX-compatible soft­ware. This in­cludes DaVinci Resolve, Hitfilm, and Vegas.

Moving beyond fork() + exec()

lwn.net

fork() is a rel­a­tively ex­pen­sive sys­tem call; it must copy the en­tire process state (including mem­ory) for the child process. Many op­ti­miza­tions have been made over the years, but a fork is still a fun­da­men­tally costly op­er­a­tion. To make things worse, a fork() call is of­ten im­me­di­ately fol­lowed by an exec(), which will dis­card all of that mem­ory that was so care­fully copied for the child. Attempts (such as vfork()) have been made over the years to op­ti­mize for this case, but the pat­tern still is more ex­pen­sive than it could be.

Spawn tem­plates

Chen’s patch set takes an in­ter­est­ing ap­proach to op­ti­mize the fork() and exec() pat­tern. It is fo­cused on ap­pli­ca­tions that re­peat­edly launch processes run­ning the same ex­e­cutable; imag­ine, for ex­am­ple, a pro­gram that must run Git re­peat­edly to ob­tain in­for­ma­tion about the con­tents of a repos­i­tory. In such cases, the pro­gram could es­tab­lish a tem­plate to ac­cel­er­ate those in­vo­ca­tions, spread­ing the setup cost across mul­ti­ple op­er­a­tions. This tem­plate would be cre­ated with the spawn_tem­plate_cre­ate() sys­tem call:

struct spawn_tem­plate_cre­ate_args { __aligned_u64 flags; __s32 ex­ecfd; __u32 ex­ec_flags; __aligned_u64 file­name; /* Some fields elided */ };

int spawn_tem­plate_cre­ate(struct spawn_tem­plate_cre­ate_args *args, size_t args_­size);

This call will re­turn a file de­scrip­tor rep­re­sent­ing a tem­plate for the ex­e­cutable file, which can be spec­i­fied as ei­ther a file de­scrip­tor (execfd) or an ab­solute path (filename), but not both. To cre­ate the tem­plate, the ker­nel will open the in­di­cated file and cache a bunch of in­for­ma­tion that will al­low a process to run that file more quickly in the fu­ture.

The ap­pli­ca­tion in ques­tion may run a given ex­e­cutable many times, but each in­vo­ca­tion is dif­fer­ent in a num­ber of ways. The de­tails of a spe­cific in­vo­ca­tion must be placed into an in­stance of this struc­ture:

struct spawn_tem­plate_s­pawn_args { __aligned_u64 flags; __aligned_u64 pidfd; __aligned_u64 argv; __aligned_u64 envp; __aligned_u64 ac­tions; __aligned_u64 ac­tion­s_len; __aligned_u64 re­served[4]; };

The argv field is a pointer to the ar­gu­ment list to be passed to the pro­gram, while envp points to its en­vi­ron­ment. Changes to file de­scrip­tors and sig­nal han­dling, in­stead, are passed through ac­tions, which is a pointer to an ar­ray of:

struct spawn_tem­plate_ac­tion { __u32 type; __u32 flags; __s32 fd; __s32 newfd; __aligned_u64 arg; };

If, for ex­am­ple, file de­scrip­tor four should be closed in the child, the as­so­ci­ated spawn_tem­plate_ac­tion struc­ture would have type set to SPAWN_TEMPLATE_ACTION_CLOSE and fd set to four. Other ac­tions ex­ist for du­pli­cat­ing file de­scrip­tors, open­ing files, chang­ing the work­ing di­rec­tory, and chang­ing sig­nal han­dling.

Once the spawn_tem­plate_s­pawn_args struc­ture has been filled in, the new process can be run with:

int spawn_tem­plate_s­pawn(int tem­plate_fd, struct spawn_tem­plate_s­pawn_args *args, int args_­size);

Internally, this sys­tem call fol­lows some­thing close to the nor­mal fork()/​exec() path. Chen is care­ful to point out that all of the nor­mal checks ap­plied when ex­e­cut­ing a new file re­main in place. But the cached in­for­ma­tion in the tem­plate makes the whole process faster than it was be­fore.

How much faster? Benchmark re­sults pro­vided in the cover let­ter show an im­prove­ment of about 2%, which may not seem like a lot, but it may make a dif­fer­ence for ap­pli­ca­tions that fit the ex­pected pat­tern.

Toward posix_s­pawn()

The most de­tailed re­view of this work was posted by Mateusz Guzik, who said: This prob­lem is dear to my heart and I have been pon­der­ing it on and off for some time now. The en­tire fork + exec id­iom is ter­ri­ble and needs to be re­tired”. He pointed out that the fo­cus of the patch set was a bit strange in that it left the fork() part of the prob­lem un­touched. That is where most of the cost lies, he said, so op­ti­miza­tion ef­forts should seek to re­move it from the pic­ture. Rather than copy­ing the cur­rent process, creating a pris­tine process is the way to go”.

Christian Brauner was fa­vor­able to­ward the goal, say­ing: The idea of hav­ing a builder api for exec is­n’t all that crazy”. His sug­ges­tion, though, was that a new API should be built on top of the ex­ist­ing pidfd ab­strac­tion. Without get­ting into any de­gree of de­tail, he said that the right ap­proach would be to cre­ate an op­tion to pidfd_open() to cre­ate an empty process. A se­ries of calls to a new pidfd_­con­fig() sys­tem call would then con­fig­ure this new process as de­sired, set­ting up its en­vi­ron­ment, im­age to ex­e­cute, and more. pidfd_­con­fig() would thus be anal­o­gous to fs­con­fig().

An im­por­tant ob­jec­tive for a new in­ter­face, Brauner said, would be the abil­ity to sup­port an im­ple­men­ta­tion of posix_s­pawn() in user space. posix_s­pawn() is well suited as a re­place­ment for the fork()/​exec() pat­tern; de­vel­op­ers would likely wel­come a na­tive im­ple­men­ta­tion that is­n’t (unlike the cur­rent im­ple­men­ta­tion) hid­ing fork() and exec() un­der the cov­ers.

Chen agreed that the API as broadly sketched out by Brauner seemed bet­ter, and said that fu­ture work would be in that di­rec­tion. So there will be no spawn tem­plates in the Linux ker­nel but, if Chen’s fu­ture work comes to fruition, Linux may fi­nally gain a proper posix_s­pawn() im­ple­men­ta­tion in­stead.

Did you like this ar­ti­cle?? Claim your free trial sub­scrip­tion now to get a lot more like it.

The Smart TV in Your LivingRoom Is a Node in the AIScraping Economy

blog.includesecurity.com

The work at Include Security has us work­ing with AI day in and day out (hacking it, us­ing it, train­ing it, etc).

We’re all aware of the com­mu­nity-level op­po­si­tion hap­pen­ing against dat­a­cen­ters, aimed at im­prov­ing AI ca­pa­bil­i­ties, be­ing built re­cently. What you might not be aware of are the dis­trib­uted ef­forts to train AI that could be us­ing the de­vices in­side your home.

In this post, we’re go­ing to ex­plore how the com­pany Bright Data fa­cil­i­tates mod­ern AI mod­els scrap­ing train­ing data from the Internet us­ing its res­i­den­tial proxy net­work.

Bright Data is a data-col­lec­tion com­pany that sells ac­cess to what it mar­kets as the world’s largest res­i­den­tial proxy net­work of 400M+ home IP ad­dresses that its cus­tomers route web-scrap­ing traf­fic through. The sup­ply be­hind that net­work comes from an SDK: a piece of soft­ware em­bed­ded in con­sumer apps that, with the user’s con­sent, turns their phone or smart TV into one of those exit nodes.

We’ll doc­u­ment what you, the av­er­age user, should know about what this com­pa­ny’s SDK does on your sys­tems such as your mo­bile phone and your smart TV. We’re go­ing to ex­plore how their SDK works, which plat­forms have shipped it, and why your Internet-connected TV is the ul­ti­mate proxy for AI mod­els look­ing to train on data scraped from the Internet.

Why This Matters Now

AI com­pa­nies de­pend on web-scraped con­tent: for pre-train­ing, for re­trieval, for agent ground­ing, for search. But the mod­ern web is­n’t scrape­able from a dat­a­cen­ter. Cloudflare, DataDome, HUMAN, among oth­ers throt­tle or block re­quests from known cloud IPs.

The workaround is res­i­den­tial prox­ies. A scrap­ing job routed through a Comcast or T-Mobile sub­scriber’s con­nec­tion ar­rives at the tar­get site from an IP that be­longs to a pay­ing res­i­den­tial cus­tomer. Krebs re­ported in October 2025 that a glut of prox­ies from Aisuru and other sources is fu­el­ing large-scale data har­vest­ing ef­forts tied to var­i­ous AI pro­jects.” Academic mea­sure­ment go­ing back to 2019 shows these net­works are over­whelm­ingly mis­used. The FBI is­sued a for­mal ad­vi­sory ear­lier this year.

Most of the ex­ist­ing press has fo­cused on the il­le­gal res­i­den­tial-proxy sup­ply: bot­nets (Aisuru, Kimwolf), tro­janized apps (HUMAN Security’s PROXYLIB dis­clo­sure), pre-in­fected IoT hard­ware (Google/Mandiant’s IPIDEA take­down). These are the bad ac­tors.

On the other hand, the le­gal sup­ply side has re­ceived far less scrutiny. Today Bright Data is the largest res­i­den­tial proxy net­work in the world by its own mar­ket­ing, ad­ver­tis­ing 150M+ IPs” sourced via a con­sent SDK em­bed­ded in part­ner apps. This re­search doc­u­ments how that SDK works, which plat­forms have shipped it, and why the con­nected-TV is the ul­ti­mate res­i­den­tial proxy.

Why Connected TV (CTV) is the Ideal Proxy

Connected TV, a.k.a Smart TV, is a near-per­fect res­i­den­tial proxy. Compared to a mo­bile phone:

A TV never hits 1% bat­tery, jumps be­tween WiFi net­works or gets locked when the user is asleep. Some part­ner pub­lish­ers do dis­close the Bright Data re­la­tion­ship in their pri­vacy poli­cies PlayWorks is one ex­am­ple. But pri­vacy-pol­icy dis­clo­sure is the wrong con­trol sur­face for a TV. It is hard to scroll through a le­gal doc­u­ment nav­i­gated by ar­row keys on a re­mote, and the in-app con­sent di­a­log, does­n’t con­vey that a pay­ing Bright Data cus­tomer is about to route their scrap­ing traf­fic through the user’s home in­ter­net.

Petflix, a Roku app doc­u­mented by The Verge, is a rep­re­sen­ta­tive case. Its opt-in screen reads: To en­joy Petflix for free with fewer ads, you are al­low­ing Bright Data to oc­ca­sion­ally use your de­vice’s free re­sources and IP ad­dress to down­load pub­lic web data from the in­ter­net. Bright Data will only use your IP ad­dress for ap­proved busi­ness-re­lated use cases. None of your per­sonal in­for­ma­tion is ac­cessed or col­lected ex­cept your IP ad­dress. Period.” The Petflix di­a­log says occasionally.” The SDKs pub­licly queryable con­fig sets max_b­w_­month­ly_wifi: 200,000,000,000 bytes — a 200 GB de­fault monthly WiFi bud­get.

Who Bright Data Names as Partners

Bright Data ex­poses a part­ner man­i­fest end­point. The end­point is unau­then­ti­cated and any­one can fetch it. Names in the man­i­fest that I was able to iden­tify with high con­fi­dence from pub­lic sources:

Others (desoline, free_­time, ot­t_s­tu­dio, glob­al_mi­cro­trad­ing, m_m_­me­dia, easystaff_lp) are pre­sent but less iden­ti­fi­able from pub­lic sources. bright_screen­savers, bright_videos, and bright­data are Bright Data’s own apps.

A note on what the part­ner list proves: Being listed in Bright Data’s con­fig means an in­te­gra­tion might have ex­isted at some point. It does not by it­self prove that a spe­cific pub­lish­er’s cur­rently-ship­ping app(s) in­cludes the SDK in pro­duc­tion. For any named pub­lisher, per-app ver­i­fi­ca­tion is re­quired.

What the part­ner list does di­rectly prove:

Bright Data ships this ros­ter in an unau­then­ti­cated pub­lic end­point.

At least three CTV-focused en­ti­ties (PlayWorks, CloudTV, Longvision) mon­e­tized their user’s de­vices as res­i­den­tial proxy exit nodes. PlayWorks in par­tic­u­lar re­ports CTV dis­tri­b­u­tion across ma­jor TV plat­forms and ISPs, with reach fig­ures in the hun­dreds of mil­lions of house­holds per its own mar­ket­ing ma­te­ri­als.

How does the Bright Data SDK turn a user’s de­vice into a res­i­den­tial proxy exit node?

The Bright Data SDK is a pub­licly doc­u­mented com­mer­cial prod­uct, of­fered to pub­lish­ers via Bright Data’s SDK in­te­gra­tion docs (with a JavaScript vari­ant for web). What fol­lows builds on that pub­lic sur­face with find­ings from re­verse-en­gi­neer­ing the ship­ping iOS frame­work and in­stru­ment­ing 30 days of its run­time traf­fic.

The SDK ships as an iOS frame­work (brdsdk.framework) in­side part­ner apps. I re­verse-en­gi­neered the bi­nary and cap­tured 30 days of traf­fic from a re­search fleet run­ning the SDK in­side a con­sent-in­stalled part­ner app.

The Unauthenticated Config

On every launch the SDK calls:

GET <https://​clientsdk.bright-sdk.com/​sd­k_­con­fig_ios.json>?appid=<bundle>&ver=<sdk-version>&uuid=sdk-ios-<32hex>

The end­point is unau­then­ti­cated in any mean­ing­ful sense. The server gates only on two query pa­ra­me­ters ap­pid (an app bun­dle ID, which can be found in the App Store list­ing of the part­ner app) and ver (the SDK ver­sion string). Supply those and any ran­domly gen­er­ated UUID, and the server re­turns the same re­sponse a real de­vice gets: fea­ture flags, idle-de­tec­tion thresh­olds (battery %, CPU/memory ceil­ings, WiFi-vs-cellular rules), per-coun­try band­width tiers, and the part­ner man­i­fest I show­cased above. Each of these branches is worth ex­am­in­ing on its own: the idle rules that de­cide when your de­vice is el­i­gi­ble to re­lay, a flag that routes peer traf­fic around your VPN, a map that stitches your in­stalls across plat­forms into one iden­tity, and the per-coun­try band­width caps.

The Peer Tunnel

After con­fig fetch, the SDK opens a per­sis­tent WebSocket to:

wss://​prox­yjs.brdt­net.com:443

This host­name re­solves to AWS Global Accelerator IPs (3.33.193.183, 15.197.193.114 as of this writ­ing). The TLS cer­tifi­cate is CN=*.luminatinet.com — the do­main for Luminati Networks, Bright Data’s pre-2018 cor­po­rate name. The re­brand was pub­licly an­nounced in 2018. Active SDK in­fra­struc­ture still runs on the legacy cert, which is a use­ful de­tec­tion pivot: the cur­rent cus­tomer-fac­ing proxy ser­vice lives on bright­data.com-branded do­mains, so any lu­mi­natinet.com / brdt­net.com traf­fic on your net­work is specif­i­cally the peer-tun­nel plane, not cus­tomer-side Bright Data us­age. The server iden­ti­fies it­self as uWeb­Sock­ets: 20.

The peer end­point re­quires no au­then­ti­ca­tion to up­grade. The server ac­cepts any TLS-valid WebSocket up­grade and im­me­di­ately pushes the con­nect­ing client an ap­pli­ca­tion-layer frame with the clien­t’s pub­lic IP echoed back. From there, a hand­shake un­folds:

Server → client: tun­nel_init es­tab­lishes the ses­sion, re­turns the clien­t’s pub­lic IP.

Server → client: cid_set the server as­signs the client a ses­sion-track­ing iden­ti­fier in the for­mat <IP>-<token>/ls<N>c<M>p443_<IP>_<counter>. We con­firmed this for­mat matches the cid field pre­sent in the SDKs cap­tured teleme­try traf­fic from real de­vices.

Server → client: sta­tus_get the server polls the de­vice for its idle state, bat­tery, net­work type, and avail­able band­width. The de­vice re­sponds with a con­tin­u­ous teleme­try feed: idle, wifi_­con­nected, mo­bile_­con­nected, mo­bile_­type (LTE/5G), roam­ing, bat­tery_level, us­ing_­bat­tery, screen_on, on_­call, cpu_us­age, mem_us­age, raw_bw, bw, ipv6_­sup­ported, ap­pid (the host app), sd­k_ver­sion, plat­form, and the as­signed cid. This is a con­tin­u­ous feed of phys­i­cal-de­vice state to a third party, de­liv­ered via a con­sent di­a­log whose text is cho­sen by the host app pub­lisher.

Handshake com­plete. Once the de­vice re­ports fa­vor­able sta­tus, the server’s job-match­ing layer is free to push cmd_­tun frames: in­di­vid­ual scrap­ing-job in­struc­tions that the SDK ex­e­cutes as HTTP re­quests against third-party sites, us­ing the user’s res­i­den­tial IP as the source.

Every frame on the WebSocket is plain JSON with a fixed en­ve­lope:

{“type”: ipc_call”|“ipc_post”|“ipc_result”|“ipc_error”,“cmd”:  <command>, cookie”: <correlation-id>,“err_code”: 0, msg”: { …payload… }}

The full com­mand vo­cab­u­lary ex­tracted from the bi­nary and ver­i­fied on the wire:

There’s no mes­sage sign­ing, HMAC, client cer­tifi­cate or de­vice at­tes­ta­tion. Only the TLS layer and the server’s IP-reputation fil­ter gat­ing which peers ac­tu­ally re­ceive jobs. For read­ers fa­mil­iar with com­mer­cial mal­ware pro­to­col de­sign: this is sub­stan­tially less se­cure than typ­i­cal C2.

When the SDK con­sid­ers you idle”

The con­fig ships an ex­plicit rule­book for when the de­vice is el­i­gi­ble to re­lay some­one else’s traf­fic:

idle_metrics”: {  ignore_screen_on”: true,      // re­lay even with the screen on  ignore_on_call”: true,        // re­lay while the user is on a phone call  max_bw_ratio”: 1,  min_battery”: 0.2,  wifi_on_battery”: true,  min_battery_wifi”: 0.2,  max_cpu_usage”: 70,  max_mem_usage”: 90,  mem_screen_off”: true,  idle_timeout”: 30,  not_idle_timeout”: 10}

The ig­nore_screen_on and ig­nore_on_­call flags are no­table: idle” does not mean the user is away from the de­vice. It means the de­vice’s CPU, mem­ory, and bat­tery are within the SDKs thresh­olds. A user on a phone call, ac­tively read­ing the screen, is con­sid­ered idle for re­lay pur­poses.

Cross-Platform Identity Linkage

The con­fig also ships a dual_­pair­ing map:

dual_pairing”: {  ios_com.brd.earnapp”: [“win_earnapp.com”, mac_com.earnapp”]}

That’s a server-side map ty­ing a user’s iOS, Windows, and ma­cOS in­stal­la­tions of the same brand into one en­tity. It’s cross-plat­form iden­tity stitch­ing doc­u­mented in­side a pub­lic con­fig file.One more for­ward-look­ing field: http3_en­abled: true. The SDK is al­ready ship­ping the flag for QUIC-based peer trans­port. A fu­ture ver­sion may move the peer tun­nel from TCP/443 to UDP/443, which would break any de­fender re­ly­ing on TCP con­nec­tion track­ing to de­tect the WebSocket.

The Inspection Bypass

The SDKs con­fig ships a flag use_netifs”: true. That flag trig­gers code in the SDK bi­nary that con­structs its NWConnection with a spe­cific re­quired in­ter­face: en0 (WiFi) or pdp_ip0 (cellular), rather than us­ing the sys­tem de­fault route.

On iOS, this by­passes any con­fig­ured VPNs tun0 in­ter­face en­tirely. The peer tun­nel does not cross a user-con­fig­ured VPN, even when the rest of the ap­p’s HTTPS traf­fic does.

We ob­served this em­pir­i­cally. My re­search setup in­cludes trans­par­ent TLS in­ter­cep­tion. It cap­tured every HTTPS call the SDK made, ex­cept the peer tun­nel to prox­yjs.brdt­net.com:443, even though port 443 is ex­plic­itly redi­rected to the in­spec­tor. The by­pass uses Apple’s doc­u­mented NWParameters.requiredInterface API.

It’s worth em­pha­siz­ing that the SDK uses two in­de­pen­dent in­spec­tion by­passes, one per plane:

Control plane (config fetch, teleme­try pings): built on CFNetwork’s CFHTTPMessage prim­i­tives rather than URLSession/NSURLConnection. This de­feats URLSessionlevel in­stru­men­ta­tion (swizzling, net­work ex­ten­sions, URLProtocol sub­classes) com­monly used in mo­bile app-sec tool­ing, while still re­spect­ing the sys­tem proxy and so re­main­ing vis­i­ble to TLS-intercepting re­searchers.

Data plane (peer tun­nel): built on NWConnection with re­quired­In­ter­face set to the phys­i­cal in­ter­face. This is what de­feats VPNs and en­sures the scrap­ing is ex­e­cuted from a res­i­den­tial IP.

Both choices are le­git­i­mate Apple APIs. The com­bi­na­tion is the in­ter­est­ing ar­ti­fact: the data plane is in­vis­i­ble to VPN-based in­spec­tion and the con­trol plane is in­vis­i­ble to URLSession-based hooks. Researchers who rely on ei­ther sin­gle tech­nique see only half the SDKs be­hav­ior.

For en­ter­prise se­cu­rity teams run­ning MDM, cor­po­rate-VPN-based traf­fic in­spec­tion, or home-router parental con­trols: the most sen­si­tive chan­nel this SDK op­er­ates is de­signed to go around your vis­i­bil­ity layer.

The ge­og­ra­phy tiers

The con­fig ships per-coun­try band­width thresh­olds. Four coun­tries get ex­plicit non-de­fault poli­cies:

Looking at the con­fig, Uzbekistan and Oman de­vices are per­mit­ted to re­lay down to 1% bat­tery, with daily caps 20× the de­fault and monthly caps 60× the de­fault. Qatar and UAE de­vices are throt­tled be­low de­fault.  We can only spec­u­late as to why the tiers are drawn this way. One read­ing is de­lib­er­ate mar­ket seg­men­ta­tion, re­lax­ing lim­its where grid power is sta­ble and throt­tling where mo­bile data is ex­pen­sive. The de­fault-world­wide al­lowance still per­mits 500 MB of some­one else’s traf­fic per month over the user’s home in­ter­net.

Testing Setup and Methodology

Three data sources:

Thirty days of TLS-inspecting proxy cap­tures from iOS de­vice run­ning con­sent-in­stalled part­ner apps (including XYO COIN, which em­beds the Bright SDK).

Static analy­sis of the SDK bi­nary (brdsdk.framework, ver­sion 1.532.120, iOS ar­m64).

All spe­cific Bright Data host­names, cert fin­ger­prints, and TLS in­fra­struc­ture de­scribed are pub­licly ob­serv­able by any­one mak­ing the same re­quests. No ses­sion-spe­cific iden­ti­fy­ing data from ei­ther the re­search fleet or the re­search client ap­pears in this doc­u­ment.

Timeline

May 11, 2026 — Email no­tice sent to pri­vacy@bright­data.com no­ti­fy­ing their team about the re­lease of this blog post. No re­sponse to the no­ti­fi­ca­tion has been re­ceived at the time of this ar­ti­cle’s pub­lish­ing.

Defense Approaches

The traf­fic leaves clear fin­ger­prints at the net­work bound­ary, and the SDK leaves iden­ti­fi­able sym­bols in the app bi­nary. The ap­proaches be­low let you de­tect and block the peer tun­nel — at the net­work level or on the de­vice it­self. Three ap­proaches, or­dered by ease of de­ploy­ment:

Approach 1: DNS block (trivial, ef­fec­tive for net­work-routed de­vices):

prox­yjs.brdt­net.com­pro­x­yjs.lu­mi­natinet.com­pro­x­yjs.bright-sdk.com­clientsdk.bright-sdk.com­clientsdk.brdt­net.com

Blocking prox­yjs.* kills the peer tun­nel with­out af­fect­ing any cus­tomer who le­git­i­mately uses Bright Data’s cus­tomer-fac­ing proxy ser­vice on a dif­fer­ent do­main.

Approach 2: TLS SNI fil­ter­ing: Drop or alert on TLS hand­shakes where serv­er_­name matches *.brdtnet.com, *.luminatinet.com, or *.luminati.io. Works at the net­work bound­ary with­out TLS in­spec­tion.

Approach 3: TLS cer­tifi­cate fin­ger­print:

.brdtnet.com → SHA256 313ce4ec7d5a51e5…

.luminatinet.com → SHA256 5028612e625befea…

Stable un­til Sectigo cert ro­ta­tion (current certs valid through mid-2026).

The use_netifs caveat: All three lay­ers only work on traf­fic that crosses your net­work bound­ary. The SDKs use_netifs bind­ing means that on iOS, when the de­vice is on cel­lu­lar, peer traf­fic by­passes cor­po­rate WiFi en­tirely. For man­aged fleets, the com­ple­men­tary con­trol is MDM-based app bi­nary scan­ning: search in­stalled apps for the Swift sym­bols BrdWebSocketFacade and BrdNetwork.DNSResolver, and pro­hibit apps con­tain­ing them on cor­po­rate-is­sued de­vices.

For house­hold users con­cerned about a spe­cific smart TV or mo­bile app: block the host­names above at your router’s DNS set­tings (Pi-hole, NextDNS, Cloudflare Gateway, your ISPs equiv­a­lent).

This blog post was writ­ten in part­ner­ship with our guest au­thor and in­de­pen­dent se­cu­rity re­searcher Buchodi.

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

Visit pancik.com for more.