10 interesting stories served every morning and every evening.




1 797 shares, 1 trendiness

jQuery 4.0.0

On January 14, 2006, John Resig in­tro­duced a JavaScript li­brary called jQuery at BarCamp in New York City. Now, 20 years later, the jQuery team is happy to an­nounce the fi­nal re­lease of jQuery 4.0.0. After a long de­vel­op­ment cy­cle and sev­eral pre-re­leases, jQuery 4.0.0 brings many im­prove­ments and mod­ern­iza­tions. It is the first ma­jor ver­sion re­lease in al­most 10 years and in­cludes some break­ing changes, so be sure to read through the de­tails be­low be­fore up­grad­ing. Still, we ex­pect that most users will be able to up­grade with min­i­mal changes to their code.

Many of the break­ing changes are ones the team has wanted to make for years, but could­n’t in a patch or mi­nor re­lease. We’ve trimmed legacy code, re­moved some pre­vi­ously-dep­re­cated APIs, re­moved some in­ter­nal-only pa­ra­me­ters to pub­lic func­tions that were never doc­u­mented, and dropped sup­port for some magic” be­hav­iors that were overly com­pli­cated.

We have an up­grade guide and jQuery Migrate plu­gin re­lease ready to as­sist with the tran­si­tion. Please up­grade and let us know if you en­counter any is­sues.

As usual, the re­lease is avail­able on our CDN and the npm pack­age man­ager. Other third party CDNs will prob­a­bly have it avail­able soon as well, but re­mem­ber that we don’t con­trol their re­lease sched­ules and they will need some time. Here are the high­lights for jQuery 4.0.0.

jQuery 4.0 drops sup­port for IE 10 and older. Some may be ask­ing why we did­n’t re­move sup­port for IE 11. We plan to re­move sup­port in stages, and the next step will be re­leased in jQuery 5.0. For now, we’ll start by re­mov­ing code specif­i­cally sup­port­ing IE ver­sions older than 11.

We also dropped sup­port for other very old browsers, in­clud­ing Edge Legacy, iOS ver­sions ear­lier than the last 3, Firefox ver­sions ear­lier than the last 2 (aside from Firefox ESR), and Android Browser. No changes should be re­quired on your end. If you need to sup­port any of these browsers, stick with jQuery 3.x.

jQuery 4.0 adds sup­port for Trusted Types, en­sur­ing that HTML wrapped in TrustedHTML can be used as in­put to jQuery ma­nip­u­la­tion meth­ods in a way that does­n’t vi­o­late the re­quire-trusted-types-for Content Security Policy di­rec­tive.

Along with this, while some AJAX re­quests were al­ready us­ing tags to main­tain at­trib­utes such as cross­do­main, we have since switched most asyn­chro­nous script re­quests to use <script> tags to avoid any CSP er­rors caused by us­ing in­line scripts. There are still a few cases where XHR is used for asyn­chro­nous script re­quests, such as when the”head­ers” op­tion is passed (use scrip­tAt­trs in­stead!), but we now use a tag when­ever pos­si­ble.

It was a spe­cial day when the jQuery source on the main branch was mi­grated from AMD to ES mod­ules. The jQuery source has al­ways been pub­lished with jQuery re­leases on npm and GitHub, but could not be im­ported di­rectly as mod­ules with­out RequireJS, which was jQuery’s build tool of choice. We have since switched to Rollup for pack­ag­ing jQuery and we do run all tests on the ES mod­ules sep­a­rately. This makes jQuery com­pat­i­ble with mod­ern build tools, de­vel­op­ment work­flows, and browsers through the use of .

...

Read the original on blog.jquery.com »

2 532 shares, 29 trendiness

G4 (Severe) Geomagnetic Storm Levels Reached 19 Jan, 2026

About Space Weather

Products and DataForecasts

Report and Forecast of Solar and Geophysical Activity

Weak or mi­nor degra­da­tion of HF ra­dio com­mu­ni­ca­tion on sun­lit side, oc­ca­sional loss of ra­dio con­tact.

Low-frequency nav­i­ga­tion sig­nals de­graded for brief in­ter­vals.

More about the NOAA Space Weather Scales

G4 Levels were first reached at 2:38pm EST (1938 UTC) on 19 January, 2026 upon CME shock ar­rival. CME pas­sage is ex­pected to con­tinue through the evening with G4 lev­els re­main­ing pos­si­ble.

...

Read the original on www.swpc.noaa.gov »

3 498 shares, 21 trendiness

Apple testing new App Store design that blurs the line between ads and search results

Apple is test­ing a new de­sign for App Store search ads on iPhone. Some users on iOS 26.3 are notic­ing that the blue back­ground around spon­sored re­sults is no longer shown, blur­ring the line be­tween what paid ad re­sults look like and the real search re­sults that fol­low.

This means the only dif­fer­en­tia­tor be­tween or­ganic re­sults and the pro­moted ad is the pres­ence of the small Ad’ ban­ner next to the app icon. Right now, it ap­pears to be in some kind of A/B test phase.

We have asked Apple for clar­ity on the change, and whether this will roll out more widely in the fu­ture.

It may be re­lated to the com­pa­ny’s an­nounce­ment from December that App Store search re­sults will soon start in­clud­ing more than one spon­sored re­sult for a given search query. The re­moval of the blue back­ground will mean all of the ads will ap­pear in the list in a more in­te­grated fash­ion.

Of course, this also has the ef­fect of mak­ing it harder for users to quickly dis­tin­guish at a glance what is an ad and what is­n’t, po­ten­tially mis­lead­ing some users into not re­al­is­ing that the first re­sult is a paid ad place­ment. While not great for user ex­pe­ri­ence, it prob­a­bly helps in­crease click-through rates which ul­ti­mately boosts Apple’s rev­enue in its ads busi­ness.

...

Read the original on 9to5mac.com »

4 430 shares, 47 trendiness

The Incredible Overcomplexity of the Shadcn Radio Button

The Incredible Overcomplexity of the Shadcn Radio Button

The other day I was asked to up­date the vi­sual de­sign of ra­dio but­tons in a web app at work. I fig­ured it could­n’t be that com­pli­cated. It’s just a ra­dio but­ton right?

Boom! Done. Radio but­tons are a built-in HTML el­e­ment. They’ve been around for 30 years. The browser makes it easy. Time for a cof­fee.

I dug into our code­base and re­al­ized we were us­ing two React com­po­nents from

Shadcn to power our ra­dio but­tons: and

For those un­fa­mil­iar with Shadcn, it’s a UI frame­work that pro­vides a bunch of pre­built UI com­po­nents for use in your web­sites. Unlike tra­di­tional UI frame­works like Bootstrap, you don’t im­port it with a script tag or

npm in­stall. Instead you run a com­mand that copies the com­po­nents into your code­base.

Here’s the code that was ex­ported from Shadcn into our pro­ject:

use client”;

im­port * as React from react”;

im­port * as RadioGroupPrimitive from @radix-ui/react-radio-group”;

im­port { CircleIcon } from lucide-react”;

im­port { cn } from @/lib/utils”;

func­tion RadioGroup({

class­Name,

…props

}: React. ComponentProps

Woof… 3 im­ports and 45 lines of code. And it’s im­port­ing a third party icon li­brary just to ren­der a cir­cle. (Who needs CSS bor­der-ra­dius or the SVG

el­e­ment when you can add a third party de­pen­dency in­stead?)

All of the styling is done by the 30 dif­fer­ent Tailwind classes in the markup. I should prob­a­bly just tweak those to fix the styling is­sues.

But now I’m dis­tracted, an­noyed, and cu­ri­ous. Where’s the ac­tual ? What’s the point of all this? Let’s dig a lit­tle deeper.

The Shadcn com­po­nents im­port com­po­nents from an­other li­brary called Radix. For those un­fa­mil­iar with Radix, it’s a UI frame­work that pro­vides a bunch of pre­built UI com­po­nents…

Wait a sec­ond! Isn’t that what I just said about Shadcn? What gives? Why do we need both? Let’s see what the Radix docs say:

Radix Primitives is a low-level UI com­po­nent li­brary with a fo­cus on ac­ces­si­bil­ity, cus­tomiza­tion and de­vel­oper ex­pe­ri­ence. You can use these com­po­nents ei­ther as the base layer of your de­sign sys­tem, or adopt them in­cre­men­tally.

So Radix pro­vides un­styled com­po­nents, and then Shadcn adds styles on top of that. How does Radix work? You can see for your­self on GitHub:

https://​github.com/​radix-ui/

This is get­ting even more com­pli­cated: 215 lines of React code im­port­ing 7 other files. But what does it ac­tu­ally do?

Taking a look in the browser

Let’s look in the browser dev tools to see if we can tell what’s go­ing on.

Okay, in­stead of a ra­dio in­put it’s ren­der­ing a but­ton with an SVG cir­cle in­side it? Weird.

It’s also us­ing

ARIA at­trib­utes

to tell screen read­ers and other as­sis­tive tools that the but­ton is ac­tu­ally a ra­dio but­ton.

ARIA at­trib­utes al­low you to change the se­man­tic mean­ing of HTML el­e­ments. For ex­am­ple, you can say that a but­ton is ac­tu­ally a ra­dio but­ton. (If you wanted to do that for some strange rea­son.)

If you can use a na­tive HTML el­e­ment or at­tribute with the se­man­tics and be­hav­ior you re­quire al­ready built in, in­stead of re-pur­pos­ing an el­e­ment and adding an ARIA role, state or prop­erty to make it ac­ces­si­ble, then do

so.

Despite that, Radix is re­pur­pos­ing an el­e­ment and adding an ARIA role in­stead of us­ing a na­tive HTML el­e­ment.

Finally, the com­po­nent also in­cludes a hid­den but only if it’s used in­side of a el­e­ment. Weird!

This is get­ting pretty com­pli­cated to just ren­der a ra­dio but­ton. Why would you want to do this?

Styling ra­dio but­tons is hard (Wait, is it?)

My best guess is that Radix re­builds the ra­dio but­ton from scratch in or­der to make it eas­ier to style. Radio but­tons used to be dif­fi­cult to style con­sis­tently across browsers. But for sev­eral years we’ve been able to style ra­dio but­tons how­ever we want us­ing a few CSS tools:

ap­pear­ance: none re­moves the ra­dio but­ton’s de­fault styling al­low­ing us to

do what­ever we want.

We can use the ::before pseudo-el­e­ment to ren­der a dot” in­side of the

un­styled ra­dio but­ton.

We can use the :checked pseudo-class to show and hide that dot de­pend­ing on

whether the ra­dio but­ton is checked.

in­put[type=“ra­dio”] {

/* Disable the browser’s de­fault ra­dio but­ton styles */

ap­pear­ance: none;

mar­gin: 0;

/* Recreate the cir­cle con­tainer */

bor­der: 1px solid black;

back­ground: white;

bor­der-ra­dius: 50%;

/* Center our dot in the con­tainer */

dis­play: in­line-grid;

place-con­tent: cen­ter;

/* Use a pseudo-el­e­ment to dis­play our dot” */

&::before {

con­tent: ”;

width: 0.75rem;

height: 0.75rem;

bor­der-ra­dius: 50%;

/* And dis­play it when the ra­dio but­ton is checked */

&:checked::before {

back­ground: black;

This does­n’t re­quire any de­pen­den­cies, JavaScript, or ARIA roles. It’s just an in­put el­e­ment with some styles. (You can do the same thing with Tailwind if that’s your jam.)

It does re­quire knowl­edge of CSS but this is­n’t some ar­cane se­cret.

Googling how to style a ra­dio but­ton”

shows sev­eral blog posts ex­plain­ing these tech­niques. You may say this is a lot of CSS, but the Shadcn com­po­nent we were us­ing had 30 Tailwind classes!

I’m not try­ing to con­vince you to write your own com­po­nent styles

Look, I get it. You’ve got a lot go­ing on. You’re not big on CSS. You just want to grab some pre­built com­po­nents so you can fo­cus on the ac­tual prob­lem you’re solv­ing.

I to­tally un­der­stand why peo­ple reach for com­po­nent li­braries like Shadcn and I don’t blame them at all. But I wish these com­po­nent li­braries would keep things sim­ple and reuse the built-in browser el­e­ments where pos­si­ble.

Web de­vel­op­ment is hard. There’s in­her­ent com­plex­ity in build­ing qual­ity sites that solve prob­lems and work well across a wide range of de­vices and browsers.

But some things don’t have to be hard. Browsers make things like ra­dio but­tons easy. Let’s not over­com­pli­cate it.

To un­der­stand how our ra­dio but­tons work I need to un­der­stand two sep­a­rate com­po­nent li­braries and hun­dreds of lines of React.

Website vis­i­tors need to wait for JavaScript to load, parse, and run in or­der to be able to tog­gle a ra­dio but­ton. (In my test­ing, just adding these com­po­nents added sev­eral KB of JS to a ba­sic app.)

Why am I mak­ing such a big deal out of this? It’s just a ra­dio but­ton.

But these small de­ci­sions add up to more com­plex­ity, more cog­ni­tive load, more bugs, and worse web­site per­for­mance.

We have strayed so far from the light

Look at it. It’s beau­ti­ful:

...

Read the original on paulmakeswebsites.com »

5 411 shares, 23 trendiness

Porsche delivers 279,449 sports cars to customers in 2025

With a bal­anced sales struc­ture across in­di­vid­ual mar­kets, Dr. Ing. h.c. F. Porsche AG, Stuttgart, de­liv­ered a to­tal of 279,449 cars to cus­tomers around the world in 2025. The fig­ure was 310,718 for the pre­vi­ous year, rep­re­sent­ing a de­cline of 10 per cent. Porsche’s top pri­or­ity re­mains a value-ori­ented de­riv­a­tive mix.

After sev­eral record years, our de­liv­er­ies in 2025 were be­low the pre­vi­ous year’s level. This de­vel­op­ment is in line with our ex­pec­ta­tions and is due to sup­ply gaps for the 718 and Macan com­bus­tion-en­gined mod­els, the con­tin­u­ing weaker de­mand for ex­clu­sive prod­ucts in China, and our value-ori­ented sup­ply man­age­ment,” says Matthias Becker, Member of the Executive Board for Sales and Marketing at Porsche AG. In 2025, we de­lighted our cus­tomers with out­stand­ing cars — such as the 911 Turbo S with its T-Hybrid drive sys­tem.” The re­sponse to the launch of the Cayenne Electric at the end of 2025 also shows, Becker adds, that Porsche is meet­ing cus­tomer ex­pec­ta­tions with its in­no­v­a­tive and high-per­for­mance prod­ucts.

With 84,328 de­liv­er­ies, the Macan was the best-sell­ing model line. North America re­mains the largest sales re­gion with 86,229 de­liv­er­ies — a fig­ure that is in line with the pre­vi­ous year.

Porsche repo­si­tioned it­self in 2025 and made for­ward-look­ing strate­gic prod­uct de­ci­sions. The de­liv­ery mix in 2025 un­der­scores that the sports car man­u­fac­turer is con­sis­tently re­spond­ing to global cus­tomer pref­er­ences by ex­pand­ing its dri­ve­train strat­egy to of­fer com­bus­tion-en­gined, plug-in hy­brid, and fully elec­tric cars. In 2025, 34.4 per cent of Porsche cars de­liv­ered world­wide were elec­tri­fied (+7.4 per­cent­age points), with 22.2 per cent be­ing fully elec­tric and 12.1 per cent be­ing plug-in hy­brids. This puts the global share of fully elec­tric ve­hi­cles at the up­per end of the tar­get range of 20 to 22 per cent for 2025. In Europe, for the first time, more elec­tri­fied cars were de­liv­ered than pure com­bus­tion-en­gined mod­els (57.9 per cent elec­tri­fi­ca­tion share), with every third car be­ing fully elec­tric. Among the Panamera and Cayenne mod­els, plug-in hy­brid de­riv­a­tives dom­i­nate the European de­liv­ery fig­ures. At the same time, the com­bus­tion-en­gined and T-Hybrid 911 set a new bench­mark with 51,583 de­liv­er­ies world­wide.

With 86,229 de­liv­er­ies, North America re­mains the largest sales re­gion, as it was the year prior. After record de­liv­er­ies in 2024, the Overseas and Emerging Markets also largely main­tained its pre­vi­ous-year lev­els, with 54,974 cars de­liv­ered (-1 per cent). In Europe (excluding Germany), Porsche de­liv­ered 66,340 cars by the end of the year, down 13 per cent year-on-year. In the German home mar­ket, 29,968 cus­tomers took de­liv­ery of new cars — a de­cline of 16 per cent. Reasons for the de­crease in both re­gions in­clude sup­ply gaps for the com­bus­tion-en­gined 718 and Macan mod­els due to EU cy­ber­se­cu­rity reg­u­la­tions.

In China, 41,938 cars were de­liv­ered to cus­tomers (-26 per cent). Key rea­sons for the de­cline re­main chal­leng­ing mar­ket con­di­tions, es­pe­cially in the lux­ury seg­ment, as well as in­tense com­pe­ti­tion in the Chinese mar­ket, par­tic­u­larly for fully elec­tric mod­els. Porsche con­tin­ues to fo­cus on value-ori­ented sales.

Deliveries of the Macan to­taled 84,328 units (+2 per cent), with fully elec­tric ver­sions ac­count­ing for over half at 45,367 ve­hi­cles. In most mar­kets out­side the EU, the com­bus­tion-en­gined Macan con­tin­ues to be of­fered, with 38,961 of these be­ing de­liv­ered. Some 27,701 Panamera mod­els were de­liv­ered by the end of December (-6 per cent).

The 911 sports car icon recorded 51,583 de­liv­er­ies by year-end (+1 per cent), set­ting an­other de­liv­ery record. The 718 Boxster and 718 Cayman to­taled 18,612 de­liv­er­ies, down 21 per cent from the pre­vi­ous year due to the model line’s phase-out. Production ended in October 2025.

The Taycan ac­counted for 16,339 de­liv­er­ies (-22 per cent), mainly due to the slow­down in the adop­tion of elec­tro­mo­bil­ity. The keys to 80,886 Cayenne mod­els were handed to cus­tomers in 2025, a de­cline of 21 per cent, partly due to catch-up ef­fects the pre­vi­ous year. The new fully elec­tric Cayenne cel­e­brated its world pre­miere in November, with the first mar­kets to of­fer the model be­gin­ning to de­liver to cus­tomers from this spring. It will be of­fered along­side com­bus­tion-en­gined and plug-in hy­brid ver­sions of the Cayenne.

Looking ahead, Matthias Becker says: In 2026, we have a clear fo­cus; we want to man­age de­mand and sup­ply ac­cord­ing to our value over vol­ume’ strat­egy. At the same time, we are plan­ning our vol­umes for 2026 re­al­is­ti­cally, con­sid­er­ing the pro­duc­tion phase-out of the com­bus­tion-en­gined 718 and Macan mod­els.” In par­al­lel, Porsche is con­sis­tently in­vest­ing in its three-pronged pow­er­train strat­egy and will con­tinue to in­spire cus­tomers with unique sports cars in 2026. An im­por­tant com­po­nent is the ex­pan­sion of the brand’s cus­tomiza­tion of­fer­ing — via both the Exclusive Manufaktur and Sonderwunsch pro­gram. In do­ing so, the com­pany is re­spond­ing to cus­tomers’ ever-in­creas­ing de­sire for in­di­vid­u­al­iza­tion.

All amounts are in­di­vid­u­ally rounded to the near­est cent; this may re­sult in mi­nor dis­crep­an­cies when summed.

This press re­lease con­tains for­ward-look­ing state­ments and in­for­ma­tion on the cur­rently ex­pected busi­ness de­vel­op­ment of Porsche AG. These state­ments are sub­ject to risks and un­cer­tain­ties. They are based on as­sump­tions about the de­vel­op­ment of eco­nomic, po­lit­i­cal and le­gal con­di­tions in in­di­vid­ual coun­tries, eco­nomic re­gions and mar­kets, in par­tic­u­lar for the au­to­mo­tive in­dus­try, which we have made based on the in­for­ma­tion avail­able to us and which we con­sider to be re­al­is­tic at the time of pub­li­ca­tion. If any of these or other risks ma­te­ri­alise, or if the as­sump­tions un­der­ly­ing these state­ments prove in­cor­rect, the ac­tual re­sults could be sig­nif­i­cantly dif­fer­ent from those ex­pressed or im­plied by such state­ments. Forward-looking state­ments in this pre­sen­ta­tion are based solely on the in­for­ma­tion per­tain­ing on the day of pub­li­ca­tion.

These for­ward-look­ing state­ments will not be up­dated later. Such state­ments are valid on the day of pub­li­ca­tion and may be over­taken by later events.

This in­for­ma­tion does not con­sti­tute an of­fer to ex­change or sell or of­fer to ex­change or pur­chase se­cu­ri­ties.

Join us as we re­visit the most sig­nif­i­cant mo­ments — and take a look ahead to 2026.

Porsche AG res­olutely pushed ahead with its de­ci­sion to re­align its prod­uct strat­egy at the end of the third quar­ter of 2025.

Porsche has sig­nif­i­cantly in­creased the pro­por­tion of elec­tri­fied ve­hi­cles sold in the first nine months of 2025.

Between January and June, a to­tal of 146,391 ve­hi­cles were de­liv­ered to cus­tomers world­wide.

* Where val­ues are in­di­cated as ranges, they do not re­fer to a sin­gle, spe­cific ve­hi­cle and are not part of the of­fered prod­uct range. They are only for the pur­poses of com­par­i­son be­tween dif­fer­ent ve­hi­cle types. Additional equip­ment and ac­ces­sories (add-on parts, tyre for­mats etc.) can change rel­e­vant ve­hi­cle pa­ra­me­ters such as weight, rolling re­sis­tance and aero­dy­nam­ics. These fac­tors, in ad­di­tion to weather, traf­fic con­di­tions and dri­ving be­hav­iour, can in­flu­ence the fuel/​elec­tric­ity con­sump­tion, CO emis­sions, range and per­for­mance val­ues of a ve­hi­cle.

** Important in­for­ma­tion about the all-elec­tric Porsche mod­els can be found here.

1. All in­for­ma­tion of­fered on Porsche Newsroom, in­clud­ing but not lim­ited to, texts, im­ages, au­dio and video doc­u­ments, are sub­ject to copy­right or other leg­is­la­tion for the pro­tec­tion of in­tel­lec­tual prop­erty. They are in­tended ex­clu­sively for use by jour­nal­ists as a source for their own me­dia re­port­ing and are not in­tended for com­mer­cial use, in par­tic­u­lar for ad­ver­tis­ing pur­poses. It is not per­mit­ted to pass on texts, im­ages, au­dio or video data to unau­tho­rised third par­ties.

2. Use of Newsroom con­tent for book pro­jects (or sim­i­lar com­mer­cial use) is not per­mit­ted, par­tic­u­lar with re­gards to im­ages. Any po­ten­tial us­age must be ap­proved be­fore­hand by Dr. Ing. h.c. F. Porsche AG. To dis­cuss li­cenc­ing re­quests for book pro­jects please email: archiv@porsche.de

3. All lo­gos and trade­marks men­tioned on Porsche Newsroom are trade­marks of Dr. Ing. h.c. F. Porsche AG (hereinafter: Porsche AG), un­less oth­er­wise stated.

4. All con­tents of Porsche Newsroom are care­fully re­searched and com­piled. Nevertheless, the in­for­ma­tion may con­tain er­rors or in­ac­cu­ra­cies. Porsche AG does not ac­cept any li­a­bil­ity with re­spect to the re­sults that may be achived through the use of the in­for­ma­tion, in par­tic­u­lar with re­spect to ac­cu­racy, up-to-date­ness and com­plete­ness.

5. Insofar as Porsche Newsroom pro­vides in­for­ma­tion con­cern­ing ve­hi­cles, the data refers to the German mar­ket. Statements con­cern­ing stan­dard equip­ment and statu­tory, le­gal and tax reg­u­la­tions and reper­cus­sion are valid for the Federal Public of Germany only.

6. With re­spect to the use of Porsche Newsroom, tech­ni­cal faults such as, de­lays to news trans­mis­sion, can­not be ruled out. Porsche AG does not ac­cept any li­a­bil­ity for any re­sult­ing dam­age.

7. Insofar as Porsche Newsroom pro­vides links to the in­ter­net sites of third par­ties, Porsche AG does not ac­cept any re­spon­si­bil­ity for the con­tent of the linked sites. On us­ing the links, the user leaves the Porsche AG in­for­ma­tion prod­ucts.

8. In agree­ing to these rights of use, the user shall be obliged to re­frain from any im­proper use of Porsche Newsroom.

9. In the event of im­proper use, Porsche AG re­serves the right to block ac­cess to Porsche Newsroom.

10. Should one or more pro­vi­sions of these terms and con­di­tions be or be­come in­valid, this shall not af­fect the va­lid­ity of the re­main­ing pro­vi­sions.

...

Read the original on newsroom.porsche.com »

6 374 shares, 51 trendiness

America Is Slow-Walking Into a Polymarket Disaster

For the past week, I’ve found my­self play­ing the same 23-second CNN clip on re­peat. I’ve watched it in bed, dur­ing my com­mute to work, at the of­fice, mid­way through mak­ing car­rot soup, and while brush­ing my teeth. In the video, Harry Enten, the net­work’s chief data an­a­lyst, stares into the cam­era and breath­lessly tells his au­di­ence about the gam­bling odds that Donald Trump will buy any of Greenland. The peo­ple who are putting their money where their mouth is—they are ab­solutely tak­ing this se­ri­ously,” Enten says. He taps the gi­ant touch screen be­hind him and pulls up a made-for-TV graphic: Based on how peo­ple were bet­ting on­line at the time, there was a 36 per­cent chance that the pres­i­dent would an­nex Greenland. Whoa, way up there!” Enten yells, slap­ping his hands to­gether. My good­ness gra­cious!” The ticker at the bot­tom of the screen speeds through other odds: Will Gavin Newsom win the next pres­i­den­tial elec­tion? 19 per­cent chance. Will Viktor Orbán be out as the leader of Hungary be­fore the end of the year? 48 per­cent chance.

These odds were pulled from Kalshi, which hi­lar­i­ously claims not to be a gam­bling plat­form: It’s a prediction mar­ket.” People go to sites such as Kalshi and Polymarket—another big pre­dic­tion mar­ket—in or­der to put money down on a given news event. Nobody would bet on some­thing that they did­n’t be­lieve would hap­pen, the think­ing goes, and so the mar­kets are meant to fore­cast the like­li­hood of a given out­come.

Listen: Prediction mar­kets and the suckerification” cri­sis, with Max Read

Prediction mar­kets let you wa­ger on ba­si­cally any­thing. Will Elon Musk fa­ther an­other baby by June 30? Will Jesus re­turn this year? Will Israel strike Gaza to­mor­row? Will the longevity guru Bryan Johnson’s next func­tional sperm count be greater than 20.0 M/ejac”? These sites have re­cently boomed in pop­u­lar­ity—par­tic­u­larly among ter­mi­nally on­line young men who trade meme stocks and siphon from their 401(k)s to buy up bit­coin. But now pre­dic­tion mar­kets are creep­ing into the main­stream. CNN an­nounced a deal with Kalshi last month to in­te­grate the site’s data into its broad­casts, which has led to bet­ting odds show­ing up in seg­ments about Democrats pos­si­bly re­tak­ing the House, credit-card in­ter­est rates, and Federal Reserve Chair Jerome Powell. At least twice in the past two weeks, Enten has told view­ers about the value of data from peo­ple who are putting their money where their mouth is.”

On January 7, the me­dia gi­ant Dow Jones an­nounced its own col­lab­o­ra­tion with Polymarket and said that it will be­gin in­te­grat­ing the site’s odds across its pub­li­ca­tions, in­clud­ing The Wall Street Journal. CNBC has a pre­dic­tion-mar­ket deal, as does Yahoo Finance, Sports Illustrated, and Time. Last week, MoviePass an­nounced that it will be­gin test­ing a bet­ting plat­form. On Sunday, the Golden Globes fea­tured Polymarket’s fore­casts through­out the broad­cast—be­cause ap­par­ently Americans wanted to know whether on­line gam­blers fa­vored Amy Poehler or Dax Shepard to win Best Podcast.

Media is a ruth­less, un­sta­ble busi­ness, and rev­enue streams are dry­ing up; if you squint, you can see why CNN or Dow Jones might sign a con­tract that, af­ter all, pro­vides its au­di­ence with some kind of data. On air, Enten cites Kalshi odds along­side Gallup polls and Google searches—what’s the dif­fer­ence? The data fea­tured through our part­ner­ship with Kalshi is just one of many sources used to pro­vide con­text around the sto­ries or top­ics we are cov­er­ing and has no im­pact on ed­i­to­r­ial judg­ment,” Brian Poliakoff, a CNN spokesper­son, told me in a state­ment. Nolly Evans, the Journal’s dig­i­tal gen­eral man­ager, told me that Polymarket pro­vides the news­pa­per’s jour­nal­ists with another way to quan­tify col­lec­tive ex­pec­ta­tions—es­pe­cially around fi­nan­cial or geopo­lit­i­cal events.” In an email, Jack Suh, a Kalshi spokesper­son, told me that the com­pa­ny’s part­ner­ships are de­signed to in­form the pub­lic, not to en­cour­age more trad­ing. Polymarket de­clined to com­ment.

The prob­lem is that pre­dic­tion mar­kets are ush­er­ing in a world in which news be­comes as much about gam­bling as about the event it­self. This kind of thing has al­ready hap­pened to sports, where the lan­guage of parlays” and covering the spread” has in­fil­trated every inch of com­men­tary. ESPN part­ners with DraftKings to bring its odds to SportsCenter and Monday Night Football; CBS Sports has a bet­ting ver­ti­cal; FanDuel runs its own stream­ing net­work. But the stakes of Greenland’s fu­ture are more con­se­quen­tial than the NFL play­offs.

The more that pre­dic­tion mar­kets are treated like news, es­pe­cially head­ing into an­other elec­tion, the more every dip and swing in the odds may end up wildly mis­lead­ing peo­ple about what might hap­pen, or in­flu­enc­ing what hap­pens in the real world. Yet it’s un­clear whether these sites are mean­ing­ful pre­dic­tors of any­thing. After the Golden Globes, Polymarket CEO Shayne Coplan ex­cit­edly posted that his site had cor­rectly pre­dicted 26 of 28 win­ners, which seems im­pres­sive—but Hollywood awards shows are gen­er­ally pre­dictable. One re­cent study found that Polymarket’s fore­casts in the weeks be­fore the 2024 elec­tion were not much bet­ter than chance.

These mar­kets are also ma­nip­u­la­ble. In 2012, one bet­tor on the now-de­funct pre­dic­tion mar­ket Intrade placed a se­ries of huge wa­gers on Mitt Romney in the two weeks pre­ced­ing the elec­tion, gen­er­at­ing a bet­ting line in­dica­tive of a tight race. The bet­tor did not seem mo­ti­vated by fi­nan­cial gain, ac­cord­ing to two re­searchers who ex­am­ined the trades. More plau­si­bly, this trader could have been at­tempt­ing to ma­nip­u­late be­liefs about the odds of vic­tory in an at­tempt to boost fundrais­ing, cam­paign morale, and turnout,” they wrote. The trader lost at least $4 mil­lion but might have shaped me­dia at­ten­tion of the race for less than the price of a prime-time ad, they con­cluded.

A bil­lion­aire con­gres­sional can­di­date can’t just send a check to Quinnipiac University and sud­denly find him­self as the polling front-run­ner, but he can place enor­mous Polymarket bets on him­self that move the odds in his fa­vor. Or con­sider this hy­po­thet­i­cal laid out by the Stanford po­lit­i­cal sci­en­tist Andrew Hall: What if, a month be­fore the 2028 pres­i­den­tial elec­tion, the race is dead even be­tween J. D. Vance and Mark Cuban? Inexplicably, Vance’s odds of win­ning surge on Kalshi, pos­si­bly linked to shady over­seas bets. CNN airs seg­ment af­ter seg­ment about the spike, turn­ing it into an all-con­sum­ing na­tional news story. Democrats and Republicans point fin­gers at each other, and no one knows what’s re­ally go­ing on. Such a sce­nario is plausible—maybe even likely—in the com­ing years,” Hall writes. It does­n’t help that the Trump Media and Technology Group, the owner of the pres­i­den­t’s so­cial-me­dia plat­form, Truth Social, is set to launch its own plat­form, Truth Predict. (Donald Trump Jr. is an ad­viser to both Kalshi and Polymarket.)

The irony of pre­dic­tion mar­kets is that they are sup­posed to be a more trust­wor­thy way of glean­ing the fu­ture than in­ter­net click­bait and half-baked pun­ditry, but they risk shred­ding what­ever shared trust we still have left. The sus­pi­ciously well-timed bets that one Polymarket user placed right be­fore the cap­ture of Nicolás Maduro may have been just a stroke of phe­nom­e­nal luck that net­ted a roughly $400,000 pay­out. Or maybe some­one with in­side in­for­ma­tion was look­ing for easy money. Last week, when White House Press Secretary Karoline Leavitt abruptly ended her brief­ing af­ter 64 min­utes and 30 sec­onds, many traders were out­raged, be­cause they had pre­dicted (with 98 per­cent odds) that the brief­ing would run past 65 min­utes. Some sus­pected, with no ev­i­dence, that Leavitt had de­lib­er­ately stopped be­fore the 65-minute mark to turn a profit. (When I asked the White House about this, the spokesper­son Davis Ingle told me in a state­ment, This is a 100% Fake News nar­ra­tive.”)

Read: The Polymarket bets on Maduro are a warn­ing

Unintentionally or not, this is what hap­pens when me­dia out­lets nor­mal­ize treat­ing every piece of news and en­ter­tain­ment as some­thing to wa­ger on. As Tarek Mansour, Kalshi’s CEO, has said, his long-term goal is to financialize every­thing and cre­ate a trad­able as­set out of any dif­fer­ence in opin­ion.” (Kalshi means everything” in Arabic.) What could go wrong? As one vi­ral post on X re­cently put it, Got a buddy who is pray­ing for world war 3 so he can win $390 on Polymarket.” It’s a joke. I think.

...

Read the original on www.theatlantic.com »

7 330 shares, 24 trendiness

...

Read the original on lemdro.id »

8 218 shares, 9 trendiness

Notes on Apple's Nano Texture

up & to the right

2024 Nano Texture Macbook Pro on the left; 2021 Glossy Macbook Pro on the right

TLDR: the Nano Texture per­forms won­der­fully any­where where light used to be a fac­tor and used to force me to shade my screen or avoid the place en­tirely.

I’m less con­cerned with where I sit in­doors. Coffee shops / of­fices with sky­lights or in­tense light­ing are much more com­fort­able

Coding and work­ing out­side is now fea­si­ble: brows­ing the in­ter­net, writ­ing in Obsidian; all de­light­ful

The screen needs more ef­fort to keep clean than a nor­mal screen and comes with a spe­cial wipe that needs to be used in­stead of mi­crofiber

Black text on white back­ground (light mode) is con­sid­er­ably more read­able than white text on black back­ground (dark mode)

Big thanks to Julie Kruger for the com­par­i­son pho­tos and CJ for draft feed­back.

A few months af­ter I got the Daylight Computer (read my thoughts here), two friends sent me this post com­par­ing the old Macbook Pro dis­plays to the new Nano Texture glass ones. That post con­vinced me to up­grade my com­puter in short or­der, to the dis­may of my wal­let.

In the four months I’ve had it I’ve told at least a dozen peo­ple about it, and I’m gonna keep telling peo­ple. Being able to take my en­tire com­put­ing en­vi­ron­ment to places with­out be­ing wor­ried about glare has ex­panded the range of en­vi­ron­ments I can cre­ate in. It means I get to be in en­vi­ron­ments that are more in­ter­est­ing, fun, and in tune with my body.

What fol­lows are some thoughts about how this dis­play has fit into my day to day life in the cou­ple of months I’ve had it.

Typical matt dis­plays have a coat­ing added to their sur­face that scat­ters light. However, these coat­ings lower con­trast while pro­duc­ing un­wanted haze and sparkle. Etched into the glass at the nanome­tre level, the nano-tex­ture scat­ters light to fur­ther min­imise glare — for out­stand­ing im­age qual­ity even in chal­leng­ing light­ing con­di­tions.

Basically, it’s a coat­ing phys­i­cally etched into the screen that re­flects light dif­fer­ently from the glossy fin­ish of the tra­di­tional screen.

Cursor on the 2021 MBP (Glossy) on the left; 2024 MBP (Nano Texture) on the right

First off, this is­n’t ap­ples to or­anges - these are dif­fer­ent tech­nolo­gies that in my mind, serve a dif­fer­ent pur­pose. The Daylight Computer is an Android tablet, the Macbook Pro is a full MacOS lap­top.

The trans­flec­tive LCD in the Daylight Computer is grayscale but it needs no light to func­tion. It has a back­light, but where it does re­ally well is in di­rect sun­light with the back­light turned off. When out­side in di­rect sun­light, tog­gling the Daylight’s back­light on and off does­n’t make a dif­fer­ence be­cause it works fun­da­men­tally dif­fer­ent from a lap­top screen.

white text on black back­ground has about the same read­abil­ity as black text on white back­ground

the back­light can be low­ered to 0% out­side with no im­pact to vis­i­bil­ity and mak­ing the bat­tery last won­der­fully long

grayscale + lower DPI lim­its how much text can fit on the screen

Daylight be­ing a tablet form fac­tor means I have to fid­dle around with a con­fig­u­ra­tion that will hold my screen in an ideal an­gle. It’s rea­son­ably for­giv­ing but cer­tain an­gles are harder to see with than oth­ers

2024 MBP on left; 2021 MBP on right. Dark mode is less ideal on both.

The Nano Texture MacBook Pro is still ul­ti­mately a tra­di­tional LCD screen. This means the only way to see the screen is if the back­light is pow­ered on: hav­ing the back­light off in di­rect sun­lights re­sults in a black screen. Also, it’s worth not­ing:

white text on black bg is a lot less read­able than black text on white bg

the back­light gen­er­ally has to be at 90%+ to be com­fort­able

retina dis­play + wide swath of the color spec­trum means most of what I can do in­doors, I can do out­doors as well

be­ing a lap­top with a hinge, it’s very easy to find the ex­act an­gle I want that min­i­mizes glare & max­i­mizes com­fort

Both how­ever are an in­cred­i­ble up­grade over out­door com­put­ing op­tions from just 1 year ago. I be­lieve these are both mas­sive steps in terms of er­gonom­ics and free­dom to be in more places as we com­pute.

some down­sides to con­sider

fin­ger­prints, splat­ters, and smudges are mildly an­noy­ing in­doors but al­most flu­o­res­cent out­doors

rub­bing al­co­hol cleans them off when fric­tion alone does­n’t do the trick but it still takes some rub­bing. as far as I can tell, it’s not de­grad­ing the fin­ish but I also try to clean it with the cloth be­fore ap­ply­ing al­co­hol

they give you one spe­cial screen clean­ing cloth. I think the ideal num­ber is like 5. Only this one can be used for Nano Texture screens.

I read some­where that this is be­cause tra­di­tional mi­crofiber cloths will shred into the screen, de­grad­ing vis­i­bil­ity (but I can’t re­mem­ber where so don’t quote me on on this)

I’ve learned to bring my spe­cial wipe when I bring my lap­top, and I slip a few rub­bing al­co­hol wipes in there as well. I wet the Special Cloth with the al­co­hol wipes, and then ap­ply the Special Cloth to the screen. This is def­i­nitely high main­te­nance

I have to swat other peo­ple’s hands away when they try to point some­thing out on my screen with their pizza fin­gers

I’m more para­noid about swing­ing a USB C ca­ble up against my screen or clos­ing my lap­top down on a grain of rice. I was less wor­ried with my old screen

The Nano Texture up­grade is an ex­tra $150 on an al­ready-ex­pen­sive com­puter

Closing the MacBook re­sults in slight rub­bing on the screen at the bot­tom of the key­board / top of the track­pad, leav­ing scratches on the screen. So far this is­n’t detri­men­tal when the bright­ness is up; it’s only vis­i­ble with the back­light off

I don’t think this is a new thing be­cause my old MacBook Pro (glossy screen) has scratches in the same ex­act place but I am wor­ried about them be­ing more vis­i­ble on the Nano Texture screen in the long run

If you get an­noyed by the glare of your screen and don’t mind a bit of ex­tra men­tal band­width to keep your screen clean, I would highly rec­om­mend con­sid­er­ing a Nano Texture dis­play up­grade on your next lap­top pur­chase. If you have a chaotic en­vi­ron­ment and can’t be both­ered to keep your screen clean, or you aren’t both­ered much by glare or re­flec­tions in the en­vi­ron­ments you work in, then the Nano Texture is prob­a­bly not for you.

toms­guide.com: I put the M4 MacBook Pro’s nano-tex­ture dis­play to the test and it’s a game-changer

...

Read the original on jon.bo »

9 198 shares, 11 trendiness

On the Coming Industrialisation of Exploit Generation with LLMs

Follow @seanhn

Recently I ran an ex­per­i­ment where I built agents on top of Opus 4.5 and GPT-5.2 and then chal­lenged them to write ex­ploits for a ze­ro­day vul­ner­a­bil­ity in the QuickJS Javascript in­ter­preter. I added a va­ri­ety of mod­ern ex­ploit mit­i­ga­tions, var­i­ous con­straints (like as­sum­ing an un­known heap start­ing state, or for­bid­ding hard­coded off­sets in the ex­ploits) and dif­fer­ent ob­jec­tives (spawn a shell, write a file, con­nect back to a com­mand and con­trol server). The agents suc­ceeded in build­ing over 40 dis­tinct ex­ploits across 6 dif­fer­ent sce­nar­ios, and GPT-5.2 solved every sce­nario. Opus 4.5 solved all but two. I’ve put a tech­ni­cal write-up of the ex­per­i­ments and the re­sults on Github, as well as the code to re­pro­duce the ex­per­i­ments.

In this post I’m go­ing to fo­cus on the main con­clu­sion I’ve drawn from this work, which is that we should pre­pare for the in­dus­tri­al­i­sa­tion of many of the con­stituent parts of of­fen­sive cy­ber se­cu­rity. We should start as­sum­ing that in the near fu­ture the lim­it­ing fac­tor on a state or group’s abil­ity to de­velop ex­ploits, break into net­works, es­ca­late priv­i­leges and re­main in those net­works, is go­ing to be their to­ken through­put over time, and not the num­ber of hack­ers they em­ploy. Nothing is cer­tain, but we would be bet­ter off hav­ing wasted ef­fort think­ing through this sce­nario and have it not hap­pen, than be un­pre­pared if it does.

A Brief Overview of the Experiment

All of the code to re-run the ex­per­i­ments, a de­tailed write-up of them, and the raw data the agents pro­duced are on Github, but just to give a flavour of what the agents ac­com­plished:

Both agents turned the QuickJS vul­ner­a­bil­ity into an API to al­low them to read and ar­bi­trar­ily mod­ify the ad­dress space of the tar­get process. As the vul­ner­a­bil­ity is a ze­ro­day with no pub­lic ex­ploits for it, this ca­pa­bil­ity had to be de­vel­oped by the agents through read­ing source code, de­bug­ging and trial and er­ror. A sam­ple of the no­table ex­ploits is here and I have writ­ten up one of them in de­tail here.

They solved most chal­lenges in less than an hour and rel­a­tively cheaply. I set a to­ken limit of 30M per agent run and ran ten runs per agent. This was more than enough to solve all but the hard­est task. With Opus 4.5 30M to­tal to­kens (input and out­put) ends up cost­ing about $30 USD.

In the hard­est task I chal­lenged GPT-5.2 it to fig­ure out how to write a spec­i­fied string to a spec­i­fied path on disk, while the fol­low­ing pro­tec­tions were en­abled: ad­dress space lay­out ran­domi­sa­tion, non-ex­e­cutable mem­ory, full RELRO, fine-grained CFI on the QuickJS bi­nary, hard­ware-en­forced shadow-stack, a sec­comp sand­box to pre­vent shell ex­e­cu­tion, and a build of QuickJS where I had stripped all func­tion­al­ity in it for ac­cess­ing the op­er­at­ing sys­tem and file sys­tem. To write a file you need to chain mul­ti­ple func­tion calls, but the shadow-stack pre­vents ROP and the sand­box pre­vents sim­ply spawn­ing a shell process to solve the prob­lem. GPT-5.2 came up with a clever so­lu­tion in­volv­ing chain­ing 7 func­tion calls through glibc’s exit han­dler mech­a­nism. The full ex­ploit is here and an ex­pla­na­tion of the so­lu­tion is here. It took the agent 50M to­kens and just over 3 hours to solve this, for a cost of about $50 for that agent run. (As I was run­ning four agents in par­al­lel the true cost was closer to $150).

Before go­ing on there are two im­por­tant caveats that need to be kept in mind with these ex­per­i­ments:

While QuickJS is a real Javascript in­ter­preter, it is an or­der of mag­ni­tude less code, and at least an or­der of mag­ni­tude less com­plex, than the Javascript in­ter­preters in Chrome and Firefox. We can ob­serve the ex­ploits pro­duced for QuickJS and the man­ner in which they were pro­duced and con­clude, as I have, that it ap­pears that LLMs are likely to solve these prob­lems ei­ther now or in the near fu­ture, but we can’t say de­fin­i­tively that they can with­out spend­ing the to­kens and see­ing it hap­pen.

The ex­ploits gen­er­ated do not demon­strate novel, generic breaks in any of the pro­tec­tion mech­a­nisms. They take ad­van­tage of known flaws in those pro­tec­tion mech­a­nisms and gaps that ex­ist in real de­ploy­ments of them. These are the same gaps that hu­man ex­ploit de­vel­op­ers take ad­van­tage of, as they also typ­i­cally do not come up with novel breaks of ex­ploit mit­i­ga­tions for each ex­ploit. I’ve ex­plained those gaps in de­tail here. What is novel are the over­all ex­ploit chains. This is true by de­f­i­n­i­tion as the QuickJS vul­ner­a­bil­ity was pre­vi­ously un­known un­til I found it (or, more cor­rectly: my Opus 4.5 vul­ner­a­bil­ity dis­cov­ery agent found it). The ap­proach GPT-5.2 took to solv­ing the hard­est chal­lenge men­tioned above was also novel to me at least, and I haven’t been able to find any ex­am­ple of it writ­ten down on­line. However, I would­n’t be sur­prised if it’s known by CTF play­ers and pro­fes­sional ex­ploit de­vel­op­ers, and just not writ­ten down any­where.

By industrialisation’ I mean that the abil­ity of an or­gan­i­sa­tion to com­plete a task will be lim­ited by the num­ber of to­kens they can throw at that task. In or­der for a task to be industrialised’ in this way it needs two things:

An LLM-based agent must be able to search the so­lu­tion space. It must have an en­vi­ron­ment in which to op­er­ate, ap­pro­pri­ate tools, and not re­quire hu­man as­sis­tance. The abil­ity to do true search’, and cover more of the so­lu­tion space as more to­kens are spent also re­quires some base­line ca­pa­bil­ity from the model to process in­for­ma­tion, re­act to it, and make sen­si­ble de­ci­sions that move the search for­ward. It looks like Opus 4.5 and GPT-5.2 pos­sess this in my ex­per­i­ments. It will be in­ter­est­ing to see how they do against a much larger space, like v8 or Firefox.

The agent must have some way to ver­ify its so­lu­tion. The ver­i­fier needs to be ac­cu­rate, fast and again not in­volve a hu­man.

Exploit de­vel­op­ment is the ideal case for in­dus­tri­al­i­sa­tion. An en­vi­ron­ment is easy to con­struct, the tools re­quired to help solve it are well un­der­stood, and ver­i­fi­ca­tion is straight­for­ward. I have writ­ten up the ver­i­fi­ca­tion process I used for the ex­per­i­ments here, but the sum­mary is: an ex­ploit tends to in­volve build­ing a ca­pa­bil­ity to al­low you to do some­thing you should­n’t be able to do. If, af­ter run­ning the ex­ploit, you can do that thing, then you’ve won. For ex­am­ple, some of the ex­per­i­ments in­volved writ­ing an ex­ploit to spawn a shell from the Javascript process. To ver­ify this the ver­i­fi­ca­tion har­ness starts a lis­tener on a par­tic­u­lar lo­cal port, runs the Javascript in­ter­preter and then pipes a com­mand into it to run a com­mand line util­ity that con­nects to that lo­cal port. As the Javascript in­ter­preter has no abil­ity to do any sort of net­work con­nec­tions, or spawn­ing of an­other process in nor­mal ex­e­cu­tion, you know that if you re­ceive the con­nect back then the ex­ploit works as the shell that it started has run the com­mand line util­ity you sent to it.

There is a third at­tribute of prob­lems in this space that may in­flu­ence how/​when they are in­dus­tri­al­is­able: if an agent can solve a prob­lem in an of­fline set­ting and then use its so­lu­tion, then it maps to the sort of large scale so­lu­tion search that mod­els seem to be good at to­day. If of­fline search is­n’t fea­si­ble, and the agent needs to find a so­lu­tion while in­ter­act­ing with the real en­vi­ron­ment, and that en­vi­ron­ment has the at­tribute that cer­tain ac­tions by the agent per­ma­nently ter­mi­nate the search, then in­dus­tri­al­i­sa­tion may be more dif­fi­cult. Or, at least, it’s less ap­par­ent that the ca­pa­bil­i­ties of cur­rent LLMs map di­rectly to prob­lems with this at­tribute.

There are sev­eral tasks in­volved in cy­ber in­tru­sions that have this third prop­erty: ini­tial ac­cess via ex­ploita­tion, lat­eral move­ment, main­tain­ing ac­cess, and the use of ac­cess to do es­pi­onage (i.e. ex­fil­trate data). You can’t per­form the en­tire search ahead of time and then use the so­lu­tion. Some amount of search has to take place in the real en­vi­ron­ment, and that en­vi­ron­ment is ad­ver­sar­ial in that if a wrong ac­tion is taken it can ter­mi­nate the en­tire search. i.e. the agent is de­tected and kicked out of the net­work, and po­ten­tially the en­tire op­er­a­tion is burned. For these tasks I think my cur­rent ex­per­i­ments pro­vide less in­for­ma­tion. They are fun­da­men­tally not about trad­ing to­kens for search space cov­er­age. That said, if we think we can build mod­els for au­tomat­ing cod­ing and SRE work, then it would seem un­usual to think that these sorts of hack­ing-re­lated tasks are go­ing to be im­pos­si­ble.

Where are we now?

We are al­ready at a point where with vul­ner­a­bil­ity dis­cov­ery and ex­ploit de­vel­op­ment you can trade to­kens for real re­sults. There’s ev­i­dence for this from the Aardvark pro­ject at OpenAI where they have said they’re see­ing this sort of re­sult: the more to­kens you spend, the more bugs you find, and the bet­ter qual­ity those bugs are. You can also see it in my ex­per­i­ments. As the chal­lenges got harder I was able to spend more and more to­kens to keep find­ing so­lu­tions. Eventually the lim­it­ing fac­tor was my bud­get, not the mod­els. I would be more sur­prised if this is­n’t in­dus­tri­alised by LLMs, than if it is.

For the other tasks in­volved in hack­ing/​cy­ber in­tru­sion we have to spec­u­late. There’s less pub­lic in­for­ma­tion on how LLMs per­form on these tasks in real en­vi­ron­ments (for ob­vi­ous rea­sons). We have the re­port from Anthropic on the Chinese hack­ing team us­ing their API to or­ches­trate at­tacks, so we can at least con­clude that or­gan­i­sa­tions are try­ing to get this to work. One hint that we might not be yet at a place where post-ac­cess hack­ing-re­lated tasks are au­to­mated is that there don’t ap­pear to be any com­pa­nies that have en­tirely au­to­mated SRE work (or at least, that I am aware of).

The types of prob­lems that you en­counter if you want to au­to­mate the work of SREs, sys­tem ad­mins and de­vel­op­ers that man­age pro­duc­tion net­works are con­cep­tu­ally sim­i­lar to those of a hacker op­er­at­ing within an ad­ver­sary’s net­work. An agent for SRE can’t just do ar­bi­trary search for so­lu­tions with­out con­sid­er­ing the con­se­quences of ac­tions. There are ac­tions that if it takes the search is ter­mi­nated and it loses per­ma­nently (i.e. drop­ping the pro­duc­tion data­base). While we might not get pub­lic con­fir­ma­tion that the hack­ing-re­lated tasks with this third prop­erty are now au­tomat­able, we do have a canary’. If there are com­pa­nies suc­cess­fully sell­ing agents to au­to­mate the work of an SRE, and us­ing gen­eral pur­pose mod­els from fron­tier labs, then it’s more likely that those same mod­els can be used to au­to­mate a va­ri­ety of hack­ing-re­lated tasks where an agent needs to op­er­ate within the ad­ver­sary’s net­work.

These ex­per­i­ments shifted my ex­pec­ta­tions re­gard­ing what is and is not likely to get au­to­mated in the cy­ber do­main, and my time line for that. It also left me with a bit of a wish list from the AI com­pa­nies and other en­ti­ties do­ing eval­u­a­tions.

Right now, I don’t think we have a clear idea of the real abil­i­ties of cur­rent gen­er­a­tion mod­els. The rea­son for that is that CTF-based eval­u­a­tions and eval­u­a­tions us­ing syn­thetic data or old vul­ner­a­bil­i­ties just aren’t that in­for­ma­tive when your ques­tion re­lates to find­ing and ex­ploit­ing ze­ro­days in hard tar­gets. I would strongly urge the teams at fron­tier labs that are eval­u­at­ing model ca­pa­bil­i­ties, as well as for AI Security Institutes, to con­sider eval­u­at­ing their mod­els against real, hard, tar­gets us­ing ze­ro­day vul­ner­a­bil­i­ties and re­port­ing those eval­u­a­tions pub­licly. With the next ma­jor re­lease from a fron­tier lab I would love to read some­thing like We spent X bil­lion to­kens run­ning our agents against the Linux ker­nel and Firefox and pro­duced Y ex­ploits“. It does­n’t mat­ter if Y=0. What mat­ters is that X is some very large num­ber. Both com­pa­nies have strong se­cu­rity teams so it’s en­tirely pos­si­ble they are al­ready mov­ing to­wards this. OpenAI al­ready have the Aardvark pro­ject and it would be very help­ful to pair that with a pro­ject try­ing to ex­ploit the vul­ner­a­bil­i­ties they are al­ready find­ing.

For the AI Security Institutes it’s would be worth spend­ing time iden­ti­fy­ing gaps in the eval­u­a­tions that the model com­pa­nies are do­ing, and work­ing with them to get those gaps ad­dressed. For ex­am­ple, I’m al­most cer­tain that you could drop the firmware from a huge num­ber of IoT de­vices (routers, IP cam­eras, etc) into an agent based on Opus 4.5 or GPT-5.2 and get func­tion­ing ex­ploits out the other end in less a week of work. It’s not ideal that eval­u­a­tions fo­cus on CTFs, syn­thetic en­vi­ron­ments and old vul­ner­a­bil­i­ties, but don’t pro­vide this sort of di­rect as­sess­ment against real tar­gets.

In gen­eral, if you’re a re­searcher or en­gi­neer, I would en­cour­age you to pick the most in­ter­est­ing ex­ploita­tion re­lated prob­lem you can think of, spend as many to­kens as you can af­ford on it, and write up the re­sults. You may be sur­prised by how well it works.

Hopefully the source code for my ex­per­i­ments will be of some use in that.

Share on Facebook (Opens in new win­dow)

Share on X (Opens in new win­dow)

Back to top

...

Read the original on sean.heelan.io »

10 197 shares, 7 trendiness

The Microstructure of Wealth Transfer in Prediction Markets

Slot ma­chines on the Las Vegas Strip re­turn about 93 cents on the dol­lar. This is widely con­sid­ered some of the worst odds in gam­bling. Yet on Kalshi, a CFTC-regulated pre­dic­tion mar­ket, traders have wa­gered vast sums on long­shot con­tracts with his­tor­i­cal re­turns as low as 43 cents on the dol­lar. Thousands of par­tic­i­pants are vol­un­tar­ily ac­cept­ing ex­pected val­ues far lower than a casino slot ma­chine to bet on their con­vic­tions.

The ef­fi­cient mar­ket hy­poth­e­sis sug­gests that as­set prices should per­fectly ag­gre­gate all avail­able in­for­ma­tion. Prediction mar­kets the­o­ret­i­cally pro­vide the purest test of this the­ory. Unlike eq­ui­ties, there is no am­bi­gu­ity about in­trin­sic value. A con­tract ei­ther pays $1 or it does not. A price of 5 cents should im­ply ex­actly a 5% prob­a­bil­ity.

We an­a­lyzed 72.1 mil­lion trades cov­er­ing $18.26 bil­lion in vol­ume to test this ef­fi­ciency. Our find­ings sug­gest that col­lec­tive ac­cu­racy re­lies less on ra­tio­nal ac­tors than on a mech­a­nism for har­vest­ing er­ror. We doc­u­ment a sys­tem­atic wealth trans­fer where im­pul­sive Takers pay a struc­tural pre­mium for af­fir­ma­tive YES out­comes while Makers cap­ture an Optimism Tax” sim­ply by sell­ing into this bi­ased flow. The ef­fect is strongest in high-en­gage­ment cat­e­gories like Sports and Entertainment, while low-en­gage­ment cat­e­gories like Finance ap­proach per­fect ef­fi­ciency.

This pa­per makes three con­tri­bu­tions. First, it con­firms the pres­ence of the long­shot bias on Kalshi and quan­ti­fies its mag­ni­tude across price lev­els. Second, it de­com­poses re­turns by mar­ket role, re­veal­ing a per­sis­tent wealth trans­fer from tak­ers to mak­ers dri­ven by asym­met­ric or­der flow. Third, it iden­ti­fies a YES/NO asym­me­try where tak­ers dis­pro­por­tion­ately fa­vor af­fir­ma­tive bets at long­shot prices, ex­ac­er­bat­ing their losses.

Prediction mar­kets are ex­changes where par­tic­i­pants trade bi­nary con­tracts on real-world out­comes. These con­tracts set­tle at ei­ther $1 or $0, with prices rang­ing from 1 to 99 cents serv­ing as prob­a­bil­ity prox­ies. Unlike eq­uity mar­kets, pre­dic­tion mar­kets are strictly zero-sum: every dol­lar of profit cor­re­sponds ex­actly to a dol­lar of loss.

Kalshi launched in 2021 as the first U. S. pre­dic­tion mar­ket reg­u­lated by the CFTC. Initially fo­cused on eco­nomic and weather data, the plat­form stayed niche un­til 2024. A le­gal vic­tory over the CFTC se­cured the right to list po­lit­i­cal con­tracts, and the 2024 elec­tion cy­cle trig­gered ex­plo­sive growth. Sports mar­kets, in­tro­duced in 2025, now dom­i­nate trad­ing ac­tiv­ity.

Volume dis­tri­b­u­tion across cat­e­gories is highly un­even. Sports ac­counts for 72% of no­tional vol­ume, fol­lowed by pol­i­tics at 13% and crypto at 5%.

Note: Data col­lec­tion con­cluded on 2025-11-25 at 17:00 ET; Q4 2025 fig­ures are in­com­plete.

The dataset, avail­able on GitHub, con­tains 7.68 mil­lion mar­kets and 72.1 mil­lion trades. Each trade records the ex­e­cu­tion price (1-99 cents), taker side (yes/no), con­tract count, and time­stamp. Markets in­clude res­o­lu­tion out­come and cat­e­gory clas­si­fi­ca­tion.

Role as­sign­ment: Each trade iden­ti­fies the liq­uid­ity taker. The maker took the op­po­site po­si­tion. If tak­er_­side = yes at 10 cents, the taker bought YES at 10¢; the maker bought NO at 90¢.

Cost Basis (): To com­pare asym­me­tries be­tween YES and NO con­tracts, we nor­mal­ize all trades by cap­i­tal risked. For a stan­dard YES trade at 5 cents, . For a NO trade at 5 cents, . All ref­er­ences to Price” in this pa­per re­fer to this Cost Basis un­less oth­er­wise noted.

Mispricing () mea­sures the di­ver­gence be­tween ac­tual win rate and im­plied prob­a­bil­ity for a sub­set of trades :

* Gross Excess re­turn () is the re­turn rel­a­tive to cost, gross of plat­form fees, where is price in cents and is the out­come:

Calculations de­rive from re­solved mar­kets only. Markets that were voided, delisted, or re­main open are ex­cluded. Additionally, trades from mar­kets with less than $100 in no­tional vol­ume were ex­cluded. The dataset re­mains ro­bust across all price lev­els; the spars­est bin (81-90¢) con­tains 5.8 mil­lion trades.

First doc­u­mented by Griffith (1949) in horse rac­ing and later for­mal­ized by Thaler & Ziemba (1988) in their analy­sis of parimutuel bet­ting mar­kets, the long­shot bias de­scribes the ten­dency for bet­tors to over­pay for low-prob­a­bil­ity out­comes. In ef­fi­cient mar­kets, a con­tract priced at cents should win ap­prox­i­mately % of the time. In mar­kets ex­hibit­ing long­shot bias, low-priced con­tracts win less than their im­plied prob­a­bil­ity, while high-priced con­tracts win more.

The data con­firms this pat­tern on Kalshi. Contracts trad­ing at 5 cents win only 4.18% of the time, im­ply­ing mis­pric­ing of -16.36%. Conversely, con­tracts at 95 cents win 95.83% of the time. This pat­tern is con­sis­tent; all con­tracts priced be­low 20 cents un­der­per­form their odds, while those above 80 cents out­per­form.

Note: The cal­i­bra­tion curve above demon­strates that pre­dic­tion mar­kets are ac­tu­ally quite ef­fi­cient and ac­cu­rate, with the slight ex­cep­tion of the tails. The close align­ment be­tween im­plied and ac­tual prob­a­bil­i­ties con­firms that pre­dic­tion mar­kets are well-cal­i­brated price dis­cov­ery mech­a­nisms.

The ex­is­tence of the long­shot bias raises a ques­tion unique to zero-sum mar­kets: if some traders sys­tem­at­i­cally over­pay, who cap­tures the sur­plus?

Market mi­crostruc­ture de­fines two pop­u­la­tions based on their in­ter­ac­tion with the or­der book. A Maker pro­vides liq­uid­ity by plac­ing limit or­ders that rest on the book. A Taker con­sumes this liq­uid­ity by ex­e­cut­ing against rest­ing or­ders.

The di­ver­gence is most pro­nounced at the tails. At 1-cent con­tracts, tak­ers win only 0.43% of the time against an im­plied prob­a­bil­ity of 1%, cor­re­spond­ing to a mis­pric­ing of -57%. Makers on the same con­tracts win 1.57% of the time, re­sult­ing in a mis­pric­ing of +57%. At 50 cents, mis­pric­ing com­presses; tak­ers show -2.65%, and mak­ers show +2.66%.

Takers ex­hibit neg­a­tive ex­cess re­turns at 80 of 99 price lev­els. Makers ex­hibit pos­i­tive ex­cess re­turns at the same 80 lev­els. The mar­ket’s ag­gre­gate mis­cal­i­bra­tion is con­cen­trated in a spe­cific pop­u­la­tion; tak­ers bear the losses while mak­ers cap­ture the gains.

An ob­vi­ous ob­jec­tion arises; mak­ers earn the bid-ask spread as com­pen­sa­tion for pro­vid­ing liq­uid­ity. Their pos­i­tive re­turns may sim­ply re­flect spread cap­ture rather than the ex­ploita­tion of bi­ased flow. While plau­si­ble, two ob­ser­va­tions sug­gest oth­er­wise.

The first ob­ser­va­tion sug­gests the ef­fect ex­tends be­yond pure spread cap­ture; maker re­turns de­pend on which side they take. If prof­its were purely spread-based, it should not mat­ter whether mak­ers bought YES or NO. We test this by de­com­pos­ing maker per­for­mance by po­si­tion di­rec­tion:

Makers who buy NO out­per­form mak­ers who buy YES 59% of the time. The vol­ume-weighted ex­cess re­turn is +0.77 pp for mak­ers buy­ing YES ver­sus +1.25 pp for mak­ers buy­ing NO, a gap of 0.47 per­cent­age points. The ef­fect is minis­cule (Cohen’s d = 0.02-0.03) but con­sis­tent. At min­i­mum, this sug­gests spread cap­ture is not the whole story.

A sec­ond ob­ser­va­tion strength­ens the case fur­ther; the maker-taker gap varies sub­stan­tially by mar­ket cat­e­gory.

We ex­am­ine whether the maker-taker gap varies by mar­ket cat­e­gory. If the bias re­flects un­in­formed de­mand, cat­e­gories at­tract­ing less so­phis­ti­cated par­tic­i­pants should show larger gaps.

The vari­a­tion is strik­ing. Finance shows a gap of merely 0.17 pp; the mar­ket is ex­tremely ef­fi­cient, with tak­ers los­ing only 0.08% per trade. At the other ex­treme, World Events and Media show gaps ex­ceed­ing 7 per­cent­age points. Sports, the largest cat­e­gory by vol­ume, ex­hibits a mod­er­ate gap of 2.23 pp. Given $6.1 bil­lion in taker vol­ume, even this mod­est gap gen­er­ates sub­stan­tial wealth trans­fer.

Why is Finance ef­fi­cient? The likely ex­pla­na­tion is par­tic­i­pant se­lec­tion; fi­nan­cial ques­tions at­tract traders who think in prob­a­bil­i­ties and ex­pected val­ues rather than fans bet­ting on their fa­vorite team or par­ti­sans bet­ting on a pre­ferred can­di­date. The ques­tions them­selves are dry (“Will the S&P close above 6000?“), which fil­ters out emo­tional bet­tors.

The maker-taker gap is not a fixed fea­ture of the mar­ket; rather, it emerged as the plat­form grew. In Kalshi’s early days, the pat­tern was re­versed; tak­ers earned pos­i­tive ex­cess re­turns while mak­ers lost money.

From launch through 2023, taker re­turns av­er­aged +2.0% while maker re­turns av­er­aged -2.0%. Without so­phis­ti­cated coun­ter­par­ties, tak­ers won; am­a­teur mak­ers de­fined the early pe­riod and were the los­ing pop­u­la­tion. This be­gan to re­verse in 2024 Q2, with the gap cross­ing zero and then widen­ing sharply af­ter the 2024 elec­tion.

The in­flec­tion point co­in­cides with two events; Kalshi’s le­gal vic­tory over the CFTC in October 2024, which per­mit­ted po­lit­i­cal con­tracts, and the sub­se­quent 2024 elec­tion cy­cle. Volume ex­ploded from $30 mil­lion in 2024 Q3 to $820 mil­lion in 2024 Q4. The new vol­ume at­tracted so­phis­ti­cated mar­ket mak­ers, and with them, the ex­trac­tion of value from taker flow.

Pre-election, the av­er­age gap was -2.9 pp (takers win­ning); post-elec­tion, it flipped to +2.5 pp (makers win­ning), a swing of 5.3 per­cent­age points.

The com­po­si­tion of taker flow pro­vides fur­ther ev­i­dence. If the wealth trans­fer arose be­cause new par­tic­i­pants ar­rived with stronger long­shot pref­er­ences, we would ex­pect the dis­tri­b­u­tion to shift to­ward low-prob­a­bil­ity con­tracts. It did not:

The share of taker vol­ume in long­shot con­tracts (1-20¢) re­mained es­sen­tially flat; 4.8% pre-elec­tion ver­sus 4.6% post-elec­tion. The dis­tri­b­u­tion ac­tu­ally shifted to­ward the mid­dle; the 91-99¢ bucket fell from 40-50% in 2021-2023 to un­der 20% in 2025, while mid-range prices (31-70¢) grew sub­stan­tially. Taker be­hav­ior did not be­come more bi­ased; if any­thing, it be­came less ex­treme. Yet taker losses in­creased; new mar­ket mak­ers ex­tract value more ef­fi­ciently across all price lev­els.

This evo­lu­tion re­frames the ag­gre­gate re­sults. The wealth trans­fer from tak­ers to mak­ers is not in­her­ent to pre­dic­tion mar­ket mi­crostruc­ture; it re­quires so­phis­ti­cated mar­ket mak­ers, and so­phis­ti­cated mar­ket mak­ers re­quire suf­fi­cient vol­ume to jus­tify par­tic­i­pa­tion. In the low-vol­ume early pe­riod, mak­ers were likely un­so­phis­ti­cated in­di­vid­u­als who lost to rel­a­tively in­formed tak­ers. The vol­ume surge at­tracted pro­fes­sional liq­uid­ity providers ca­pa­ble of ex­tract­ing value from taker flow at all price points.

The maker-taker de­com­po­si­tion iden­ti­fies who ab­sorbs the losses, but leaves open the ques­tion of how their se­lec­tion bias op­er­ates. Why is taker flow so con­sis­tently mis­priced? The an­swer is not that mak­ers pos­sess su­pe­rior fore­sight, but rather that tak­ers ex­hibit a costly pref­er­ence for af­fir­ma­tive out­comes.

Standard ef­fi­ciency mod­els im­ply that mis­pric­ing should be sym­met­ric across con­tract types at equiv­a­lent prices; a 1-cent YES con­tract and a 1-cent NO con­tract should the­o­ret­i­cally re­flect sim­i­lar ex­pected val­ues. The data con­tra­dicts this as­sump­tion. At a price of 1 cent, a YES con­tract car­ries a his­tor­i­cal ex­pected value of -41%; buy­ers lose nearly half their cap­i­tal in ex­pec­ta­tion. Conversely, a NO con­tract at the same 1-cent price de­liv­ers a his­tor­i­cal ex­pected value of +23%. The di­ver­gence be­tween these seem­ingly iden­ti­cal prob­a­bil­ity es­ti­mates is 64 per­cent­age points.

The ad­van­tage for NO con­tracts is per­sis­tent. NO out­per­forms YES at 69 of 99 price lev­els, with the ad­van­tage con­cen­trat­ing at the mar­ket ex­tremes. NO con­tracts gen­er­ate su­pe­rior re­turns at every price in­cre­ment from 1 to 10 cents and again from 91 to 99 cents.

Despite the mar­ket be­ing zero-sum, dol­lar-weighted re­turns are -1.02% for YES buy­ers com­pared to +0.83% for NO buy­ers, a 1.85 per­cent­age point gap dri­ven by the over­pric­ing of YES con­tracts.

The un­der­per­for­mance of YES con­tracts may be linked to taker be­hav­ior. Breaking down the trad­ing data re­veals a struc­tural im­bal­ance in or­der flow com­po­si­tion.

In the 1-10 cent range, where YES rep­re­sents the long­shot out­come, tak­ers ac­count for 41-47% of YES vol­ume; mak­ers ac­count for only 20-24%. This im­bal­ance in­verts at the op­po­site end of the prob­a­bil­ity curve. When con­tracts trade at 99 cents, im­ply­ing that NO is the 1-cent long­shot, mak­ers ac­tively pur­chase NO con­tracts at 43% of vol­ume. Takers par­tic­i­pate at a rate of only 23%.

One might hy­poth­e­size that mak­ers ex­ploit this asym­me­try through su­pe­rior di­rec­tional fore­cast­ing—that they sim­ply know when to buy NO. The ev­i­dence does not sup­port this. When de­com­pos­ing maker per­for­mance by po­si­tion di­rec­tion, re­turns are nearly iden­ti­cal. Statistically sig­nif­i­cant dif­fer­ences emerge only at the ex­treme tails (1–10¢ and 91–99¢), and even there, ef­fect sizes are neg­li­gi­ble (Cohen’s d = 0.02–0.03). This sym­me­try is telling: mak­ers do not profit by know­ing which way to bet, but through some mech­a­nism that ap­plies equally to both di­rec­tions.

The analy­sis of 72.1 mil­lion trades on Kalshi re­veals a dis­tinct mar­ket mi­crostruc­ture where wealth sys­tem­at­i­cally trans­fers from liq­uid­ity tak­ers to liq­uid­ity mak­ers. This phe­nom­e­non is dri­ven by spe­cific be­hav­ioral bi­ases, mod­u­lated by mar­ket ma­tu­rity, and con­cen­trated in cat­e­gories that elicit high emo­tional en­gage­ment.

A cen­tral ques­tion in zero-sum mar­ket analy­sis is whether prof­itable par­tic­i­pants win through su­pe­rior in­for­ma­tion (forecasting) or su­pe­rior struc­ture (market mak­ing). Our data strongly sup­ports the lat­ter. When de­com­pos­ing maker re­turns by po­si­tion di­rec­tion, the per­for­mance gap is neg­li­gi­ble: mak­ers buy­ing YES earn an ex­cess re­turn of +0.77%, while those buy­ing NO earn +1.25% (Cohen’s d ≈ 0.02). This sta­tis­ti­cal sym­me­try in­di­cates that mak­ers do not pos­sess a sig­nif­i­cant abil­ity to pick win­ners. Instead, they profit via a struc­tural ar­bi­trage: pro­vid­ing liq­uid­ity to a taker pop­u­la­tion that ex­hibits a costly pref­er­ence for af­fir­ma­tive, long­shot out­comes.

This ex­trac­tion mech­a­nism re­lies on the Optimism Tax.” Takers dis­pro­por­tion­ately pur­chase YES con­tracts at long­shot prices, ac­count­ing for nearly half of all vol­ume in that range, de­spite YES long­shots un­der­per­form­ing NO long­shots by up to 64 per­cent­age points. Makers, there­fore, do not need to pre­dict the fu­ture; they sim­ply need to act as the coun­ter­party to op­ti­mism. This aligns with find­ings by Reichenbach and Walther (2025) on Polymarket and Whelan (2025) on Betfair, sug­gest­ing that in pre­dic­tion mar­kets, mak­ers ac­com­mo­date bi­ased flow rather than out-fore­cast it.

The tem­po­ral evo­lu­tion of maker-taker re­turns chal­lenges the as­sump­tion that long­shot bias in­evitably leads to wealth trans­fer. From 2021 through 2023, the bias ex­isted, yet tak­ers main­tained pos­i­tive ex­cess re­turns. The re­ver­sal of this trend co­in­cides pre­cisely with the ex­plo­sive vol­ume growth fol­low­ing Kalshi’s October 2024 le­gal vic­tory.

The wealth trans­fer ob­served in late 2024 is a func­tion of mar­ket depth. In the plat­for­m’s in­fancy, low liq­uid­ity likely de­terred so­phis­ti­cated al­go­rith­mic mar­ket mak­ers, leav­ing the or­der book to be pop­u­lated by am­a­teurs who were sta­tis­ti­cally in­dis­tin­guish­able from tak­ers. The mas­sive vol­ume surge fol­low­ing the 2024 elec­tion in­cen­tivized the en­try of pro­fes­sional liq­uid­ity providers ca­pa­ble of sys­tem­at­i­cally cap­tur­ing the spread and ex­ploit­ing the bi­ased flow. The long­shot bias it­self may have per­sisted for years, but it was only once mar­ket depth grew suf­fi­ciently to at­tract these so­phis­ti­cated mak­ers that the bias be­came a re­li­able source of profit ex­trac­tion.

The vari­a­tion in maker-taker gaps across cat­e­gories re­veals how par­tic­i­pant se­lec­tion shapes mar­ket ef­fi­ciency. At one ex­treme, Finance ex­hibits a gap of just 0.17 per­cent­age points; nearly per­fect ef­fi­ciency. At the other, World Events and Media ex­ceed 7 per­cent­age points. This dif­fer­ence can­not be ex­plained by the long­shot bias alone; it re­flects who chooses to trade in each cat­e­gory.

Finance (0.17 pp) serves as a con­trol group demon­strat­ing that pre­dic­tion mar­kets can ap­proach ef­fi­ciency. Questions like Will the S&P close above 6000?” at­tract par­tic­i­pants who think in prob­a­bil­i­ties and ex­pected val­ues, likely the same pop­u­la­tion that trades op­tions or fol­lows macro­eco­nomic data. The bar­rier to in­formed par­tic­i­pa­tion is high, and ca­sual bet­tors have no edge and likely rec­og­nize this, fil­ter­ing them­selves out.

Politics (1.02 pp) shows mod­er­ate in­ef­fi­ciency de­spite high emo­tional stakes. Political bet­tors fol­low polling closely and have prac­ticed cal­i­brat­ing be­liefs through elec­tion cy­cles. The gap is larger than Finance but far smaller than en­ter­tain­ment cat­e­gories, sug­gest­ing that po­lit­i­cal en­gage­ment, while emo­tional, does not en­tirely erode prob­a­bilis­tic rea­son­ing.

Sports (2.23 pp) rep­re­sents the modal pre­dic­tion mar­ket par­tic­i­pant. The gap is mod­er­ate but con­se­quen­tial given the cat­e­go­ry’s 72% vol­ume share. Sports bet­tors ex­hibit well-doc­u­mented bi­ases, in­clud­ing home team loy­alty, re­cency ef­fects, and nar­ra­tive at­tach­ment to star play­ers. A fan bet­ting on their team to win the cham­pi­onship is not cal­cu­lat­ing ex­pected value; they are pur­chas­ing hope.

Crypto (2.69 pp) at­tracts par­tic­i­pants con­di­tioned by the number go up” men­tal­ity of re­tail crypto mar­kets, a pop­u­la­tion over­lap­ping with meme stock traders and NFT spec­u­la­tors. Questions like Will Bitcoin reach $100k?” in­vite nar­ra­tive-dri­ven bet­ting rather than prob­a­bil­ity es­ti­ma­tion.

Entertainment, Media, and World Events (4.79–7.32 pp) ex­hibit the largest gaps and share a com­mon fea­ture: min­i­mal bar­ri­ers to per­ceived ex­per­tise. Anyone who fol­lows celebrity gos­sip feels qual­i­fied to bet on award show out­comes; any­one who reads head­lines feels in­formed about geopol­i­tics. This cre­ates a par­tic­i­pant pool that con­flates fa­mil­iar­ity with cal­i­bra­tion.

The pat­tern sug­gests ef­fi­ciency de­pends on two fac­tors: the tech­ni­cal bar­rier to in­formed par­tic­i­pa­tion and the de­gree to which ques­tions in­vite emo­tional rea­son­ing. When bar­ri­ers are high and fram­ing is clin­i­cal, mar­kets ap­proach ef­fi­ciency; when bar­ri­ers are low and fram­ing in­vites sto­ry­telling, the op­ti­mism tax reaches its max­i­mum.

While the data is ro­bust, sev­eral lim­i­ta­tions per­sist. First, the ab­sence of unique trader IDs forces us to rely on the Maker/Taker” clas­si­fi­ca­tion as a proxy for Sophisticated/Unsophisticated.” While stan­dard in mi­crostruc­ture lit­er­a­ture, this im­per­fectly cap­tures in­stances where so­phis­ti­cated traders cross the spread to act on time-sen­si­tive in­for­ma­tion. Second, we can­not di­rectly ob­serve the bid-ask spread in his­tor­i­cal trade data, mak­ing it dif­fi­cult to strictly de­cou­ple spread cap­ture from ex­plota­tion of bi­ased flow. Finally, these re­sults are spe­cific to a US-regulated en­vi­ron­ment; off­shore venues with dif­fer­ent lever­age caps and fee struc­tures may ex­hibit dif­fer­ent dy­nam­ics.

The promise of pre­dic­tion mar­kets lies in their abil­ity to ag­gre­gate di­verse in­for­ma­tion into a sin­gle, ac­cu­rate prob­a­bil­ity. However, our analy­sis of Kalshi demon­strates that this sig­nal is of­ten dis­torted by sys­tem­atic wealth trans­fer dri­ven by hu­man psy­chol­ogy and mar­ket mi­crostruc­ture.

The mar­ket is split into two dis­tinct pop­u­la­tions: a taker class that sys­tem­at­i­cally over­pays for low-prob­a­bil­ity, af­fir­ma­tive out­comes, and a maker class that ex­tracts this pre­mium through pas­sive liq­uid­ity pro­vi­sion. This dy­namic is not an in­her­ent flaw of the wisdom of the crowd,” but rather a fea­ture of how hu­man psy­chol­ogy in­ter­acts with mar­ket mi­crostruc­ture. When the topic is dry and quan­ti­ta­tive (Finance), the mar­ket is ef­fi­cient. When the topic al­lows for trib­al­ism and hope (Sports, Entertainment), the mar­ket trans­forms into a mech­a­nism for trans­fer­ring wealth from the op­ti­mistic to the cal­cu­lated.

* Fama, E.F., Efficient Capital Markets: A Review of Theory and Empirical Work”, Journal of Finance, 1970. Available: https://​www.js­tor.org/​sta­ble/​2325486

* Griffith, R.M., Odds Adjustments by American Horse-Race Bettors”, American Journal of Psychology, 1949. Available: https://​www.js­tor.org/​sta­ble/​1418469

* Reichenbach, F. & Walther, M., Exploring Decentralized Prediction Markets: Accuracy, Skill, and Bias on Polymarket”, SSRN, 2025. Available: https://​ssrn.com/​ab­stract=5910522

* Thaler, R.H. & Ziemba, W.T., Anomalies: Parimutuel Betting Markets: Racetracks and Lotteries”, Journal of Economic Perspectives, 1988. Available: https://​www.aeaweb.org/​ar­ti­cles?id=10.1257/​jep.2.2.161

* Whelan, K., Agreeing to Disagree: The Economics of Betting Exchanges”, MPRA, 2025. Available: https://​mpra.ub.uni-muenchen.de/​126351/​1/​MPRA_­pa­per_126351.pdf

* U.S. Court of Appeals for the D.C. Circuit, Kalshi, Inc. v. CFTC, Oct 2024. Available: https://​me­dia.cadc.us­courts.gov/​opin­ions/​docs/​2024/​10/​24-5205-2077790.pdf

...

Read the original on jbecker.dev »

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

If you like 10HN please leave feedback and share

Visit pancik.com for more.