What is the origin of the lake tank image that has become a meme?

It’s a Panzer IVD of the 31st Panzer Regiment as­signed to the 5th Panzer Div. com­manded by Lt. Heinz Zobel lost on May 13th, 1940. The lake” is the Meuse River. The man is a German pi­o­neer.

All credit to find­ing the Panzer of the Lake goes to ConeOfArc for co­or­di­nat­ing the search, and miller786 and their team for find­ing the Panzer. Full sources and de­tails are in Panzer Of The Lake - Meuse River Theory

The photo was taken about co­or­di­nates 50.29092467073664, 4.893099128823844 near mod­ern Wallonia, Belgium on the Meuse River. The tank was not re­cov­ered un­til much later in 1941. The man is an un­named German pi­o­neer likely at the time of re­cov­ery.

Comparison of an al­ter­na­tive orig­i­nal photo and the most re­cent im­age avail­able of the lo­ca­tion (July 2020, Google Street View)

On May 12th, 1940 the 31st Panzer Regiment, as­signed to the 5th Panzer Division, at­tempted to cap­ture a bridge over the Meuse River at Yvoir. The bridge was de­mol­ished by 1st Lieutenant De Wispelaere of the Belgian Engineers.

Werner Advance Detachment (under Oberst Paul Hermann Werner, com­man­der, 31st Panzer Regiment), which be­longed to the 5th Panzer Division, un­der Rommel’s com­mand… Werner re­ceived a mes­sage from close sup­port air re­con­nais­sance in the af­ter­noon that the bridge at Yvoir (seven kilo­me­ters north of Dinant) was still in­tact. He (Werner) im­me­di­ately or­dered Leutnant [Heinz] Zobel’s ar­mored as­sault team of two ar­mored scout cars and one Panzer pla­toon to head to the bridge at top speed… Belgian en­gi­neers un­der the com­mand of 1st Lieutenant de Wispelaere had pre­pared the bridge for de­mo­li­tion while a pla­toon of Ardennes Light Infantry and el­e­ments of a French in­fantry bat­tal­ion screened the bridge… Although the last sol­diers had al­ready passed the bridge, de Wispelaere de­layed the de­mo­li­tion be­cause civil­ian refugees were still ap­proach­ing… two German ar­mored scout cars charged to­ward the bridge while the fol­low­ing three Panzers opened fire. De Wispelaere im­me­di­ately pushed the elec­tri­cal ig­ni­tion, but there was no ex­plo­sion… Wispelaere now left his shel­ter and worked the man­ual ig­ni­tion de­vice. Trying to get back to his bunker, he was hit by a burst from a German ma­chine gun and fell to the ground, mor­tally wounded. At the same time, the ex­plo­sive charge went off. After the gi­gan­tic smoke cloud had drifted away, only the rem­nants of the pil­lars could be seen.

A few kilo­me­ters south at Houx, the Germans used a por­tion of a pon­toon bridge (Bruckengerat B) rated to carry 16 tons to ferry their 25 ton tanks across.

By noon on May 13, Pioniere com­pleted an eight-ton ferry and crossed twenty anti-tank guns to the west bank, how­ever to main­tain the tempo of his di­vi­sions ad­vance, he needed ar­mor and mo­tor­ized units across the river. Rommel per­son­ally or­dered the ferry con­verted to a heav­ier six­teen-ton vari­ant to fa­cil­i­tate the cross­ing of the light Panzers and ar­mored cars. Simultaneously, the Pioniere be­gan con­struc­tion on a bridge ca­pa­ble of cross­ing the di­vi­sion’s heav­ier Panzers and mo­tor­ized units.

Major Erich Schnee in The German Pionier: Case Study of the Combat Engineer’s Employment During Sustained Ground Combat”

On the evening of the 13th, Lt. Zobel’s tank is cross­ing. Approaching the shore, the ferry lifts, the load shifts, and the tank falls into the river.

The panzer IV of Lieutenant Zabel [sic] of the 31. Panzer Regiment of the 5. Panzer-Division, on May 13, 1940, in Houx, as good as un­der­wa­ter ex­cept for the ve­hi­cle com­man­der’s cupola. Close to the west bank, at the pon­toon cross­ing site and later site of 5. Panzer Division bridge, a 16 tonne ferry (Bruckengerat B) gave way to the ap­proach­ing shore­line, likely due to the ro­tat­ing move­ment of the panzer, which turned right when dis­em­bark­ing (the only pos­si­ble di­rec­tion to quickly leave the Meuse’s shore due to the wall cre­ated by the rail line). The tank would be fished out in 1941 dur­ing the re­con­struc­tion of the bridge.

Sometime later the pho­to­graph was taken of a German pi­o­neer in­fantry­man look­ing at the tank. Later the tank was re­cov­ered and its ul­ti­mate fate is un­known.

Available ev­i­dence sug­gests the sol­dier in the photo is a Pioneer/Tank re­cov­ery crew, hold­ing a Kar98k and wear­ing an EM/NCO’S Drill & Work uni­form, more com­monly known as Drillich”.

His role is proven by the pres­ence of pon­toon fer­ries on the Meuse river, used by the 5th Panzer Division. That is also proven by his uni­form, which, as ev­i­dence sug­gests, was used dur­ing work to pre­vent dam­age to their stan­dard woolen uni­form.

An early ver­sion of the Drillich

While I can’t iden­tify the photo, I can nar­row down the tank. I be­lieve it is a Panzer IV D.

It has the short bar­relled 7.5 cm KwK 37 nar­row­ing it down to a Panzer IV Ausf. A through F1 or a Panzer III N.

Both had very sim­i­lar tur­rets, but the Panzer III N has a wider gun mant­let, a more an­gu­lar shroud, and lacked (or cov­ered) the dis­tinc­tive an­gu­lar view ports (I be­lieve they’re view ports) on ei­ther side of the tur­ret face.

This leaves the Panzer IV. The dis­tinc­tive cupola was added in model B. The ex­ter­nal gun mant­let was added in model D.

Panzer IV model D in France 1940 with the ex­ter­nal gun mant­let and periscope. source

Note the front half of the tur­ret top is smooth. There is a pro­tru­sion to the front left of the cupola (I be­lieve it’s a periscope sight) and an­other cir­cu­lar open­ing to the front right. Finally, note the large ven­ti­la­tion hatch just in front of the cupola.

Model E would elim­i­nate the ven­ti­la­tion hatch and re­place it with a fan. The periscope was re­placed with a hatch for sig­nal flags.

Panzer IV model D en­tered mass pro­duc­tion in October 1939 which means it would be too late for Poland, but could have seen ser­vice in France, Norway, or the Soviet Union.

As for the sol­dier…

The ri­fle has a turned down bolt han­dle, a bay­o­net lug (missing from late ri­fles), a dis­tinc­tive dis­as­sem­bly disc on the side of the stock (also miss­ing from late ri­fles), no front site hood (indicative of an early ri­fle), and you can just about make out ex­tra de­tail in the nose cap (also early). This is likely an early Karabiner 98k which is miss­ing its clean­ing rod. See Forgotten Weapons: Evolution of the Karabiner 98k, From Prewar to Kriegsmodell.

ConeOfArc posted a video The Search for Panzer of the Lake.

He broke down what he could iden­tify about the sol­der, prob­a­bly German.

For the tank he con­firms it’s a Panzer IV D us­ing sim­i­lar cri­te­ria I used and he found two ad­di­tional pho­tos of what ap­pear to be the same tank claim­ing to be from the Western front in 1940.

He then found a Russian source claim­ing it was found in Romania at the on­set of Barbarossa in 1941.

Unfortunately that’s all for now. ConeOfArc has put a bounty of $100 US for de­fin­i­tive proof of the tank’s lo­ca­tion. More de­tail can be had on ConeOfArc’s Discord.


auonsson (@auonsson.bsky.social)

This is a heav­ily in­ter­ac­tive web ap­pli­ca­tion, and JavaScript is re­quired. Simple HTML in­ter­faces are pos­si­ble, but that is not what this is.

Learn more about Bluesky at bsky.so­cial and at­proto.com. Chinese-flagged cargo ship Yi Peng 3 crossed both sub­ma­rine ca­bles C-Lion 1 and BSC at times match­ing when they broke.

She was shad­owed by Danish navy for a while dur­ing night and is now in Danish Straits leav­ing Baltics.

No signs of board­ing. AIS-caveats ap­ply.


Delivering SSL/TLS Everywhere

Vital per­sonal and busi­ness in­for­ma­tion flows over the Internet more fre­quently than ever, and we don’t al­ways know when it’s hap­pen­ing. It’s clear at this point that en­crypt­ing is some­thing all of us should be do­ing. Then why don’t we use TLS (the suc­ces­sor to SSL) every­where? Every browser in every de­vice sup­ports it. Every server in every data cen­ter sup­ports it. Why don’t we just flip the switch?

The chal­lenge is server cer­tifi­cates. The an­chor for any TLS-protected com­mu­ni­ca­tion is a pub­lic-key cer­tifi­cate which demon­strates that the server you’re ac­tu­ally talk­ing to is the server you in­tended to talk to. For many server op­er­a­tors, get­ting even a ba­sic server cer­tifi­cate is just too much of a has­sle. The ap­pli­ca­tion process can be con­fus­ing. It usu­ally costs money. It’s tricky to in­stall cor­rectly. It’s a pain to up­date.

Let’s Encrypt is a new free cer­tifi­cate au­thor­ity, built on a foun­da­tion of co­op­er­a­tion and open­ness, that lets every­one be up and run­ning with ba­sic server cer­tifi­cates for their do­mains through a sim­ple one-click process.

Mozilla Corporation, Cisco Systems, Inc., Akamai Technologies, Electronic Frontier Foundation, IdenTrust, Inc., and re­searchers at the University of Michigan are work­ing through the Internet Security Research Group (“ISRG), a California pub­lic ben­e­fit cor­po­ra­tion, to de­liver this much-needed in­fra­struc­ture in Q2 2015. The ISRG wel­comes other or­ga­ni­za­tions ded­i­cated to the same ideal of ubiq­ui­tous, open Internet se­cu­rity.

The key prin­ci­ples be­hind Let’s Encrypt are:

* Free: Anyone who owns a do­main can get a cer­tifi­cate val­i­dated for that do­main at zero cost.

* Automatic: The en­tire en­roll­ment process for cer­tifi­cates oc­curs pain­lessly dur­ing the server’s na­tive in­stal­la­tion or con­fig­u­ra­tion process, while re­newal oc­curs au­to­mat­i­cally in the back­ground.

* Secure: Let’s Encrypt will serve as a plat­form for im­ple­ment­ing mod­ern se­cu­rity tech­niques and best prac­tices.

* Transparent: All records of cer­tifi­cate is­suance and re­vo­ca­tion will be avail­able to any­one who wishes to in­spect them.

* Open: The au­to­mated is­suance and re­newal pro­to­col will be an open stan­dard and as much of the soft­ware as pos­si­ble will be open source.

* Cooperative: Much like the un­der­ly­ing Internet pro­to­cols them­selves, Let’s Encrypt is a joint ef­fort to ben­e­fit the en­tire com­mu­nity, be­yond the con­trol of any one or­ga­ni­za­tion.

If you want to help these or­ga­ni­za­tions in mak­ing TLS Everywhere a re­al­ity, here’s how you can get in­volved:

To learn more about the ISRG and our part­ners, check out our About page.


Analytical Anti-Aliasing

Today’s jour­ney is Anti-Aliasing and the des­ti­na­tion is Analytical Anti-Aliasing. Getting rid of ras­ter­i­za­tion jag­gies is an art-form with decades upon decades of maths, cre­ative tech­niques and non-stop in­no­va­tion. With so many years of re­search and de­vel­op­ment, there are many fla­vors.

From the sim­ple but re­source in­ten­sive SSAA, over the­ory dense SMAA, to us­ing ma­chine learn­ing with DLAA. Same goal - vastly dif­fer­ent ap­proaches. We’ll take a look at how they work, be­fore in­tro­duc­ing a new way to look a the prob­lem - the ✨analytical🌟 way. The per­fect Anti-Aliasing ex­ists and is sim­pler than you think.

Having im­ple­mented it mul­ti­ple times over the years, I’ll also share some juicy se­crets I have never read any­where be­fore.

To un­der­stand the Anti-Aliasing al­go­rithms, we will im­ple­ment them along the way! Following WebGL can­vases draw a mov­ing cir­cle. Anti-Aliasing can­not be fully un­der­stood with just im­ages, move­ment is es­sen­tial. The red box has 4x zoom. Rendering is done at na­tive res­o­lu­tion of your de­vice, im­por­tant to judge sharp­ness.

Please pixel-peep to judge sharp­ness and alias­ing closely. Resolution of your screen too high to see alias­ing? Lower the res­o­lu­tion with the fol­low­ing but­tons, which will in­te­ger-scale the ren­der­ing.

Let’s start out sim­ple. Using GLSL Shaders we tell the GPU of your de­vice to draw a cir­cle in the most sim­ple and naive way pos­si­ble, as seen in cir­cle.fs above: If the length() from the mid­dle point is big­ger than 1.0, we dis­card the pixel.

The cir­cle is blocky, es­pe­cially at smaller res­o­lu­tions. More painfully, there is strong pixel crawl­ing”, an ar­ti­fact that’s very ob­vi­ous when there is any kind of move­ment. As the cir­cle moves, rows of pix­els pop in and out of ex­is­tence and the stair steps of the pix­e­la­tion move along the side of the cir­cle like beads of dif­fer­ent speeds.

The low ¼ and ⅛ res­o­lu­tions aren’t just there for ex­treme pixel-peep­ing, but also to rep­re­sent small el­e­ments or ones at large dis­tance in 3D.

At lower res­o­lu­tions these ar­ti­facts come to­gether to de­stroy the cir­cu­lar form. The com­bi­na­tion of slow move­ment and low res­o­lu­tion causes one side’s pix­els to come into ex­is­tence, be­fore the other side’s pix­els dis­ap­pear, caus­ing a wob­ble. Axis-alignment with the pixel grid causes plateaus” of pix­els at every 90° and 45° po­si­tion.

Understanding the GPU code is not nec­es­sary to fol­low this ar­ti­cle, but will help to grasp whats hap­pen­ing when we get to the an­a­lyt­i­cal bits.

4 ver­tices mak­ing up a quad are sent to the GPU in the ver­tex shader cir­cle.vs, where they are re­ceived as at­tribute vec2 vtx. The co­or­di­nates are of a unit quad”, mean­ing the co­or­di­nates look like the fol­low­ing im­age. With one fa­mous ex­cep­tion, all GPUs use tri­an­gles, so the quad is ac­tu­ally made up of two tri­an­gles.

The ver­tices here are given to the frag­ment shader cir­cle.fs via vary­ing vec2 uv. The frag­ment shader is called per frag­ment (here frag­ments are pixel-sized) and the vary­ing is in­ter­po­lated lin­early with per­spec­tive cor­rected, barycen­tric co­or­di­nates, giv­ing us a uv co­or­di­nate per pixel from -1 to +1 with zero at the cen­ter.

By per­form­ing the check if (length(uv) < 1.0) we draw our color for frag­ments in­side the cir­cle and re­ject frag­ments out­side of it. What we are do­ing is known as Alpha test­ing”. Without div­ing too deeply and just to hint at what’s to come, what we have cre­ated with length(uv) is the signed dis­tance field of a point.

Just to clar­ify, the cir­cle is­n’t drawn with geom­e­try”, which would have fi­nite res­o­lu­tion of the shape, de­pend­ing on how many ver­tices we use. It’s drawn by the shader”.

SSAA stands for Super Sampling Anti-Aliasing. Render it big­ger, down­sam­ple to be smaller. The idea is as old as 3D ren­der­ing it­self. In fact, the first movies with CGI all re­lied on this with the most naive of im­ple­men­ta­tions. One ex­am­ple is the 1986 movie Flight of the Navigator”, as cov­ered by Captain Disillusion in the video be­low.

1986 did it, so can we. Implemented in mere sec­onds. Easy, right?

cir­cleSSAA.js draws at twice the res­o­lu­tion to a tex­ture, which frag­ment shader post.fs reads from at stan­dard res­o­lu­tion with GL_LINEAR to per­form SSAA. So we have four in­put pix­els for every one out­put pixel we draw to the screen. But it’s some­what strange: There is def­i­nitely Anti-Aliasing hap­pen­ing, but less than ex­pected.

There should be 4 steps of trans­parency, but we only get two!

Especially at lower res­o­lu­tions, we can see the cir­cle does ac­tu­ally have 4 steps of trans­parency, but mainly at the 45° diagonals” of the cir­cle. A cir­cle has of course no sides, but at the axis-aligned bottom” there are only 2 steps of trans­parency: Fully Opaque and 50% trans­par­ent, the 25% and 75% trans­parency steps are miss­ing.

We aren’t sam­pling against the cir­cle shape at twice the res­o­lu­tion, we are sam­pling against the quan­tized re­sult of the cir­cle shape. Twice the res­o­lu­tion, but dis­crete pix­els nonethe­less. The com­bi­na­tion of pix­e­la­tion and sam­ple place­ment does­n’t hold enough in­for­ma­tion where we need it the most: at the axis-aligned flat parts”.

Four times the mem­ory and four times the cal­cu­la­tion re­quire­ment, but only a half-assed re­sult.

Implementing SSAA prop­erly is a minute craft. Here we are draw­ing to a 2x res­o­lu­tion tex­ture and down-sam­pling it with lin­ear in­ter­po­la­tion. So ac­tu­ally, this im­ple­men­ta­tion needs 5x the amount of VRAM. A proper im­ple­men­ta­tion sam­ples the scene mul­ti­ple times and com­bines the re­sult with­out an in­ter­me­di­ary buffer.

With our im­ple­men­ta­tion, we can’t even do more than 2xSSAA with one tex­ture read, as lin­ear in­ter­po­la­tion hap­pens only with 2x2 sam­ples

To com­bat axis-align­ment ar­ti­facts like with our cir­cle above, we need to place our SSAA sam­ples bet­ter. There are mul­ti­ple ways to do so, all with pros and cons. To im­ple­ment SSAA prop­erly, we need deep in­te­gra­tion with the ren­der­ing pipeline. For 3D prim­i­tives, this hap­pens be­low API or en­gine, in the realm of ven­dors and dri­vers.

In fact, some of the best im­ple­men­ta­tions were dis­cov­ered by ven­dors on ac­ci­dent, like SGSSAA. There are also ways in which SSAA can make your scene look worse. Depending on im­ple­men­ta­tion, SSAA messes with mip-map cal­cu­la­tions. As a re­sult the mip-map lod-bias may need ad­just­ment, as ex­plained in the ar­ti­cle above.

WebXR UI pack­age three-mesh-ui , a pack­age ma­ture enough to be used by Meta , uses shader-based ro­tated grid su­per sam­pling to achieve sharp text ren­der­ing in VR, as seen in the code

MSAA is su­per sam­pling, but only at the sil­hou­ette of mod­els, over­lap­ping geom­e­try, and tex­ture edges if Alpha to Coverage” is en­abled. MSAA is im­ple­mented by the graph­ics card in-hard­ware by the graph­ics ven­dors and what is sup­ported de­pends on hard­ware. In the se­lect box be­low you can choose dif­fer­ent MSAA lev­els for our cir­cle.

There is up to MSAA x64, but what is avail­able is im­ple­men­ta­tion de­fined. WebGL 1 has no sup­port, which is why the next can­vas ini­tial­izes a WebGL 2 con­text. In WebGL, NVIDIA lim­its MSAA to 8x on Windows, even if more is sup­ported, whilst on Linux no such limit is in place. On smart­phones you will only get ex­actly 4x, as dis­cussed be­low.

What is edge smooth­ing and how does MSAA even know what to sam­ple against? For now we skip the shader code and im­ple­men­ta­tion. First let’s take a look at MSAAs pros and cons in gen­eral.

We rely on hard­ware to do the Anti-Aliasing, which ob­vi­ously leads to the prob­lem that user hard­ware may not sup­port what we need. The sam­pling pat­terns MSAA uses may also do things we don’t ex­pect. Depending on what your hard­ware does, you may see the cir­cle’s edge trans­parency steps ap­pear­ing in the wrong or­der”.

When MSAA be­came re­quired with OpenGL 3 & DirectX 10 era of hard­ware, sup­port was es­pe­cially hit & miss. Even lat­est Intel GMA iG­PUs ex­pose the OpenGL ex­ten­sion EXT_framebuffer_multisample, but don’t in-fact sup­port MSAA, which led to con­fu­sion. But also in more re­cent smart­phones, sup­port just was­n’t that clear-cut.

Mobile chips sup­port ex­actly MSAAx4 and things are weird. Android will let you pick 2x, but the dri­ver will force 4x any­ways. iPhones & iPads do some­thing rather stu­pid: Choosing 2x will make it 4x, but trans­parency will be rounded to near­est 50% mul­ti­ple, lead­ing to dou­ble edges in our ex­am­ple. There is hard­ware spe­cific rea­son:

Looking at mod­ern video games, one might be­lieve that MSAA is of the past. It usu­ally brings a hefty per­for­mance penalty af­ter all. Surprisingly, it’s still the king un­der cer­tain cir­cum­stances and in very spe­cific sit­u­a­tions, even per­for­mance free.

As a gamer, this goes against in­stinct…

Rahul Prasad: Use MSAA […] It’s ac­tu­ally not as ex­pen­sive on mo­bile as it is on desk­top, it’s one of the nice things you get on mo­bile. […] On some (mobile) GPUs 4x (MSAA) is free, so use it when you have it.

As ex­plained by Rahul Prasad in the above talk, in VR 4xMSAA is a must and may come free on cer­tain mo­bile GPUs. The spe­cific rea­son would de­rail the blog post, but in case you want to go down that par­tic­u­lar rab­bit hole, here is Epic Games’ Niklas Smedberg giv­ing a run-down.

In short, this is pos­si­ble un­der the con­di­tion of for­ward ren­der­ing with geom­e­try that is not too dense and the GPU hav­ing tiled-based ren­der­ing ar­chi­tec­ture, which al­lows the GPU to per­form MSAA cal­cu­la­tions with­out heavy mem­ory ac­cess and thus la­tency hid­ing the cost of the cal­cu­la­tion. Here’s deep dive, if you are in­ter­ested.

MSAA gives you ac­cess to the sam­ples, mak­ing cus­tom MSAA fil­ter­ing curves a pos­si­bil­ity. It also al­lows you to merge both stan­dard mesh-based and signed-dis­tance-field ren­der­ing via al­pha to cov­er­age. This com­plex fea­tures set made pos­si­ble the most out-of-the-box think­ing I ever wit­nessed in graph­ics pro­gram­ming:

Assassin’s Creed Unity used MSAA to ren­der at half res­o­lu­tion and re­con­struct only some buffers to full-res from MSAA sam­ples, as de­scribed on page 48 of the talk GPU-Driven Rendering Pipelines” by Ulrich Haar and Sebastian Aaltonen. Kinda like vari­able rate shad­ing, but im­ple­mented with duct-tape and with­out ven­dor sup­port.

The brain-melt­ing lengths to which graph­ics pro­gram­mers go to uti­lize hard­ware ac­cel­er­a­tion to the last drop has me some­times in awe.

In 2009 a pa­per by Alexander Reshetov struck the graph­ics pro­gram­ming world like a ton of bricks: take the blocky, aliased re­sult of the ren­dered im­age, find edges and clas­sify the pix­els into tetris-like shapes with per-shape fil­ter­ing rules and re­move the blocky edge. Anti-Aliasing based on the mor­phol­ogy of pix­els - MLAA was born.

Computationally cheap, easy to im­ple­ment. Later it was re­fined with more em­pha­sis on re­mov­ing sub-pixel ar­ti­facts to be­come SMAA. It be­came a fan fa­vorite, with an in­jec­tor be­ing de­vel­oped early on to put SMAA into games that did­n’t sup­port it. Some con­sid­ered these too blurry, the say­ing vaseline on the screen” was coined.

It was the fu­ture, a sign of things to come. No more shaky hard­ware sup­port. Like Fixed-Function pipelines died in fa­vor of pro­gram­ma­ble shaders Anti-Aliasing too be­came shader based”.

We’ll take a close look at an al­go­rithm that was in­spired by MLAA, de­vel­oped by Timothy Lottes. Fast ap­prox­i­mate anti-alias­ing”, FXAA. In fact, when it came into wide cir­cu­la­tion, it re­ceived some in­cred­i­ble press. Among oth­ers, Jeff Atwood pulled nei­ther bold fonts nor punches in his 2011 blog post, later re­pub­lished by Kotaku.

Jeff Atwood: The FXAA method is so good, in fact, it makes all other forms of full-screen anti-alias­ing pretty much ob­so­lete overnight. If you have an FXAA op­tion in your game, you should en­able it im­me­di­ately and ig­nore any other AA op­tions.

Let’s see what the hype was about. The fi­nal ver­sion pub­licly re­leased was FXAA 3.11 on August 12th 2011 and the fol­low­ing demos are based on this. First, let’s take a look at our cir­cle with FXAA do­ing the Anti-Aliasing at de­fault set­tings.

A bit of a weird re­sult. It looks good if the cir­cle would­n’t move. Perfectly smooth edges. But the cir­cle dis­torts as it moves. The axis-aligned top and bot­tom es­pe­cially have a lit­tle nub that ap­pears and dis­ap­pears. And switch­ing to lower res­o­lu­tions, the cir­cle even loses its round shape, wob­bling like Play Station 1 graph­ics.

Per-pixel, FXAA con­sid­ers only the 3x3 neigh­bor­hood, so it can’t pos­si­bly know that this area is part of a big shape. But it also does­n’t just blur edges”, as of­ten said. As ex­plained in the of­fi­cial whitepa­per, it finds the edge’s di­rec­tion and shifts the pix­el’s co­or­di­nates to let the per­for­mance free lin­ear in­ter­po­la­tion do the blend­ing.

For our demo here, wrong tool for the job. Really, we did­n’t do FXAA jus­tice with our ex­am­ple. FXAA was cre­ated for an­other use case and has many set­tings and pre­sets. It was cre­ated to anti-alias more com­plex scenes. Let’s give it a fair shot!

A scene from my fa­vorite piece of soft­ware in ex­is­tence: NeoTokyo°. I cre­ated a bright area light in an NT° map and moved a bench to cre­ate an area of strong alias­ing. The fol­low­ing demo uses the aliased out­put from NeoTokyo°, cal­cu­lates the re­quired lu­mi­nance chan­nel and ap­plies FXAA. All FXAA pre­sets and set­tings at your fin­ger tips.

This has fixed res­o­lu­tion and will only be at you de­vice’s na­tive res­o­lu­tion, if your de­vice has no dpi scal­ing and the browser is at 100% zoom.

Just look­ing at the full FXAA 3.11 source, you can see the pas­sion in every line. Portable across OpenGL and DirectX, a PC ver­sion, a XBOX 360 ver­sion, two finely op­ti­mized PS3 ver­sion fight­ing for every GPU cy­cle, in­clud­ing shader dis­as­sam­bly. Such level of pro­fes­sion­al­ism and ded­i­ca­tion, shared with the world in plain text.

The shar­ing and open­ness is why I’m in love with graph­ics pro­gram­ming.

It may be per­for­mance cheap, but only if you al­ready have post-pro­cess­ing in place or do de­ferred shad­ing. Especially in mo­bile graph­ics, mem­ory ac­cess is ex­pen­sive, so sav­ing the frame­buffer to per­form post pro­cess­ing is not al­ways a given. If you need to setup ren­der-to-tex­ture in or­der to have FXAA, then the F” in FXAA evap­o­rates.

In this ar­ti­cle we won’t jump into mod­ern tem­po­ral anti-alias­ing, but be­fore FXAA was even de­vel­oped, TAA was al­ready ex­per­i­mented with. In fact, FXAA was sup­posed to get a new ver­sion 4 and in­cor­po­rate tem­po­ral anti alias­ing in ad­di­tion to the stan­dard spa­tial one, but in­stead it evolved fur­ther and re­branded into TXAA.

Now we get to the good stuff. Analytical Anti-Aliasing ap­proaches the prob­lem back­wards - it knows the shape you need and draws the pixel al­ready Anti-Aliased to the screen. Whilst draw­ing the 2D or 3D shape you need, it fades the shape’s bor­der by ex­actly one pixel.

Always smooth with­out ar­ti­facts and you can ad­just the amount of fil­ter­ing. Preserves shape even at low res­o­lu­tions. No ex­tra buffers or ex­tra hard­ware re­quire­ments.

Even runs on ba­sic WebGL 1.0 or OpenGLES 2.0, with­out any ex­ten­sions.

With the above but­tons, you can set the smooth­ing to be equal to one pixel. This gives a sharp re­sult, but comes with the caveat that axis-aligned 90° sides may still be per­se­ved as flat” in spe­cific com­bi­na­tions of screen res­o­lu­tion, size and cir­cle po­si­tion.

Filtering based on the di­ag­o­nal pixel size of √2 px = 1.4142…, en­sures the tip” of the cir­cle in axis-aligned pixel rows and columns is al­ways non-opaque. This re­moves the per­cep­tion of flat­ness, but makes it shape ever so slightly more blurry.

Or in other words: as soon as the bor­der has an opaque pixel, there is al­ready a trans­par­ent pixel in front” of it.

This style of Anti-Aliasing is usu­ally im­ple­mented with 3 in­gre­di­ents:

But if you look at the code box above, you will find cir­cle-an­a­lyt­i­cal.fs hav­ing none of those. And this is the se­cret sauce we will look at. Before we dive into the im­ple­men­ta­tion, let’s clear the ele­phants in the room…

In graph­ics pro­gram­ming, Analytical refers to ef­fects cre­ated by know­ing the make-up of the in­tended shape be­fore­hand and per­form­ing cal­cu­la­tions against the rigid math­e­mat­i­cal de­f­i­n­i­tion of said shape. This term is used very loosely across com­puter graph­ics, sim­i­lar to su­per sam­pling re­fer­ring to mul­ti­ple things, de­pend­ing on con­text.

Very soft soft-shad­ows which in­clude con­tact-hard­en­ing, im­ple­mented by al­go­rithms like per­cent­age-closer soft shad­ows are very com­pu­ta­tion­ally in­tense and re­quire both high res­o­lu­tion shadow maps and/​or very ag­gres­sive fil­ter­ing to not pro­duce shim­mer­ing dur­ing move­ment.

This is why Naughty Dog’s The Last of Us re­lied on get­ting soft-shad­ows on the main char­ac­ter by cal­cu­lat­ing the shadow from a rigidly de­fined for­mula of a stretched sphere, mul­ti­ple of which were arranged in the shape of the main char­ac­ter, shown in red. An im­proved im­ple­men­ta­tion with shader code can be seen in this Shadertoy demo by ro­main­guy, with the more mod­ern cap­sule, as op­posed a stretched sphere.

This is now an in­te­gral part of mod­ern game en­gines, like Unreal. As op­posed to stan­dard shadow map­ping, we don’t ren­der the scene from the per­spec­tive of the light with fi­nite res­o­lu­tion. We eval­u­ate the shadow per-pixel against the math­e­mat­i­cal equa­tion of the stretched sphere or cap­sule. This makes cap­sule shad­ows an­a­lyt­i­cal.

Staying with the Last of Us, The Last of Us Part II uses the same logic for blurry real-time re­flec­tions of the main char­ac­ter, where Screen Space Reflections aren’t de­fined. Other op­tions like ray­trac­ing against the scene, or us­ing a real-time cube­map like in GTA V are ei­ther noisy and low res­o­lu­tion or high res­o­lu­tion, but low per­for­mance.

Here the re­flec­tion cal­cu­la­tion is part of the ma­te­r­ial shader, ren­der­ing against the rigidly de­fined math­e­mat­i­cal shape of the cap­sule per-pixel, mul­ti­ple of which are arranged in the shape of the main char­ac­ter. This makes cap­sule re­flec­tions an­a­lyt­i­cal.

An on­line demo with is worth at least a mil­lion…

…yeah the joke is get­ting old.

Ambient Occlusion is es­sen­tial in mod­ern ren­der­ing, bring­ing con­tact shad­ows and ap­prox­i­mat­ing global il­lu­mi­na­tion. Another topic as deep as the ocean, with so many im­ple­men­ta­tions. Usually im­ple­mented by some form of raytrace a bunch of rays and blur the re­sult”.

In this Shadertoy demo, the floor is eval­u­ated per-pixel against the rigidly de­fined math­e­mat­i­cal de­scrip­tion of the sphere to get a soft, non-noisy, non-flick­er­ing oc­clu­sion con­tri­bu­tion from the hov­er­ing ball. This im­ple­men­ta­tion is an­a­lyt­i­cal. Not just spheres, there are an­a­lyt­i­cal ap­proaches also for com­plex geom­e­try.

By ex­ten­sion, Unreal Engine has dis­tance field ap­proaches for Soft Shadows and Ambient Occlusion, though one may ar­gue, that this type of signed dis­tance field ren­der­ing does­n’t fit the de­scrip­tion of an­a­lyt­i­cal, con­sid­er­ing the dis­tance field is pre­cal­cu­lated into a 3D tex­ture.

Let’s dive into the sauce. We work with signed dis­tance fields, where for every point that we sam­ple, we know the dis­tance to the de­sired shape. This in­for­ma­tion may be baked into a tex­ture as done for SDF text ren­der­ing or maybe be de­rived per-pixel from a math­e­mat­i­cal for­mula for sim­pler shapes like bezier curves or hearts.

Based on that dis­tance we fade out the bor­der of the shape. If we fade by the size of one pixel, we get per­fectly smooth edges, with­out any strange side ef­fects. The se­cret sauce is in the im­ple­men­ta­tion and un­der the sauce is where the magic is. How does the shader know the size of pixel? How do we blend based on dis­tance?

This ap­proach gives mo­tion-sta­ble pixel-per­fec­tion, but does­n’t work with tra­di­tional ras­ter­i­za­tion. The full shape re­quires a signed dis­tance field.

Specifically, by how much do we fade the bor­der? If we hard­code a sta­tic value, eg. fade at 95% of the cir­cle’s ra­dius, we may get a pleas­ing re­sult for that cir­cle size at that screen res­o­lu­tion, but too much smooth­ing when the cir­cle is big­ger or closer to the cam­era and alias­ing if the cir­cle be­comes small.

We need to know the size of a pixel. This is in part what Screen Space de­riv­a­tives were cre­ated for. Shader func­tions like dFdx, dFdy and fwidth al­low you to get the size of a screen pixel rel­a­tive to some vec­tor. In the above cir­cle-an­a­lyt­i­cal­Com­pare.fs we de­ter­mine by how much the dis­tance changes via two meth­ods:

pix­el­Size = fwidth(dist);

/* or */

pix­el­Size = length(vec2(dFdx(dist), dFdy(dist)));

Relying on Screen Space de­riv­a­tives has the ben­e­fit, that we get the pixel size de­liv­ered to us by the graph­ics pipeline. It prop­erly re­spects any trans­for­ma­tions we might throw at it, in­clud­ing 3D per­spec­tive.

The down side is that it is not sup­ported by the WebGL 1 stan­dard and has to be pulled in via the ex­ten­sion GL_OES_standard_derivatives or re­quires the jump to WebGL 2.

Luckily I have never wit­nessed any de­vice that sup­ported WebGL 1, but not the Screen Space de­riv­a­tives. Even the GMA based Thinkpad X200 & T500 I hard­ware mod­ded do.

Generally, there are some nasty pit­falls when us­ing Screen Space de­riv­a­tives: how the cal­cu­la­tion hap­pens is up to the im­ple­men­ta­tion. This led to the split into dFdxFine() and dFdx­Coarse() in later OpenGL re­vi­sions. The de­fault case can be set via GL_FRAGMENT_SHADER_DERIVATIVE_HINT, but the stan­dard hates you:

OpenGL Docs: The im­ple­men­ta­tion may choose which cal­cu­la­tion to per­form based upon fac­tors such as per­for­mance or the value of the API GL_FRAGMENT_SHADER_DERIVATIVE_HINT hint.

Why do we have stan­dards again? As a graph­ics pro­gram­mer, any­thing with hint has me trau­ma­tized.

Luckily, nei­ther case con­cerns us, as the dif­fer­ence does­n’t show it­self in the con­text of Anti-Aliasing. Performance tech­ni­cally dFdx and dFdy are free (or rather, their cost is al­ready part of the ren­der­ing pipeline), though the pixel size cal­cu­la­tion us­ing length() or fwidth() is not. It is per­formed per-pixel.

This is why there ex­ist two ways of do­ing this: get­ting the length() of the vec­tor that dFdx and dFdy make up, a step in­volv­ing the his­tor­i­cally per­for­mance ex­pen­sive sqrt() func­tion or us­ing fwidth(), which is the ap­prox­i­ma­tion abs(dFdx()) + abs(dFdy()) of the above.

It de­pends on con­text, but on semi-mod­ern hard­ware a call to length() should be per­for­mance triv­ial though, even per-pixel.

To show­case the dif­fer­ence, the above Radius ad­just slider works off of the Pixel size method and ad­justs the SDF dis­tance. If you go with fwidth() and a strong ra­dius shrink, you’ll see some­thing weird.

The di­ag­o­nals shrink more than they should, as the ap­prox­i­ma­tion us­ing ad­di­tion scales too much di­ag­o­nally. We’ll talk about pro­fes­sional im­ple­men­ta­tions fur­ther be­low in a mo­ment, but us­ing fwidth() for AAA is what Unity ex­ten­sion Shapes” by Freya Holmér calls Fast Local Anti-Aliasing” with the fol­low­ing text:

Fast LAA has a slight bias in the di­ag­o­nal di­rec­tions, mak­ing cir­cu­lar shapes ap­pear ever so slightly rhom­bous and have a slightly sharper cur­va­ture in the or­thog­o­nal di­rec­tions, es­pe­cially when small. Sometimes the edges in the di­ag­o­nals are slightly fuzzy as well.

This ef­fects our fad­ing, which will fade more on di­ag­o­nals. Luckily, we fade by the amount of one pixel and thus the dif­fer­ence is re­ally only vis­i­ble when flick­ing be­tween the meth­ods. What to choose de­pends on what you care more about: Performance or Accuracy? But what if I told you can have your cake and eat it too…

…Calculate it your­self! For the 2D case, this is triv­ial and eas­ily ab­stracted away. We know the size our con­text is ren­der­ing at and how big our quad is that we draw on. Calculating the size of the pixel is thus done per-ob­ject, not per-pixel. This is what hap­pens in the above cir­cle­An­a­lyt­i­cal­Com­par­i­son.js.

/* Calculate pixel size based on height.

Simple case: Assumes Square pix­els and a square quad. */

gl.uni­for­m1f(pix­el­Size­Cir­cle, (2.0 / (canvas.height / res­Div)));

No WebGL 2, no ex­ten­sions, works on an­cient hard­ware.


Undergraduates with family income below $200,000 can expect to attend MIT tuition-free starting in 2025

Undergraduates with fam­ily in­come be­low $200,000 can ex­pect to at­tend MIT tu­ition-free start­ing next fall, thanks to newly ex­panded fi­nan­cial aid. Eighty per­cent of American house­holds meet this in­come thresh­old.

And for the 50 per­cent of American fam­i­lies with in­come be­low $100,000, par­ents can ex­pect to pay noth­ing at all to­ward the full cost of their stu­dents’ MIT ed­u­ca­tion, which in­cludes tu­ition as well as hous­ing, din­ing, fees, and an al­lowance for books and per­sonal ex­penses.

This $100,000 thresh­old is up from $75,000 this year, while next year’s $200,000 thresh­old for tu­ition-free at­ten­dance will in­crease from its cur­rent level of $140,000.

These new steps to en­hance MITs af­ford­abil­ity for stu­dents and fam­i­lies are the lat­est in a long his­tory of ef­forts by the Institute to free up more re­sources to make an MIT ed­u­ca­tion as af­ford­able and ac­ces­si­ble as pos­si­ble. Toward that end, MIT has ear­marked $167.3 mil­lion in need-based fi­nan­cial aid this year for un­der­grad­u­ate stu­dents — up some 70 per­cent from a decade ago.

MITs dis­tinc­tive model of ed­u­ca­tion — in­tense, de­mand­ing, and rooted in sci­ence and en­gi­neer­ing — has pro­found prac­ti­cal value to our stu­dents and to so­ci­ety,” MIT President Sally Kornbluth says. As the Wall Street Journal recently re­ported, MIT is bet­ter at im­prov­ing the fi­nan­cial fu­tures of its grad­u­ates than any other U. S. col­lege, and the Institute also ranks num­ber one in the world for the em­ploy­a­bil­ity of its grad­u­ates.”

The cost of col­lege is a real con­cern for fam­i­lies across the board,” Kornbluth adds, and we’re de­ter­mined to make this trans­for­ma­tive ed­u­ca­tional ex­pe­ri­ence avail­able to the most tal­ented stu­dents, what­ever their fi­nan­cial cir­cum­stances. So, to every stu­dent out there who dreams of com­ing to MIT: Don’t let con­cerns about cost stand in your way.”

MIT is one of only nine col­leges in the US that does not con­sider ap­pli­cants’ abil­ity to pay as part of its ad­mis­sions process and that meets the full demon­strated fi­nan­cial need ⁠for all un­der­grad­u­ates. MIT does not ex­pect stu­dents on aid to take loans, and, un­like many other in­sti­tu­tions, MIT does not pro­vide an ad­mis­sions ad­van­tage to the chil­dren of alumni or donors. Indeed, 18 per­cent of cur­rent MIT un­der­grad­u­ates are first-gen­er­a­tion col­lege stu­dents.

We be­lieve MIT should be the pre­em­i­nent des­ti­na­tion for the most tal­ented stu­dents in the coun­try in­ter­ested in an ed­u­ca­tion cen­tered on sci­ence and tech­nol­ogy, and ac­ces­si­ble to the best stu­dents re­gard­less of their fi­nan­cial cir­cum­stances,” says Stu Schmill, MITs dean of ad­mis­sions and stu­dent fi­nan­cial ser­vices.

With the need-based fi­nan­cial aid we pro­vide to­day, our ed­u­ca­tion is much more af­ford­able now than at any point in the past,” adds Schmill, who grad­u­ated from MIT in 1986, even though the sticker price’ of MIT is higher now than it was when I was an un­der­grad­u­ate.”

Last year, the me­dian an­nual cost paid by an MIT un­der­grad­u­ate re­ceiv­ing fi­nan­cial aid was $12,938⁠, al­low­ing 87 per­cent of stu­dents in the Class of 2024 to grad­u­ate debt-free. Those who did bor­row grad­u­ated with me­dian debt of $14,844. At the same time, grad­u­ates ben­e­fit from the life­long value of an MIT de­gree, with an av­er­age start­ing salary of $126,438 for grad­u­ates en­ter­ing in­dus­try, ac­cord­ing to MITs most re­cent sur­vey of its grad­u­at­ing stu­dents.

MITs en­dow­ment — made up of gen­er­ous gifts made by in­di­vid­ual alumni and friends — al­lows the Institute to pro­vide this level of fi­nan­cial aid, both now and into the fu­ture.

Today’s an­nounce­ment is a pow­er­ful ex­pres­sion of how much our grad­u­ates value their MIT ex­pe­ri­ence,” Kornbluth says, because our abil­ity to pro­vide fi­nan­cial aid of this scope de­pends on decades of in­di­vid­ual do­na­tions to our en­dow­ment, from gen­er­a­tions of MIT alumni and other friends. In ef­fect, our en­dow­ment is an in­ter-gen­er­a­tional gift from past MIT stu­dents to the stu­dents of to­day and to­mor­row.”

What MIT fam­i­lies can ex­pect in 2025

As noted ear­lier: Starting next fall, for fam­i­lies with in­come be­low $100,000, with typ­i­cal as­sets, par­ents can ex­pect to pay noth­ing for the full cost of at­ten­dance, which in­cludes tu­ition, hous­ing, din­ing, fees, and al­lowances for books and per­sonal ex­penses.

For fam­i­lies with in­come from $100,000 to $200,000, with typ­i­cal as­sets, par­ents can ex­pect to pay on a slid­ing scale from $0 up to a max­i­mum of around $23,970, which is this year’s to­tal cost for MIT hous­ing, din­ing, fees, and al­lowances for books and per­sonal ex­penses.

Put an­other way, next year all MIT fam­i­lies with in­come be­low $200,000 can ex­pect to con­tribute well be­low $27,146, which is the an­nual av­er­age cost for in-state stu­dents to at­tend and live on cam­pus at pub­lic uni­ver­si­ties in the US, ac­cord­ing to the Ed­u­ca­tion Data Initiative. And even among fam­i­lies with in­come above $200,000, many still re­ceive need-based fi­nan­cial aid from MIT, based on their unique fi­nan­cial cir­cum­stances. Families can use MITs on­line cal­cu­la­tors to es­ti­mate the cost of at­ten­dance for their spe­cific fam­ily.


Pandas but 100x faster

My main back­ground is a hedge fund pro­fes­sional, so I deal with fi­nance data all the time and so far the Pandas li­brary has been an in­dis­pens­able tool in my work­flow and my most used Python li­brary.

Then came along Polars (written in Rust, btw!) which shook the ground of Python ecosys­tem due to its speed and ef­fi­ciency, you can check some of Polars bench­mark here.

I have around +/- 30 thou­sand lines of Pandas code, so you can un­der­stand why I’ve been hes­i­tant to rewrite them to Polars, de­spite my en­thu­si­asm for speed and op­ti­miza­tion. The sheer scale of the task has led to re­peated de­lays, as I weigh the po­ten­tial ben­e­fits of a faster and more ef­fi­cient li­brary against the sig­nif­i­cant ef­fort re­quired to refac­tor my ex­ist­ing code.

There has al­ways been this thought in the back of my mind:

Pandas is writ­ten in C and Cython, which means the main en­gine is King C…there got to be a way to op­ti­mize Pandas and lever­age the C en­gine!

Here comes FireDucks, the an­swer to my prayer: . It was launched on October 2023 by a team of pro­gram­mers from NEC Corporation which have 30+ years of ex­pe­ri­ence de­vel­op­ing su­per­com­put­ers, read the an­nounce­ment here.

Quick check the bench­mark page here! I’ll let the num­bers speak by them­selves.

* This is the cra­zi­est bench, FireDucks even beat DuckDB! Also check Pandas & Polars ranks.

* It’s even faster than Polars!

Alrighty those bench num­bers from FireDucks looks amaz­ing, but a good rule of thumb is never take num­bers for granted…don’t trust, ver­ify! Hence I’m mak­ing my own set of bench­marks on my ma­chine.

Yes the last two bench­mark num­bers are 130x and 200x faster than Pandas…are you not amused with these per­for­mance im­pact?! So yeah, the ti­tle of this post is not a click­bait, it’s real. Another key point I need to high­light, the most im­por­tant one:

you can just plug FireDucks into your ex­ist­ing Pandas code and ex­pect mas­sive speed im­prove­ments..im­pres­sive in­deed!

I’m lost for words..frankly! What else would Pandas users want?

A note for those group of peo­ple bash­ing Python for be­ing slow…yes pure Python is su­per slow I agree. But it has been proven time and again it can be op­ti­mized and once it’s been prop­erly op­ti­mized (FireDucks, Codon, Cython, etc) it can be speedy as well since Python back­end uses C en­gine!

Be smart folks! Noone sane would use pure Python” for se­ri­ous work­load…lever­age the vast ecosys­tem!


Understanding the BM25 full text search algorithm

BM25, or Best Match 25, is a widely used al­go­rithm for full text search. It is the de­fault in Lucene/Elasticsearch and SQLite, among oth­ers. Recently, it has be­come com­mon to com­bine full text search and vec­tor sim­i­lar­ity search into hybrid search”. I wanted to un­der­stand how full text search works, and specif­i­cally BM25, so here is my at­tempt at un­der­stand­ing by re-ex­plain­ing.

For a quick bit of con­text on why I’m think­ing about search al­go­rithms, I’m build­ing a per­son­al­ized con­tent feed that scours noisy sources for con­tent re­lated to your in­ter­ests. I started off us­ing vec­tor sim­i­lar­ity search and wanted to also in­clude full-text search to im­prove the han­dling of ex­act key­words (for ex­am­ple, a friend has Solid.js” as an in­ter­est and us­ing vec­tor sim­i­lar­ity search alone, that turns up more con­tent re­lated to React than Solid).

The ques­tion that mo­ti­vated this deep dive into BM25 was: can I com­pare the BM25 scores of doc­u­ments across mul­ti­ple queries to de­ter­mine which query the doc­u­ment best matches?

Initially, both ChatGPT and Claude told me no — though an­noy­ingly, af­ter do­ing this deep dive and for­mu­lat­ing a more pre­cise ques­tion, they both said yes 🤦‍♂️. Anyway, let’s get into the de­tails of BM25 and then I’ll share my con­clu­sions about this ques­tion.

At the most ba­sic level, the goal of a full text search al­go­rithm is to take a query and find the most rel­e­vant doc­u­ments from a set of pos­si­bil­i­ties.

However, we don’t re­ally know which doc­u­ments are relevant”, so the best we can do is guess. Specifically, we can rank doc­u­ments based on the prob­a­bil­ity that they are rel­e­vant to the query. (This is called The Probability Ranking Principle.)

How do we cal­cu­late the prob­a­bil­ity that a doc­u­ment is rel­e­vant?

For full text or lex­i­cal search, we are only go­ing to use qual­i­ties of the search query and each of the doc­u­ments in our col­lec­tion. (In con­trast, vec­tor sim­i­lar­ity search might use an em­bed­ding model trained on an ex­ter­nal cor­pus of text to rep­re­sent the mean­ing or se­man­tics of the query and doc­u­ment.)

BM25 uses a cou­ple of dif­fer­ent com­po­nents of the query and the set of doc­u­ments:

* Query terms: if a search query is made up of mul­ti­ple terms, BM25 will cal­cu­late a sep­a­rate score for each term and then sum them up.

* Inverse Document Frequency (IDF): how rare is a given search term across the en­tire doc­u­ment col­lec­tion? We as­sume that com­mon words (such as the” or and”) are less in­for­ma­tive than rare words. Therefore, we want to boost the im­por­tance of rare words.

* Term fre­quency in the doc­u­ment: how many times does a search term ap­pear in a given doc­u­ment? We as­sume that more rep­e­ti­tion of a query term in a given doc­u­ment in­creases the like­li­hood that that doc­u­ment is re­lated to the term. However, BM25 also ad­justs this so that there are di­min­ish­ing re­turns each time a term is re­peated.

* Document length: how long is the given doc­u­ment com­pared to oth­ers? Long doc­u­ments might re­peat the search term more, just by virtue of be­ing longer. We don’t want to un­fairly boost long doc­u­ments, so BM25 ap­plies some nor­mal­iza­tion based on how the doc­u­men­t’s length com­pares to the av­er­age.

These four com­po­nents are what make up BM25. Now, let’s look at ex­actly how they’re used.

The BM25 al­go­rithm might look scary to non-math­e­mati­cians (my eyes glazed over the first time I saw it), but I promise, it’s not too hard to un­der­stand!

Here is the full equa­tion:

Now, let’s go through it piece-by-piece.

* is the full query, po­ten­tially com­posed of mul­ti­ple query terms

* is the num­ber of query terms

* is each of the query terms

This part of the equa­tion says: given a doc­u­ment and a query, sum up the scores for each of the query terms.

Now, let’s dig into how we cal­cu­late the score for each of the query terms.

The first com­po­nent of the score cal­cu­lates how rare the query term is within the whole col­lec­tion of doc­u­ments us­ing the Inverse Document Frequency (IDF).

The key el­e­ments to fo­cus on in this equa­tion are:

* is the to­tal num­ber of doc­u­ments in our col­lec­tion

* is the num­ber of doc­u­ments that con­tain the query term

* there­fore is the num­ber of doc­u­ments that do not con­tain the query term

In sim­ple lan­guage, this part boils down to the fol­low­ing: com­mon terms will ap­pear in many doc­u­ments. If the term ap­pears in many doc­u­ments, we will have a small num­ber (, or the num­ber of doc­u­ments that do not have the term) di­vided by . As a re­sult, com­mon terms will have a small ef­fect on the score.

In con­trast, rare terms will ap­pear in few doc­u­ments so will be small and will be large. Therefore, rare terms will have a greater im­pact on the score.

The con­stants and are there to smooth out the equa­tion and en­sure that we don’t end up with wildly vary­ing re­sults if the term is ei­ther very rare or very com­mon.

In the pre­vi­ous step, we looked at how rare the term is across the whole set of doc­u­ments. Now, let’s look at how fre­quent the given query is in the given doc­u­ment.

The terms in this equa­tion are:

* is the fre­quency of the given query in the given doc­u­ment

* is a tun­ing pa­ra­me­ter that is gen­er­ally set be­tween and

This equa­tion takes the term fre­quency within the doc­u­ment into ef­fect, but en­sures that term rep­e­ti­tion has di­min­ish­ing re­turns. The in­tu­ition here is that, at some point, the doc­u­ment is prob­a­bly re­lated to the query term and we don’t want an in­fi­nite amount of rep­e­ti­tion to be weighted too heav­ily in the score.

The pa­ra­me­ter con­trols how quickly the re­turns to term rep­e­ti­tion di­min­ish. You can see how the slope changes based on this set­ting:

The last thing we need is to com­pare the length of the given doc­u­ment to the lengths of the other doc­u­ments in the col­lec­tion.

From right to left this time, the pa­ra­me­ters are:

* is the length of the given doc­u­ment

* is the av­er­age doc­u­ment length in our col­lec­tion

* is an­other tun­ing pa­ra­me­ter that con­trols how much we nor­mal­ize by the doc­u­ment length

Long doc­u­ments are likely to con­tain the search term more fre­quently, just by virtue of be­ing longer. Since we don’t want to un­fairly boost long doc­u­ments, this whole term is go­ing to go in the de­nom­i­na­tor of our fi­nal equa­tion. That is, a doc­u­ment that is longer than av­er­age () will be pe­nal­ized by this ad­just­ment.

can be ad­justed by the user. Setting turns off doc­u­ment length nor­mal­iza­tion, while set­ting ap­plies it fully. It is nor­mally set to .

If we take all of the com­po­nents we’ve just dis­cussed and put them to­gether, we ar­rive back at the full BM25 equa­tion:

Reading from left to right, you can see that we are sum­ming up the scores for each query term. For each, we are tak­ing the Inverse Document Frequency, mul­ti­ply­ing it by the term fre­quency in the doc­u­ment (with di­min­ish­ing re­turns), and then nor­mal­iz­ing by the doc­u­ment length.

We’ve just gone through the com­po­nents of the BM25 equa­tion, but I think it’s worth paus­ing to em­pha­size two of its most in­ge­nious as­pects.

As men­tioned ear­lier, BM25 is based on an idea called the Probability Ranking Principle. In short, it says:

If re­trieved doc­u­ments are or­dered by de­creas­ing prob­a­bil­ity of rel­e­vance on the data avail­able, then the sys­tem’s ef­fec­tive­ness is the best that can be ob­tained for the data.

Unfortunately, cal­cu­lat­ing the true” prob­a­bil­ity that a doc­u­ment is rel­e­vant to a query is nearly im­pos­si­ble.

However, we re­ally care about the or­der of the doc­u­ments more than we care about the ex­act prob­a­bil­ity. Because of this, re­searchers re­al­ized that you could sim­plify the equa­tions and make it prac­ti­ca­ble. Specifically, you could drop terms from the equa­tion that would be re­quired to cal­cu­late the full prob­a­bil­ity but where leav­ing them out would not af­fect the or­der.

Even though we are us­ing the Probability Ranking Principle, we are ac­tu­ally cal­cu­lat­ing a weight” in­stead of a prob­a­bil­ity.

This equa­tion cal­cu­lates the weight us­ing term fre­quen­cies. Specifically:

* is the weight for a given doc­u­ment

* is the prob­a­bil­ity that the query term would ap­pear in the doc­u­ment with a given fre­quency () if the doc­u­ment is rel­e­vant ()

The var­i­ous terms boil down to the prob­a­bil­ity that we would see a cer­tain query term fre­quency within the doc­u­ment if the doc­u­ment is rel­e­vant or not rel­e­vant, and the prob­a­bil­i­ties that the term would not ap­pear at all if the doc­u­ment is rel­e­vant or not.

The Robertson/Sparck Jones Weight is a way of es­ti­mat­ing these prob­a­bil­i­ties but only us­ing the counts of dif­fer­ent sets of doc­u­ments:

The terms here are:

* is the num­ber of rel­e­vant doc­u­ments that con­tain the query term

* is the to­tal num­ber of doc­u­ments in the col­lec­tion

* is the num­ber of rel­e­vant doc­u­ments in the col­lec­tion

* is the num­ber of doc­u­ments that con­tain the query term

The big, glar­ing prob­lem with this equa­tion is that you first need to know which doc­u­ments are rel­e­vant to the query. How are we go­ing to get those?

The ques­tion about how to make use of the Robertson/Sparck Joes weight ap­par­ently stumped the en­tire re­search field for about 15 years. The equa­tion was built up from a solid the­o­ret­i­cal foun­da­tion, but re­ly­ing on al­ready hav­ing rel­e­vance in­for­ma­tion made it nearly im­pos­si­ble to put to use.

The BM25 de­vel­op­ers made a very clever as­sump­tion to get to the next step.

For any given query, we can as­sume that most doc­u­ments are not go­ing to be rel­e­vant. If we as­sume that the num­ber of rel­e­vant doc­u­ments is so small as to be neg­li­gi­ble, we can just set those num­bers to zero!

If we sub­sti­tute this into the Robertson/Sparck Jones Weight equa­tion, we get nearly the IDF term used in BM25:

Not re­ly­ing on rel­e­vance in­for­ma­tion made BM25 much more use­ful, while keep­ing the same the­o­ret­i­cal un­der­pin­nings. Victor Lavrenko de­scribed this as a very im­pres­sive leap of faith”, and I think this is quite a neat bit of BM25′s back­story.

As I men­tioned at the start, my mo­ti­vat­ing ques­tion was whether I could com­pare BM25 scores for a doc­u­ment across queries to un­der­stand which query the doc­u­ment best matches.

In gen­eral, BM25 scores can­not be di­rectly com­pared (and this is what ChatGPT and Claude stressed to me in re­sponse to my ini­tial in­quiries 🙂‍↔️). The al­go­rithm does not pro­duce a score from 0 to 1 that is easy to com­pare across sys­tems, and it does­n’t even try to es­ti­mate the prob­a­bil­ity that a doc­u­ment is rel­e­vant. It only fo­cuses on rank­ing doc­u­ments within a cer­tain col­lec­tion in an or­der that ap­prox­i­mates the prob­a­bil­ity of their rel­e­vance to the query. A higher BM25 score means the doc­u­ment is likely to be more rel­e­vant, but it is­n’t the ac­tual prob­a­bil­ity that it is rel­e­vant.

As far as I un­der­stand now, it is pos­si­ble to com­pare the BM25 scores across queries for the same doc­u­ment within the same col­lec­tion of doc­u­ments.

My hint that this was the case was the fact that BM25 sums the scores of each query term. There should not be a se­man­tic dif­fer­ence be­tween com­par­ing the scores for two query term and two whole queries.

The im­por­tant caveat to stress, how­ever, is the same doc­u­ment within the same col­lec­tion. BM25 uses the IDF or rar­ity of terms as well as the av­er­age doc­u­ment length within the col­lec­tion. Therefore, you can­not nec­es­sar­ily com­pare scores across time be­cause any mod­i­fi­ca­tions to the over­all col­lec­tion could change the scores.

For my pur­poses, though, this is use­ful enough. It means that I can do a full text search for each of a user’s in­ter­ests in my col­lec­tion of con­tent and com­pare the BM25 scores to help de­ter­mine which pieces best match their in­ter­ests.

I’ll write more about rank­ing al­go­rithms and how I’m us­ing the rel­e­vance scores in fu­ture posts, but in the mean­time I hope you’ve found this back­ground on BM25 use­ful or in­ter­est­ing!

Thanks to Alex Kesling and Natan Last for feed­back on drafts of this post.

If you are in­ter­ested in div­ing fur­ther into the the­ory and his­tory of BM25, I would highly rec­om­mend watch­ing Elastic en­gi­neer Britta Weber’s 2016 talk Improved Text Scoring with BM25 and read­ing The Probabilistic Relevance Framework: BM25 and Beyond by Stephen Robertson and Hugo Zaragoza.

Also, I had ini­tially in­cluded com­par­isons be­tween BM25 and some other al­go­rithms in this post. But, as you know, it was al­ready a bit long 😅. So, you can now find those in this other post: Comparing full text search al­go­rithms: BM25, TF-IDF, and Postgres.


Building a Large Geospatial Model to Achieve Spatial Intelligence

At Niantic, we are pi­o­neer­ing the con­cept of a Large Geospatial Model that will use large-scale ma­chine learn­ing to un­der­stand a scene and con­nect it to mil­lions of other scenes glob­ally.

When you look at a fa­mil­iar type of struc­ture — whether it’s a church, a statue, or a town square — it’s fairly easy to imag­ine what it might look like from other an­gles, even if you haven’t seen it from all sides. As hu­mans, we have spatial un­der­stand­ing” that means we can fill in these de­tails based on count­less sim­i­lar scenes we’ve en­coun­tered be­fore. But for ma­chines, this task is ex­tra­or­di­nar­ily dif­fi­cult. Even the most ad­vanced AI mod­els to­day strug­gle to vi­su­al­ize and in­fer miss­ing parts of a scene, or to imag­ine a place from a new an­gle. This is about to change: Spatial in­tel­li­gence is the next fron­tier of AI mod­els.

As part of Niantic’s Visual Positioning System (VPS), we have trained more than 50 mil­lion neural net­works, with more than 150 tril­lion pa­ra­me­ters, en­abling op­er­a­tion in over a mil­lion lo­ca­tions. In our vi­sion for a Large Geospatial Model (LGM), each of these lo­cal net­works would con­tribute to a global large model, im­ple­ment­ing a shared un­der­stand­ing of ge­o­graphic lo­ca­tions, and com­pre­hend­ing places yet to be fully scanned.

The LGM will en­able com­put­ers not only to per­ceive and un­der­stand phys­i­cal spaces, but also to in­ter­act with them in new ways, form­ing a crit­i­cal com­po­nent of AR glasses and fields be­yond, in­clud­ing ro­bot­ics, con­tent cre­ation and au­tonomous sys­tems. As we move from phones to wear­able tech­nol­ogy linked to the real world, spa­tial in­tel­li­gence will be­come the world’s fu­ture op­er­at­ing sys­tem.

Large Language Models (LLMs) are hav­ing an un­de­ni­able im­pact on our every­day lives and across mul­ti­ple in­dus­tries. Trained on in­ter­net-scale col­lec­tions of text, LLMs can un­der­stand and gen­er­ate writ­ten lan­guage in a way that chal­lenges our un­der­stand­ing of intelligence”.

Large Geospatial Models will help com­put­ers per­ceive, com­pre­hend, and nav­i­gate the phys­i­cal world in a way that will seem equally ad­vanced. Analogous to LLMs, geospa­tial mod­els are built us­ing vast amounts of raw data: bil­lions of im­ages of the world, all an­chored to pre­cise lo­ca­tions on the globe, are dis­tilled into a large model that en­ables a lo­ca­tion-based un­der­stand­ing of space, struc­tures, and phys­i­cal in­ter­ac­tions.

The shift from text-based mod­els to those based on 3D data mir­rors the broader tra­jec­tory of AIs growth in re­cent years: from un­der­stand­ing and gen­er­at­ing lan­guage, to in­ter­pret­ing and cre­at­ing sta­tic and mov­ing im­ages (2D vi­sion mod­els), and, with cur­rent re­search ef­forts in­creas­ing, to­wards mod­el­ing the 3D ap­pear­ance of ob­jects (3D vi­sion mod­els).

Geospatial mod­els are a step be­yond even 3D vi­sion mod­els in that they cap­ture 3D en­ti­ties that are rooted in spe­cific ge­o­graphic lo­ca­tions and have a met­ric qual­ity to them. Unlike typ­i­cal 3D gen­er­a­tive mod­els, which pro­duce un­scaled as­sets, a Large Geospatial Model is bound to met­ric space, en­sur­ing pre­cise es­ti­mates in scale-met­ric units. These en­ti­ties there­fore rep­re­sent next-gen­er­a­tion maps, rather than ar­bi­trary 3D as­sets. While a 3D vi­sion model may be able to cre­ate and un­der­stand a 3D scene, a geospa­tial model un­der­stands how that scene re­lates to mil­lions of other scenes, ge­o­graph­i­cally, around the world. A geospa­tial model im­ple­ments a form of geospa­tial in­tel­li­gence, where the model learns from its pre­vi­ous ob­ser­va­tions and is able to trans­fer knowl­edge to new lo­ca­tions, even if those are ob­served only par­tially.

While AR glasses with 3D graph­ics are still sev­eral years away from the mass mar­ket, there are op­por­tu­ni­ties for geospa­tial mod­els to be in­te­grated with au­dio-only or 2D dis­play glasses. These mod­els could guide users through the world, an­swer ques­tions, pro­vide per­son­al­ized rec­om­men­da­tions, help with nav­i­ga­tion, and en­hance real-world in­ter­ac­tions. Large lan­guage mod­els could be in­te­grated so un­der­stand­ing and space come to­gether, giv­ing peo­ple the op­por­tu­nity to be more in­formed and en­gaged with their sur­round­ings and neigh­bor­hoods. Geospatial in­tel­li­gence, as emerg­ing from a large geospa­tial model, could also en­able gen­er­a­tion, com­ple­tion or ma­nip­u­la­tion of 3D rep­re­sen­ta­tions of the world to help build the next gen­er­a­tion of AR ex­pe­ri­ences. Beyond gam­ing, Large Geospatial Models will have wide­spread ap­pli­ca­tions, rang­ing from spa­tial plan­ning and de­sign, lo­gis­tics, au­di­ence en­gage­ment, and re­mote col­lab­o­ra­tion.

Our work so far

Over the past five years, Niantic has fo­cused on build­ing our Visual Positioning System (VPS), which uses a sin­gle im­age from a phone to de­ter­mine its po­si­tion and ori­en­ta­tion us­ing a 3D map built from peo­ple scan­ning in­ter­est­ing lo­ca­tions in our games and Scaniverse.

With VPS, users can po­si­tion them­selves in the world with cen­time­ter-level ac­cu­racy. That means they can see dig­i­tal con­tent placed against the phys­i­cal en­vi­ron­ment pre­cisely and re­al­is­ti­cally. This con­tent is per­sis­tent in that it stays in a lo­ca­tion af­ter you’ve left, and it’s then share­able with oth­ers. For ex­am­ple, we re­cently started rolling out an ex­per­i­men­tal fea­ture in Pokémon GO, called Pokémon Playgrounds, where the user can place Pokémon at a spe­cific lo­ca­tion, and they will re­main there for oth­ers to see and in­ter­act with.

Niantic’s VPS is built from user scans, taken from dif­fer­ent per­spec­tives and at var­i­ous times of day, at many times dur­ing the years, and with po­si­tion­ing in­for­ma­tion at­tached, cre­at­ing a highly de­tailed un­der­stand­ing of the world. This data is unique be­cause it is taken from a pedes­trian per­spec­tive and in­cludes places in­ac­ces­si­ble to cars.

Today we have 10 mil­lion scanned lo­ca­tions around the world, and over 1 mil­lion of those are ac­ti­vated and avail­able for use with our VPS ser­vice. We re­ceive about 1 mil­lion fresh scans each week, each con­tain­ing hun­dreds of dis­crete im­ages.

As part of the VPS, we build clas­si­cal 3D vi­sion maps us­ing struc­ture from mo­tion tech­niques - but also a new type of neural map for each place. These neural mod­els, based on our re­search pa­pers ACE (2023) and ACE Zero (2024) do not rep­re­sent lo­ca­tions us­ing clas­si­cal 3D data struc­tures any­more, but en­code them im­plic­itly in the learn­able pa­ra­me­ters of a neural net­work. These net­works can swiftly com­press thou­sands of map­ping im­ages into a lean, neural rep­re­sen­ta­tion. Given a new query im­age, they of­fer pre­cise po­si­tion­ing for that lo­ca­tion with cen­time­ter-level ac­cu­racy.

Niantic has trained more than 50 mil­lion neural nets to date, where mul­ti­ple net­works can con­tribute to a sin­gle lo­ca­tion. All these net­works com­bined com­prise over 150 tril­lion pa­ra­me­ters op­ti­mized us­ing ma­chine learn­ing.

Our cur­rent neural map is a vi­able geospa­tial model, ac­tive and us­able right now as part of Niantic’s VPS. It is also most cer­tainly large”. However, our vi­sion of a Large Geospatial Model” goes be­yond the cur­rent sys­tem of in­de­pen­dent lo­cal maps.

An en­tirely lo­cal model might lack com­plete cov­er­age of their re­spec­tive lo­ca­tions. No mat­ter how much data we have avail­able on a global scale, lo­cally, it will of­ten be sparse. The main fail­ure mode of a lo­cal model is its in­abil­ity to ex­trap­o­late be­yond what it has al­ready seen and from where the model has seen it. Therefore, lo­cal mod­els can only po­si­tion cam­era views sim­i­lar to the views they have been trained with al­ready.

Imagine your­self stand­ing be­hind a church. Let us as­sume the clos­est lo­cal model has seen only the front en­trance of that church, and thus, it will not be able to tell you where you are. The model has never seen the back of that build­ing. But on a global scale, we have seen a lot of churches, thou­sands of them, all cap­tured by their re­spec­tive lo­cal mod­els at other places world­wide. No church is the same, but many share com­mon char­ac­ter­is­tics. An LGM is a way to ac­cess that dis­trib­uted knowl­edge.

An LGM dis­tills com­mon in­for­ma­tion in a global large-scale model that en­ables com­mu­ni­ca­tion and data shar­ing across lo­cal mod­els. An LGM would be able to in­ter­nal­ize the con­cept of a church, and, fur­ther­more, how these build­ings are com­monly struc­tured. Even if, for a spe­cific lo­ca­tion, we have only mapped the en­trance of a church, an LGM would be able to make an in­tel­li­gent guess about what the back of the build­ing looks like, based on thou­sands of churches it has seen be­fore. Therefore, the LGM al­lows for un­prece­dented ro­bust­ness in po­si­tion­ing, even from view­points and an­gles that the VPS has never seen.

The global model im­ple­ments a cen­tral­ized un­der­stand­ing of the world, en­tirely de­rived from geospa­tial and vi­sual data. The LGM ex­trap­o­lates lo­cally by in­ter­po­lat­ing glob­ally.

The process de­scribed above is sim­i­lar to how hu­mans per­ceive and imag­ine the world. As hu­mans, we nat­u­rally rec­og­nize some­thing we’ve seen be­fore, even from a dif­fer­ent an­gle. For ex­am­ple, it takes us rel­a­tively lit­tle ef­fort to back-track our way through the wind­ing streets of a European old town. We iden­tify all the right junc­tions al­though we had only seen them once and from the op­pos­ing di­rec­tion. This takes a level of un­der­stand­ing of the phys­i­cal world, and cul­tural spaces, that is nat­ural to us, but ex­tremely dif­fi­cult to achieve with clas­si­cal ma­chine vi­sion tech­nol­ogy. It re­quires knowl­edge of some ba­sic laws of na­ture: the world is com­posed of ob­jects which con­sist of solid mat­ter and there­fore have a front and a back. Appearance changes based on time of day and sea­son. It also re­quires a con­sid­er­able amount of cul­tural knowl­edge: the shape of many man-made ob­jects fol­low spe­cific rules of sym­me­try or other generic types of lay­outs — of­ten de­pen­dent on the ge­o­graphic re­gion.

While early com­puter vi­sion re­search tried to de­ci­pher some of these rules in or­der to hard-code them into hand-crafted sys­tems, it is now con­sen­sus that such a high de­gree of un­der­stand­ing as we as­pire to can re­al­is­ti­cally only be achieved via large-scale ma­chine learn­ing. This is what we aim for with our LGM. We have seen a first glimpse of im­pres­sive cam­era po­si­tion­ing ca­pa­bil­i­ties emerg­ing from our data in our re­cent re­search pa­per MicKey (2024). MicKey is a neural net­work able to po­si­tion two cam­era views rel­a­tive to each other, even un­der dras­tic view­point changes.

MicKey can han­dle even op­pos­ing shots that would take a hu­man some ef­fort to fig­ure out. MicKey was trained on a tiny frac­tion of our data — data that we re­leased to the aca­d­e­mic com­mu­nity to en­cour­age this type of re­search. MicKey is lim­ited to two-view in­puts and was trained on com­par­a­tively lit­tle data, but it still rep­re­sents a proof of con­cept re­gard­ing the po­ten­tial of an LGM. Evidently, to ac­com­plish geospa­tial in­tel­li­gence as out­lined in this text, an im­mense in­flux of geospa­tial data is needed — a kind of data not many or­ga­ni­za­tions have ac­cess to. Therefore, Niantic is in a unique po­si­tion to lead the way in mak­ing a Large Geospatial Model a re­al­ity, sup­ported by more than a mil­lion user-con­tributed scans of real-world places we re­ceive per week.

An LGM will be use­ful for more than mere po­si­tion­ing. In or­der to solve po­si­tion­ing well, the LGM has to en­code rich geo­met­ri­cal, ap­pear­ance and cul­tural in­for­ma­tion into scene-level fea­tures. These fea­tures will en­able new ways of scene rep­re­sen­ta­tion, ma­nip­u­la­tion and cre­ation. Versatile large AI mod­els like the LGM, which are use­ful for a mul­ti­tude of down­stream ap­pli­ca­tions, are com­monly re­ferred to as foundation mod­els”.

Different types of foun­da­tion mod­els will com­ple­ment each other. LLMs will in­ter­act with mul­ti­modal mod­els, which will, in turn, com­mu­ni­cate with LGMs. These sys­tems, work­ing to­gether, will make sense of the world in ways that no sin­gle model can achieve on its own. This in­ter­con­nec­tion is the fu­ture of spa­tial com­put­ing — in­tel­li­gent sys­tems that per­ceive, un­der­stand, and act upon the phys­i­cal world.

As we move to­ward more scal­able mod­els, Niantic’s goal re­mains to lead in the de­vel­op­ment of a large geospa­tial model that op­er­ates wher­ever we can de­liver novel, fun, en­rich­ing ex­pe­ri­ences to our users. And, as noted, be­yond gam­ing Large Geospatial Models will have wide­spread ap­pli­ca­tions, in­clud­ing spa­tial plan­ning and de­sign, lo­gis­tics, au­di­ence en­gage­ment, and re­mote col­lab­o­ra­tion.

The path from LLMs to LGMs is an­other step in AIs evo­lu­tion. As wear­able de­vices like AR glasses be­come more preva­lent, the world’s fu­ture op­er­at­ing sys­tem will de­pend on the blend­ing of phys­i­cal and dig­i­tal re­al­i­ties to cre­ate a sys­tem for spa­tial com­put­ing that will put peo­ple at the cen­ter.


4.3 — blender.org

With light link­ing, lights can be set to af­fect only spe­cific ob­jects in the scene.

Shadow link­ing ad­di­tion­ally gives con­trol over which ob­jects acts as shadow block­ers for a light.

This is now fea­ture par­ity with Cycles.


Z-Library Helps Students to Overcome Academic Poverty, Study Finds

A re­cent study pub­lished in the Journal of University Teaching & Learning Practice sheds light on peo­ple’s mo­ti­va­tions to use Z-Library. Expensive books and lim­ited ac­cess to aca­d­e­mic ma­te­r­ial play a key role among those sur­veyed. That in­cludes a group of Chinese post­grad­u­ate stu­dents who be­lieve that shadow li­braries help to over­come (academic) poverty.

A re­cent study pub­lished in the Journal of University Teaching & Learning Practice sheds light on peo­ple’s mo­ti­va­tions to use Z-Library. Expensive books and lim­ited ac­cess to aca­d­e­mic ma­te­r­ial play a key role among those sur­veyed. That in­cludes a group of Chinese post­grad­u­ate stu­dents who be­lieve that shadow li­braries help to over­come (academic) poverty.

Z-Library is one of the largest shadow li­braries on the Internet, host­ing mil­lions of books and aca­d­e­mic ar­ti­cles that can be down­loaded for free.

The site de­fied all odds over the past two years. It con­tin­ued to op­er­ate de­spite a full-fledged crim­i­nal pros­e­cu­tion by the United States, which re­sulted in the ar­rest of two al­leged op­er­a­tors in Argentina.

These two Russian de­fen­dants are wanted by the United States and ear­lier this year a judge ap­proved their ex­tra­di­tion. However, ac­cord­ing to the most re­cent in­for­ma­tion we have, the de­fen­dants es­caped house ar­rest and van­ished into thin air.

The roles of the two Russians re­main un­clear, but they were not vi­tal to the site’s sur­vival. Z-Library con­tin­ued to ex­pand its reach de­spite their le­gal trou­bles.

Z-Library users don’t seem to be hin­dered by the crim­i­nal pros­e­cu­tion ei­ther, as they con­tinue to sup­port and use the site. For many, Z-Library is sim­ply a con­ve­nient por­tal to down­load free books. For oth­ers, how­ever, it’s a vi­tal re­source to fur­ther an aca­d­e­mic ca­reer.

A re­cent study pub­lished in the Journal of University Teaching & Learning Practice sheds light on the lat­ter. It looks at the piracy’ mo­ti­va­tions of Redditors and stu­dents in higher ed­u­ca­tion, specif­i­cally when it comes to Z-Library.

The pa­per, pub­lished by Dr. Michael Day of the University of Greenwich, la­bels the use of Z-Library as Academic Cybercrime’. The find­ings, how­ever, sug­gest that stu­dents are more likely to draw com­par­isons with Robin Hood”.

The re­search looks at the mo­ti­va­tions of two groups; Reddit users and Chinese post­grad­u­ate stu­dents. Despite the vast dif­fer­ences be­tween these groups, their views on Z-Library are quite sim­i­lar.

The 134 Reddit re­sponses were sam­pled from the Zlibrary sub­red­dit, which is ob­vi­ously bi­ased in fa­vor of the site. However, the rea­son­ing goes well be­yond a sim­ple I want free stuff” ar­gu­ments.

Many com­menters high­lighted that they were drawn to the site out of poverty, for ex­am­ple, or they high­lighted that Z-Library was an es­sen­tial tool to ful­fill their aca­d­e­mic goals.

Living in a 3rd world coun­try, 1 book would cost like 50%- 80% al­ready of my daily wage,” one Redditor wrote.

The idea that Z-Library is a necessary evil’ was also high­lighted by other com­menters. This in­cludes a stu­dent who can barely make ends meet, and a home­less per­son, who has nei­ther the money nor the space for phys­i­cal books.

The lack of free ac­cess to all study ma­te­ri­als, in­clud­ing aca­d­e­mic jour­nal sub­scrip­tions at uni­ver­sity li­braries, was also a key mo­ti­va­tor. Paired with the no­tion that jour­nal pub­lish­ers make bil­lions of dol­lars, with­out com­pen­sat­ing au­thors, jus­ti­fi­ca­tion is found for pirate’ al­ter­na­tives.

They make mas­sive prof­its. So steal­ing from them does­n’t hurt the au­thors nor re­view­ers, just the rich greedy pub­lish­ers who make mil­lions just to de­sign a cover and click publish’,” one Redditor wrote.

The sec­ond part of the study is con­ducted in a more struc­tured for­mat among 103 post­grad­u­ate stu­dents in China. This group joined a sem­i­nar where Z-Library and the crack­down were dis­cussed. In ad­di­tion, the stu­dents par­tic­i­pated in fol­low-up fo­cus group dis­cus­sions, while also com­plet­ing a sur­vey.

Despite not all be­ing users of the shadow li­brary, 41% of the stu­dents agreed that the site’s (temporary) shut­down af­fected their abil­ity to study and find re­sources for de­gree learn­ing.

In gen­eral, the stu­dents have a fa­vor­able view to­ward Z-Library and sim­i­lar sites, and 71% ad­mit that they have used a shadow li­brary in the past. In line with China’s so­cial­ist val­ues, the over­whelm­ing ma­jor­ity of the stu­dents agreed that ac­cess to knowl­edge should be free for every­one.

While the stu­dents are aware of copy­right law, they be­lieve that the need to ac­cess knowl­edge out­weighs right­sh­old­ers’ con­cerns. This is also re­flected in the fol­low­ing re­sponses, among oth­ers.

– Z-Library, or a sim­i­lar web­site, is help­ful to stu­dents liv­ing in poverty (82% agree).

– Academic text­books are too ex­pen­sive, so I can’t af­ford to buy them as a stu­dent (67% agree).

– I have lim­ited ac­cess to English medium aca­d­e­mic books in my coun­try (63% agree)

– I pre­fer to down­load books with­out re­stric­tions, like [paywalls etc.], as it is dif­fi­cult (77% agree).

All in all, Z-Library and other shadow li­braries are seen as a vi­able op­tion for ex­pen­sive or in­ac­ces­si­ble books, de­spite po­ten­tial copy­right con­cerns.

This re­search sheds an in­trigu­ing light on key mo­ti­va­tions to use shadow li­braries. However, the small sam­ple sizes, se­lec­tion bias, and spe­cific char­ac­ter­is­tics of the groups, means that these find­ings should be in­ter­preted with cau­tion.

Dr. Michael Day, nonethe­less, notes that the re­sponses show clear signs of a Robin Hood men­tal­ity. Z-Library users evade the pub­lish­ers’ tax’ on knowl­edge by down­load­ing works for free.

Overall, the pa­per sug­gests that uni­ver­si­ties and pub­lish­ers may want to re­con­sider the sta­tus quo and con­sider mak­ing more con­tent freely ac­ces­si­ble, tak­ing a page from Z-Library.

There is need for uni­ver­si­ties to re-con­sider the dig­i­tal di­vides faced by so­cioe­co­nom­i­cally and dig­i­tally dis­ad­van­taged stu­dents, along­side pub­lish­ers, who must re­think their ap­proach by mak­ing open ac­cess re­search more com­mon­place and thus pro-hu­man,” the au­thor con­cludes.

The pa­per pro­vides a good ex­am­ple, as it is pub­lished un­der a Creative Commons li­cense and is freely ac­ces­si­ble to all.

Day, M. J. (2024). Digital Piracy in Higher Education: Exploring Social Media Users and Chinese Postgraduate Students Motivations for Supporting Academic Cybercrime’ by Shelving ebooks from Z-Library. Journal of University Teaching and Learning Practice.


