10 interesting stories served every morning and every evening.
It’s a Panzer IVD of the 31st Panzer Regiment assigned to the 5th Panzer Div. commanded by Lt. Heinz Zobel lost on May 13th, 1940. The “lake” is the Meuse River. The man is a German pioneer.
All credit to finding the Panzer of the Lake goes to ConeOfArc for coordinating the search, and miller786 and their team for finding the Panzer. Full sources and details are in Panzer Of The Lake - Meuse River Theory
The photo was taken about coordinates 50.29092467073664, 4.893099128823844 near modern Wallonia, Belgium on the Meuse River. The tank was not recovered until much later in 1941. The man is an unnamed German pioneer likely at the time of recovery.
Comparison of an alternative original photo and the most recent image available of the location (July 2020, Google Street View)
On May 12th, 1940 the 31st Panzer Regiment, assigned to the 5th Panzer Division, attempted to capture a bridge over the Meuse River at Yvoir. The bridge was demolished by 1st Lieutenant De Wispelaere of the Belgian Engineers.
Werner Advance Detachment (under Oberst Paul Hermann Werner, commander, 31st Panzer Regiment), which belonged to the 5th Panzer Division, under Rommel’s command… Werner received a message from close support air reconnaissance in the afternoon that the bridge at Yvoir (seven kilometers north of Dinant) was still intact. He (Werner) immediately ordered Leutnant [Heinz] Zobel’s armored assault team of two armored scout cars and one Panzer platoon to head to the bridge at top speed… Belgian engineers under the command of 1st Lieutenant de Wispelaere had prepared the bridge for demolition while a platoon of Ardennes Light Infantry and elements of a French infantry battalion screened the bridge… Although the last soldiers had already passed the bridge, de Wispelaere delayed the demolition because civilian refugees were still approaching… two German armored scout cars charged toward the bridge while the following three Panzers opened fire. De Wispelaere immediately pushed the electrical ignition, but there was no explosion… Wispelaere now left his shelter and worked the manual ignition device. Trying to get back to his bunker, he was hit by a burst from a German machine gun and fell to the ground, mortally wounded. At the same time, the explosive charge went off. After the gigantic smoke cloud had drifted away, only the remnants of the pillars could be seen.
A few kilometers south at Houx, the Germans used a portion of a pontoon bridge (Bruckengerat B) rated to carry 16 tons to ferry their 25 ton tanks across.
By noon on May 13, Pioniere completed an eight-ton ferry and crossed twenty anti-tank guns to the west bank, however to maintain the tempo of his divisions advance, he needed armor and motorized units across the river. Rommel personally ordered the ferry converted to a heavier sixteen-ton variant to facilitate the crossing of the light Panzers and armored cars. Simultaneously, the Pioniere began construction on a bridge capable of crossing the division’s heavier Panzers and motorized units.
Major Erich Schnee in “The German Pionier: Case Study of the Combat Engineer’s Employment During Sustained Ground Combat”
On the evening of the 13th, Lt. Zobel’s tank is crossing. Approaching the shore, the ferry lifts, the load shifts, and the tank falls into the river.
The panzer IV of Lieutenant Zabel [sic] of the 31. Panzer Regiment of the 5. Panzer-Division, on May 13, 1940, in Houx, as good as underwater except for the vehicle commander’s cupola. Close to the west bank, at the pontoon crossing site and later site of 5. Panzer Division bridge, a 16 tonne ferry (Bruckengerat B) gave way to the approaching shoreline, likely due to the rotating movement of the panzer, which turned right when disembarking (the only possible direction to quickly leave the Meuse’s shore due to the wall created by the rail line). The tank would be fished out in 1941 during the reconstruction of the bridge.
Sometime later the photograph was taken of a German pioneer infantryman looking at the tank. Later the tank was recovered and its ultimate fate is unknown.
Available evidence suggests the soldier in the photo is a Pioneer/Tank recovery crew, holding a Kar98k and wearing an EM/NCO’S Drill & Work uniform, more commonly known as “Drillich”.
His role is proven by the presence of pontoon ferries on the Meuse river, used by the 5th Panzer Division. That is also proven by his uniform, which, as evidence suggests, was used during work to prevent damage to their standard woolen uniform.
An early version of the Drillich
While I can’t identify the photo, I can narrow down the tank. I believe it is a Panzer IV D.
It has the short barrelled 7.5 cm KwK 37 narrowing it down to a Panzer IV Ausf. A through F1 or a Panzer III N.
Both had very similar turrets, but the Panzer III N has a wider gun mantlet, a more angular shroud, and lacked (or covered) the distinctive angular view ports (I believe they’re view ports) on either side of the turret face.
This leaves the Panzer IV. The distinctive cupola was added in model B. The external gun mantlet was added in model D.
Panzer IV model D in France 1940 with the external gun mantlet and periscope. source
Note the front half of the turret top is smooth. There is a protrusion to the front left of the cupola (I believe it’s a periscope sight) and another circular opening to the front right. Finally, note the large ventilation hatch just in front of the cupola.
Model E would eliminate the ventilation hatch and replace it with a fan. The periscope was replaced with a hatch for signal flags.
Panzer IV model D entered mass production in October 1939 which means it would be too late for Poland, but could have seen service in France, Norway, or the Soviet Union.
As for the soldier…
The rifle has a turned down bolt handle, a bayonet lug (missing from late rifles), a distinctive disassembly disc on the side of the stock (also missing from late rifles), no front site hood (indicative of an early rifle), and you can just about make out extra detail in the nose cap (also early). This is likely an early Karabiner 98k which is missing its cleaning rod. See Forgotten Weapons: Evolution of the Karabiner 98k, From Prewar to Kriegsmodell.
ConeOfArc posted a video The Search for Panzer of the Lake.
He broke down what he could identify about the solder, probably German.
For the tank he confirms it’s a Panzer IV D using similar criteria I used and he found two additional photos of what appear to be the same tank claiming to be from the Western front in 1940.
He then found a Russian source claiming it was found in Romania at the onset of Barbarossa in 1941.
Unfortunately that’s all for now. ConeOfArc has put a bounty of $100 US for definitive proof of the tank’s location. More detail can be had on ConeOfArc’s Discord.
...
Read the original on history.stackexchange.com »
This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Learn more about Bluesky at bsky.social and atproto.com. Chinese-flagged cargo ship Yi Peng 3 crossed both submarine cables C-Lion 1 and BSC at times matching when they broke.
She was shadowed by Danish navy for a while during night and is now in Danish Straits leaving Baltics.
No signs of boarding. AIS-caveats apply.
...
Read the original on bsky.app »
Vital personal and business information flows over the Internet more frequently than ever, and we don’t always know when it’s happening. It’s clear at this point that encrypting is something all of us should be doing. Then why don’t we use TLS (the successor to SSL) everywhere? Every browser in every device supports it. Every server in every data center supports it. Why don’t we just flip the switch?
The challenge is server certificates. The anchor for any TLS-protected communication is a public-key certificate which demonstrates that the server you’re actually talking to is the server you intended to talk to. For many server operators, getting even a basic server certificate is just too much of a hassle. The application process can be confusing. It usually costs money. It’s tricky to install correctly. It’s a pain to update.
Let’s Encrypt is a new free certificate authority, built on a foundation of cooperation and openness, that lets everyone be up and running with basic server certificates for their domains through a simple one-click process.
Mozilla Corporation, Cisco Systems, Inc., Akamai Technologies, Electronic Frontier Foundation, IdenTrust, Inc., and researchers at the University of Michigan are working through the Internet Security Research Group (“ISRG”), a California public benefit corporation, to deliver this much-needed infrastructure in Q2 2015. The ISRG welcomes other organizations dedicated to the same ideal of ubiquitous, open Internet security.
The key principles behind Let’s Encrypt are:
* Free: Anyone who owns a domain can get a certificate validated for that domain at zero cost.
* Automatic: The entire enrollment process for certificates occurs painlessly during the server’s native installation or configuration process, while renewal occurs automatically in the background.
* Secure: Let’s Encrypt will serve as a platform for implementing modern security techniques and best practices.
* Transparent: All records of certificate issuance and revocation will be available to anyone who wishes to inspect them.
* Open: The automated issuance and renewal protocol will be an open standard and as much of the software as possible will be open source.
* Cooperative: Much like the underlying Internet protocols themselves, Let’s Encrypt is a joint effort to benefit the entire community, beyond the control of any one organization.
If you want to help these organizations in making TLS Everywhere a reality, here’s how you can get involved:
To learn more about the ISRG and our partners, check out our About page.
...
Read the original on letsencrypt.org »
Today’s journey is Anti-Aliasing and the destination is Analytical Anti-Aliasing. Getting rid of rasterization jaggies is an art-form with decades upon decades of maths, creative techniques and non-stop innovation. With so many years of research and development, there are many flavors.
From the simple but resource intensive SSAA, over theory dense SMAA, to using machine learning with DLAA. Same goal - vastly different approaches. We’ll take a look at how they work, before introducing a new way to look a the problem - the ✨analytical🌟 way. The perfect Anti-Aliasing exists and is simpler than you think.
Having implemented it multiple times over the years, I’ll also share some juicy secrets I have never read anywhere before.
To understand the Anti-Aliasing algorithms, we will implement them along the way! Following WebGL canvases draw a moving circle. Anti-Aliasing cannot be fully understood with just images, movement is essential. The red box has 4x zoom. Rendering is done at native resolution of your device, important to judge sharpness.
Please pixel-peep to judge sharpness and aliasing closely. Resolution of your screen too high to see aliasing? Lower the resolution with the following buttons, which will integer-scale the rendering.
Let’s start out simple. Using GLSL Shaders we tell the GPU of your device to draw a circle in the most simple and naive way possible, as seen in circle.fs above: If the length() from the middle point is bigger than 1.0, we discard the pixel.
The circle is blocky, especially at smaller resolutions. More painfully, there is strong “pixel crawling”, an artifact that’s very obvious when there is any kind of movement. As the circle moves, rows of pixels pop in and out of existence and the stair steps of the pixelation move along the side of the circle like beads of different speeds.
The low ¼ and ⅛ resolutions aren’t just there for extreme pixel-peeping, but also to represent small elements or ones at large distance in 3D.
At lower resolutions these artifacts come together to destroy the circular form. The combination of slow movement and low resolution causes one side’s pixels to come into existence, before the other side’s pixels disappear, causing a wobble. Axis-alignment with the pixel grid causes “plateaus” of pixels at every 90° and 45° position.
Understanding the GPU code is not necessary to follow this article, but will help to grasp whats happening when we get to the analytical bits.
4 vertices making up a quad are sent to the GPU in the vertex shader circle.vs, where they are received as attribute vec2 vtx. The coordinates are of a “unit quad”, meaning the coordinates look like the following image. With one famous exception, all GPUs use triangles, so the quad is actually made up of two triangles.
The vertices here are given to the fragment shader circle.fs via varying vec2 uv. The fragment shader is called per fragment (here fragments are pixel-sized) and the varying is interpolated linearly with perspective corrected, barycentric coordinates, giving us a uv coordinate per pixel from -1 to +1 with zero at the center.
By performing the check if (length(uv) < 1.0) we draw our color for fragments inside the circle and reject fragments outside of it. What we are doing is known as “Alpha testing”. Without diving too deeply and just to hint at what’s to come, what we have created with length(uv) is the signed distance field of a point.
Just to clarify, the circle isn’t “drawn with geometry”, which would have finite resolution of the shape, depending on how many vertices we use. It’s “drawn by the shader”.
SSAA stands for Super Sampling Anti-Aliasing. Render it bigger, downsample to be smaller. The idea is as old as 3D rendering itself. In fact, the first movies with CGI all relied on this with the most naive of implementations. One example is the 1986 movie “Flight of the Navigator”, as covered by Captain Disillusion in the video below.
1986 did it, so can we. Implemented in mere seconds. Easy, right?
circleSSAA.js draws at twice the resolution to a texture, which fragment shader post.fs reads from at standard resolution with GL_LINEAR to perform SSAA. So we have four input pixels for every one output pixel we draw to the screen. But it’s somewhat strange: There is definitely Anti-Aliasing happening, but less than expected.
There should be 4 steps of transparency, but we only get two!
Especially at lower resolutions, we can see the circle does actually have 4 steps of transparency, but mainly at the 45° “diagonals” of the circle. A circle has of course no sides, but at the axis-aligned “bottom” there are only 2 steps of transparency: Fully Opaque and 50% transparent, the 25% and 75% transparency steps are missing.
We aren’t sampling against the circle shape at twice the resolution, we are sampling against the quantized result of the circle shape. Twice the resolution, but discrete pixels nonetheless. The combination of pixelation and sample placement doesn’t hold enough information where we need it the most: at the axis-aligned “flat parts”.
Four times the memory and four times the calculation requirement, but only a half-assed result.
Implementing SSAA properly is a minute craft. Here we are drawing to a 2x resolution texture and down-sampling it with linear interpolation. So actually, this implementation needs 5x the amount of VRAM. A proper implementation samples the scene multiple times and combines the result without an intermediary buffer.
With our implementation, we can’t even do more than 2xSSAA with one texture read, as linear interpolation happens only with 2x2 samples
To combat axis-alignment artifacts like with our circle above, we need to place our SSAA samples better. There are multiple ways to do so, all with pros and cons. To implement SSAA properly, we need deep integration with the rendering pipeline. For 3D primitives, this happens below API or engine, in the realm of vendors and drivers.
In fact, some of the best implementations were discovered by vendors on accident, like SGSSAA. There are also ways in which SSAA can make your scene look worse. Depending on implementation, SSAA messes with mip-map calculations. As a result the mip-map lod-bias may need adjustment, as explained in the article above.
WebXR UI package three-mesh-ui , a package mature enough to be used by Meta , uses shader-based rotated grid super sampling to achieve sharp text rendering in VR, as seen in the code
MSAA is super sampling, but only at the silhouette of models, overlapping geometry, and texture edges if “Alpha to Coverage” is enabled. MSAA is implemented by the graphics card in-hardware by the graphics vendors and what is supported depends on hardware. In the select box below you can choose different MSAA levels for our circle.
There is up to MSAA x64, but what is available is implementation defined. WebGL 1 has no support, which is why the next canvas initializes a WebGL 2 context. In WebGL, NVIDIA limits MSAA to 8x on Windows, even if more is supported, whilst on Linux no such limit is in place. On smartphones you will only get exactly 4x, as discussed below.
What is edge smoothing and how does MSAA even know what to sample against? For now we skip the shader code and implementation. First let’s take a look at MSAA’s pros and cons in general.
We rely on hardware to do the Anti-Aliasing, which obviously leads to the problem that user hardware may not support what we need. The sampling patterns MSAA uses may also do things we don’t expect. Depending on what your hardware does, you may see the circle’s edge transparency steps appearing “in the wrong order”.
When MSAA became required with OpenGL 3 & DirectX 10 era of hardware, support was especially hit & miss. Even latest Intel GMA iGPUs expose the OpenGL extension EXT_framebuffer_multisample, but don’t in-fact support MSAA, which led to confusion. But also in more recent smartphones, support just wasn’t that clear-cut.
Mobile chips support exactly MSAAx4 and things are weird. Android will let you pick 2x, but the driver will force 4x anyways. iPhones & iPads do something rather stupid: Choosing 2x will make it 4x, but transparency will be rounded to nearest 50% multiple, leading to double edges in our example. There is hardware specific reason:
Looking at modern video games, one might believe that MSAA is of the past. It usually brings a hefty performance penalty after all. Surprisingly, it’s still the king under certain circumstances and in very specific situations, even performance free.
As a gamer, this goes against instinct…
Rahul Prasad: Use MSAA […] It’s actually not as expensive on mobile as it is on desktop, it’s one of the nice things you get on mobile. […] On some (mobile) GPUs 4x (MSAA) is free, so use it when you have it.
As explained by Rahul Prasad in the above talk, in VR 4xMSAA is a must and may come free on certain mobile GPUs. The specific reason would derail the blog post, but in case you want to go down that particular rabbit hole, here is Epic Games’ Niklas Smedberg giving a run-down.
In short, this is possible under the condition of forward rendering with geometry that is not too dense and the GPU having tiled-based rendering architecture, which allows the GPU to perform MSAA calculations without heavy memory access and thus latency hiding the cost of the calculation. Here’s deep dive, if you are interested.
MSAA gives you access to the samples, making custom MSAA filtering curves a possibility. It also allows you to merge both standard mesh-based and signed-distance-field rendering via alpha to coverage. This complex features set made possible the most out-of-the-box thinking I ever witnessed in graphics programming:
Assassin’s Creed Unity used MSAA to render at half resolution and reconstruct only some buffers to full-res from MSAA samples, as described on page 48 of the talk “GPU-Driven Rendering Pipelines” by Ulrich Haar and Sebastian Aaltonen. Kinda like variable rate shading, but implemented with duct-tape and without vendor support.
The brain-melting lengths to which graphics programmers go to utilize hardware acceleration to the last drop has me sometimes in awe.
In 2009 a paper by Alexander Reshetov struck the graphics programming world like a ton of bricks: take the blocky, aliased result of the rendered image, find edges and classify the pixels into tetris-like shapes with per-shape filtering rules and remove the blocky edge. Anti-Aliasing based on the morphology of pixels - MLAA was born.
Computationally cheap, easy to implement. Later it was refined with more emphasis on removing sub-pixel artifacts to become SMAA. It became a fan favorite, with an injector being developed early on to put SMAA into games that didn’t support it. Some considered these too blurry, the saying “vaseline on the screen” was coined.
It was the future, a sign of things to come. No more shaky hardware support. Like Fixed-Function pipelines died in favor of programmable shaders Anti-Aliasing too became “shader based”.
We’ll take a close look at an algorithm that was inspired by MLAA, developed by Timothy Lottes. “Fast approximate anti-aliasing”, FXAA. In fact, when it came into wide circulation, it received some incredible press. Among others, Jeff Atwood pulled neither bold fonts nor punches in his 2011 blog post, later republished by Kotaku.
Jeff Atwood: The FXAA method is so good, in fact, it makes all other forms of full-screen anti-aliasing pretty much obsolete overnight. If you have an FXAA option in your game, you should enable it immediately and ignore any other AA options.
Let’s see what the hype was about. The final version publicly released was FXAA 3.11 on August 12th 2011 and the following demos are based on this. First, let’s take a look at our circle with FXAA doing the Anti-Aliasing at default settings.
A bit of a weird result. It looks good if the circle wouldn’t move. Perfectly smooth edges. But the circle distorts as it moves. The axis-aligned top and bottom especially have a little nub that appears and disappears. And switching to lower resolutions, the circle even loses its round shape, wobbling like Play Station 1 graphics.
Per-pixel, FXAA considers only the 3x3 neighborhood, so it can’t possibly know that this area is part of a big shape. But it also doesn’t just “blur edges”, as often said. As explained in the official whitepaper, it finds the edge’s direction and shifts the pixel’s coordinates to let the performance free linear interpolation do the blending.
For our demo here, wrong tool for the job. Really, we didn’t do FXAA justice with our example. FXAA was created for another use case and has many settings and presets. It was created to anti-alias more complex scenes. Let’s give it a fair shot!
A scene from my favorite piece of software in existence: NeoTokyo°. I created a bright area light in an NT° map and moved a bench to create an area of strong aliasing. The following demo uses the aliased output from NeoTokyo°, calculates the required luminance channel and applies FXAA. All FXAA presets and settings at your finger tips.
This has fixed resolution and will only be at you device’s native resolution, if your device has no dpi scaling and the browser is at 100% zoom.
Just looking at the full FXAA 3.11 source, you can see the passion in every line. Portable across OpenGL and DirectX, a PC version, a XBOX 360 version, two finely optimized PS3 version fighting for every GPU cycle, including shader disassambly. Such level of professionalism and dedication, shared with the world in plain text.
The sharing and openness is why I’m in love with graphics programming.
It may be performance cheap, but only if you already have post-processing in place or do deferred shading. Especially in mobile graphics, memory access is expensive, so saving the framebuffer to perform post processing is not always a given. If you need to setup render-to-texture in order to have FXAA, then the “F” in FXAA evaporates.
In this article we won’t jump into modern temporal anti-aliasing, but before FXAA was even developed, TAA was already experimented with. In fact, FXAA was supposed to get a new version 4 and incorporate temporal anti aliasing in addition to the standard spatial one, but instead it evolved further and rebranded into TXAA.
Now we get to the good stuff. Analytical Anti-Aliasing approaches the problem backwards - it knows the shape you need and draws the pixel already Anti-Aliased to the screen. Whilst drawing the 2D or 3D shape you need, it fades the shape’s border by exactly one pixel.
Always smooth without artifacts and you can adjust the amount of filtering. Preserves shape even at low resolutions. No extra buffers or extra hardware requirements.
Even runs on basic WebGL 1.0 or OpenGLES 2.0, without any extensions.
With the above buttons, you can set the smoothing to be equal to one pixel. This gives a sharp result, but comes with the caveat that axis-aligned 90° sides may still be perseved as “flat” in specific combinations of screen resolution, size and circle position.
Filtering based on the diagonal pixel size of √2 px = 1.4142…, ensures the “tip” of the circle in axis-aligned pixel rows and columns is always non-opaque. This removes the perception of flatness, but makes it shape ever so slightly more blurry.
Or in other words: as soon as the border has an opaque pixel, there is already a transparent pixel “in front” of it.
This style of Anti-Aliasing is usually implemented with 3 ingredients:
But if you look at the code box above, you will find circle-analytical.fs having none of those. And this is the secret sauce we will look at. Before we dive into the implementation, let’s clear the elephants in the room…
In graphics programming, Analytical refers to effects created by knowing the make-up of the intended shape beforehand and performing calculations against the rigid mathematical definition of said shape. This term is used very loosely across computer graphics, similar to super sampling referring to multiple things, depending on context.
Very soft soft-shadows which include contact-hardening, implemented by algorithms like percentage-closer soft shadows are very computationally intense and require both high resolution shadow maps and/or very aggressive filtering to not produce shimmering during movement.
This is why Naughty Dog’s The Last of Us relied on getting soft-shadows on the main character by calculating the shadow from a rigidly defined formula of a stretched sphere, multiple of which were arranged in the shape of the main character, shown in red. An improved implementation with shader code can be seen in this Shadertoy demo by romainguy, with the more modern capsule, as opposed a stretched sphere.
This is now an integral part of modern game engines, like Unreal. As opposed to standard shadow mapping, we don’t render the scene from the perspective of the light with finite resolution. We evaluate the shadow per-pixel against the mathematical equation of the stretched sphere or capsule. This makes capsule shadows analytical.
Staying with the Last of Us, The Last of Us Part II uses the same logic for blurry real-time reflections of the main character, where Screen Space Reflections aren’t defined. Other options like raytracing against the scene, or using a real-time cubemap like in GTA V are either noisy and low resolution or high resolution, but low performance.
Here the reflection calculation is part of the material shader, rendering against the rigidly defined mathematical shape of the capsule per-pixel, multiple of which are arranged in the shape of the main character. This makes capsule reflections analytical.
An online demo with is worth at least a million…
…yeah the joke is getting old.
Ambient Occlusion is essential in modern rendering, bringing contact shadows and approximating global illumination. Another topic as deep as the ocean, with so many implementations. Usually implemented by some form of “raytrace a bunch of rays and blur the result”.
In this Shadertoy demo, the floor is evaluated per-pixel against the rigidly defined mathematical description of the sphere to get a soft, non-noisy, non-flickering occlusion contribution from the hovering ball. This implementation is analytical. Not just spheres, there are analytical approaches also for complex geometry.
By extension, Unreal Engine has distance field approaches for Soft Shadows and Ambient Occlusion, though one may argue, that this type of signed distance field rendering doesn’t fit the description of analytical, considering the distance field is precalculated into a 3D texture.
Let’s dive into the sauce. We work with signed distance fields, where for every point that we sample, we know the distance to the desired shape. This information may be baked into a texture as done for SDF text rendering or maybe be derived per-pixel from a mathematical formula for simpler shapes like bezier curves or hearts.
Based on that distance we fade out the border of the shape. If we fade by the size of one pixel, we get perfectly smooth edges, without any strange side effects. The secret sauce is in the implementation and under the sauce is where the magic is. How does the shader know the size of pixel? How do we blend based on distance?
This approach gives motion-stable pixel-perfection, but doesn’t work with traditional rasterization. The full shape requires a signed distance field.
Specifically, by how much do we fade the border? If we hardcode a static value, eg. fade at 95% of the circle’s radius, we may get a pleasing result for that circle size at that screen resolution, but too much smoothing when the circle is bigger or closer to the camera and aliasing if the circle becomes small.
We need to know the size of a pixel. This is in part what Screen Space derivatives were created for. Shader functions like dFdx, dFdy and fwidth allow you to get the size of a screen pixel relative to some vector. In the above circle-analyticalCompare.fs we determine by how much the distance changes via two methods:
pixelSize = fwidth(dist);
/* or */
pixelSize = length(vec2(dFdx(dist), dFdy(dist)));
Relying on Screen Space derivatives has the benefit, that we get the pixel size delivered to us by the graphics pipeline. It properly respects any transformations we might throw at it, including 3D perspective.
The down side is that it is not supported by the WebGL 1 standard and has to be pulled in via the extension GL_OES_standard_derivatives or requires the jump to WebGL 2.
Luckily I have never witnessed any device that supported WebGL 1, but not the Screen Space derivatives. Even the GMA based Thinkpad X200 & T500 I hardware modded do.
Generally, there are some nasty pitfalls when using Screen Space derivatives: how the calculation happens is up to the implementation. This led to the split into dFdxFine() and dFdxCoarse() in later OpenGL revisions. The default case can be set via GL_FRAGMENT_SHADER_DERIVATIVE_HINT, but the standard hates you:
OpenGL Docs: The implementation may choose which calculation to perform based upon factors such as performance or the value of the API GL_FRAGMENT_SHADER_DERIVATIVE_HINT hint.
Why do we have standards again? As a graphics programmer, anything with hint has me traumatized.
Luckily, neither case concerns us, as the difference doesn’t show itself in the context of Anti-Aliasing. Performance technically dFdx and dFdy are free (or rather, their cost is already part of the rendering pipeline), though the pixel size calculation using length() or fwidth() is not. It is performed per-pixel.
This is why there exist two ways of doing this: getting the length() of the vector that dFdx and dFdy make up, a step involving the historically performance expensive sqrt() function or using fwidth(), which is the approximation abs(dFdx()) + abs(dFdy()) of the above.
It depends on context, but on semi-modern hardware a call to length() should be performance trivial though, even per-pixel.
To showcase the difference, the above Radius adjust slider works off of the Pixel size method and adjusts the SDF distance. If you go with fwidth() and a strong radius shrink, you’ll see something weird.
The diagonals shrink more than they should, as the approximation using addition scales too much diagonally. We’ll talk about professional implementations further below in a moment, but using fwidth() for AAA is what Unity extension “Shapes” by Freya Holmér calls “Fast Local Anti-Aliasing” with the following text:
Fast LAA has a slight bias in the diagonal directions, making circular shapes appear ever so slightly rhombous and have a slightly sharper curvature in the orthogonal directions, especially when small. Sometimes the edges in the diagonals are slightly fuzzy as well.
This effects our fading, which will fade more on diagonals. Luckily, we fade by the amount of one pixel and thus the difference is really only visible when flicking between the methods. What to choose depends on what you care more about: Performance or Accuracy? But what if I told you can have your cake and eat it too…
…Calculate it yourself! For the 2D case, this is trivial and easily abstracted away. We know the size our context is rendering at and how big our quad is that we draw on. Calculating the size of the pixel is thus done per-object, not per-pixel. This is what happens in the above circleAnalyticalComparison.js.
/* Calculate pixel size based on height.
Simple case: Assumes Square pixels and a square quad. */
gl.uniform1f(pixelSizeCircle, (2.0 / (canvas.height / resDiv)));
No WebGL 2, no extensions, works on ancient hardware.
...
Read the original on blog.frost.kiwi »
Undergraduates with family income below $200,000 can expect to attend MIT tuition-free starting next fall, thanks to newly expanded financial aid. Eighty percent of American households meet this income threshold.
And for the 50 percent of American families with income below $100,000, parents can expect to pay nothing at all toward the full cost of their students’ MIT education, which includes tuition as well as housing, dining, fees, and an allowance for books and personal expenses.
This $100,000 threshold is up from $75,000 this year, while next year’s $200,000 threshold for tuition-free attendance will increase from its current level of $140,000.
These new steps to enhance MIT’s affordability for students and families are the latest in a long history of efforts by the Institute to free up more resources to make an MIT education as affordable and accessible as possible. Toward that end, MIT has earmarked $167.3 million in need-based financial aid this year for undergraduate students — up some 70 percent from a decade ago.
“MIT’s distinctive model of education — intense, demanding, and rooted in science and engineering — has profound practical value to our students and to society,” MIT President Sally Kornbluth says. “As the Wall Street Journal recently reported, MIT is better at improving the financial futures of its graduates than any other U. S. college, and the Institute also ranks number one in the world for the employability of its graduates.”
“The cost of college is a real concern for families across the board,” Kornbluth adds, “and we’re determined to make this transformative educational experience available to the most talented students, whatever their financial circumstances. So, to every student out there who dreams of coming to MIT: Don’t let concerns about cost stand in your way.”
MIT is one of only nine colleges in the US that does not consider applicants’ ability to pay as part of its admissions process and that meets the full demonstrated financial need for all undergraduates. MIT does not expect students on aid to take loans, and, unlike many other institutions, MIT does not provide an admissions advantage to the children of alumni or donors. Indeed, 18 percent of current MIT undergraduates are first-generation college students.
“We believe MIT should be the preeminent destination for the most talented students in the country interested in an education centered on science and technology, and accessible to the best students regardless of their financial circumstances,” says Stu Schmill, MIT’s dean of admissions and student financial services.
“With the need-based financial aid we provide today, our education is much more affordable now than at any point in the past,” adds Schmill, who graduated from MIT in 1986, “even though the ‘sticker price’ of MIT is higher now than it was when I was an undergraduate.”
Last year, the median annual cost paid by an MIT undergraduate receiving financial aid was $12,938, allowing 87 percent of students in the Class of 2024 to graduate debt-free. Those who did borrow graduated with median debt of $14,844. At the same time, graduates benefit from the lifelong value of an MIT degree, with an average starting salary of $126,438 for graduates entering industry, according to MIT’s most recent survey of its graduating students.
MIT’s endowment — made up of generous gifts made by individual alumni and friends — allows the Institute to provide this level of financial aid, both now and into the future.
“Today’s announcement is a powerful expression of how much our graduates value their MIT experience,” Kornbluth says, “because our ability to provide financial aid of this scope depends on decades of individual donations to our endowment, from generations of MIT alumni and other friends. In effect, our endowment is an inter-generational gift from past MIT students to the students of today and tomorrow.”
What MIT families can expect in 2025
As noted earlier: Starting next fall, for families with income below $100,000, with typical assets, parents can expect to pay nothing for the full cost of attendance, which includes tuition, housing, dining, fees, and allowances for books and personal expenses.
For families with income from $100,000 to $200,000, with typical assets, parents can expect to pay on a sliding scale from $0 up to a maximum of around $23,970, which is this year’s total cost for MIT housing, dining, fees, and allowances for books and personal expenses.
Put another way, next year all MIT families with income below $200,000 can expect to contribute well below $27,146, which is the annual average cost for in-state students to attend and live on campus at public universities in the US, according to the Education Data Initiative. And even among families with income above $200,000, many still receive need-based financial aid from MIT, based on their unique financial circumstances. Families can use MIT’s online calculators to estimate the cost of attendance for their specific family.
...
Read the original on news.mit.edu »
My main background is a hedge fund professional, so I deal with finance data all the time and so far the Pandas library has been an indispensable tool in my workflow and my most used Python library.
Then came along Polars (written in Rust, btw!) which shook the ground of Python ecosystem due to its speed and efficiency, you can check some of Polars benchmark here.
I have around +/- 30 thousand lines of Pandas code, so you can understand why I’ve been hesitant to rewrite them to Polars, despite my enthusiasm for speed and optimization. The sheer scale of the task has led to repeated delays, as I weigh the potential benefits of a faster and more efficient library against the significant effort required to refactor my existing code.
There has always been this thought in the back of my mind:
Pandas is written in C and Cython, which means the main engine is King C…there got to be a way to optimize Pandas and leverage the C engine!
Here comes FireDucks, the answer to my prayer: . It was launched on October 2023 by a team of programmers from NEC Corporation which have 30+ years of experience developing supercomputers, read the announcement here.
Quick check the benchmark page here! I’ll let the numbers speak by themselves.
* This is the craziest bench, FireDucks even beat DuckDB! Also check Pandas & Polars ranks.
* It’s even faster than Polars!
Alrighty those bench numbers from FireDucks looks amazing, but a good rule of thumb is never take numbers for granted…don’t trust, verify! Hence I’m making my own set of benchmarks on my machine.
Yes the last two benchmark numbers are 130x and 200x faster than Pandas…are you not amused with these performance impact?! So yeah, the title of this post is not a clickbait, it’s real. Another key point I need to highlight, the most important one:
you can just plug FireDucks into your existing Pandas code and expect massive speed improvements..impressive indeed!
I’m lost for words..frankly! What else would Pandas users want?
A note for those group of people bashing Python for being slow…yes pure Python is super slow I agree. But it has been proven time and again it can be optimized and once it’s been properly optimized (FireDucks, Codon, Cython, etc) it can be speedy as well since Python backend uses C engine!
Be smart folks! Noone sane would use “pure Python” for serious workload…leverage the vast ecosystem!
...
Read the original on hwisnu.bearblog.dev »
BM25, or Best Match 25, is a widely used algorithm for full text search. It is the default in Lucene/Elasticsearch and SQLite, among others. Recently, it has become common to combine full text search and vector similarity search into “hybrid search”. I wanted to understand how full text search works, and specifically BM25, so here is my attempt at understanding by re-explaining.
For a quick bit of context on why I’m thinking about search algorithms, I’m building a personalized content feed that scours noisy sources for content related to your interests. I started off using vector similarity search and wanted to also include full-text search to improve the handling of exact keywords (for example, a friend has “Solid.js” as an interest and using vector similarity search alone, that turns up more content related to React than Solid).
The question that motivated this deep dive into BM25 was: can I compare the BM25 scores of documents across multiple queries to determine which query the document best matches?
Initially, both ChatGPT and Claude told me no — though annoyingly, after doing this deep dive and formulating a more precise question, they both said yes 🤦♂️. Anyway, let’s get into the details of BM25 and then I’ll share my conclusions about this question.
At the most basic level, the goal of a full text search algorithm is to take a query and find the most relevant documents from a set of possibilities.
However, we don’t really know which documents are “relevant”, so the best we can do is guess. Specifically, we can rank documents based on the probability that they are relevant to the query. (This is called The Probability Ranking Principle.)
How do we calculate the probability that a document is relevant?
For full text or lexical search, we are only going to use qualities of the search query and each of the documents in our collection. (In contrast, vector similarity search might use an embedding model trained on an external corpus of text to represent the meaning or semantics of the query and document.)
BM25 uses a couple of different components of the query and the set of documents:
* Query terms: if a search query is made up of multiple terms, BM25 will calculate a separate score for each term and then sum them up.
* Inverse Document Frequency (IDF): how rare is a given search term across the entire document collection? We assume that common words (such as “the” or “and”) are less informative than rare words. Therefore, we want to boost the importance of rare words.
* Term frequency in the document: how many times does a search term appear in a given document? We assume that more repetition of a query term in a given document increases the likelihood that that document is related to the term. However, BM25 also adjusts this so that there are diminishing returns each time a term is repeated.
* Document length: how long is the given document compared to others? Long documents might repeat the search term more, just by virtue of being longer. We don’t want to unfairly boost long documents, so BM25 applies some normalization based on how the document’s length compares to the average.
These four components are what make up BM25. Now, let’s look at exactly how they’re used.
The BM25 algorithm might look scary to non-mathematicians (my eyes glazed over the first time I saw it), but I promise, it’s not too hard to understand!
Here is the full equation:
Now, let’s go through it piece-by-piece.
* is the full query, potentially composed of multiple query terms
* is the number of query terms
* is each of the query terms
This part of the equation says: given a document and a query, sum up the scores for each of the query terms.
Now, let’s dig into how we calculate the score for each of the query terms.
The first component of the score calculates how rare the query term is within the whole collection of documents using the Inverse Document Frequency (IDF).
The key elements to focus on in this equation are:
* is the total number of documents in our collection
* is the number of documents that contain the query term
* therefore is the number of documents that do not contain the query term
In simple language, this part boils down to the following: common terms will appear in many documents. If the term appears in many documents, we will have a small number (, or the number of documents that do not have the term) divided by . As a result, common terms will have a small effect on the score.
In contrast, rare terms will appear in few documents so will be small and will be large. Therefore, rare terms will have a greater impact on the score.
The constants and are there to smooth out the equation and ensure that we don’t end up with wildly varying results if the term is either very rare or very common.
In the previous step, we looked at how rare the term is across the whole set of documents. Now, let’s look at how frequent the given query is in the given document.
The terms in this equation are:
* is the frequency of the given query in the given document
* is a tuning parameter that is generally set between and
This equation takes the term frequency within the document into effect, but ensures that term repetition has diminishing returns. The intuition here is that, at some point, the document is probably related to the query term and we don’t want an infinite amount of repetition to be weighted too heavily in the score.
The parameter controls how quickly the returns to term repetition diminish. You can see how the slope changes based on this setting:
The last thing we need is to compare the length of the given document to the lengths of the other documents in the collection.
From right to left this time, the parameters are:
* is the length of the given document
* is the average document length in our collection
* is another tuning parameter that controls how much we normalize by the document length
Long documents are likely to contain the search term more frequently, just by virtue of being longer. Since we don’t want to unfairly boost long documents, this whole term is going to go in the denominator of our final equation. That is, a document that is longer than average () will be penalized by this adjustment.
can be adjusted by the user. Setting turns off document length normalization, while setting applies it fully. It is normally set to .
If we take all of the components we’ve just discussed and put them together, we arrive back at the full BM25 equation:
Reading from left to right, you can see that we are summing up the scores for each query term. For each, we are taking the Inverse Document Frequency, multiplying it by the term frequency in the document (with diminishing returns), and then normalizing by the document length.
We’ve just gone through the components of the BM25 equation, but I think it’s worth pausing to emphasize two of its most ingenious aspects.
As mentioned earlier, BM25 is based on an idea called the Probability Ranking Principle. In short, it says:
If retrieved documents are ordered by decreasing probability of relevance on the data available, then the system’s effectiveness is the best that can be obtained for the data.
Unfortunately, calculating the “true” probability that a document is relevant to a query is nearly impossible.
However, we really care about the order of the documents more than we care about the exact probability. Because of this, researchers realized that you could simplify the equations and make it practicable. Specifically, you could drop terms from the equation that would be required to calculate the full probability but where leaving them out would not affect the order.
Even though we are using the Probability Ranking Principle, we are actually calculating a “weight” instead of a probability.
This equation calculates the weight using term frequencies. Specifically:
* is the weight for a given document
* is the probability that the query term would appear in the document with a given frequency () if the document is relevant ()
The various terms boil down to the probability that we would see a certain query term frequency within the document if the document is relevant or not relevant, and the probabilities that the term would not appear at all if the document is relevant or not.
The Robertson/Sparck Jones Weight is a way of estimating these probabilities but only using the counts of different sets of documents:
The terms here are:
* is the number of relevant documents that contain the query term
* is the total number of documents in the collection
* is the number of relevant documents in the collection
* is the number of documents that contain the query term
The big, glaring problem with this equation is that you first need to know which documents are relevant to the query. How are we going to get those?
The question about how to make use of the Robertson/Sparck Joes weight apparently stumped the entire research field for about 15 years. The equation was built up from a solid theoretical foundation, but relying on already having relevance information made it nearly impossible to put to use.
The BM25 developers made a very clever assumption to get to the next step.
For any given query, we can assume that most documents are not going to be relevant. If we assume that the number of relevant documents is so small as to be negligible, we can just set those numbers to zero!
If we substitute this into the Robertson/Sparck Jones Weight equation, we get nearly the IDF term used in BM25:
Not relying on relevance information made BM25 much more useful, while keeping the same theoretical underpinnings. Victor Lavrenko described this as a “very impressive leap of faith”, and I think this is quite a neat bit of BM25′s backstory.
As I mentioned at the start, my motivating question was whether I could compare BM25 scores for a document across queries to understand which query the document best matches.
In general, BM25 scores cannot be directly compared (and this is what ChatGPT and Claude stressed to me in response to my initial inquiries 🙂↔️). The algorithm does not produce a score from 0 to 1 that is easy to compare across systems, and it doesn’t even try to estimate the probability that a document is relevant. It only focuses on ranking documents within a certain collection in an order that approximates the probability of their relevance to the query. A higher BM25 score means the document is likely to be more relevant, but it isn’t the actual probability that it is relevant.
As far as I understand now, it is possible to compare the BM25 scores across queries for the same document within the same collection of documents.
My hint that this was the case was the fact that BM25 sums the scores of each query term. There should not be a semantic difference between comparing the scores for two query term and two whole queries.
The important caveat to stress, however, is the same document within the same collection. BM25 uses the IDF or rarity of terms as well as the average document length within the collection. Therefore, you cannot necessarily compare scores across time because any modifications to the overall collection could change the scores.
For my purposes, though, this is useful enough. It means that I can do a full text search for each of a user’s interests in my collection of content and compare the BM25 scores to help determine which pieces best match their interests.
I’ll write more about ranking algorithms and how I’m using the relevance scores in future posts, but in the meantime I hope you’ve found this background on BM25 useful or interesting!
Thanks to Alex Kesling and Natan Last for feedback on drafts of this post.
If you are interested in diving further into the theory and history of BM25, I would highly recommend watching Elastic engineer Britta Weber’s 2016 talk Improved Text Scoring with BM25 and reading The Probabilistic Relevance Framework: BM25 and Beyond by Stephen Robertson and Hugo Zaragoza.
Also, I had initially included comparisons between BM25 and some other algorithms in this post. But, as you know, it was already a bit long 😅. So, you can now find those in this other post: Comparing full text search algorithms: BM25, TF-IDF, and Postgres.
...
Read the original on emschwartz.me »
Skip to main content
At Niantic, we are pioneering the concept of a Large Geospatial Model that will use large-scale machine learning to understand a scene and connect it to millions of other scenes globally.
When you look at a familiar type of structure — whether it’s a church, a statue, or a town square — it’s fairly easy to imagine what it might look like from other angles, even if you haven’t seen it from all sides. As humans, we have “spatial understanding” that means we can fill in these details based on countless similar scenes we’ve encountered before. But for machines, this task is extraordinarily difficult. Even the most advanced AI models today struggle to visualize and infer missing parts of a scene, or to imagine a place from a new angle. This is about to change: Spatial intelligence is the next frontier of AI models.
As part of Niantic’s Visual Positioning System (VPS), we have trained more than 50 million neural networks, with more than 150 trillion parameters, enabling operation in over a million locations. In our vision for a Large Geospatial Model (LGM), each of these local networks would contribute to a global large model, implementing a shared understanding of geographic locations, and comprehending places yet to be fully scanned.
The LGM will enable computers not only to perceive and understand physical spaces, but also to interact with them in new ways, forming a critical component of AR glasses and fields beyond, including robotics, content creation and autonomous systems. As we move from phones to wearable technology linked to the real world, spatial intelligence will become the world’s future operating system.
Large Language Models (LLMs) are having an undeniable impact on our everyday lives and across multiple industries. Trained on internet-scale collections of text, LLMs can understand and generate written language in a way that challenges our understanding of “intelligence”.
Large Geospatial Models will help computers perceive, comprehend, and navigate the physical world in a way that will seem equally advanced. Analogous to LLMs, geospatial models are built using vast amounts of raw data: billions of images of the world, all anchored to precise locations on the globe, are distilled into a large model that enables a location-based understanding of space, structures, and physical interactions.
The shift from text-based models to those based on 3D data mirrors the broader trajectory of AI’s growth in recent years: from understanding and generating language, to interpreting and creating static and moving images (2D vision models), and, with current research efforts increasing, towards modeling the 3D appearance of objects (3D vision models).
Geospatial models are a step beyond even 3D vision models in that they capture 3D entities that are rooted in specific geographic locations and have a metric quality to them. Unlike typical 3D generative models, which produce unscaled assets, a Large Geospatial Model is bound to metric space, ensuring precise estimates in scale-metric units. These entities therefore represent next-generation maps, rather than arbitrary 3D assets. While a 3D vision model may be able to create and understand a 3D scene, a geospatial model understands how that scene relates to millions of other scenes, geographically, around the world. A geospatial model implements a form of geospatial intelligence, where the model learns from its previous observations and is able to transfer knowledge to new locations, even if those are observed only partially.
While AR glasses with 3D graphics are still several years away from the mass market, there are opportunities for geospatial models to be integrated with audio-only or 2D display glasses. These models could guide users through the world, answer questions, provide personalized recommendations, help with navigation, and enhance real-world interactions. Large language models could be integrated so understanding and space come together, giving people the opportunity to be more informed and engaged with their surroundings and neighborhoods. Geospatial intelligence, as emerging from a large geospatial model, could also enable generation, completion or manipulation of 3D representations of the world to help build the next generation of AR experiences. Beyond gaming, Large Geospatial Models will have widespread applications, ranging from spatial planning and design, logistics, audience engagement, and remote collaboration.
Our work so far
Over the past five years, Niantic has focused on building our Visual Positioning System (VPS), which uses a single image from a phone to determine its position and orientation using a 3D map built from people scanning interesting locations in our games and Scaniverse.
With VPS, users can position themselves in the world with centimeter-level accuracy. That means they can see digital content placed against the physical environment precisely and realistically. This content is persistent in that it stays in a location after you’ve left, and it’s then shareable with others. For example, we recently started rolling out an experimental feature in Pokémon GO, called Pokémon Playgrounds, where the user can place Pokémon at a specific location, and they will remain there for others to see and interact with.
Niantic’s VPS is built from user scans, taken from different perspectives and at various times of day, at many times during the years, and with positioning information attached, creating a highly detailed understanding of the world. This data is unique because it is taken from a pedestrian perspective and includes places inaccessible to cars.
Today we have 10 million scanned locations around the world, and over 1 million of those are activated and available for use with our VPS service. We receive about 1 million fresh scans each week, each containing hundreds of discrete images.
As part of the VPS, we build classical 3D vision maps using structure from motion techniques - but also a new type of neural map for each place. These neural models, based on our research papers ACE (2023) and ACE Zero (2024) do not represent locations using classical 3D data structures anymore, but encode them implicitly in the learnable parameters of a neural network. These networks can swiftly compress thousands of mapping images into a lean, neural representation. Given a new query image, they offer precise positioning for that location with centimeter-level accuracy.
Niantic has trained more than 50 million neural nets to date, where multiple networks can contribute to a single location. All these networks combined comprise over 150 trillion parameters optimized using machine learning.
Our current neural map is a viable geospatial model, active and usable right now as part of Niantic’s VPS. It is also most certainly “large”. However, our vision of a “Large Geospatial Model” goes beyond the current system of independent local maps.
An entirely local model might lack complete coverage of their respective locations. No matter how much data we have available on a global scale, locally, it will often be sparse. The main failure mode of a local model is its inability to extrapolate beyond what it has already seen and from where the model has seen it. Therefore, local models can only position camera views similar to the views they have been trained with already.
Imagine yourself standing behind a church. Let us assume the closest local model has seen only the front entrance of that church, and thus, it will not be able to tell you where you are. The model has never seen the back of that building. But on a global scale, we have seen a lot of churches, thousands of them, all captured by their respective local models at other places worldwide. No church is the same, but many share common characteristics. An LGM is a way to access that distributed knowledge.
An LGM distills common information in a global large-scale model that enables communication and data sharing across local models. An LGM would be able to internalize the concept of a church, and, furthermore, how these buildings are commonly structured. Even if, for a specific location, we have only mapped the entrance of a church, an LGM would be able to make an intelligent guess about what the back of the building looks like, based on thousands of churches it has seen before. Therefore, the LGM allows for unprecedented robustness in positioning, even from viewpoints and angles that the VPS has never seen.
The global model implements a centralized understanding of the world, entirely derived from geospatial and visual data. The LGM extrapolates locally by interpolating globally.
The process described above is similar to how humans perceive and imagine the world. As humans, we naturally recognize something we’ve seen before, even from a different angle. For example, it takes us relatively little effort to back-track our way through the winding streets of a European old town. We identify all the right junctions although we had only seen them once and from the opposing direction. This takes a level of understanding of the physical world, and cultural spaces, that is natural to us, but extremely difficult to achieve with classical machine vision technology. It requires knowledge of some basic laws of nature: the world is composed of objects which consist of solid matter and therefore have a front and a back. Appearance changes based on time of day and season. It also requires a considerable amount of cultural knowledge: the shape of many man-made objects follow specific rules of symmetry or other generic types of layouts — often dependent on the geographic region.
While early computer vision research tried to decipher some of these rules in order to hard-code them into hand-crafted systems, it is now consensus that such a high degree of understanding as we aspire to can realistically only be achieved via large-scale machine learning. This is what we aim for with our LGM. We have seen a first glimpse of impressive camera positioning capabilities emerging from our data in our recent research paper MicKey (2024). MicKey is a neural network able to position two camera views relative to each other, even under drastic viewpoint changes.
MicKey can handle even opposing shots that would take a human some effort to figure out. MicKey was trained on a tiny fraction of our data — data that we released to the academic community to encourage this type of research. MicKey is limited to two-view inputs and was trained on comparatively little data, but it still represents a proof of concept regarding the potential of an LGM. Evidently, to accomplish geospatial intelligence as outlined in this text, an immense influx of geospatial data is needed — a kind of data not many organizations have access to. Therefore, Niantic is in a unique position to lead the way in making a Large Geospatial Model a reality, supported by more than a million user-contributed scans of real-world places we receive per week.
An LGM will be useful for more than mere positioning. In order to solve positioning well, the LGM has to encode rich geometrical, appearance and cultural information into scene-level features. These features will enable new ways of scene representation, manipulation and creation. Versatile large AI models like the LGM, which are useful for a multitude of downstream applications, are commonly referred to as “foundation models”.
Different types of foundation models will complement each other. LLMs will interact with multimodal models, which will, in turn, communicate with LGMs. These systems, working together, will make sense of the world in ways that no single model can achieve on its own. This interconnection is the future of spatial computing — intelligent systems that perceive, understand, and act upon the physical world.
As we move toward more scalable models, Niantic’s goal remains to lead in the development of a large geospatial model that operates wherever we can deliver novel, fun, enriching experiences to our users. And, as noted, beyond gaming Large Geospatial Models will have widespread applications, including spatial planning and design, logistics, audience engagement, and remote collaboration.
The path from LLMs to LGMs is another step in AI’s evolution. As wearable devices like AR glasses become more prevalent, the world’s future operating system will depend on the blending of physical and digital realities to create a system for spatial computing that will put people at the center.
...
Read the original on nianticlabs.com »
With light linking, lights can be set to affect only specific objects in the scene.
Shadow linking additionally gives control over which objects acts as shadow blockers for a light.
This is now feature parity with Cycles.
...
Read the original on www.blender.org »
A recent study published in the Journal of University Teaching & Learning Practice sheds light on people’s motivations to use Z-Library. Expensive books and limited access to academic material play a key role among those surveyed. That includes a group of Chinese postgraduate students who believe that shadow libraries help to overcome (academic) poverty.
A recent study published in the Journal of University Teaching & Learning Practice sheds light on people’s motivations to use Z-Library. Expensive books and limited access to academic material play a key role among those surveyed. That includes a group of Chinese postgraduate students who believe that shadow libraries help to overcome (academic) poverty.
Z-Library is one of the largest shadow libraries on the Internet, hosting millions of books and academic articles that can be downloaded for free.
The site defied all odds over the past two years. It continued to operate despite a full-fledged criminal prosecution by the United States, which resulted in the arrest of two alleged operators in Argentina.
These two Russian defendants are wanted by the United States and earlier this year a judge approved their extradition. However, according to the most recent information we have, the defendants escaped house arrest and vanished into thin air.
The roles of the two Russians remain unclear, but they were not vital to the site’s survival. Z-Library continued to expand its reach despite their legal troubles.
Z-Library users don’t seem to be hindered by the criminal prosecution either, as they continue to support and use the site. For many, Z-Library is simply a convenient portal to download free books. For others, however, it’s a vital resource to further an academic career.
A recent study published in the Journal of University Teaching & Learning Practice sheds light on the latter. It looks at the ‘piracy’ motivations of Redditors and students in higher education, specifically when it comes to Z-Library.
The paper, published by Dr. Michael Day of the University of Greenwich, labels the use of Z-Library as ‘Academic Cybercrime’. The findings, however, suggest that students are more likely to draw comparisons with “Robin Hood”.
The research looks at the motivations of two groups; Reddit users and Chinese postgraduate students. Despite the vast differences between these groups, their views on Z-Library are quite similar.
The 134 Reddit responses were sampled from the Zlibrary subreddit, which is obviously biased in favor of the site. However, the reasoning goes well beyond a simple “I want free stuff” arguments.
Many commenters highlighted that they were drawn to the site out of poverty, for example, or they highlighted that Z-Library was an essential tool to fulfill their academic goals.
“Living in a 3rd world country, 1 book would cost like 50%- 80% already of my daily wage,” one Redditor wrote.
The idea that Z-Library is a ‘necessary evil’ was also highlighted by other commenters. This includes a student who can barely make ends meet, and a homeless person, who has neither the money nor the space for physical books.
The lack of free access to all study materials, including academic journal subscriptions at university libraries, was also a key motivator. Paired with the notion that journal publishers make billions of dollars, without compensating authors, justification is found for ‘pirate’ alternatives.
“They make massive profits. So stealing from them doesn’t hurt the authors nor reviewers, just the rich greedy publishers who make millions just to design a cover and click ‘publish’,” one Redditor wrote.
The second part of the study is conducted in a more structured format among 103 postgraduate students in China. This group joined a seminar where Z-Library and the crackdown were discussed. In addition, the students participated in follow-up focus group discussions, while also completing a survey.
Despite not all being users of the shadow library, 41% of the students agreed that the site’s (temporary) shutdown affected their ability to study and find resources for degree learning.
In general, the students have a favorable view toward Z-Library and similar sites, and 71% admit that they have used a shadow library in the past. In line with China’s socialist values, the overwhelming majority of the students agreed that access to knowledge should be free for everyone.
While the students are aware of copyright law, they believe that the need to access knowledge outweighs rightsholders’ concerns. This is also reflected in the following responses, among others.
– Z-Library, or a similar website, is helpful to students living in poverty (82% agree).
– Academic textbooks are too expensive, so I can’t afford to buy them as a student (67% agree).
– I have limited access to English medium academic books in my country (63% agree)
– I prefer to download books without restrictions, like [paywalls etc.], as it is difficult (77% agree).
All in all, Z-Library and other shadow libraries are seen as a viable option for expensive or inaccessible books, despite potential copyright concerns.
This research sheds an intriguing light on key motivations to use shadow libraries. However, the small sample sizes, selection bias, and specific characteristics of the groups, means that these findings should be interpreted with caution.
Dr. Michael Day, nonetheless, notes that the responses show clear signs of a Robin Hood mentality. Z-Library users evade the publishers’ ‘tax’ on knowledge by downloading works for free.
Overall, the paper suggests that universities and publishers may want to reconsider the status quo and consider making more content freely accessible, taking a page from Z-Library.
“There is need for universities to re-consider the digital divides faced by socioeconomically and digitally disadvantaged students, alongside publishers, who must rethink their approach by making open access research more commonplace and thus pro-human,” the author concludes.
The paper provides a good example, as it is published under a Creative Commons license and is freely accessible to all.
Day, M. J. (2024). Digital Piracy in Higher Education: Exploring Social Media Users and Chinese Postgraduate Students Motivations for Supporting ‘Academic Cybercrime’ by Shelving ebooks from Z-Library. Journal of University Teaching and Learning Practice.
...
Read the original on torrentfreak.com »
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.