10 interesting stories served every morning and every evening.

Blocked

old.reddit.com

whoa there, pard­ner!

Your re­quest has been blocked due to a net­work pol­icy.

Try log­ging in or cre­at­ing an ac­count here to get back to brows­ing.

If you’re run­ning a script or ap­pli­ca­tion, please reg­is­ter or sign in with your de­vel­oper cre­den­tials here. Additionally make sure your User-Agent is not empty and is some­thing unique and de­scrip­tive and try again. if you’re sup­ply­ing an al­ter­nate User-Agent string, try chang­ing back to de­fault as that can some­times re­sult in a block.

You can read Reddit’s Terms of Service here.

If you think that we’ve in­cor­rectly blocked you or you would like to dis­cuss eas­ier ways to get the data you want, please file a ticket here.

When con­tact­ing us, please in­clude your Reddit ac­count along with the fol­low­ing code:

019eea62 – 7613-7ff7-a698 – 85c31e074bc3

Beyond All Reason ★ RTS

www.beyondallreason.info

Real-Time Strategy

Redefined

Every unit, pro­jec­tile and ex­plo­sion sim­u­lated in real-time

Unmatched

Scale & re­al­ism

All units and pro­jec­tiles are sim­u­lated in real-time. The game of­fers fully sim­u­lated pro­jec­tile bal­lis­tics, ex­plo­sion physics and ter­rain de­for­ma­tion.

Enjoy an im­mer­sive RTS ex­pe­ri­ence, whether you are com­mand­ing in­di­vid­ual units, or armies of thou­sands. Take con­trol as you en­gage in an epic strug­gle for dom­i­na­tion!

ScreenshotsGameplay

Strategic

Importance of ter­rain

The shape of every bat­tle­field in-game im­poses which strate­gies work and which units are ef­fec­tive. No two maps will play the same. Radar can­not pen­e­trate moun­tains and nu­clear war­fare will phys­i­cally al­ter the ter­rain.

Utilize over 10 dif­fer­ent unit classes, in­clud­ing all-ter­rain Experimental units, to work your way to vic­tory.

Learn how to Play

Countless

World-class con­trols

Your power lies in the care­ful bal­ance of ex­po­nen­tially grow­ing your re­source in­come and the pro­duc­tion of dev­as­tat­ing war ma­chines.

You de­cide if you want to dis­arm your en­e­mies with a few pre­cise early strikes or to build a thou­sand bombers and oblit­er­ate them.

Immerse your­self in a vi­o­lent world where tac­ti­cal and strate­gi­cal su­premacy are needed in your fight to­wards vic­tory.

Commands Overview

Relentless game de­sign

Unique and with a pur­pose

Each and every unit in the game has a role to fill. Mix-and-match units to cre­ate in­fi­nite pos­si­ble tac­tics. Experiment with your own com­bi­na­tions and show off the new strate­gies you de­velop in bat­tle.

Compare Units

Google hits 50% IPv6 | APNIC Blog

blog.apnic.net

You may have seen head­lines not­ing that Google’s mea­sure­ments have shown IPv6 reach­ing 50% for the first time. These mea­sure­ments are based on Google’s con­tin­u­ous mon­i­tor­ing of the avail­abil­ity of IPv6 con­nec­tiv­ity among its users, and re­flect the pro­por­tion of users who ac­cess Google ser­vices over IPv6. Reaching the 50% mark is a sig­nif­i­cant mile­stone, demon­strat­ing that IPv6 is a ma­ture, fully ca­pa­ble pro­to­col that is be­ing de­ployed at a global scale and used ef­fec­tively in real-world net­works.

The shape of IPv6 adop­tion is­n’t evenly dis­trib­uted

The global up­take of IPv6 fol­lows a com­plex and var­ied path that is­n’t ap­par­ent in a sin­gle, ag­gre­gated trend line. Google does not pub­lish per‑re­gion IPv6 sta­tis­tics, and its per‑econ­omy data is lim­ited to over­all to­tals, so these nu­ances are hard to see in Google’s fig­ures alone. To un­der­stand how adop­tion re­ally un­folds, it’s more in­struc­tive to look at the APNIC Labs data. Individual economies such as India, Viet Nam, and Saudi Arabia ex­hibit adop­tion curves that dif­fer markedly from the global av­er­age. As the APNIC Labs data shows, this global trend does not nec­es­sar­ily re­flect the ex­pe­ri­ence of in­di­vid­ual economies.

APNICs own mea­sure­ment records a 42% world­wide IPv6 ca­pa­bil­ity (Figure 2). That’s a sub­stan­tial dif­fer­ence, which also needs clar­i­fy­ing.

Measurement dif­fer­ences

APNICs mea­sure­ment pro­gram is run by APNIC Labs and uses on­line ad­ver­tis­ing dis­trib­uted through Google Ads, which ap­pear in end users’ web browsers, games, and apps wher­ever Google ad­ver­tise­ments are placed. APNIC does not se­lect spe­cific users and seeks the broad­est pos­si­ble ex­po­sure in every econ­omy, 24/7. Normal ad­ver­tis­ing track­ing sys­tems are used with APNIC Labs logic, which en­sures a unique set of tests are run, mea­sur­ing IP, BGP rout­ing and DNS, amongst other tech­nol­ogy choices. No end-user Personally Identifiable Information (PII) data is held, and raw mea­sure­ments are never shared, only col­la­tions at the ISP, econ­omy and re­gion level.

This work is car­ried out with the as­sis­tance of Google Research, ICANN, and oth­ers who help fund and sup­port the ac­tiv­ity. Given this close in­volve­ment, it’s nat­ural to ask why APNICs mea­sure­ment re­sults don’t al­ways align with Google’s own pub­lished sta­tis­tics. If Google is used to con­duct the re­search, how can the re­sults dif­fer?

APNICs mea­sure­ment ap­proach ap­plies sta­tis­ti­cal weight­ing to the col­lected data and uses ex­ter­nal sources, such as World Bank sta­tis­tics, to model Internet us­age by econ­omy. This is nec­es­sary be­cause the num­ber of mea­sure­ment sam­ples APNIC Labs re­ceives each day is not uni­form. Advertising place­ments are op­ti­mized by Google to max­i­mize de­liv­ery and rev­enue, which means that, on any given day, more ad­ver­tise­ments, and there­fore more mea­sure­ment sam­ples, may be shown in cer­tain economies than oth­ers. For ex­am­ple, if ad­ver­tis­ing de­mand is higher in North African economies such as Egypt or Tunisia on a par­tic­u­lar day, more mea­sure­ments will be col­lected there, while fewer may be gath­ered from South America or Asia.

As a re­sult, the raw sam­ple counts can­not sim­ply be ag­gre­gated to cal­cu­late global IPv6 ca­pa­bil­ity. Instead, APNIC Labs ag­gre­gates the mea­sured IPv6 ca­pa­bil­ity for each econ­omy and then weights it ac­cord­ing to that econ­o­my’s es­ti­mated Internet user pop­u­la­tion.

In prac­tice, this means that large Internet pop­u­la­tions, such as those in India, China, Indonesia, and other ma­jor economies, con­tribute pro­por­tion­ally more to the global re­sult than smaller economies, even if the raw sam­ple vol­umes on a given day might sug­gest oth­er­wise. This weight­ing en­sures that the fi­nal mea­sure­ments re­flect global Internet us­age more ac­cu­rately, rather than daily ad­ver­tis­ing dis­tri­b­u­tion pat­terns.

At the level of in­di­vid­ual economies, APNIC Labs’ mea­sure­ments gen­er­ally align with the to­tals pub­lished by Google and with data from Cloudflare, Akamai, Cisco, and oth­ers. This sug­gests that the un­der­ly­ing mea­sure­ments are com­pa­ra­ble and that the larger dif­fer­ences ob­served at the global level are likely due to dif­fer­ences be­tween APNICs weight­ing model. This may be why we see the large vari­ances be­tween the two mea­sure­ments.

In prac­tice, APNICs mea­sure­ments tend to be lower than Google’s. As a re­sult, it’s use­ful to view the two data sets to­gether, as they ef­fec­tively bracket the likely range of ac­tual IPv6 ca­pa­bil­ity at any given point in time.

Is IPv6 adop­tion pro­gress­ing as ex­pected?

Some point to the long path to­ward a 50% adop­tion mile­stone as ev­i­dence of a sys­temic fail­ure in IPv6. Nothing could be fur­ther from the truth. Deploying IPv6 has re­quired sub­stan­tial tech­ni­cal ef­fort and sig­nif­i­cant cap­i­tal in­vest­ment. It’s there­fore en­tirely ex­pected that progress has var­ied across re­gions and economies, as in­di­vid­ual ISPs and economies make their own de­ci­sions about how best to bal­ance net­work growth, user ex­pec­ta­tions, and the prac­ti­cal re­al­i­ties of op­er­at­ing Internet in­fra­struc­ture.

The global Internet is not a command econ­omy‘, it evolves through col­lab­o­ra­tion and co­op­er­a­tion within mar­ket-dri­ven con­di­tions. Many providers made sub­stan­tial cap­i­tal in­vest­ments in IPv4 in ear­lier pe­ri­ods and have nat­u­rally sought to max­i­mize the re­turn on those in­vest­ments. In do­ing so, they built sus­tain­able and com­mer­cially vi­able IPv4-based net­works within their ex­ist­ing foot­prints.

By con­trast, for newer mar­ket en­trants, it has of­ten been more ra­tio­nal to adopt IPv6 as the pri­mary pro­to­col, as it can demon­stra­bly re­duce the to­tal cost of own­er­ship. This pat­tern is par­tic­u­larly ev­i­dent in the mo­bile sec­tor, most no­tably in large-scale IPv6 de­ploy­ments such as Reliance Jio’s net­work in India.

Is the global Internet func­tion­ing in a two‑protocol world’?

Yes, but it could be sim­pler.

Certainly, it would be eas­ier lo­gis­ti­cally to run a global in­ter­net un­der a sin­gle pro­to­col. However, that is not the en­vi­ron­ment we have ended up with. Instead, the Internet to­day op­er­ates across a mix of di­rect IPv4 con­nec­tiv­ity, IPv4 me­di­ated through Network Address Translation (NAT), ei­ther in home net­works or at the car­rier level via Carrier‑Grade NAT (CGNAT), and IPv6.

Managing ad­dress trans­la­tion through NAT is not ma­te­ri­ally less com­plex than al­ter­na­tives such as pro­to­col trans­la­tion, IPv4 en­cap­su­la­tion over IPv6, or other tran­si­tion and proxy mech­a­nisms. As a re­sult, claims that IPv4 is work­ing fine’ of­ten over­look the un­der­ly­ing re­al­ity: Modern IPv4 net­works al­ready rely on lay­ers of op­er­a­tional com­plex­ity, and there is no in­her­ently lower‑cost or sim­pler ap­proach avail­able within IPv4 alone.

From the out­set, it was un­der­stood that the lack of di­rect in­ter­op­er­abil­ity be­tween IPv4 and IPv6 would be a chal­lenge that needed to be ad­dressed. Early ef­forts ex­plored the idea of pro­to­cols that could sub­sume IPv4 un­changed and en­able di­rect con­nec­tiv­ity across both worlds, but these ap­proaches did not prove vi­able.

Instead, in­ter­op­er­abil­ity has emerged at higher lay­ers, with trans­port pro­to­cols such as TCP, UDP, and QUIC op­er­at­ing in­de­pen­dently of the un­der­ly­ing IP ver­sion. This model nec­es­sar­ily re­lies on some form of in­ter­me­di­ary. This is vis­i­ble in the way large con­tent and caching providers, such as Cloudflare, are able to of­fer dual‑stack ser­vices re­gard­less of whether the back­end sys­tems them­selves sup­port both pro­to­cols.

The ab­sence of na­tive dual‑stack ca­pa­bil­ity at some ser­vices, for ex­am­ple, cer­tain Git plat­forms or na­tional tele­vi­sion broad­cast­ers, is of­ten per­ceived as a ma­jor bar­rier to IPv6 progress. However, this may re­flect prag­matic con­straints, such as op­er­a­tional com­plex­ity, or, in the case of na­tional broad­cast­ers, le­gal and reg­u­la­tory re­quire­ments around data ac­cess and ge­olo­ca­tion, rather than re­sis­tance.

Let’s rec­og­nize the 50% mile­stone, even as the jour­ney con­tin­ues

Whatever one’s view on the de­ci­sion to in­tro­duce a sec­ond ad­dress­ing and pro­to­col model be­neath to­day’s Internet ser­vices, the re­al­ity is clear: IPv6 is now de­ployed on a global scale. Around half of the Internet users vis­i­ble to Google al­ready reach its ser­vices over IPv6. IPv6 is used every day, every hour, across de­vel­oped and de­vel­op­ing economies alike, on fixed and mo­bile net­works, on small per­sonal de­vices, and within vast data‑cen­tre‑backed ser­vices. It is no longer ex­per­i­men­tal or mar­ginal; it is part of the Internet’s day‑to‑day op­er­a­tion.

That achieve­ment re­flects the col­lec­tive ef­fort of those work­ing to build, op­er­ate, and grow the Internet world­wide, and it is some­thing worth rec­og­niz­ing and tak­ing pride in.

The views ex­pressed by the au­thors of this blog are their own and do not nec­es­sar­ily re­flect the views of APNIC. Please note a Code of Conduct ap­plies to this blog.

Did my old job only exist because of fraud? – It's not about code

david.newgas.net

Early in my soft­ware en­gi­neer­ing ca­reer, the UK-based startup I worked at, GenieDB, was taken over by a US Venture Capital fund, Frost VP, owned by Stuart Frost. I was func­tion­ally the only piece that came to the US. The code was re­built, the rest of the team even­tu­ally ro­tated out, even the core strat­egy was re­placed. But I was early in my ca­reer and ex­cited to be work­ing in the VC tech-startup world. I ended up mak­ing my life in the US, so this was a pretty piv­otal phase in my life.

For a while I lived the start-up life: build­ing rapidly and play­ing Foosball1. GenieDB ac­tively re­jected rev­enue op­por­tu­ni­ties (a la Silicon Valley) with the aim of get­ting ac­quired for pi­o­neer­ing tech­nol­ogy. We limped along for years with never more than 3 cus­tomers, even when big-tech and even­tu­ally open-source did what we tried to do bet­ter than us. I left with mixed feel­ings and even­tu­ally came to re­al­ize we never had the se­ri­ous foot­ing it would have taken to ac­tu­ally de­velop sig­nif­i­cant tech­nol­ogy.

A decade later I heard from a for­mer col­league that Frost was be­ing sued by the SEC for fraud. Still not sure what to make of my time at GenieDB, I was cu­ri­ous enough to skim the com­plaint, where I saw this line:

Was GenieDB in­volved in this fraud some­how? I chewed on this ques­tion and it evolved to a form that be­gan to haunt me:

Did my old job — the one that brought me to the USA and changed the course of my en­tire life — only ex­ist be­cause of fraud?

Did my old job — the one that brought me to the USA and changed the course of my en­tire life — only ex­ist be­cause of fraud?

With this burn­ing ques­tion I dove into the record of the case. The al­leged fraud was sim­ple: Frost VP op­er­ated as an in­cu­ba­tor, pro­vid­ing ser­vices to the port­fo­lio com­pa­nies, and the in­vestors say these fees charged were ex­ces­sive.

The case went to bind­ing ar­bi­tra­tion2 and the in­vestors won, af­ter which the SEC sued to bar Frost from man­ag­ing funds in the fu­ture. There’s some great el­e­ments like claim­ing a per­sonal chef and cleaner as ex­penses, telling in­vestors no salary would be paid by the fees (it was) and start­ing a mar­ket­ing com­pany just to spon­sor some­one’s visa.

But what about GenieDB? Both the ar­bi­tra­tion and SEC suit elab­o­rate on this idea of cre­at­ing com­pa­nies just to charge them fees:

This al­le­ga­tion was never lit­i­gated though, nei­ther court nor ar­bi­tra­tor were kind enough to rule on why GenieDB was in the port­fo­lio. I had to dive into the ev­i­dence and de­cide for my­self. First of all, my old CEO tes­ti­fied that GenieDB was pay­ing ex­ces­sive fees:

I was only re­ally shaken when I saw what the in­sid­ers at the VC fund say­ing to each other:

To me, this email shows that the in­vest­ments were mo­ti­vated by the fees3 and GenieDB was be­ing used to siphon in­vestor money away. I felt like an ant. My ca­reer, my fam­ily, my cit­i­zen­ship all would be com­pletely dif­fer­ent ex­cept for this fraud. This story plays out be­tween in­vestors, multi-mil­lion­aire VCs, judges, and my en­tire life is­n’t even a foot­note.

I took a deep breath. GenieDB did have a con­cept at the core, which pre-dated Frost turn­ing his eyes on us. A con­cept which turned out to be quite use­ful when more se­ri­ous play­ers took it on. My col­leagues and I were re­ally try­ing to build it, even if the fund man­ager was eat­ing up our run­way with spu­ri­ous fees for his per­sonal gain. I was­n’t just some in­stru­ment of fraud work­ing on noth­ing.

Besides, tur­bu­lence from chance events, light­hearted de­ci­sions and even crime al­ters the course of every life. Stories of chance meet­ings of a spouse are dime-a-dozen. I just looked in my wake and was sur­prised by a dark cur­rent.

Amusingly Foosball abil­ity re­flected the cor­po­rate hi­er­ar­chy. Stuart Frost was hands down the best. His sec­ond in com­mand a close sec­ond, and GenieDB’s CEO could beat any of us pro­gram­mers. ↩︎

In fact Frost ac­tu­ally ini­ti­ated the ar­bi­tra­tion be­cause he al­leged the in­vestors were con­spir­ing against him. The coun­ter­claim laid out the fraud. ↩︎

With Genie com­ing out” here is re­fer­ring to GenieDB dis­solv­ing and not pay­ing fees to the in­cu­ba­tor any more. GenieDB is­n’t one of the com­pa­nies they are propos­ing here. ↩︎

Fully Open Foundation Model for Sovereign AI

apertvs.ai

Developed by the Swiss AI Initiative as a col­lab­o­ra­tive ef­fort be­tween EPFL, ETH Zurich, and CSCS. Open weights, open data, open sci­ence.

Fully Open

Training data, code, weights, meth­ods, and align­ment prin­ci­ples — all doc­u­mented and re­pro­ducible. Apertus is to AI as Open is to Source.

Compliant at Scale

Built to meet EU AI Act re­quire­ments: the model re­spects opt-outs, re­moves PII, pre­vents mem­o­riza­tion. A global foun­da­tion to build on.

Run for Performance

Competitive with top open mod­els at an equiv­a­lent scale of 8B and 70B pa­ra­me­ters. Multilingual from day one — trained on 1000+ lan­guages.

Swisscom is a Strategic Partner of the Swiss AI Initiative.

Stay Updated

Our newslet­ter will keep you on top of Apertus re­leases, our team’s re­search, and com­mu­nity news.

HPV jabs cut risk of dying from cervical cancer before 30 to almost zero

www.theguardian.com

Women who re­ceived an HPV vac­cine in early ado­les­cence have vir­tu­ally zero risk of dy­ing from cer­vi­cal can­cer be­fore the age of 30, ac­cord­ing to a ground­break­ing study, but falling vac­ci­na­tion rates could see a rise in avoid­able deaths.

Cervical can­cer is the fourth most com­mon can­cer in women, ac­cord­ing to the World Health Organization, and high-risk hu­man pa­pil­lo­maviruses (HPV) cause 99% of cases. About 3,300 women in England are di­ag­nosed with the dis­ease every year.

While the HPV vac­cine pre­vents about 90% of cer­vi­cal can­cers, un­til now the im­pact on sur­vival has been un­known. In new analy­sis, re­searchers from Queen Mary University of London (QMUL) used of­fi­cial can­cer mor­tal­ity and vac­ci­na­tion data for women aged 20 to 34 to cal­cu­late the im­pact of vac­ci­na­tion on cer­vi­cal can­cer sur­vival.

While the study, funded by Cancer Research UK and pub­lished in the Lancet, saw lit­tle change in cer­vi­cal can­cer mor­tal­ity in those who were never of­fered HPV vac­ci­na­tion, there were sub­stan­tial falls in those who were of­fered vac­ci­na­tion af­ter the HPV jab was in­tro­duced in 2008.

The im­pact on mor­tal­ity has been so great that the au­thors es­ti­mate that the like­li­hood of girls who are in­oc­u­lated when they are 12 or 13 dy­ing from cer­vi­cal can­cer be­fore the age of 30 is al­most zero. For vac­ci­nated women aged 30 – 34, the rel­a­tive risk of death from the dis­ease is 63% lower.

And for the first time in recorded his­tory, no women aged 20 to 24 died from cer­vi­cal can­cer in England be­tween 2020 and 2024. In all, the HPV vac­cine has saved hun­dreds of lives, the au­thors con­clude.

Peter Sasieni, pro­fes­sor of can­cer epi­demi­ol­ogy at QMUL and lead au­thor of the study, said: We es­ti­mate that since its in­tro­duc­tion, HPV vac­ci­na­tion has pre­vented nearly 200 young women from dy­ing from cer­vi­cal can­cer in England.”

The jab, which also pro­tects against cer­tain can­cers of the anus, pe­nis, vagina, vulva, mouth and throat, as well as gen­i­tal warts, is given to girls and boys in year 8, with catchup vac­ci­na­tions of­fered in some ar­eas in years 9 and 10.

WHOs global strat­egy on cer­vi­cal can­cer states that by 2030, all coun­tries should vac­ci­nate 90% of girls with the HPV vac­cine by the age of 15, screen 70% of women and treat 90% of those with cer­vi­cal dis­ease.

Until the pan­demic, vac­ci­na­tion rates were close to WHOs tar­get, but have fallen sig­nif­i­cantly since then.

With close to 90% HPV vac­cine up­take in women born be­tween 1995 and 2004, we ex­pect to see thou­sands of cer­vi­cal can­cer deaths pre­vented in those women over the com­ing years,” said Sasieni.

HPV vac­ci­na­tion com­bined with cer­vi­cal screen­ing could re­duce cer­vi­cal can­cer rates to the point where al­most no one de­vel­ops it.”

But he said deaths and cases could rise again be­cause of fewer teenagers get­ting vac­ci­nated.

The falling HPV vac­cine up­take — now just 75% na­tion­ally and 60% in London — means that with­out swift and con­certed ef­forts to in­crease HPV vac­cine up­take, we could see a re­ver­sal of these trends.

There could be an­other 15 – 25 avoid­able deaths each year in young women and even­tu­ally about 200 deaths from cer­vi­cal can­cer each year that could be pre­vented if we can in­crease vac­cine up­take to pre-Covid lev­els.”

Michelle Mitchell, chief ex­ec­u­tive of Cancer Research UK, said: It’s es­sen­tial that the UK gov­ern­ment and health sys­tems ur­gently ad­dress this with tar­geted ac­tion to reach com­mu­ni­ties where up­take is the low­est.”

Helen Hyndman, lead nurse at gy­nae­co­log­i­cal can­cer char­ity The Eve Appeal, said cer­vi­cal can­cer will not be elim­i­nated un­less vac­ci­na­tion and screen­ing rates im­prove and those who need it get timely treat­ment.

We need ur­gent ac­tion — we are lag­ging be­hind on our plans to elim­i­nate cer­vi­cal can­cer by 2024 and at our cur­rent rate it will be 2050 be­fore this is achieved.”

Dr Alison Wright, pres­i­dent of the Royal College of Obstetricians and Gynaecologists, said the exciting and pow­er­ful data” showed the vac­cine’s use could see fewer women with a cer­vi­cal can­cer di­ag­no­sis, and fewer lives lost to a largely pre­ventable dis­ease”.

While im­proved ac­cess to vac­cines through lo­cal com­mu­nity phar­ma­cies for those who missed school vac­ci­na­tions was wel­come, fur­ther progress now de­pends on en­cour­ag­ing vac­cine up­take at all lev­els, rais­ing aware­ness of the vac­ci­na­tion pro­gramme, and en­sur­ing that every­one who is el­i­gi­ble can get timely, eq­ui­table ac­cess, she added.

Caroline Temmink, the NHS Director of Vaccination, said: This hugely en­cour­ag­ing news shows the life-sav­ing im­pact of the HPV vac­cine and it’s in­cred­i­bly ex­cit­ing to be able to say to this whole gen­er­a­tion, cer­vi­cal can­cer and some other can­cers should­n’t be a risk for you.

Alongside cer­vi­cal screen­ing, HPV vac­ci­na­tion is cen­tral to the NHS am­bi­tion to elim­i­nate cer­vi­cal can­cer by 2040. It’s a safe and ef­fec­tive vac­cine and we urge every­one el­i­gi­ble to take up the of­fer when in­vited.”

A Department of Health and Social Care spokesper­son said: We are boost­ing vac­cine up­take so that more young peo­ple ben­e­fit from this life-sav­ing pro­tec­tion — in­clud­ing rolling out catchup HPV vac­ci­na­tion cam­paigns via com­mu­nity phar­ma­cies, and mak­ing it eas­ier to ac­cess cer­vi­cal screen­ing. HPV self-test­ing kits are now be­ing sent to those who do not come for­ward for screen­ing, to en­sure we catch more can­cers at their ear­li­est, most treat­able stages.”

Running microVMs in Proxmox VE, The Easy Way

taoofmac.com

Jun 18th 2026 · 15 min read · #containers #homelab #kvm #microvm #proxmox #qemu #virtualization

I’ve been run­ning a mixed Proxmox clus­ter for years — four nodes of wildly dif­fer­ent ca­pa­bil­ity, from an Atom x5-Z8350 with 2 GB of RAM (a z83ii, cur­rently of­fline af­ter years of faith­ful ser­vice as a base­line tor­ture de­vice) up to an i7 – 12700 with 128 GB (borg, my main home­lab server).

This year, some­where along the way be­tween writ­ing agent­box and all the hype around agen­tic sand­boxes I got tired of the eter­nal com­pro­mise be­tween LXC con­tain­ers and full vir­tual ma­chines, and ended up build­ing pve-mi­crovm — a Debian pack­age that adds QEMUs mi­crovm ma­chine type as a first-class man­aged guest in Proxmox VE.

This is­n’t a quick hack. Well, the first ver­sion was, ac­tu­ally, but it’s gone quite a bit far­ther than that, and cer­tainly far­ther than I ex­pected.

It now ships a cus­tom ker­nel, patches the Perl in­ter­nals to pro­vide Proxmox web UI in­te­gra­tion, and, due to my usual fas­ci­na­tion with off­beat op­er­at­ing sys­tems, ended up sup­port­ing (as of this writ­ing) 21 guest OS types from Debian to NetBSD to Plan9.

Yes, I com­pletely brought it upon my­self to run Plan9 in a mi­croVM, and yes, it works.

Finding the Right Balance

After a few rounds of clus­ter cleanups and mi­gra­tions, it’s now my daily dri­ver for run­ning Gitea, Caddy re­verse prox­ies, mini-fire­walls, and the AI agent that’s help­ing me clean up this post.

Proxmox gives you two main op­tions out of the box:

LXC con­tain­ers start in­stantly, share the host ker­nel, and are spec­tac­u­larly ef­fi­cient. But they’re not iso­lated — a ker­nel ex­ploit in one con­tainer com­pro­mises every­thing. You can’t run a dif­fer­ent OS. You can’t eas­ily nest Docker in­side them with­out end­ing up (eventually) wrestling with fuse-over­layfs gym­nas­tics. And cer­tain work­loads (anything need­ing cus­tom ker­nel mod­ules, or CAP_SYS_ADMIN in anger) sim­ply don’t fit.

LXC con­tain­ers start in­stantly, share the host ker­nel, and are spec­tac­u­larly ef­fi­cient. But they’re not iso­lated — a ker­nel ex­ploit in one con­tainer com­pro­mises every­thing. You can’t run a dif­fer­ent OS. You can’t eas­ily nest Docker in­side them with­out end­ing up (eventually) wrestling with fuse-over­layfs gym­nas­tics. And cer­tain work­loads (anything need­ing cus­tom ker­nel mod­ules, or CAP_SYS_ADMIN in anger) sim­ply don’t fit.

Full VMs give you hard­ware iso­la­tion via KVM/VT-x, but they boot SeaBIOS or OVMF, se­dately walk through GRUB as they yawn their way out of bed, probe a for­est of em­u­lated legacy de­vices (IDE con­trollers, VGA, USB hubs, PCI bridges), and typ­i­cally take 5 – 10 sec­onds to reach a lo­gin prompt. Each one car­ries the over­head of that en­tire em­u­lated chipset sit­ting in mem­ory.

Full VMs give you hard­ware iso­la­tion via KVM/VT-x, but they boot SeaBIOS or OVMF, se­dately walk through GRUB as they yawn their way out of bed, probe a for­est of em­u­lated legacy de­vices (IDE con­trollers, VGA, USB hubs, PCI bridges), and typ­i­cally take 5 – 10 sec­onds to reach a lo­gin prompt. Each one car­ries the over­head of that en­tire em­u­lated chipset sit­ting in mem­ory.

What I wanted was the se­cu­rity bound­ary of a VM with the startup char­ac­ter­is­tics of a con­tainer. QEMUs mi­crovm ma­chine type — orig­i­nally de­vel­oped for Firecracker-style work­loads — strips all of that away. No BIOS, no GRUB, no legacy de­vices. Direct ker­nel boot into a min­i­mal vir­tio-only en­vi­ron­ment. The re­sult: sub-300ms boot to a fully net­worked guest with a QEMU agent, run­ning in­side its own KVM hard­ware iso­la­tion bound­ary.

Now, let me be clear: I’m not spawn­ing hun­dreds of these things. I have Azure for that — but I do want to run Gitea Actions work­ers, have a very lim­ited set of hard­ware re­sources, and got fed up with the time it took for one par­tic­u­lar VM to boot re­peat­edly…

What It Actually Does

pve-mi­crovm is a sin­gle .deb that patches Proxmox’s qemu-server Perl mod­ules at in­stall time. When you set ma­chine: mi­crovm on a VM con­fig, the stan­dard con­fig_­to_­com­mand func­tion del­e­gates to my MicroVM.pm, which builds an (almost) com­pletely dif­fer­ent QEMU com­mand line:

qemu-sys­tem-x86_64 -M mi­crovm,x-op­tion-roms=off,pit=off,pic=off,\ isa-se­r­ial=on,rtc=on,acpi=on,pcie=on \ -kernel /usr/share/pve-microvm/vmlinuz \ -initrd /usr/share/pve-microvm/initrd \ -append console=ttyS0 root=/​dev/​vda rw quiet” \ -device vir­tio-blk-pci-non-tran­si­tional,drive=drive-sc­si0 \ -device vir­tio-net-pci-non-tran­si­tional,net­dev=net0 \ …

No chipset em­u­la­tion. No PCI bridges. No VGA. The guest gets a sin­gle se­r­ial con­sole (which PVEs xterm.js con­nects to na­tively), vir­tio block de­vices, and a vir­tio net­work in­ter­face. Everything rides PCIe trans­port with non-tran­si­tional (modern-only) vir­tio de­vices rather than the MMIO trans­port mi­crovm was orig­i­nally de­signed around — for rea­sons I’ll come to in a mo­ment.

The pack­age ships:

A tiny (12MB) pre-built Linux 6.12.22 ker­nel com­piled from x86_64_de­f­con­fig with a min­i­mal over­lay — vir­tio, vsock, vir­tiofs, 9p, and the mod­ules Docker needs (overlay, veth, bridge, net­fil­ter, BPF), be­cause, well, I’m prag­matic.

A 1 MB ini­trd that probes vir­tio de­vices, finds the root filesys­tem by la­bel or de­vice path, and does a switch_­root in ~150ms

pve-mi­crovm-tem­plate — builds root filesys­tems from any of 12 sup­ported OCI base im­ages, with op­tional SSH, Docker, and guest agent

pve-oci-im­port — pulls an OCI im­age di­rectly into a PVE-managed disk

Web UI ex­ten­sions — a Create µVM” but­ton, ma­chine type drop­down, con­di­tional panel hid­ing for ir­rel­e­vant set­tings, and an am­ber bolt icon in the re­source tree

A sys­temd ser­vice (pve-microvm-early.service) that en­sures patches are ap­plied be­fore pvedae­mon starts on boot — crit­i­cal for on­boot=1 VMs

The Boot Sequence

Like in aero­dy­nam­ics, most speed comes from elim­i­nat­ing every­thing that is­n’t strictly nec­es­sary. A stan­dard VM spends most of its boot time in firmware and boot­loader, so a mi­croVM skips all of that.

SmolBSD (a NetBSD guest us­ing vir­tio-mmio trans­port) boots in 31ms. A full Debian with Docker and the QEMU agent is ready in un­der 8 sec­onds — and most of that time is apt pack­age in­stal­la­tion dur­ing first boot. Subsequent boots hit the 300ms mark con­sis­tently, even on my hum­ble hard­ware.

A fun rab­bit hole I went into when some­one asked me to add SmolBSD sup­port:

There’s a rea­son SmolBSD gets to use vir­tio-mmio and the Linux guests don’t. A QEMU mi­crovm ma­chine type can carry its vir­tio de­vices over two trans­ports: the bare-bones MMIO in­ter­face it was orig­i­nally built for, or PCIe. MMIO is the lighter of the two — no PCIe host bridge, no ACPI — which is how a NetBSD guest shaves it­self down to 31 ms.

But on QEMU 10.x the MMIO path has (as far as I can tell) a de­vice-prob­ing bug for Linux guests: only vir­tio-blk binds, and the net­work, se­r­ial and bal­loon de­vices are never claimed by their dri­vers for some rea­son. NetBSD probes MMIO cor­rectly and is per­fectly happy; Linux (at least the ker­nel I am us­ing) is not.

For every Linux guest I there­fore fall back to PCIe with non-tran­si­tional (modern-only) vir­tio de­vices, which binds all of them re­li­ably. The cost is about 50 ms of ex­tra bring-up — which, against a 300 ms boot, I’ll take with­out com­plaint.

I think the above is ac­tu­ally a bug in my ker­nel con­fig­u­ra­tion, but haven’t re­ally had time (or maybe even the right hard­ware) to tackle it — this is some­thing I’d love more peo­ple to look at and con­tribute patches.

One Kernel, Many Guests

There’s a de­lib­er­ate con­se­quence of this di­rect ker­nel boot trick that’s easy to miss: the ker­nel does­n’t live in­side the guest. It sits on the Proxmox host at /usr/share/pve-microvm/vmlinuz, and the guest disk holds noth­ing but a root filesys­tem — user­land, no /boot, no GRUB, no per-guest ker­nel pack­age, no initramfs of its own.

That also means there’s no boot the in­staller ISO and click through it” path, so in­stead the rootfs gets built straight from an OCI im­age with pve-mi­crovm-tem­plate (Debian, Alpine, Fedora, Rocky, Amazon Linux and friends). In the weird cases, we im­port a pre­pared ext4/​raw disk with qm im­port­disk. You don’t in­stall an OS — you as­sem­ble a root filesys­tem.

Decoupling the ker­nel from the rootfs is what makes this in­ter­est­ing to run at scale. Every Linux mi­croVM on the node boots the same vm­linuz — one ker­nel, built once from a stock x86_64_de­f­con­fig with a mi­crovm over­lay, so you can au­dit and up­date it in ex­actly one place: drop a new vm­linuz on the host, restart the guests, done. No guest ever pulls a bro­ken ker­nel from an apt up­grade, be­cause no guest has a ker­nel to up­grade, and the rootfs im­ages stay tiny and com­pletely ker­nel-ag­nos­tic.

Container-style ker­nel con­sis­tency, VM-style iso­la­tion.

What I’m Running

On my clus­ter right now, I have a fair smat­ter­ing of these al­ready. Four off the top of my head are:

gitea (VM 114, on an Intel N5105) — Bare-metal Gitea with SQLite, Caddy HTTPS, lo­cal ac­tions run­ner, Avahi dis­cov­ery. 2 cores, 2 GB RAM, 32 GB disk. Boots in ~3s, mostly be­cause Gitea does a lot of house­keep­ing.

smith (VM 9022, on my i7) — the main pi­claw agent, the sys­tem that man­ages the clus­ter, re­leases pi­claw and gen­er­ally keeps tabs on every­thing. 2 cores, 6 GB RAM, 48 GB disk. Runs Docker in­ter­nally, and has 3 smaller, volatile sib­lings scat­tered through­out the clus­ter that don’t run Docker but have dif­fer­ent roles (CI/CD, wipe-and-re­in­stall agent in­stance for test­ing up­grades, etc.)

exo (VM 9021, on my i7) — Distributed in­fer­ence co­or­di­na­tor for run­ning LLMs across mul­ti­ple ma­chines. CPU-only, 2 GB root.

vir­tu­aldsm (VM 300, tnas) — Synology DSM run­ning in­side a mi­croVM with Docker, in­side Terramaster NAS hard­ware. Yeah, I know I’m weird, but it was needed when my Synology went side­ways and I haven’t nuked it yet. Uses the stock Debian ker­nel rather than my cus­tom one, be­cause DSM needs spe­cific mod­ule paths.

I also have a dor­mant 9Front (Plan9) one, as well as a lit­tle menagerie of OpenWrt, OPNsense, OSv uniker­nels, gokrazy Go ap­pli­ances, and var­i­ous Alpine/Fedora/Rocky/Amazon Linux con­fig­u­ra­tions filed away as stan­dard Proxmox back­ups. The 21 guest OS types aren’t the­o­ret­i­cal — each one has been booted and val­i­dated, and some­times smith will go and thaw one out to do re­gres­sion tests.

The z83ii (that an­cient Atom x5-Z8350 with 2GB RAM) was in­valu­able as a base­line test plat­form, be­cause If a mi­croVM can boot and run use­fully on a fan­less 2016-era Atom with 2 GB of to­tal sys­tem mem­ory, it’ll work any­where. And it could run six be­fore it started slow­ing down…

What a Config Looks Like

There’s no magic to a mi­croVM con­fig — it’s an or­di­nary qm guest with a par­tic­u­lar ma­chine type and a ker­nel com­mand line. Here’s gitea (the VM 114 above) as it sits in /etc/pve/qemu-server/114.conf:

agent: 1 args: -kernel /usr/share/pve-microvm/vmlinuz -append console=ttyS0 root=/​dev/​vda rw quiet” boot: or­der=sc­si0 cores: 2 ma­chine: mi­crovm mem­ory: 2048 name: gitea net0: vir­tio=BC:24:11:00:6E:01,bridge=vm­br0 on­boot: 1 sc­si0: lo­cal-lvm:vm-114-disk-0,size=32G se­ri­al0: socket tags: mi­crovm vga: se­ri­al0

The only mi­crovm-spe­cific lines are ma­chine: mi­crovm, the args car­ry­ing the ker­nel and its cmd­line, and se­ri­al0: socket / vga: se­ri­al0 wiring the con­sole through to xterm.js. Everything else — cores, mem­ory, the vir­tio NIC on vm­br0, the sc­si0 disk on lo­cal-lvm, on­boot, the guest agent — is ex­actly what you’d write for any Proxmox VM.

Note there’s no -initrd in args: MicroVM.pm in­jects it au­to­mat­i­cally when it sees the shipped ker­nel, along with the bal­loon, vsock and (when con­fig­ured) vir­tiofs de­vices. That’s the whole point — a mi­croVM is a nor­mal guest that hap­pens to boot a host-pro­vided ker­nel, not a spe­cial ob­ject you have to learn a new tool to man­age.

Networking

A mi­croVM at­taches to the net­work ex­actly like any other Proxmox guest. The in­ter­face spec is a bit of a mouth­ful (virtio-net-pci-non-transitional de­vice on the PCIe bus), and it lands on what­ever Linux bridge and VLAN you point it at:

qm set 900 –net0 vir­tio,bridge=vm­br0 # sin­gle NIC qm set 900 –net0 vir­tio,bridge=vm­br0,tag=100 # tagged onto VLAN 100

Because it’s an or­di­nary KVM guest, the stan­dard PVE fire­wall ap­plies — per-VM nfta­bles rules work the way they do for any VM, which is the part that ac­tu­ally mat­ters when you’re run­ning un­trusted code.

Inside the guest, net­work­ing is han­dled by sys­temd-net­workd rather than cloud-init: DHCP by de­fault (matched on Type=ether, so it sur­vives cloning with no MAC pin­ning), or a one-line /etc/microvm-static-net for a sta­tic ad­dress. Earlier ver­sions leaned on cloud-init for this, and I found it too brit­tle; mov­ing it to sys­temd-net­workd made cloning re­li­able and I stopped hav­ing to de­bug tem­plates half the time.

Network iso­la­tion, an­other main­stay of the cur­rent hype around agent iso­la­tion, is a solved prob­lem I have no in­ter­est in re-solv­ing in­side the pack­age be­cause I think Proxmox’s own SDN al­ready does it prop­erly — a sim­ple VLAN zone with a sep­a­rate VNet per trust do­main keeps un­trusted guests on their own seg­ment with no path to the LAN, and if I ever both­ered with that on a home LAN (well I have thought about it… but got no time), a VXLAN zone with a des­ig­nated exit node would let me fun­nel egress through a sin­gle fire­walled choke point.

The mi­croVM just lands on what­ever VNet I point net0 at, so the pol­icy lives in Proxmox, not in a one-off rule­set I’d have to babysit.

There’s also a non-net­worked path for host/​guest plumb­ing: each guest gets a vsock CID (VMID + 1000), which I use for SSH-agent for­ward­ing (host keys into the guest with­out ever ex­pos­ing them on the wire) and for vir­tiofs/​9p di­rec­tory shar­ing, be­cause… I for­get. I know I needed it at one point, even if right now most of my mi­crovm in­stances just do SMB mounts (which were a pain to do un­der Docker and LXC)

The NIC count is… mod­er­ately sane. I cur­rently al­low six vir­tio in­ter­faces per guest (net0-net5) on the spu­ri­ous grounds that they’re twice the max­i­mum num­ber of phys­i­cal in­ter­faces across all of my host ma­chines. It’s just de­fined in MicroVM.pm, and the very few peo­ple who need more than that are wel­come to tweak it (and if you think you want a dozen you al­most cer­tainly want VLANs in­stead).

Storage and Migration

Storage was… triv­ial? every PVE back­end works, be­cause the disk is just a vir­tio-blk-pci de­vice and Proxmox hands it the same path it hands any VM. LVM and LVM-thin, ZFS, Ceph/RBD, NFS and CIFS, plain di­rec­tory stor­age — all fine, with snap­shots, linked and full clones, vz­dump back­ups and qm im­port­disk be­hav­ing ex­actly as they do else­where. A mi­crovm is a nor­mal PVE guest that hap­pens to boot dif­fer­ently.

Migration is the only place things are iffy. a) I can’t do live mi­gra­tion on any of my hard­ware (none of it is en­ter­prise grade), and b) true live mi­gra­tion is­n’t vi­able with the cur­rent QEMU mi­crovm ma­chine type any­way — it sim­ply does­n’t im­ple­ment it.

Offline mi­gra­tion (i.e., reshuf­fling in­stances around) works fine, though, and be­cause mi­crovm boot in well un­der a sec­ond, you can run a quick HA re­lo­cate cy­cle — stop, mi­grate, start — pro­vided your disks live on shared stor­age.

I’ve mea­sured it at around two sec­onds mov­ing a small guest be­tween nodes on shared CIFS. Not seam­less VM mo­tion, but per­haps good enough for most uses, and as far as I can fig­ure out, ha-man­ager dri­ves it the same way it would any guest.

Vs. LXC: When To Choose What

Mind you, I still run LXC con­tain­ers. They’re the right choice when:

You trust the work­load (or it’s your own code)

You need zero boot over­head

You want di­rect filesys­tem ac­cess from the host

You want slightly less mem­ory al­lo­ca­tion over­head (oh, and there’s shared page cache if you’re re­ally lucky)

MicroVMs win when:

You need ac­tual ker­nel iso­la­tion — run­ning un­trusted code, dif­fer­ent ker­nel ver­sions, nested Docker, any­thing with ag­gres­sive CAP_ re­quire­ments

You want a re­pro­ducible im­age you can snap­shot, back up (vzdump works), of­fline-mi­grate, and clone (linked clones too)

The work­load does some­thing un­usual to the ker­nel (BPF pro­grams, cus­tom net­fil­ter rules, ker­nel mod­ules) and you don’t want that leak­ing into your host (which is why most of my tun­nels now land in a mi­crovm)

You’re run­ning non-Linux guests — NetBSD, Plan 9, FreeBSD-based fire­walls — which sim­ply can’t run as LXC

You want a VM that’s up be­fore you’ve fin­ished the com­mand.

The over­head dif­fer­ence be­tween LXC and a mi­croVM on mod­ern hard­ware is sur­pris­ingly small. On borg (that i7 – 12700), the idle mem­ory foot­print of a min­i­mal mi­crovm is ~40 MB, and the CPU over­head of the hy­per­vi­sor is un­mea­sur­able for most work­loads.

The mem­ory fig­ure you set at boot is gen­uinely just a ceil­ing: As far as I can tell KVM still only backs the pages a guest ac­tu­ally touches, and the host’s same-page merg­ing dedupes iden­ti­cal pages — ker­nels, libc, shared base-im­age lay­ers — across every mi­crovm on the node, so like on grown-up cloud hy­per­vi­sors the real cost tracks the work­ing set, not the nom­i­nal al­lo­ca­tion.

Much to my em­bar­rass­ment, I did­n’t even bother with live re­siz­ing for a long while, but I added bal­loon sup­port to the ker­nel re­cently, with free-page-re­port­ing and de­flate-on-oom. PVEs bal­loon tar­get dri­ves proper auto-bal­loon­ing, and a vir­tio-mem de­vice gives gen­uine fine-grained hot-add and hot-re­move, so, again, this works like a nor­mal VM.

The Awkward Parts

Now for the catches…

First one is pretty ob­vi­ous: I’m patch­ing some­one else’s prod­uct, and even though I sur­vived the Perl 4 to Perl 5 tran­si­tion decades ago and I have Codex to help these days, patch­ing the Perl in­ter­nals is frag­ile–every qemu-server up­grade can break my setup.

I mit­i­gate this by a) not blindly run­ning up­grades and b) us­ing dpkg trig­gers (the pack­age re-patches au­to­mat­i­cally af­ter up­grades), but it’s still a race con­di­tion wait­ing to hap­pen — I’ve had par­tial PVE up­grades leave the sys­tem in a state where pvedae­mon could­n’t even com­pile the patched mod­ule.

And then come the weird fail­ure modes–one up­grade had a nasty lit­tle side ef­fect re­gard­ing root de­vices:

Root is found at root=/​dev/​vda be­cause the guest boots un­der the mi­crovm ma­chine type with vir­tio-blk on PCIe

but if a guest ever comes up un­der the stan­dard chipset in­stead (a half-ap­plied patch, or an on­boot=1 VM start­ing be­fore the early-boot ser­vice has re-patched af­ter a host re­boot), the same disk enu­mer­ates as /dev/sda, root is­n’t found, and the VM looks like it has lost its filesys­tem.

The data is al­ways there, just at the wrong path. I’ve patched that par­tic­u­lar twist in a way that the dpkg trig­ger and early-boot ser­vice or­der­ing try to mit­i­gate it, so the ini­trd now falls back to /dev/sda when /dev/vda is miss­ing — but un­til (if ever) Proxmox sup­ports mi­croVMs na­tively, ex­pect the oc­ca­sional pa­per­cut.

Package ver­sion mis­matches can cas­cade. I learned this the hard way — a par­tial apt up­grade on one of my nodes left libpve-clus­ter-perl and libpve-net­work-perl at in­com­pat­i­ble ver­sions, which broke all Perl mod­ule load­ing, which meant pvedae­mon could­n’t start, which meant VMs could­n’t be man­aged.

So… al­ways do full dist-up­grade, never par­tial up­grades on PVE nodes with this.

Another catch (for some) is that there is no VGA and no graph­i­cal con­sole — the se­r­ial con­sole is your only in­ter­face. If some­thing goes wrong dur­ing boot, you’re read­ing ker­nel pan­ics on a ter­mi­nal. This is fine for servers, less fine for desk­top-ori­ented guests, and Plan9 re­ally does­n’t like it.

The next one is that there’s no USB at all — no con­trollers, no passthrough, noth­ing on the bus — so I had to get the web UI to hide those op­tions the mo­ment you flip a guest to mi­crovm. This might be a deal-breaker if you want to, say, run a Zigbee con­troller (and is why my home au­toma­tion stuff is still in an LXC).

The ker­nel is opin­ion­ated. My cus­tom 6.12.22 ker­nel in­cludes ex­actly what I need and noth­ing else. If your work­load needs a mod­ule I haven’t in­cluded, you’ll need to re­build it or use the stock Debian ker­nel (which works fine but is 3x larger and boots slower).

Finally, GPU and PCI passthrough are dis­abled, not im­pos­si­ble. And this is an­other thing I would love peo­ple with more money hard­ware to re­ally dive into, since I ac­tu­ally had an RTX 3060 passed through and work­ing in early test­ing.

In the end, I de­cided to tem­porar­ily dis­able GPU sup­port for sim­plic­ity — the pack­age strips host­pci* to­day and there’s no vIOMMU plumb­ing wired up, so it’s off by de­fault.

There’s noth­ing stop­ping any­one from adding it back (save some QEMU mi­crovm caveats around the min­i­mal ECAM PCIe bus and IOMMU setup), and I’d gen­uinely love to be able to do proper PCI and vIOMMU test­ing — I just don’t have the spare hard­ware for it, and I chose to fo­cus this on run­ning agents rather than chas­ing ac­cel­er­a­tor passthrough.

Under The Hood: The Patch Strategy

The pack­age mod­i­fies three files in the PVE Perl stack:

Machine.pm — ex­tends the ma­chine type regex to ac­cept mi­crovm as a valid value

JSON-LD Explained for Personal Websites

hawksley.dev

JSON-LD, also known as JSON Linked Data, is a for­mat for adding struc­tured data to web­pages. It can aid web crawlers in un­der­stand­ing the se­man­tic struc­ture of your site, qual­i­fy­ing you for richer link pre­views, and even po­ten­tially im­prov­ing your search rank­ing.

It’s been 4 months since my first post where I de­scribed build­ing this site, and Wakatime es­ti­mates I’ve spent ~100 hours cod­ing now, not in­clud­ing time spent re­search­ing and test­ing. Since then, this site has been re­ceiv­ing plenty of pol­ish, in­clud­ing the ad­di­tion of JSON-LD on each page.

JSON-LD Fundamentals

To add JSON-LD to a page, add the fol­low­ing some­where in your <head> sec­tion:

<script type=“ap­pli­ca­tion/​ld+json”> { @context”: https://​schema.org, @graph”: [ { @type”: WebSite”, @id”: https://​hawk­sley.dev/#​web­site, url”: https://​hawk­sley.dev/, name”: Ethan Hawksley” }, // Insert more nodes here. ] } </script>

Let’s break down what each part does.

<script type=“ap­pli­ca­tion/​ld+json”>

This de­clares a new script with MIME type ap­pli­ca­tion/​ld+json. Since it has this type spec­i­fied, the browser’s JS en­gine won’t run it. Specialised crawlers like Googlebot look out for these el­e­ments and parse the con­tents.

{ @context”: https://​schema.org }

Here, a JSON ob­ject is ini­tialised and the prop­erty @context is set to https://​schema.org. In JSON-LD, the struc­ture of data is de­ter­mined by as­sign­ing the ap­pro­pri­ate con­text. Web crawlers are stan­dard­ised on Schema.org,, opens in new tab which de­fines all the valid key-value pairs for the JSON.

Now that we’ve de­fined the schema our JSON-LD is fol­low­ing, we can de­scribe our web­page!

{ @graph”: [ { @type”: WebSite”, @id”: https://​hawk­sley.dev/#​web­site, url”: https://​hawk­sley.dev/, name”: Ethan Hawksley” } // Insert more nodes here. ] }

A JSON-LD doc­u­ment can be thought of as a la­belled, di­rected graph, stored un­der @graph. The graph con­tains mul­ti­ple nodes, con­nected to each other with di­rected arcs.

Nodes have:

@type - Describes what the node is, e.g. WebSite or SoftwareApplication

@id - A unique iden­ti­fier for the node, typ­i­cally a URL with a unique hash value at the end

Properties - Key/Value pairs that de­scribe the at­trib­utes of the node

In the ex­am­ple above, the type is WebSite, the ID is https://​hawk­sley.dev/#​web­site, and it has two prop­er­ties, url and name.

Web crawlers can merge the prop­er­ties of a node across mul­ti­ple pages, as long as they share an ID. However, scrap­ers that only read one page - such as LLMs - will not merge the prop­er­ties. When JSON-LD is reused across pages, strik­ing this bal­ance is im­por­tant to keep in mind. It is best prac­tice for the ID to be a URL fol­lowed by a hash, such as #website, that uniquely iden­ti­fies the node.

Although the Schema.org con­text de­fines many types of nodes, this guide will only be cov­er­ing nodes that have no­tice­able SEO im­pact. If you’re in­ter­ested in more, look up the se­man­tic web - it’s a fun rab­bit hole.

Let’s move on to which nodes each page on our site should in­clude. For each type, I’ve in­cluded the JSON-LD from this site, so you can copy-paste and edit it to fit your own.

WebSite

You’ve seen an ex­tract of WebSite ear­lier! Now here’s the full ver­sion:

{ @type”: WebSite”, @id”: https://​hawk­sley.dev/#​web­site, url”: https://​hawk­sley.dev/, name”: Ethan Hawksley”, alternateName”: [“hawksley.dev”, Hawksley”], description”: The per­sonal site and tech­ni­cal blog of Ethan Hawksley, a UK-based CS stu­dent with a fo­cus on sys­tems pro­gram­ming, low-level com­put­ing, and cy­ber­se­cu­rity.”, inLanguage”: en-GB”, publisher”: { @id”: https://​hawk­sley.dev/#​per­son }, image”: { @type”: ImageObject”, @id”: https://​hawk­sley.dev/#​web­site-im­age, url”: https://​hawk­sley.dev/​logo-square.png, caption”: Ethan Hawksley Logo” } }

WebSite ex­plains the meta­data about the site. It gives crawlers hints on how to dis­play your site.

Here, you can see that Google has in­ter­preted the name field as rep­re­sen­ta­tive of the do­main and is la­belling the re­sult ap­pro­pri­ately.

Although WebSite ap­plies to every page, you don’t need to in­clude the full ver­sion of it on every page. The root page of the do­main should be fully de­tailed, but it is per­fectly ac­cept­able for other pages to have a slimmed-down ver­sion:

{ @type”: WebSite”, @id”: https://​hawk­sley.dev/#​web­site, url”: https://​hawk­sley.dev/, name”: Ethan Hawksley” }

This gives suf­fi­cient con­text to sin­gle-page crawlers so they cor­rectly name the site, but they don’t need the full de­tails.

WebPage

WebPage de­scribes the cur­rent page, but it’s im­por­tant to dis­tin­guish it from other types like BlogPosting (covered later). WebPage rep­re­sents the phys­i­cal page it­self, the HTML. It con­tains the con­tent of the page.

{ @type”: WebPage”, @id”: https://​hawk­sley.dev/​blog/​hack-club-camp­fire/#​web­page, url”: https://​hawk­sley.dev/​blog/​hack-club-camp­fire/, isPartOf”: { @id”: https://​hawk­sley.dev/#​web­site }, name”: Winning the Hack Club Campfire Hackathon”, inLanguage”: en-GB”, breadcrumb”: { @id”: https://​hawk­sley.dev/​blog/​hack-club-camp­fire/#​bread­crumb } }

There are more spe­cific sub­types of WebPage. In this post, I’ll cover ProfilePage and CollectionPage. You can find less com­mon ones at the bot­tom of Schema.org’s de­f­i­n­i­tion for WebPage., opens in new tab

Person

Another node that every page on a per­sonal web­site should have is Person. It de­scribes who you are, which Google uses as part of their con­tent qual­ity met­ric. Increasingly, LLM crawlers are also us­ing it to de­cide who to cite in their an­swers.

Unlike WebSite, it is im­por­tant enough con­text that you should in­clude it on all of your site’s pages.

{ @type”: Person”, @id”: https://​hawk­sley.dev/#​per­son, url”: https://​hawk­sley.dev/, name”: Ethan Hawksley”, alternateName”: ethanhawksley”, givenName”: Ethan”, familyName”: Hawksley”, description”: Long Description”, disambiguatingDescription”: Shorter Description”, jobTitle”: Computer Science Student”, knowsLanguage”: en-GB”, knowsAbout”: [ // Keywords ], nationality”: { @type”: Country”, name”: United Kingdom” }, homeLocation”: { @type”: Place”, address”: { @type”: PostalAddress”, addressCountry”: GB } }, affiliation”: { @type”: HighSchool”, url”: https://​www.al­ces­tergs.co.uk, name”: Alcester Grammar School”, sameAs”: [ https://​www.wiki­data.org/​wiki/​Q4713005, https://​en.wikipedia.org/​wiki/​Al­ces­ter_­Gram­mar_School ] }, alumniOf”: [ { @type”: HighSchool”, url”: https://​www.brookewe­ston.org, name”: Brooke Weston Academy”, sameAs”: [ https://​www.wiki­data.org/​wiki/​Q4974495, https://​en.wikipedia.org/​wiki/​Brooke_We­st­on_A­cad­emy ] } ], image”: { @type”: ImageObject”, @id”: https://​hawk­sley.dev/#​per­son-im­age, url”: https://​hawk­sley.dev/​ethan-hawk­sley.png, caption”: Ethan Hawksley”, width”: 1200, height”: 1200 }, sameAs”: [ https://​github.com/​ethan-hawk­sley, https://​www.linkedin.com/​in/​ethanhawk­sley, https://​lob­ste.rs/~​ethanhawk­sley, https://​news.ycombi­na­tor.com/​user?id=ethanhawk­sley // etc. etc. ] }

Phew! There’s plenty of prop­er­ties for Person. I find that it helps to be more de­scrip­tive, rather than less, when it comes to fill­ing it out. Let’s look at the most im­por­tant prop­er­ties:

url - Points to your root page, an­chor­ing the node.

name, given­Name, fam­i­ly­Name - Clearly de­scribes your name.

im­age - Preferably a photo of you, or a logo you are af­fil­i­ated with. Connects you to a canon­i­cal im­age of you.

sameAs - Immensely use­ful for dis­am­bigua­tion, es­pe­cially if you have a com­mon name. It cleanly in­forms crawlers what your other pro­files are, let­ting them build a knowl­edge graph rep­re­sen­ta­tion of you across mul­ti­ple pages. At the time of writ­ing, /g/11m62cgdtf, opens in new tab is my Google knowl­edge graph ID.

The other prop­er­ties of Person are use­ful for adding more de­tail, but aren’t strictly nec­es­sary. You can trim them if you wish with only mi­nor im­pact.

ProfilePage

A ProfilePage, as you may ex­pect, de­scribes a page on the site about a per­son. For in­stance, I use this node on my home page, as that’s where I talk about my­self. On your site, putting it on an about page could be more ap­pro­pri­ate.

{ @type”: ProfilePage”, @id”: https://​hawk­sley.dev/#​web­page, url”: https://​hawk­sley.dev/, isPartOf”: { @id”: https://​hawk­sley.dev/#​web­site }, name”: About Ethan Hawksley”, inLanguage”: en-GB”, dateCreated”: 2024 – 09-10T00:00:00.000Z”, dateModified”: 2026 – 05-17T00:00:00.000Z”, mainEntity”: { @id”: https://​hawk­sley.dev/#​per­son } }

It’s im­por­tant to use is­PartOf to link it to your broader WebSite node, to cre­ate a re­la­tion­ship be­tween the two nodes. The same ap­plies for mainEn­tity, it lets crawlers know who the page is about. Including date­Cre­ated and date­Mod­i­fied is a good fresh­ness sig­nal for crawlers, but if your site does­n’t have them read­ily avail­able, don’t worry too much about it.

SoftwareApplication

If you are show­cas­ing any soft­ware on your page, it’s a good idea to in­clude a SoftwareApplication node to de­scribe the meta­data about it.

{ @type”: SoftwareApplication”, @id”: https://​hawk­sley.dev/#​pro­ject-yt-play, url”: https://​crates.io/​crates/​yt-play, name”: yt-play”, description”: A CLI util­ity writ­ten in Rust that syn­chro­nises YouTube playlists to lo­cal di­rec­to­ries.”, applicationCategory”: MultimediaApplication”, operatingSystem”: All”, creator”: { @id”: https://​hawk­sley.dev/#​per­son }, sameAs”: [“https://​github.com/​ethan-hawk­sley/​yt-play], offers”: { @type”: Offer”, price”: 0, priceCurrency”: GBP } }

If you want to be more spe­cific than SoftwareApplication, other valid types for this node are MobileApplication, WebApplication, and VideoGame.

The url prop­erty should be a link to where the pro­ject is de­ployed, e.g. crates.io. sameAs is for any other pages as­so­ci­ated with the pro­ject, like its source code repos­i­tory.

There are lots of valid val­ues for ap­pli­ca­tion­Cat­e­gory, you can find a list on Google’s de­f­i­n­i­tion for SoftwareApplication., opens in new tab

Even if your pro­ject is FOSS, in­clude of­fers but make sure to set the price to 0.

BreadcrumbList

BreadcrumbList is widely use­ful and should be in­cluded on all pages aside from the root page. It is used to de­scribe the cat­e­gori­sa­tion of a page, which is­n’t nec­es­sar­ily the ac­tual path to the page.

{ @type”: BreadcrumbList”, @id”: https://​hawk­sley.dev/​blog/​hack-club-ship­wrecked/#​bread­crumb, itemListElement”: [ { @type”: ListItem”, item”: https://​hawk­sley.dev/, position”: 1, name”: Home” }, { @type”: ListItem”, item”: https://​hawk­sley.dev/​blog/, position”: 2, name”: Blog” }, { @type”: ListItem”, item”: https://​hawk­sley.dev/​blog/​hack-club-ship­wrecked/, position”: 3, name”: The Shipwrecked Hackathon by Hack Club” } ] }

BreadcrumbList de­scribes the path of a page. By in­clud­ing one, you can con­trol how search en­gines rep­re­sent the path of a spe­cific page.

Here, the search re­sult for my blog post con­tains the path https://​hawk­sley.dev › Blog. If your site al­ready uses short paths, this node is a mi­nor gain and can be omit­ted. However, if your paths are longer, BreadcrumbList is use­ful for short­en­ing them.

CollectionPage

The CollectionPage node is a sub­type of WebPage, us­able for pages that pri­mar­ily con­tain lists. For ex­am­ple, my /elsewhere/ page lists all my other pro­files, and /blog/ lists all my blog posts.

{ @type”: CollectionPage”, @id”: https://​hawk­sley.dev/​else­where/#​web­page, url”: https://​hawk­sley.dev/​else­where/, isPartOf”: { @id”: https://​hawk­sley.dev/#​web­site }, name”: Elsewhere”, description”: Online pro­files of Ethan Hawksley, a UK-based CS stu­dent. Links to his de­vel­op­ment, so­cial me­dia, tech­ni­cal writ­ing, and se­cu­rity ac­counts.”, inLanguage”: en-GB”, about”: { @id”: https://​hawk­sley.dev/#​per­son }, breadcrumb”: { @id”: https://​hawk­sley.dev/​else­where/#​bread­crumb } }

You’ve met most of these prop­er­ties al­ready, so they are largely self-ex­plana­tory. Make sure you link bread­crumb to the cor­rect BreadcrumbList! It needs to be the one on the cur­rent page for it to make any sense.

Blog

You should add the Blog node to your blog’s in­dex or home page. It acts as a step­ping stone be­tween your WebSite and the in­di­vid­ual blog posts you pub­lish.

{ @type”: Blog”, @id”: https://​hawk­sley.dev/​blog/#​blog, isPartOf”: { @id”: https://​hawk­sley.dev/#​web­site }, mainEntityOfPage”: { @id”: https://​hawk­sley.dev/​blog/#​web­page }, name”: Ethan Hawksley’s Blog”, description”: Technical blog of Ethan Hawksley, a UK-based CS stu­dent. Articles on sys­tems pro­gram­ming, low-level com­put­ing, cy­ber­se­cu­rity, and com­puter sci­ence.”, inLanguage”: en-GB”, dateModified”: 2026 – 05-17T00:00:00.000Z”, publisher”: { @id”: https://​hawk­sley.dev/#​per­son }, license”: https://​cre­ativecom­mons.org/​li­censes/​by/​4.0/ }

date­Mod­i­fied is a good fresh­ness sig­nal, but if you don’t have it handy, don’t worry. Including li­cense lets crawlers know un­der what cir­cum­stances they can use your prose.

If you’ve done prior re­search into JSON-LD, you may be sur­prised that the pub­lisher prop­erty is set to be a Person, not an Organization. Although it used to re­quire one, Google’s doc­u­men­ta­tion has since been re­laxed and Person is en­tirely valid too, and ar­guably more ac­cu­rate for a per­sonal web­site.

BlogPosting

The last node we’ll be cov­er­ing is BlogPosting. It should be in­cluded on all pub­lished blog posts, pro­vid­ing added in­for­ma­tion to crawlers so they can more ac­cu­rately rep­re­sent them, in­clud­ing both more ac­cu­rate place­ment and richer de­tails in search re­sults.

{ @type”: BlogPosting”, @id”: https://​hawk­sley.dev/​blog/​need-for-post-quan­tum-cryp­tog­ra­phy/#​blog­post­ing, url”: https://​hawk­sley.dev/​blog/​need-for-post-quan­tum-cryp­tog­ra­phy/, mainEntityOfPage”: { @id”: https://​hawk­sley.dev/​blog/​need-for-post-quan­tum-cryp­tog­ra­phy/#​web­page }, isPartOf”: { @id”: https://​hawk­sley.dev/​blog/#​blog }, headline”: The Need for Post-Quantum Cryptography”, description”: Quantum com­put­ers are closer to break­ing RSA and ECC than we thought. Learn what post-quan­tum cryp­tog­ra­phy is and how to start mi­grat­ing.”, articleSection”: cybersecurity”, keywords”: cybersecurity, quan­tum”, inLanguage”: en-GB”, datePublished”: 2026 – 04-13T00:00:00.000Z”, dateModified”: 2026 – 04-17T00:00:00.000Z”, author”: { @id”: https://​hawk­sley.dev/#​per­son }, publisher”: { @id”: https://​hawk­sley.dev/#​per­son }, image”: { @type”: ImageObject”, @id”: https://​hawk­sley.dev/​blog/​need-for-post-quan­tum-cryp­tog­ra­phy/#​blog­post­ing-im­age, url”: https://​hawk­sley.dev/​og/​blog/​need-for-post-quan­tum-cryp­tog­ra­phy.png, width”: 1200, height”: 630 }, license”: https://​cre­ativecom­mons.org/​li­censes/​by/​4.0/ }

Since this is a per­sonal site, it is al­right for au­thor and pub­lisher to both point to­wards the same Person node. The im­age prop­erty should mir­ror the OG im­age that the post al­ready uses for link pre­views.

Conclusion

Congrats, that’s all the JSON-LD a per­sonal site needs! I’ve struc­tured this post to make it as easy as pos­si­ble for you to copy-paste and im­ple­ment into your own per­sonal site. Even if you run a sta­tic site with­out a build step, you can still ben­e­fit from adding at a min­i­mum WebSite, ProfilePage, and Person to the root page of your site.

If you have any ques­tions, you can get in touch via email, and I’ll do my best to help.

The 100,000 whys of AI

lcamtuf.substack.com

One of the most painful ar­gu­ments I keep hav­ing with fel­low techies is the ques­tion of whether you can dis­tin­guish be­tween hu­man-writ­ten and AI-generated text.

Their skep­ti­cism is rooted in rea­son: at their core, LLMs are state-of-the-art sta­tis­ti­cal mod­els of how hu­mans talk. If so, the out­put from the model should be al­most by de­f­i­n­i­tion in­dis­tin­guish­able from hu­man lan­guage un­der any sta­tis­ti­cal test.

I don’t think this is al­ways ar­gued in good faith; at least some of the de­bates are started by folks who wish to main­tain de­ni­a­bil­ity for their own un­der­handed use of the tech. But if you sin­cerely hold this be­lief, I pre­sent you the fol­low­ing col­lage:

The im­age shows about 150 Amazon book cov­ers that ap­pear if you search the site for 100000 whys” (link). Some of these books are cat­e­gory best­sellers in chil­dren lit­er­a­ture. You can view a zoomable, full-res­o­lu­tion ver­sion here.

There’s noth­ing in­hu­man about any of these ti­tles or cov­ers. At the same time, I prob­a­bly don’t need to con­vince you that you’re star­ing at the purest form of AI slop that now fills up many non­fic­tion book cat­e­gories on Amazon. More specif­i­cally, what we’re see­ing here is the ar­ti­fact of the tools be­ing quasi-de­ter­min­is­tic: if a hun­dred authors” give their fa­vorite AI tool a sim­i­lar prompt — say, generate a ref­er­ence book for chil­dren” — the model will pro­duce func­tion­ally iden­ti­cal out­put per­haps 80% of the time.

The sim­i­lar­i­ties in the col­lage go far be­yond the choice of ti­tles: for ex­am­ple, all the cov­ers in the top row fea­ture a roar­ing di­nosaur in the top left cor­ner of the de­sign. There are many other clus­ters in the data, too. Look for a re­cur­ring red-and-white car­toon rocket, a golden re­triever, a lion, and so forth. The sim­i­lar­i­ties ex­tend even to au­thor names: Ethan Bright, Nolan Bright, Pamela Bright, Daniel Bright, Thomas Bright, Andrew W. Bright, Mayan Bright, Mary Bright, Levi Bright — the Brights must be a big and ex­cep­tion­ally tal­ented fam­ily.

This is pre­cisely what makes LLM writ­ing dis­tinc­tive: it’s not that the mod­els’ in­di­vid­ual man­ner­isms are dif­fer­ent from ours. It’s that they re­sort to the same, com­plex set of man­ner­isms in re­sponse to al­most any nor­mal prompt. This is a fuzzy sig­nal, so you should­n’t fire your in­tern when they say it’s not this — it’s that”. But in more ca­sual set­tings, it’s OK to trust your gut. In fact, these in­stincts are be­com­ing in­creas­ingly im­por­tant be­cause tra­di­tional mod­els of on­line in­ter­ac­tions fall apart if it takes much less ef­fort to pro­duce con­tent than to en­gage with it.

PS. If you’re us­ing an LLM to au­to­mate blog­ging: yes, the tech is amaz­ing, but chances are, your pub­li­ca­tion could be re­named to 100,000 Whys”.

No posts

Everything Is Logarithms

alexkritchevsky.com

Some con­nec­tions be­tween things, which I have not seen else­where. Maybe they mean some­thing?

1. The Baseless Logarithm

Normally one writes a log­a­rithm with a base, \(\log_b (x)\), to mean

\[y = \log_b (x) \Lra b^y = x\]

And then you can change the base of the log­a­rithm with

\[\log_b (x) = \frac{\log_a (x)}{\log_a(b)}\]

Which fol­lows from re­ar­rang­ing \(\log_a (x) = \log_a (b^{\log_b x}) = \log_b (x) \times \log_a (b)\).

One way of think­ing about what this for­mula does is that it is a change of units. Similar to writ­ing \(2 \text{ km} = 2000 \text{ m} / \frac{1000 \text{ m}}{1 \text{ km}}\) or \(5 \text{ bytes} = 40 \text{ bits}/\​frac{8 \text{ bits}}{1\text{ byte}}\). It says: how many copies of \(b\) are in \(x\)? It’s the num­ber of copies of \(a\) in \(x\), di­vided by the num­ber of copies of \(a\) that are in \(b\).

This is per­fectly sim­ple, but for some rea­son it’s hard to think about log­a­rithms that way. The no­ta­tion kind of… ob­fus­cates things? Specifically it is hard to read \(\log_b x\) as how many copies of \(b\) are in \(x\)”, be­cause that English ex­pres­sion should cor­re­spond to the no­ta­tion \(x/b\), not \(\log_b x\).

I found a way of think­ing about log­a­rithms which I think makes this clearer, but you have to al­low a sort of odd ob­ject that I am call the base­less log­a­rithm. It is sim­ply a log­a­rithm with­out a base:

\[\log N\]

which we re­gard as an ab­stract ob­ject, not a num­ber. Then we write our nor­mal based” log­a­rithm as a ra­tio of two of these base­less log­a­rithms:

\[\log_2 N = \frac{\log N}{\log 2}\]

Note, this is al­ready sort of a thing peo­ple col­lo­qui­ally do, e.g. leav­ing out the base of log­a­rithms in as­ymp­totic for­mu­las. But I do not mean it as a short­hand. It is use­ful to re­gard it as an ac­tual al­ge­braic ob­ject.

We in­ter­pret \(\log 2\) as be­ing the unit bits”. To write \(\log N\) in bits is to fac­tor it as a mul­ti­ple of \(\log 2\):

\[\log N = \frac{\log N}{\log 2} \log 2 = \log_2 (N) \log 2 = \log_2 (N) \text{ bits}\]

Then the change-of-base for log­a­rithms fol­lows from just writ­ing the same geo­met­ric quan­tity in dif­fer­ent units. For ex­am­ple \(\log e\) as a unit is some­times called nats”:

\[\begin{aligned} \log N = \frac{\log N}{\log 2} \log 2 = \log_2 (N) \text{ bits} = \frac{\log N}{\log e} \log e = \ln (N) \text{ nats} \end{aligned}\]

The base­less \(\log N\) is sort of the mul­ti­plica­tive ver­sion of an ob­ject that might be fa­mil­iar from dis­cus­sions of vec­tors. It is com­mon with vec­tors to dis­tin­guish be­tween points and dis­place­ments: a dis­place­ment vec­tor \(\b{v}\) is given by the dif­fer­ence of two points \(\v = (b) - (a)\). When we write think of points as hav­ing co­or­di­nates, this in­volves an ex­plicit choice of ori­gin \(\O\), such that \(\b{a} \equiv (a) - \O\) and \(\b{b} \equiv (b) - \O\). Then a dis­place­ment vec­tor is con­structed by sub­tract­ing off the fac­tors of \(\O\), \(\b{v} = \b{b} - \b{a} = ((b) - \O) - ((a) - \O) = (b) - (a)\). The base­less log­a­rithm im­ple­mens the same thing but with mul­ti­pli­ca­tion: the value \(\log N\) may be thought of as \(\log N / \log \O\) for an un­spec­i­fied choice of ori­gin; turn­ing it into an ac­tual nu­meric value in­volves di­vid­ing two such log­a­rithms to can­cel out the ori­gin, \(\log_M N = \log N / \log M = (\log N / \log \O) / (\log M / \log O)\). I think of \(\log N\) as the point cor­re­spond­ing to \(N\) and \(\log N / \log \O\) as its cor­re­spond­ing dis­place­ment vec­tor once you pick a co­or­di­nate sys­tem. I pre­fer to think of the point as more fun­da­men­tal.

You might ask: if we have a base­less log­a­rithm \(\log N\), do we also have a baseless ex­po­nen­tial”? Normally \(b^{\log_b N}\) can be writ­ten as some­thing like \(b^{\log_b N} = b^{\ln N / \ln b} = e^{\ln N} = N\); is there any way to do this with­out ac­tu­ally choos­ing a base? I think the an­swer has to be no”. All we can say is that we have split the one ob­ject, a log­a­rithm \(\log_b N\) which is the so­lu­tion of \(b^y = N\), into two ob­jects, \(\log N\) and \(\log b\), each of which on their own are with­out units” and so have no nu­mer­i­cal mean­ing. It is just like points in space: a point on its own has no op­er­a­tion of ad­di­tion and does not have a length. We can sub­tract points to pro­duce vec­tors (relative to a sym­me­try group) but not add them, and the usual op­er­a­tions in co­or­di­nates all re­quire a choice of ori­gin.

In fact there are many sur­pris­ing sim­i­lar­i­ties be­tween log­a­rithms and vec­tors.

2. Logarithms are Vectors

When do­ing vec­tor al­ge­bra and dif­fer­en­tial geom­e­try in a prop­erly co­vari­ant way, we dis­tin­guish be­tween ab­stract vec­tors and vec­tors in a par­tic­u­lar co­or­di­nate sys­tem. My per­sonal con­ven­tion for this is to re­fer to the ab­stract vec­tors as geometric” vec­tors and al­ways write them in bold, \(\v\), whereas coordinate” vec­tors, tu­ples of their val­ues in co­or­di­nates, are writ­ten with an ar­row over them like \(\vec{v} = (v_x, v_y, v_z)\). Boldface geo­met­ric vec­tors are al­ways co­or­di­nate-free, whereas co­or­di­nate vec­tors are just col­lec­tions of num­bers or other ob­jects. The geo­met­ric vec­tor \(\b{v}\) can be writ­ten as a dot prod­uct of its co­or­di­nates with a frame’ \(X = (\x, \y, \z)\) of ba­sis vec­tors

\[\b{v} = \vec{v} \cdot X = (v_x, v_y, v_z) \cdot (\x, \y, \z) = v_x \x + v_y \y + v_z \z\]

The pro­jec­tion of \(\v\) onto a ba­sis vec­tor \(\x\) is then given by measuring’ the vec­tor against the ba­sis vec­tor (which does not have to be of unit length). I like to write this as di­vi­sion be­cause it acts a lot like di­vi­sion (although it’s tech­ni­cally pseu­do­di­vi­sion in­stead):

\[\frac{\v}{\x} = v_x\]

That’s in my own very non­stan­dard no­ta­tion1 for vec­tor di­vi­sion here. The more com­mon way to write this is to pro­ject a com­po­nent of a dif­fer­en­tial \(df = f_x dx + f_y dy + f_z dz\) with a par­tial de­riv­a­tive, which is also the pseu­do­di­vi­sion op­er­a­tion (which is in­ci­den­tally the sense in which par­tial de­riv­a­tives kinda work like di­vi­sion but not re­ally):

\[\frac{\p f}{\p x} = f_x\]

I will write things in both forms to make it easy to trans­late be­tween them; I do pre­fer my vec­tor-di­vi­sion ver­sion be­cause it avoids bring­ing in the ir­rel­e­vant no­ta­tions of dif­fer­en­tial cal­cu­lus, but since the lat­ter is ac­tu­ally stan­dard I ought to in­clude it for com­par­i­son.

Suppose \(\b{v}\) is one-di­men­sional, \(\b{v} = v_x \x\). Then the pro­jec­tion onto a measuring stick’ \(\b{m} = m \x\) mea­sures its length in terms of mul­ti­ples of \(m\):

\[\frac{\v}{\b{m}} = \frac{v_x \x}{m \x} = \frac{v_x}{m}\]

Multiplying by \(\b{m}\) again is what we mean by writing \(\b{v}\) in units of \(\b{m}\):

\[\frac{\b{v}}{\b{m}} \b{m} = (\frac{v_x}{m}) \text{m} \x\]

(In dif­fer­en­tials, this is the dif­fer­en­tial of \(f\) re­stricted to its \(dx\) com­po­nent: \(\frac{\p f}{\p x} dx = f_x dx = df \mid_{x}\), which is a per­fectly in­ter­est­ing ob­ject (a co­vari­ant de­riv­a­tive) that one does not see writ­ten in this way very of­ten. By the way, it’s not re­ally im­por­tant here, but is pos­si­ble to view all mea­sure­ments of the length of vec­tors in this way by think­ing first of rewrit­ing an ar­bi­trary vec­tor \(\v = v_x \x + v_y \y + v_z \z\) in a po­lar form \(\v = v_r \r + v_{\theta} \theta\) and then pro­ject­ing onto \(\r\), \(\| \v \| = \v/\r\). This tends to be a good way of look­ing at things.)

The base­less log­a­rithm is per­form­ing the same op­er­a­tion on log­a­rithms, where \(\log N\) is fill­ing the role of the geo­met­ric vec­tor \(\v\) and \(\log 2 = \text{bits}\) is the unit vec­tor or mea­sur­ing stick, which takes the role of \(\x\).

\[\begin{aligned} \frac{\log N}{\log 2} &= \log_2 N \\ \frac{\log N}{\log 2} \log 2 &= \log_2 N \text{ bits} \end{aligned}\]

In this sense base­less log­a­rithms write num­bers in co­or­di­nates in ex­actly the same way that mea­sur­ing sticks write vec­tors in co­or­di­nates.

The equiv­a­lence of log­a­rithms in dif­fer­ent units

\[\begin{aligned} \log N &= \frac{\log N}{\log 2} \log 2 = \log_2 (N) \text{ bits} \\ &= \frac{\log N}{\log e} \log e = \ln (N) \text{ nats} \end{aligned}\]

is the same as the equiv­a­lence of geo­met­ric vec­tors in dif­fer­ent units

\[\begin{aligned} \v &= \frac{\v}{\x} \x = v_x \x \\[1em] &= \frac{\v}{\x’} \x’ = v_{\x’} \x’ \\ \end{aligned}\]

or

\[\begin{aligned} df &= \frac{\p f}{\p x} dx = f_x dx \\ &= \frac{\p f}{\p x’} dx’ = f_{x’} dx’ \end{aligned}\]

And the change of base for­mula that com­putes a ra­tio of log­a­rithms in dif­fer­ent bases

\[\begin{aligned} \log_2 N \text{ bits}&= \ln N \text{ nats} \\ \log_2 N &= \frac{\text{nats}}{\text{bits}} \ln N\\ &= \frac{\log e}{\log 2} \ln N \\ &= \log_2 (e) \ln N \end{aligned}\]

is ex­actly like the change of co­or­di­nates for a vec­tor, where \(\x\) and \(\x\) are two units for the same quan­tity.

\[\begin{aligned} v_x \x &= v_{x’} \x’ \\ v_x &= \frac{\x’}{\x} v_{\x’} \\ \end{aligned}\]

or2

\[\begin{aligned} f_x dx &= f_{x’} dx’ \\ f_x &= \frac{dx’}{dx} f_{x’} \end{aligned}\]

What log­a­rithms don’t al­low you to do that par­tial de­riv­a­tives and vec­tor di­vi­sion do al­low to ac­tu­ally talk about a par­tial de­riv­a­tive op­er­a­tion in iso­la­tion. For ex­am­ple, if \(N = 2^a 3^b\), you can only talk about the ra­tio with re­spect to a sin­gle unit \(\log 2\)

\[\frac{\log N}{\log 2} = a \frac{\log 2}{\log 2} + b \frac{\log 3}{\log 2} = a + b \log_2 3\]

which is equiv­a­lent to writ­ing a vec­tor as a mul­ti­ple of a sin­gle ba­sis vec­tor (like in Clifford/geometric al­ge­bra)

\[\frac{\v}{\x} = v_x + v_y \frac{\y}{\x}\]

or to a to­tal de­riv­a­tive

\[\frac{df}{dx} = f_x + f_y \frac{dy}{dx}\]

But there is no di­rect equiv­a­lent of the op­er­a­tion of par­tial dif­fer­en­ti­a­tion—there’s noth­ing that acts like \(N \? (\log_2 N) \log 2 + (\log_3 N) \log 3\).

However, I keep find­ing that peo­ple have gone and in­vented the pro­jec­tion / par­tial de­riv­a­tive op­er­a­tion on log­a­rithms any­way. For ex­am­ple, the p-adic val­u­a­tion in num­ber the­ory

\[\nu_p (n) = \max \{ k \in \bb{N} \mid p^k \mid n \}\]

cor­re­sponds to ex­tract­ing the co­ef­fi­cient of \(\log p\) of an nat­ural num­ber in a log­a­rith­mic ba­sis

\[\begin{aligned} \log n &= \log 2^{n_2} 3^{n_3} 5^{n_5} \cdots \\ &= n_2 \log 2 + n_3 \log 3 + n_5 \log 5 + \ldots \\ \nu_p (n) &= n_p \end{aligned}\]

Each co­ef­fi­cient is a pos­i­tive in­te­ger, and \(\nu_p\) just takes the com­po­nent cor­re­spond­ing to \(\log p\). Clearly \(\log n\) acts like a vec­tor (although since the co­ef­fi­cients are in \(\bb{N}\) it is tech­ni­cally a com­mu­ta­tive monoid in­stead of a vec­tor space… nev­er­the­less, it has the fa­mil­iar struc­ture of a vec­tor). Since \(\nu_p\) is a projection’ out of this log­a­rithm, it still obeys log­a­rith­mic iden­ti­ties like \(\nu_p(m/n) = \nu_p(m) - \nu_p(n)\). But there is not re­ally a good no­ta­tion for ac­tu­ally ex­press­ing it as a pro­jec­tion, so sadly it gets a whole sep­a­rate nomen­cla­ture that you have to learn.3

The same thing also works for ra­tio­nal \(n\) or rad­i­cal \(n\) (meaning it is the prod­uct of rad­i­cals of prime fac­tors), in which case the co­ef­fi­cients be­come in­te­gers or ra­tio­nals. (As a bonus the re­sult­ing ob­jects live in an ac­tual vec­tor space.)

Another ex­am­ple of these log­a­rith­mic pro­jec­tions: in com­plex analy­sis the order of van­ish­ing” \(\text{ord}_a f(z)\) of a mero­mor­phic func­tion \(f(z)\) at a point \(z=a\) is the or­der of the pole or zero at a point (where ze­roes are like neg­a­tive poles). That is, it is the de­gree \(n\) of the low­est-de­gree term in the Laurent se­ries of the func­tion around the point \(z=a\),

\[f(z) = f_{-n} (z-a)^{-n} + f_{-n+1} (z-a)^{-n+1} + \cdots + f_{-1} (z-a)^{-1} + f_0 + f_1 (z-a) + \cdots\]

(that is, the value of \(n\) such that \((z-a)^n f(z)\) is holo­mor­phic around \(a\)). This is ex­tracted with a log­a­rithm:

\[\text{ord}_a f(z) = \lim_{z \ra a} \frac{\log f(z)}{\log (z-a)} = -n\]

since for \(z \approx a\), \(f(z) \sim f_{-n} (z-a)^{-n}\) which dom­i­nates the other terms that blow up less quickly. If we write \(g(z)\) for the rest of \(f(z)\) which has \(\text{ord}_a (g(z)) > -n\):

\[\begin{aligned} \lim_{z \ra a} \frac{\log f(z)}{\log (z-a)} &= \lim_{z \ra a} \frac{\log (f_{-n} (z-a)^{-n} + g(z))}{\log (z-a)}\\ &= \lim_{z \ra a} \frac{\log f_{-n} (z-a)^{-n} (1 + \frac{g(z)}{f_{-n}} (z-a)^n)}{\log (z-a)} \\ &= \lim_{z \ra a} \frac{\log f_{-n}}{\log (z-a)} -n \frac{\log (z-a)}{\log (z-a)} + \frac{\log (1 + c (z-a))}{\log (z-a)} \\ &= -n \end{aligned}\]

So this is a very sim­i­lar op­er­a­tion: the limit \(\lim_{z \ra a} \log (z-b)/\log(z-a) = 1_{a=b}\) serves to can­cel out the rest of the terms, like how \(\p_j dx^i \sim (\p x^i)/(\​p x^j) = 1_{i=j}\) serves to can­cel out the terms in a par­tial de­riv­a­tive, ex­tract­ing the \(dx\) com­po­nent of \(df = f_x dx + f_y dy + \ldots\).

(I’m not very good at com­plex analy­sis so that’s all I’m go­ing to say about that. Still, it seems clear that this is ba­si­cally the same op­er­a­tion.)

We see that the base­less log­a­rithm \(\log n\) works a lot like a vec­tor \(\v\) or dif­fer­en­tial \(df\), and then ex­press­ing a log­a­rithm in a base like \(\log_2 n = \log n / \log 2\) is a lot like a to­tal de­riv­a­tive \(df/dx\) or Clifford di­vi­sion \(\v \ast \b{x}^{-1}\). What is miss­ing is some equiv­a­lent of the par­tial de­riv­a­tive / pro­jec­tion op­er­a­tor that pro­jects only onto that com­po­nent… but var­i­ous fields have gone and Found a way to in­vent that any­way, ei­ther in the form of a par­tial de­riv­a­tive \(\p f/\​p x\), or just by mak­ing up the \(p\)-adic val­u­a­tion \(\nu_p\), or by the lim­its \(\lim_{z\ra a} \log f(z) / \log (z-a)\) in com­plex analy­sis. The si­m­il­iar­i­ties are all sus­pi­cious, though, and I can’t help but think there is some uni­fy­ing the­ory here that ties all this to­gether… but I can’t see what it is yet.

One thing that we might try in or­der to in­vent a \(\log_2 N\) that acts like \(\p_x f\) or \(\b{v}/\x\) is to some­how re­strict the val­ues of the log­a­rithms to cer­tain spaces, e.g. in­te­gers or ra­tio­nals. Since the \(\{\log p_i\}\) are lin­early in­de­pe­dent (which is es­sen­tially equiv­a­lent to prime fac­tor­iza­tions be­ing unique), you would end up with ob­jects like \(\log_2 3 = \log_3/\log_2\) which have no value in \(\bb{Q}\); zeroing” those out then gives some­thing that acts like a par­tial de­riv­a­tive. But I don’t know if that’s use­ful. Certainly it does­n’t help in any nu­meric con­text.

Anyway, onto more things that are log­a­rithms.

3. Vectors are also Logarithms?

In dif­fer­en­tial geom­e­try one in­ter­prets vec­tors like \(\v = v_x \x + v_y \y\) be­ing writ­ten in a ba­sis of par­tial de­riv­a­tive op­er­a­tors, \(\v = v_x \p_x + v_y \p_y\). These can then be used to cre­ate dis­crete trans­la­tions which move around in the var­i­ous co­or­di­nates,

\[T^{\v} = e^{\v} = e^{v_x \p_x + v_y \p_y }\]

The par­tial de­riv­a­tives are here in or­der to make it op­er­ate on func­tions

\[e^{v_x \p_x + v_y \p_y} f(x,y) = f(x + v_x, y + v_y)\]

which is true at the level Taylor ex­pan­sions as well. I of­ten find it eas­ier to dis­pense with the par­tial de­riv­a­tives and just think of these as trans­la­tion op­er­a­tors on the space \((x,y)\) di­rectly

\[e^{v_x \p_x + v_y \p_y} (x, y) = (x + v_x, y + v_y)\]

(You can also think of this act­ing on the func­tion \(f(x) = x\) also, but that feels like overkill.)

In any case, all this is re­ally do­ing (in flat space, at least) is chang­ing the ad­di­tive vec­tor \(\b{v}\) into a mul­ti­plica­tive form \(T^{\b{v}}\) which cor­re­sponds to the same op­er­a­tion, but whose terms are mul­ti­plied in­stead of added, and whose scalar co­ef­fi­cients are ap­plied via ex­po­nen­ti­a­tion in­stead of mul­ti­pli­ca­tion. The ba­sis is now trans­la­tion op­er­a­tors in each co­or­di­nate:4

\[T^{\v} = e^{v_x \p_x} e^{v_y \p_y} = T_x^{v_x} T_y^{v_y}\]

(In non-flat space this is not so sim­ple be­cause the trans­la­tions in dif­fer­ent co­or­di­nates may not com­mute; you can still write it in this form but it’s a lot more com­pli­cated.)

What this means for us is: look, vec­tors are log­a­rithms too

\[\begin{aligned} \ln T^{\v} &= \ln T_x^{v_x} T_y^{v_y} \\ &= v_x \ln T_x + v_y \ln T_y \\ &= v_x \p_x + v_y \p_y \end{aligned}\]

I can’t ex­actly say why, but it seems prefer­able to have this writ­ten in terms of base­less log­a­rithms also. We do this by re­al­iz­ing that \(T_x = e^{\p_x} = T^{\p_x}\) and think­ing of this sym­bol \(T\) as a sort of generic’ base for trans­la­tions, ab­sent the nu­meric mean­ing of the sym­bol \(e\), which has \(\log T_x = \log T^{\p_x} = \p_x \log T\). Then

\[\log T^{\v} = \v \log T = v_x \p_x \log T + v_y \p_y \log T\]

And then we can write \(\v = \log_T T^{\v} = \log T^{\v} / \log T\). This is equiv­a­lent to the nat­ural log ver­sion but it avoids ex­plic­itly de­pend­ing on the nu­meric value of \(e\): any choice of base for the log­a­rithm \(T\) gives the same con­cept of a vec­tor, writ­ten in terms of the ex­po­nen­ti­a­tion of \(T\), but now we make ex­plicit that the units’ on \(\v\) come in part from the units on \(\log T\) it­self.

So vec­tors in dif­fer­en­tial geom­e­try may also be thought of as log­a­rithms, specif­i­cally, the log­a­rithms of trans­la­tion op­er­a­tors.

Regular mul­ti­pli­ca­tion can even be viewed as an ex­am­ple of this. A prod­uct like \(xa\) can be rewrit­ten as translation” in the \(\ln a\) co­or­di­nate:

\[xa = e^{\ln x} e^{\ln a} = e^{(\ln x) \p_{\, \ln a}} a = x^{\p_{\, \ln a}} a\]

I’m not sure how that would be ever be use­ful but maybe it’s a bit in­ter­est­ing?

4. Logarithms are Derivatives?

This part does­n’t re­ally mat­ter, I just thought I would men­tion it so that this ar­ti­cle con­tains every fun fact about log­a­rithms that I know.

One way of defin­ing the nat­ural log­a­rithm is

\[\ln x = \lim_{a \ra 0} \frac{x^a - 1}{a}\]

I find this for­mula neat for a few rea­sons. Mostly it ex­plains where a lot of the be­hav­iors of \(\ln\) in cal­cu­lus comes from.

It fol­lows from sub­sti­tut­ing \(x^a = e^{a \ln x}\) and then Taylor ex­pand­ing:

\[\frac{x^a - 1}{a} = \frac{e^{a \ln x} - 1}{a} = \frac{(1 + a \ln x + \ldots) - 1}{a} \stackrel{a \ra 0}{=} \ln x\]

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

Visit pancik.com for more.