10 interesting stories served every morning and every evening.




1 636 shares, 25 trendiness

It’s Time to Stop Taking Sam Altman at His Word

OpenAI an­nounced this week that it has raised $6.6 bil­lion in new fund­ing and that the com­pany is now val­ued at $157 bil­lion over­all. This is quite a feat for an or­ga­ni­za­tion that re­port­edly burns through $7 bil­lion a year—far more cash than it brings in—but it makes sense when you re­al­ize that OpenAI’s pri­mary prod­uct is­n’t tech­nol­ogy. It’s sto­ries.

Case in point: Last week, CEO Sam Altman pub­lished an on­line man­i­festo ti­tled The Intelligence Age.” In it, he de­clares that the AI rev­o­lu­tion is on the verge of un­leash­ing bound­less pros­per­ity and rad­i­cally im­prov­ing hu­man life. We’ll soon be able to work with AI that helps us ac­com­plish much more than we ever could with­out AI,” he writes. Altman ex­pects that his tech­nol­ogy will fix the cli­mate, help hu­mankind es­tab­lish space colonies, and dis­cover all of physics. He pre­dicts that we may have an all-pow­er­ful su­per­in­tel­li­gence in a few thou­sand days.” All we have to do is feed his tech­nol­ogy enough en­ergy, enough data, and enough chips.

Maybe some­day Altman’s ideas about AI will prove out, but for now, his ap­proach is text­book Silicon Valley myth­mak­ing. In these nar­ra­tives, hu­mankind is for­ever on the cusp of a tech­no­log­i­cal break­through that will trans­form so­ci­ety for the bet­ter. The hard tech­ni­cal prob­lems have ba­si­cally been solved—all that’s left now are the de­tails, which will surely be worked out through mar­ket com­pe­ti­tion and old-fash­ioned en­tre­pre­neur­ship. Spend bil­lions now; make tril­lions later! This was the story of the dot-com boom in the 1990s, and of nan­otech­nol­ogy in the 2000s. It was the story of cryp­tocur­rency and ro­bot­ics in the 2010s. The tech­nolo­gies never quite work out like the Altmans of the world promise, but the sto­ries keep reg­u­la­tors and reg­u­lar peo­ple side­lined while the en­tre­pre­neurs, en­gi­neers, and in­vestors build em­pires. (The Atlantic re­cently en­tered a cor­po­rate part­ner­ship with OpenAI.)

Despite the rhetoric, Altman’s prod­ucts cur­rently feel less like a glimpse of the fu­ture and more like the mun­dane, buggy pre­sent. ChatGPT and DALL-E were cut­ting-edge tech­nol­ogy in 2022. People tried the chat­bot and im­age gen­er­a­tor for the first time and were as­ton­ished. Altman and his ilk spent the fol­low­ing year speak­ing in stage whis­pers about the awe­some tech­no­log­i­cal force that had just been un­leashed upon the world. Prominent AI fig­ures were among the thou­sands of peo­ple who signed an open let­ter in March 2023 to urge a six-month pause in the de­vel­op­ment of large lan­guage mod­els ( LLMs) so that hu­man­ity would have time to ad­dress the so­cial con­se­quences of the im­pend­ing rev­o­lu­tion. Those six months came and went. OpenAI and its com­peti­tors have re­leased other mod­els since then, and al­though tech wonks have dug into their pur­ported ad­vance­ments, for most peo­ple, the tech­nol­ogy ap­pears to have plateaued. GPT-4 now looks less like the pre­cur­sor to an all-pow­er­ful su­per­in­tel­li­gence and more like … well, any other chat­bot.

The tech­nol­ogy it­self seems much smaller once the nov­elty wears off. You can use a large lan­guage model to com­pose an email or a story—but not a par­tic­u­larly orig­i­nal one. The tools still hal­lu­ci­nate (meaning they con­fi­dently as­sert false in­for­ma­tion). They still fail in em­bar­rass­ing and un­ex­pected ways. Meanwhile, the web is fill­ing up with use­less AI slop,” LLM-generated trash that costs prac­ti­cally noth­ing to pro­duce and gen­er­ates pen­nies of ad­ver­tis­ing rev­enue for the cre­ator. We’re in a race to the bot­tom that every­one saw com­ing and no one is happy with. Meanwhile, the search for prod­uct-mar­ket fit at a scale that would jus­tify all the in­flated tech-com­pany val­u­a­tions keeps com­ing up short. Even OpenAI’s lat­est re­lease, o1, was ac­com­pa­nied by a caveat from Altman that it still seems more im­pres­sive on first use than it does af­ter you spend more time with it.”

In Altman’s ren­der­ing, this mo­ment in time is just a way­point, the doorstep of the next leap in pros­per­ity.” He still ar­gues that the deep-learn­ing tech­nique that pow­ers ChatGPT will ef­fec­tively be able to solve any prob­lem, at any scale, so long as it has enough en­ergy, enough com­pu­ta­tional power, and enough data. Many com­puter sci­en­tists are skep­ti­cal of this claim, main­tain­ing that mul­ti­ple sig­nif­i­cant sci­en­tific break­throughs stand be­tween us and ar­ti­fi­cial gen­eral in­tel­li­gence. But Altman pro­jects con­fi­dence that his com­pany has it all well in hand, that sci­ence fic­tion will soon be­come re­al­ity. He may need $7 tril­lion or so to re­al­ize his ul­ti­mate vi­sion—not to men­tion un­proven fu­sion-en­ergy tech­nol­ogy—but that’s peanuts when com­pared with all the ad­vances he is promis­ing.

There’s just one tiny prob­lem, though: Altman is no physi­cist. He is a se­r­ial en­tre­pre­neur, and quite clearly a tal­ented one. He is one of Silicon Valley’s most revered tal­ent scouts. If you look at Altman’s break­through suc­cesses, they all pretty much re­volve around con­nect­ing early start-ups with piles of in­vestor cash, not any par­tic­u­lar tech­ni­cal in­no­va­tion.

It’s re­mark­able how sim­i­lar Altman’s rhetoric sounds to that of his fel­low bil­lion­aire techno-op­ti­mists. The pro­ject of techno-op­ti­mism, for decades now, has been to in­sist that if we just have faith in tech­no­log­i­cal progress and free the in­ven­tors and in­vestors from pesky reg­u­la­tions such as copy­right law and de­cep­tive mar­ket­ing, then the mar­ket­place will work its magic and every­one will be bet­ter off. Altman has made nice with law­mak­ers, in­sist­ing that ar­ti­fi­cial in­tel­li­gence re­quires re­spon­si­ble reg­u­la­tion. But the com­pa­ny’s re­sponse to pro­posed reg­u­la­tion seems to be no, not like that.” Lord, grant us reg­u­la­tory clar­ity—but not just yet.

At a high enough level of ab­strac­tion, Altman’s en­tire job is to keep us all fix­ated on an imag­ined AI fu­ture so we don’t get too caught up in the un­der­whelm­ing de­tails of the pre­sent. Why fo­cus on how AI is be­ing used to ha­rass and ex­ploit chil­dren when you can imag­ine the ways it will make your life eas­ier? It’s much more pleas­ant fan­ta­siz­ing about a benev­o­lent fu­ture AI, one that fixes the prob­lems wrought by cli­mate change, than dwelling upon the phe­nom­e­nal en­ergy and wa­ter con­sump­tion of ac­tu­ally ex­ist­ing AI to­day.

Remember, these tech­nolo­gies al­ready have a track record. The world can and should eval­u­ate them, and the peo­ple build­ing them, based on their re­sults and their ef­fects, not solely on their sup­posed po­ten­tial.

...

Read the original on www.theatlantic.com »

2 357 shares, 18 trendiness

Cloudflare beats patent troll so badly it basically gives up

Cloudflare on Thursday cel­e­brated a vic­tory over Sable Networks, which the for­mer de­scribed as a patent troll.”

That’s a term for an in­di­vid­ual or or­ga­ni­za­tion that ex­ists solely to makes patent in­fringe­ment claims in the hope of win­ning a set­tle­ment from de­fen­dants con­cerned about costly patent lit­i­ga­tion.

Sable sued Cloudflare back in March 2021,” wrote Emily Terrell and Patrick Nemeroff, re­spec­tively Cloudflare’s se­nior coun­sel for lit­i­ga­tion and se­nior as­so­ci­ate gen­eral coun­sel, in a write-up Wednesday.

Sable is a patent troll. It does­n’t make, de­velop, in­no­vate, or sell any­thing. Sable IP is merely a shell en­tity formed to mon­e­tize (make money from) an an­cient patent port­fo­lio ac­quired by Sable Networks from Caspian Networks in 2006.”

Patent trolls have vexed the tech­nol­ogy in­dus­try for years, some­times even draw­ing reg­u­la­tory re­sponses as hap­pened over a decade ago when op­por­tunis­tic lit­i­gants fo­cused on patents per­ti­nent to the emerg­ing smart­phone mar­ket. The Obama ad­min­is­tra­tion re­sponded by is­su­ing a set of ex­ec­u­tive ac­tions to curb abuses.

Lately, these patent prof­i­teers have tar­geted the open source com­mu­nity. The Cloud Native Computing Foundation and Linux Foundation last month strength­ened ties with United Patents, a com­pany fo­cused on de­fend­ing against preda­tory patent claims.

Five other com­pa­nies that we know of sued by Sable — Cisco, Fortinet, Check Point, SonicWall, and Juniper Networks — set­tled out of court. Splunk, mean­while, fought back and man­aged last year to con­vince Sable to dis­miss its claim against the op­er­a­tion prior to its takeover by, fun­nily enough, Cisco.

Internet ser­vices gi­ant Cloudflare has notched up an even greater vic­tory. Facing an ini­tial in­fringe­ment law­suit re­gard­ing around a hun­dred claims re­lated to four patents, the cor­po­ra­tion ended up hav­ing to deal with just a sin­gle patent vi­o­la­tion claim.

In February, Cloudflare pre­vailed when a Texas jury found it did not in­fringe Sable’s micro-flow la­bel switch­ing” patent.

The biz both con­vinced the jury that it did not use the micro-flow” tech­nol­ogy de­scribed in the US patent 7,012,919 and that the patent was in­valid be­cause of prior art.

The ex­is­tence of two ear­lier US patents, 6,584,071 and 6,680,933 for router tech­nol­ogy de­vel­oped by Nortel Networks and Lucent in the 1990s con­vinced the jury that Sable’s 919 patent should never have been granted.

The mat­ter also saw Sable pledge not to pur­sue fur­ther ac­tions of this sort.

In the end, Sable agreed to pay Cloudflare $225,000, grant Cloudflare a roy­alty-free li­cense to its en­tire patent port­fo­lio, and to ded­i­cate its patents to the pub­lic, en­sur­ing that Sable can never again as­sert them against an­other com­pany,” said Terrell and Nemeroff.

The agree­ment means that Sable will tell the US Patent and Trademark Office that it is aban­don­ing its patent rights and no fur­ther claims based on those patents will be pos­si­ble.

The Register called Sable’s listed phone num­ber, and it is no longer in ser­vice. The com­pa­ny’s web­site is un­re­spon­sive and an at­tor­ney for the firm did not im­me­di­ately re­spond to a re­quest for com­ment.

Terrell and Nemeroff said that prior art sub­mis­sions for the Sable case re­lated to Project Jengo, a crowd-sourced patent in­val­i­da­tion ini­tia­tive, will be ac­cepted un­til November 2, 2024. At some point there­after, Cloudflare, which has al­ready given out $70,000 in awards for the case, will se­lect fi­nal award win­ners.

We’re proud of our work fight­ing patent trolls and be­lieve that the out­come in this case sends a strong mes­sage that Cloudflare will fight back against mer­it­less patent cases and we will win,” a spokesper­son for Cloudflare told The Register. ®

...

Read the original on www.theregister.com »

3 265 shares, 17 trendiness

Search for Charts by Data Visualization Functions

What do you want to show?

Here you can find a list of charts cat­e­gorised by their data vi­su­al­iza­tion func­tions or by what you want a chart to com­mu­ni­cate to an au­di­ence. While the al­lo­ca­tion of each chart into spe­cific func­tions is­n’t a per­fect sys­tem, it still works as a use­ful guide for se­lect­ing chart based on your analy­sis or com­mu­ni­ca­tion needs.

...

Read the original on datavizcatalogue.com »

4 263 shares, 13 trendiness

We only learnt of our son’s secret online life after he died at 20

It is the sum­mer of 1989 and a Norwegian cou­ple re­turn from hos­pi­tal cradling their new­born son. Their names are Robert and Trude Steen, this is their first child and they have named him Mats. The early weeks pass in the deliri­ous, hor­mone-flooded fug of new par­ent­hood, Robert doc­u­ment­ing his son’s wrig­gles and cries on his cam­corder with a new-found pa­ter­nal pride that has left him dumb­founded. And as the months pass, the cam­era keeps rolling. Mats grows. He sprouts bright blond Nordic hair. He drags him­self to his feet and be­gins to tod­dle. His fa­ther films him wad­dling across their liv­ing room in a Tom and Jerry T-shirt. He was, Robert re­mem­bers, the most beau­ti­ful, per­fect child”.

From about the age of two, though, some­thing changes. Robert and Trude can­not put their fin­ger on it at first, but a con­cern for their child be­gins to gnaw at them. He strug­gles to get back to his feet when he falls. Playground ob­sta­cles be­come in­sur­mount­able. Robert films Mats as he stum­bles and plonks to his bot­tom. But he just sits on the ground in his dun­ga­rees, cry­ing and help­less. As a par­ent, you know when some­thing is wrong with your child,” Trude says. They just did­n’t know what.

For two years they worry, watch­ing purse-lipped as their son’s phys­i­cal de­vel­op­ment slows then stalls. They push and per­suade doc­tors to take their fears se­ri­ously. There are tests and ex­am­i­na­tions and then, fi­nally, they re­ceive the news. It was 1pm on May 18, 1993,” Robert says. I re­mem­ber the hos­pi­tal of­fice where we were given the mes­sage. I re­mem­ber every­thing.”

Mats aged five or six

Their son, the doc­tor ex­plains, has a con­di­tion known as Duchenne mus­cu­lar dy­s­tro­phy. It is a ge­netic dis­or­der that causes pro­gres­sive mus­cle wastage and will, over time, de­prive Mats of all his re­main­ing strength and mo­bil­ity. Walking will be­come harder and harder for him. In a few years he will need a wheel­chair and then, even­tu­ally, a team of round-the-clock car­ers as his con­di­tion grad­u­ally ren­ders him phys­i­cally help­less. There will be feed­ing tubes and ma­chines to help him clear his lungs, be­cause swal­low­ing and cough­ing will be be­yond him. He may, his par­ents are told, live to be 20. Our world broke apart,” Trude says gen­tly. The mes­sage was dev­as­tat­ing, bru­tal. It was the day we un­der­stood it would­n’t go away. It would rule our lives and, inthe end, also take Mats away from us.”

Amid the shock and hor­ror, there is also the quiet aban­don­ment of the fu­ture they had hoped for their son: of ski­ing and foot­ball, of uni­ver­sity, a suc­cess­ful ca­reer and per­haps one day a fam­ily of his own. Instead, they learn to fo­cus on each day as it comes and to find and pro­vide what hap­pi­ness they can in each mo­ment. Mats grows into an in­tel­li­gent school­boy with a droll sense of hu­mour. By the age of eight he is in a wheel­chair, but he bonds with his school­mates in Oslo and shares in the com­mu­nal pas­sion for video games. He loves his lit­tle sis­ter, Mia, and their pet dog. Everyone ar­gues over who gets to be on Mats’ team dur­ing fam­ily quiz nights.

His teenage years are harder. His friends, in­evitably, are drawn into a world of house par­ties and late-night cin­ema trips and early ro­man­tic re­la­tion­ships. It is a world Mats can­not en­ter. No friends came knock­ing at the door any more,” Robert says, with­out bit­ter­ness. So Mats spends more and more time alone, play­ing video games. He dis­cov­ers an on­line role-play­ing game called World of Warcraft, in which thou­sands of play­ers can ex­plore a vast, three-di­men­sional, Dungeons & Dragons-style fan­tasy world and work to­gether to com­plete quests and de­feat mon­sters. He sinks hour af­ter hour into the game and his par­ents al­low him to: he is al­ready de­nied so many of life’s plea­sures, it would seem churl­ish to deny him this.

At 18, he grad­u­ates from high school with ex­cel­lent grades but is un­em­ploy­able. He moves into an an­nexe, is looked af­ter by a ro­tat­ing team of car­ers and spends much of his time deeply ab­sorbed in World of Warcraft, his right hand rest­ing awk­wardly on a cus­tom-built key­board, his head lolling to one side as he nav­i­gates an epic world. Robert and Trude some­times sit with him while he plays, but af­ter half an hour they find their at­ten­tion drift­ing. It was bor­ing, just sit­ting there watch­ing some­thing on screen,” Robert ad­mits. We did­n’t know what was go­ing on.”

The years pass. Mats’ 20th birth­day comes and goes. He be­gins to write a blog about his life and con­di­tion as his body seems to be­come smaller and more frag­ile by the day. In November 2014, he is ad­mit­ted to hos­pi­tal with res­pi­ra­tory prob­lems. It is not the first time this has hap­pened and, though these episode­sare fraught, he has al­ways re­turned home af­ter sev­eral days of in­ten­sive care. This time, how­ever, the Steens are roused by a tele­phone call not long af­ter they have gone to bed for the night.

It was the hos­pi­tal,” Robert says. They said, We think you should get in your car and hurry here now.’ So we threw our­selves in the car. We had never dri­ven so fast through Oslo, but it was at night, so the streets were quiet.” Robert and Trude flew into the hos­pi­tal at 12.14am, des­per­ate to see their child. We were ex­actly 14 min­utes late. He had died at mid­night.” The feel­ing, he says, was complete empti­ness. What had filled our lives, for bet­ter or worse, phys­i­cally, men­tally, prac­ti­cally, was now over. It had come to an end.”

Their grief is deep­ened by the knowl­edge that their son had lived a small, dis­creet life of lit­tle real con­se­quence. He had made no mark on the world or on the lives of any­one out­side his im­me­di­ate fam­ily. Mats had never known ro­man­tic love or last­ing friend­ship, or the feel­ing of hav­ing made a mean­ing­ful con­tri­bu­tion to so­ci­ety. They log in to his blog so they can post a mes­sage let­ting his fol­low­ers know that he has died. And then they sit to­gether on the sofa, un­able to sleep, un­able to do any­thing.

Then some­thing rouses them. It is an email from a stranger, ex­press­ing their sor­row at Mats’ death. It is quickly fol­lowed by an­other email from an­other stranger, eu­lo­gis­ing their son. The mes­sages con­tinue, a trickle be­com­ing a flood as peo­ple con­vey their con­do­lences and write para­graph af­ter para­graph about Mats. He had a warm heart, peo­ple write. He was funny and imag­i­na­tive, a good lis­tener and gen­er­ous. You should be proud of him, every­one stresses. A pri­mary school teacher from Denmark writes that af­ter hear­ing of Mats’ death, she broke down in class and had to re­turn home. A 65-year-old psy­chol­o­gist from England says some­thing sim­i­lar. Mats was a real friend to me,” writes an­other stranger. He was an in­cur­able ro­man­tic and had con­sid­er­able suc­cess with women.” Someone else writes to them de­scrib­ing Mats’ em­pa­thy. I don’t think,” they say, he was aware of how big an im­pact he had on a lot of peo­ple.”

Robert and Trude can­not make sense of this. Who are these peo­ple? Are they crazy or what?” Robert asks, frown­ing. Slowly, though, with each new email, the truth be­gins to re­veal it­self.

Their son had lived by an­other name. To his fam­ily he had sim­ply been Mats. But within World of Warcraft he had ex­isted for years as a charis­matic ad­ven­turer named Ibelin”, a strap­ping swash­buck­ler with auburn hair tied back in a pony­tail and a butch goa­tee beard. And it was as this dig­i­tal al­ter ego that Mats had thrived in a way his fam­ily had never ap­pre­ci­ated. They had mis­un­der­stood what World of Warcraft re­ally was. It had seemed to them like a fre­netic ac­tion game of mon­ster-bash­ing and point-scor­ing. To Mats and the many peo­ple he played with — the peo­ple now email­ing Robert and Trude — it was some­thing far more pro­found: an im­mer­sive world built on so­cial in­ter­ac­tions, friend­ships and shared sto­ry­telling. Robert smiles. This win­dow started to open up to us that let us see he had an­other life be­sides his phys­i­cal life. And that it had been so rich, so big and so full of con­tent­ment.”

The story of Mats’ dou­ble life, and of the emo­tional im­pact its dis­cov­ery has had on his par­ents, is told in a new doc­u­men­tary called The Remarkable Life of Ibelin. It is a mov­ing and of­ten deeply philo­soph­i­cal work that tack­les ques­tions around the na­ture of re­al­ity and re­la­tion­ships in an in­creas­ingly on­line world. In some ways it is also in­cred­i­bly am­bi­tious. How do you show the in­ter­nal world of a ter­mi­nally ill young man who has now been dead for ten years? How do you recre­ate the words and deeds that made such an im­pact on oth­ers but which took place within an old on­line role-play­ing game? These were per­haps the two great­est chal­lenges fac­ing the Norwegian di­rec­tor Benjamin Ree.

I’ve spent the past four years try­ing to find out what kind of per­son Mats was,” says Ree, who is bearded, cheer­ful and, as he later dis­cov­ered, was born within a few days of his sub­ject. I al­most feel like I’ve done a doc­tor­ate de­gree in him, in try­ing to un­der­stand him bet­ter.”

In mak­ing Ibelin, how­ever, Ree dis­cov­ered he had a num­ber of unique re­sources at his dis­posal. Mats had been a mem­ber of a World of Warcraft guild”, a sort of for­mal club or fra­ter­nity that play­ers can join, of­ten by in­vi­ta­tion only. Mats’ guild was called Starlight and had its own on­line fo­rum on which he was a pro­lific poster, in­ter­act­ing with other guild mem­bers and swap­ping thou­sands of mes­sages. The guild also kept a dig­i­tal log of their mem­bers’ every ac­tion within the game it­self: the tran­scripted text of every typed con­ver­sa­tion their char­ac­ter had, every ac­tion and emote they com­manded their avatar to carry out — laugh­ing, curt­sey­ing, cry­ing, danc­ing, eat­ing, drink­ing, hug­ging — and the time­stamped co­or­di­nates of every lo­ca­tion they vis­ited.

Mats spent al­most 20,000 hours in that world. He ba­si­cally grew up in World of Warcraft,” Ree says. And what I saw in all the logs and fo­rums and tran­scrip­tions was that com­ing of age in­side a game had a lot of sim­i­lar­i­ties to com­ing of age in the real world.”

What makes Ree’s film so af­fect­ing is the way in which view­ers are able to feel as though we are be­side Mats as he goes through this com­ing-of-age process. Using an­i­ma­tion in the style of World of Warcraft graph­ics, we fol­low Ibelin as he runs through a fan­tasy land­scape of moun­tain peaks and deep forests. He sits be­side a pond and is ap­proached by a beau­ti­ful dark-haired young woman named Rumour, who be­gins to tease him play­fully be­fore snatch­ing his hat and run­ning off into the for­est. Mats is 17 at this point and, like all teenage boys, he can­not quite work out that the woman is flirt­ing with him. But he even­tu­ally twigs that he is sup­posed to give chase and the pair strike up a con­ver­sa­tion. The young woman is, in fact, con­trolled by a teenage girl from the Netherlands. She is named Lisette and, like her avatar, she has long dark hair. Over the fol­low­ing weeks and months Ibelin and Rumour, con­trolled by Mats and Lisette, fall into the kind of in­tense but ro­man­ti­cally am­bigu­ous re­la­tion­ship that will be ag­o­nis­ingly fa­mil­iar to any­one who has ever been a teenager. At one point, Rumour gives Ibelin a peck on the cheek. It was just a vir­tual kiss,” Mats re­mem­bers years later on his blog. But boy, I could al­most feel it.”

Rumour and Ibelin in World of WarcraftWORLD OF WARCRAFT AND BLIZZARD ENTERTAINMENT © 2024/COURTESY OF NETFLIX

Lisette and Mats swap ad­dresses and send each other to­kens: mix CDs and, in Lisette’s case, sketches of Rumour and Ibelin em­brac­ing. But Mats will not video-chat with her or at­tend any of the real-life Starlight meet-ups that take place. He does not want her — or any­one else — to know of his con­di­tion. In World of Warcraft every­one is phys­i­cally per­fect, so it is a play­er’s abil­ity to pro­ject their per­son­al­ity and charisma that makes them at­trac­tive and pop­u­lar. And Ibelin is at­trac­tive and pop­u­lar. Why risk that by re­veal­ing his true na­ture? In this other world, a girl would­n’t see a wheel­chair or any­thing dif­fer­ent,” Mats will later re­flect. They would see my soul, heart and mind, con­ve­niently placed in some strong body.”

But though Mats’ true iden­tity re­mains hid­den, what’s strik­ing is how much he is able to af­fect the lives of the peo­ple he games with. Lisette’s par­ents con­fis­cate her com­puter when her school grades dip, and don’t be­lieve her when she tells them she will be cut off from so many of her friends with­out it. Isolated and lonely, she falls into a se­vere de­pres­sion. I could­n’t think of rea­sons to get out of bed,” she says. Mats in­ter­venes. He writes a heart­felt but mea­sured let­ter to her par­ents, in­tro­duc­ing him­self as an on­line friend of their daugh­ter’s, ex­press­ing his ad­mi­ra­tion for Lisette, ex­plain­ing how im­por­tant World of Warcraft is to her and urg­ing them all to work to­gether to find a so­lu­tion that will al­low her some ac­cess to her com­puter.

Lisette’s par­ents are taken aback. But they re­assess and Rumour re­turns, and she and Ibelin are re­united in the game once more. He helps Lisette un­pack and un­der­stand her trou­bles. Ibelin was a re­ally big sup­port pil­lar. He was a friend I could be open with about all the things that were go­ing on,” she says. It’s one of the things that got me out of the de­pres­sion I was in.”

She is not the only per­son he helps. A man named Kristian plays the game as a blue-haired gnome and is over­come with feel­ings of worth­less­ness. Over time he ad­mits all this to Ibelin. I told him every­thing,” Kristian says. I told him how ter­ri­ble I had felt. Perhaps it does­n’t seem like much, but it meant the world to me.”

Having spo­ken to dozens of the peo­ple he knew on­line, Ree says it’s clear that Mats was a very good lis­tener. Which might seem a strange thing to say when they did­n’t ac­tu­ally talk, be­cause every­thing was writ­ten down,” he says. But he would re­mem­ber every­thing that his friends told him. He would ask them ques­tions about it months later. Even to­wards the end of his life, when he was as sick as it’s pos­si­ble to get, you could see how he would pri­ori­tise his friends and find the en­ergy to be there for them. It’s quite ex­tra­or­di­nary.”

Perhaps most mov­ing of all is when Mats learns that one of his Starlight friends is, in real life, mother to an autis­tic teenage son named Mikkel. He is un­able to leave their apart­ment, she tells him. He is un­able to show her any phys­i­cal af­fec­tion. She feels like a ter­ri­ble mother. Mats lis­tens and then sug­gests that she in­vite her son into the game it­self. The son agrees to this and the three of them be­gin to spend time to­gether within the game, with Mats gen­tly en­cour­ag­ing Mikkel to take so­cial risks, to in­tro­duce him­self to other avatars or at least re­spond when spo­ken to. Under Mats’ guid­ance, Mikkel be­gins to find a mea­sure of con­fi­dence. One day, Mikkel gives his moth­er’s char­ac­ter a vir­tual hug, a wa­ter­shed mo­ment for them both. It was the first time in my life that I could feel love, and started to un­der­stand love,” Mikkel says in Ibelin. The heav­ens opened up,” his mother says. This was what I had been wait­ing for.” Today, Mikkel is able to hug his mother in real life. He is able to leave the house. I went from the most neg­a­tive per­son in the world to a per­son who could tol­er­ate peo­ple,” he says, laugh­ing.

The Remarkable Life of Ibelin is not a ha­giog­ra­phy, how­ever. In sift­ing through Mats’ life, Ree also un­cov­ers con­flict. If you’ve ever been part of an on­line com­mu­nity, you will know that ri­val­ries and dis­agree­ments fes­ter eas­ily. Empathetic as he was, Mats could also be scathing, sar­cas­tic and tem­pera­men­tal. He finds him­self es­tranged from Lisette, who dis­cov­ers that Ibelin has been ro­manc­ing other women within the game. He falls out with the mother of Mikkel, who be­gins to sus­pect that he may be suf­fer­ing from a chronic ill­ness in real life. We see that Mats gets in a lot of trou­ble be­cause he keeps it a se­cret,” Ree says. The in­ner demons build up and he gets into a lot of prob­lems be­cause he is­n’t hon­est about his ill­ness.” It is only to­wards the end of his life, when he fi­nally opens up about his con­di­tion to every­one in World of Warcraft and starts writ­ing his blog, that he is able to mend all his bridges.

But the fact that he broke them in the first place makes him all the more re­lat­able, Ree says. Who has­n’t made mis­takes as a young guy? Pissed peo­ple off, been a prick, lost friends and then had to get them back again?” He grins. There were a lot of sim­i­lar­i­ties with me grow­ing up, ac­tu­ally.”

In the days im­me­di­ately fol­low­ing Mats’ death, Robert and Trude found the vol­ume of in­for­ma­tion they re­ceived from strangers about their son both won­drous and sur­real. They had sim­ply not known. I knew he was car­ing and kind,” Trude says. But I never saw the ex­tent to which he helped and sup­ported so many peo­ple.”

Robert nods. It’s when I re­alised that the emo­tions and re­la­tion­ships we cre­ate on­line can be stronger than we re­alise.”

A del­e­ga­tion from the Starlight guild at­tend his fu­neral, in­clud­ing Lisette. During the ser­vice, his cof­fin is draped with their ban­ner, a sil­ver star set against mid­night blue. Robert de­liv­ers a eu­logy for his son in which he speaks of the sor­row he and Trude had felt, be­liev­ing that his short life had been one void of mean­ing, friend­ship, love and be­long­ing. But, he con­tin­ues, over the past few days they have come to un­der­stand that this was not the case, and that he had ex­pe­ri­enced all these things. You proved us wrong. You proved us so wrong,” he says from the lectern, be­fore ex­plain­ing how they had come to learn of Ibelin’s ex­ploits. Mats was, at times, ac­cused of be­ing a wom­an­iser,” he tells the fu­neral con­gre­ga­tion, tight-throated and smil­ing. And I must ad­mit, be­ing a fa­ther, I’m a bit proud of that.”

Ten years on, Robert and Trude are still dis­cov­er­ing things about Mats. This, along with work­ing on the doc­u­men­tary, has helped pre­vent him from fad­ing in their hearts and minds. We’ve still not got to the bot­tom of every­thing he did,” Robert says. In a way, he still lives. He’s 35 and we’re still learn­ing about his life. He’s very pre­sent. He’s very close to us.”

Trude and Robert Steen. I knew he was kind,” says Trude. But I never saw how much he helped peo­ple”TIM JOBLING FOR THE TIMES MAGAZINE. STYLING: HANNAH SKELLEY. GROOMING: CAROL SULLIVAN AT ARLINGTON ARTISTS USING STILA. TRUDE WEARS DRESS, WHISTLES. COM. ROBERT WEARS T-SHIRT AND SUIT, MR P (MRPORTER.COM), TRAINERS, GRENSON.COM. OPENING IMAGE CHAIR, ANTHROPOLOGIE.COM

Every year, the mem­bers of the Starlight guild meet at a cer­tain spot within World of Warcraft to com­mem­o­rate Ibelin. They stand shoul­der to shoul­der, flaw­less elves, gal­lant knights and mighty sor­cer­ers. But re­ally they are just nor­mal peo­ple with nor­mal prob­lems. Many of them had their lives im­proved, in some way or an­other, by a kind and funny Norwegian boy who died too soon but still left a mark on the world. His fa­ther draws a breath and smiles help­lessly. The tears that I have in my eyes now are not sad tears,” he says. They are happy tears.”

The Remarkable Life of Ibelin is on Netflix from October 25

...

Read the original on www.thetimes.com »

5 200 shares, 21 trendiness

Solution vector for Sudoku problem ⍵.

...

Read the original on dfns.dyalog.com »

6 193 shares, 8 trendiness

I stayed.

My in­sight into cor­po­rate le­gal dis­putes is as mean­ing­ful as my opin­ion on Quantum Mechanics. What I do know is that, when given the chance this week to leave my job with half a year’s salary paid in ad­vance, I chose to stay at Automattic.

Listen, I’m strug­gling with med­ical debts and fi­nan­cial oblig­a­tions in­curred by the clos­ing of my con­fer­ence and pub­lish­ing busi­nesses. Six months’ salary in ad­vance would have wiped the slate clean. From a fidu­ciary point of view, if noth­ing else, I had to at least con­sider my CEOs of­fer to walk out the door with a big bag of dol­lars.

But even as I made my­self think about what six months’ salary in a lump sum could do to help my fam­ily and calm my cred­i­tors, I knew in my soul there was no way I’d leave this com­pany. Not by my own choice, any­way.

I re­spect the courage and con­vic­tion of my de­parted col­leagues. I al­ready miss them, and most only quit yes­ter­day. I feel their de­par­ture as a per­sonal loss, and my grief is real. The sad­ness is like a cold fog on a dark, wet night.

The next weeks will be chal­leng­ing. My re­main­ing cowork­ers and I will work twice as hard to cover tem­po­rary em­ployee short­falls and re­cruit new team­mates, while also nav­i­gat­ing the com­plex per­sonal feel­ings these two weeks of sud­den, sur­pris­ing change have brought on. Who needs the ag­gra­va­tion, right? But I stayed.

I stayed be­cause I be­lieve in the work we do. I be­lieve in the open web and own­ing your own con­tent. I’ve de­voted nearly three decades of work to this cause, and when I chose to move in-house, I knew there was only one house that would suit me. In nearly six years at Automattic, I’ve been able to do work that mat­tered to me and helped oth­ers, and I know that the best is yet to come.

I also know that the Maker-Taker prob­lem is an is­sue in open source, just as I know that a friend you buy lunch for every day, and who earns as much money as you do, is sup­posed to re­turn the fa­vor now and then. If a friend takes ad­van­tage, you’re sup­posed to say or do some­thing about it. Addressing these im­bal­ances is rarely pretty. Doing it in pub­lic takes its own kind of courage. Now it’s for the lawyers to sort out.

On May 1, 1992, a man who’d been hor­ri­bly beaten by the L. A. po­lice called for calm in five heart­felt, mem­o­rable words: Can’t we all get along?” We could­n’t then, and we aren’t, now, but my job at Automattic is about help­ing peo­ple, and that re­mains my fo­cus at the con­clu­sion of this strange and stress­ful week. I’m grate­ful that mak­ing the tough busi­ness de­ci­sions is­n’t my re­spon­si­bil­ity. In that light, my de­ci­sion to stay at Automattic was easy.

...

Read the original on zeldman.com »

7 152 shares, 9 trendiness

An Intuitive Explanation of Black–Scholes

I ex­plain the Black–Scholes for­mula us­ing only ba­sic prob­a­bil­ity the­ory and cal­cu­lus, with a fo­cus on the big pic­ture and in­tu­ition over tech­ni­cal de­tails.

The Black–Scholes for­mula is the crown jewel of quan­ti­ta­tive fi­nance. The for­mula gives the fair price of a European-style op­tion, and its suc­cess can ul­ti­mately be mea­sured by its im­pact on op­tion mar­kets. Before the for­mu­la’s pub­li­ca­tion in 1973 (Black & Scholes, 1973; Merton, 1973), op­tion mar­kets were rel­a­tively small and illiq­uid, and op­tions were not traded in stan­dard­ized con­tracts. But af­ter the for­mu­la’s pub­li­ca­tion, op­tion mar­kets grew rapidly. The first ex­change to list stan­dard­ized stock op­tions, the Chicago Board Options Exchange, was founded the same year that Black–Scholes was pub­lished. And to­day, op­tions are a highly liq­uid, ma­ture, and global as­set class, with many dif­fer­ent tenors, ex­er­cise rights, and un­der­ly­ing as­sets.

The fi­nan­cial and math­e­mat­i­cal the­ory un­der­pin­ning Black–Scholes is rich, and one could eas­ily spend months learn­ing the foun­da­tional ideas: con­tin­u­ous-time mar­tin­gales, Brownian mo­tion, sto­chas­tic in­te­gra­tion, val­u­a­tion through repli­ca­tion, and risk-neu­tral­ity to name just a few key con­cepts. But prop­erly con­tex­tu­al­ized, the for­mula can be sur­pris­ingly in­evitable. It can al­most feel like a law of na­ture rather than a fi­nan­cial model. My goal here is to jus­tify this claim.

To be­gin, let’s setup the prob­lem and then state the for­mula. Recall that a call op­tion () is a con­tract that gives the holder the right but not oblig­a­tion to buy the un­der­ly­ing as­set () at an agreed-upon strike price (). A put op­tion () is the right but not oblig­a­tion to sell the un­der­ly­ing short, but since calls and puts are fun­gi­ble through put–call

par­ity, we will only con­cern our­selves with call op­tions in this post. If we can price one, we can price the other. We say the holder ex­er­cises

the op­tion if they choose to buy or sell the un­der­ly­ing. A European-style

op­tion can only be ex­er­cised at a fixed time in the fu­ture, called ex­piry ().

Clearly, the pay­off at ex­piry of a European-style op­tion is just a piece­wise lin­ear ramp func­tion,

where de­notes the value of the stock at ex­piry (black line, Figure

). The sin­gle most im­por­tant char­ac­ter­is­tic of an op­tion is this asym­met­ric pay­off. For a call op­tion, our down­side is lim­ited, but our up­side is un­lim­ited. In fi­nance, this kind of asym­met­ric be­hav­ior is called convexity”.

Given this, we might guess that the price of the call be­fore ex­piry, so where , should look like a smooth ap­prox­i­ma­tion of the ramp func­tion (colored lines, Figure ), at least to a first ap­prox­i­ma­tion. Why? Imagine we hold a call be­fore ex­piry, but the un­der­ly­ing is cur­rently less than the strike . Our po­si­tion is not worth­less pre­cisely be­cause it’s still pos­si­ble that the stock price will rise be­fore ex­piry. So we should still be able to sell the con­tract for more than zero. The closer the stock is to the strike , the more we should be able to re-sell our op­tion for. Put dif­fer­ently, and this is the key point here, the price of the op­tion will change non­lin­early with the price of the stock, and as time passes, the smooth ap­prox­i­ma­tion should look more and more like the pay­off func­tion, be­cause the op­tion is los­ing its op­tion­al­ity.

This ten­sion be­tween time de­cay and con­vex­ity is the cen­tral dy­namic of an op­tion, and the Black–Scholes for­mula en­cap­su­lates this ten­sion beau­ti­fully. According to the Black–Scholes model, the fair price of a European-style call op­tion is the fol­low­ing:

Here, is the cu­mu­la­tive dis­tri­b­u­tion func­tion (CDF) of the stan­dard nor­mal dis­tri­b­u­tion, is the stan­dard de­vi­a­tion or volatil­ity of the un­der­ly­ing as­set, and is the risk-free in­ter­est rate. Black–Scholes as­sumes that the volatil­ity and risk-free rate are both con­stant.

At a high level, this for­mula is just the weighted dif­fer­ence be­tween the stock and strike prices. But the terms and are not ob­vi­ously in­ter­pretable. So can we make head­way here? Can we say some­thing more pre­cise with­out quickly get­ting bogged down in math­e­mat­i­cal de­tails? Let’s try.

Understanding how to de­rive Equation from first prin­ci­ples re­quires a com­plex set of math­e­mat­i­cal and fi­nan­cial ideas. Perhaps the trick­i­est part for most peo­ple is the use of sto­chas­tic cal­cu­lus (Itô, 1944; Itô, 1951; Bru & Yor, 2002). Stochastic cal­cu­lus is re­quired be­cause we as­sume the un­der­ly­ing stock price fol­lows this sto­chas­tic dif­fer­en­tial equa­tion (SDE), which is a geo­met­ric Brownian

mo­tion:

Here, is the drift of the stock, and is the volatil­ity of the stock. This is the same vari­able in as Equation . Finally, is an in­fin­i­tes­i­mal change in a Brownian mo­tion.

The im­por­tant but sub­tle point here is that Equation is math­e­mat­i­cally im­pre­cise to most peo­ple, even those with strong tech­ni­cal back­grounds. In stan­dard cal­cu­lus, we can­not take the de­riv­a­tive of a ran­dom func­tion. Informally, it breaks the re­quired as­sump­tion that the func­tion is smooth enough and can there­fore be lo­cally ap­prox­i­mated by a tan­gent line. Thus, the no­ta­tion has no mean­ing in stan­dard cal­cu­lus. So it ap­pears that to even un­der­stand the gen­er­a­tive model above, one must un­der­stand sto­chas­tic cal­cu­lus.

As an aside, it’s worth men­tion­ing that this tech­ni­cal dif­fi­culty is made even more sub­tle in the orig­i­nal pa­per (Black & Scholes, 1973) be­cause it’s only a tech­ni­cal de­tail. The main con­tent of the pa­per is the de­riva­tion of the fa­mous Black–Scholes par­tial dif­fer­en­tial equa­tion (PDE), and this is done by rea­son­ing about the dy­nam­ics of an op­tion. But since an op­tion’s price is a func­tion of its un­der­ly­ing stock’s price, de­scrib­ing its dy­nam­ics via a PDE re­quires par­tial de­riv­a­tive such as

And of course, com­put­ing this term re­quires sto­chas­tic cal­cu­lus since is a ran­dom func­tion. So with­out sto­chas­tic cal­cu­lus, any de­riva­tion of the Black–Scholes PDE is high-level at best. My ap­proach in this post is to cir­cum­vent the PDE en­tirely and to just make sense of Equation di­rectly. The PDE is beau­ti­ful—and per­haps I’ll write a sep­a­rate post about it in the fu­ture—but we can still make progress by just think­ing about the gen­er­a­tive model of the stock price in sim­ple, prob­a­bilis­tic terms.

So let’s side-step the chal­lenge of sto­chas­tic cal­cu­lus im­plicit in Equation

. The key point is this: a dif­fer­en­tial equa­tion is an equa­tion that ex­presses the re­la­tion­ship be­tween a func­tion and its de­riv­a­tives. And a so­lu­tion here is the closed-form ex­pres­sion of

such that we could plug and its de­riv­a­tive(s) into Equation and it would hold. A stan­dard re­sult is that the so­lu­tion to the SDE in Equation is the fol­low­ing de­f­i­n­i­tion of :

If you want more de­tail, read about solv­ing the SDE of geo­met­ric Brownian mo­tion. But the main idea for us is that the Black–Scholes mod­el­ing as­sump­tion can be ex­pressed in a form that does not re­quire sto­chas­tic cal­cu­lus. Equation

can be re­framed as the as­sump­tion that stock prices are log­nor­mally

dis­trib­uted or, put dif­fer­ently, that log re­turns are nor­mally dis­trib­uted:

So as a ped­a­gog­i­cal trick, let’s just as­sume Equations and rather than Equation

. This is our new start­ing point.

For some in­tu­ition for the leap be­tween the two, re­call that the de­riv­a­tive of

is , and this looks a bit like the left-hand side of Equation . So think of the left-hand side of Equation as a log re­turn. And think of the right-hand side of Equation as a nor­mally dis­trib­uted ran­dom vari­able, since it is an in­cre­men­tal (additive) change in Brownian mo­tion plus a con­stant drift. The ex­tra term, , can only be prop­erly un­der­stood with sto­chas­tic cal­cu­lus—it’s from the qua­dratic vari­a­tion of Brownian mo­tion—, but it pro­vides no real in­tu­ition for us here, and we will take it as a given.

Let’s look at some ex­am­ples. In Figure , I have plot­ted the stock price over vary­ing drifts and volatil­i­ties

. This looks like ran­dom noise with drift be­cause that’s pre­cisely what it is. The nor­mal dis­tri­b­u­tion is as ran­dom as it gets, in the sense that the Central

Limit Theorem states that the ap­pro­pri­ately scaled sum of in­de­pen­dent ran­dom vari­ables con­verges to the nor­mal dis­tri­b­u­tion. Things that aren’t ini­tially nor­mal can still be­come nor­mal over time.

Now if is nor­mally dis­trib­uted, then is log­nor­mally dis­trib­uted. So as time passes, the stock price is al­ways log­nor­mally dis­trib­uted with a vari­ance that in­creases lin­early with time and thus a volatil­ity that in­creases with the square root of

time. To vi­su­al­ize this as­sump­tion, I’ve plot­ted the ap­pro­pri­ate log­nor­mal dis­tri­b­u­tion at var­i­ous time slices along with a many em­pir­i­cal sam­ples (Figure ). In my mind, this fig­ure cap­tures the geo­met­ric essence of Equations and . Black–Scholes as­sumes that the stock price is un­pre­dictable mod­ulo the drift, and that infinitesimal change in Brownian mo­tion”, math­e­mat­i­cally vague for the unini­ti­ated, amounts to an un­cer­tainty about the stock price that grows with the square root of time.

Of course, we could make the same fig­ure but us­ing log re­turns. In that case, the lines in­di­cat­ing the log­nor­mal dis­tri­b­u­tions in Figure

would change to in­di­cate nor­mal dis­tri­b­u­tions. In ei­ther case, the gen­er­a­tive model of Black–Scholes is that prices and there­fore re­turns are un­pre­dictable.

Now pause. We’re go­ing to make a sub­tle but crit­i­cal tweak to our as­sump­tions so far. We’re go­ing to re­place the stock-spe­cific drift with the risk-free in­ter­est rate . Concretely, rather than as­sum­ing Equation , we’re go­ing to as­sume a stock’s log re­turns fol­low this nor­mal dis­tri­b­u­tion:

We haven’t dis­cussed in de­tail yet, but this is just the an ide­al­ized num­ber rep­re­sent­ing the time-value of money. To a first or­der, think of as the in­ter­est rate from a very se­cure or re­li­able as­set, such as a short-term US gov­ern­ment bond. It is the risk-free rate and thus the lower-bound on what you can earn with­out risk.

Of course, dif­fer­ent stocks might be mod­eled with dif­fer­ent drifts. The drift of a blue-chip com­pany might not be the same as the drift of a penny stock. So in essence, this as­sump­tion is a claim that Black–Scholes is mak­ing: the drift of the stock does­n’t mat­ter when pric­ing an op­tion! This is a sur­pris­ing and deep claim. Let’s un­der­stand it.

In the orig­i­nal pa­per, Black and Scholes as­sume that the mar­ket has no ar­bi­trage. Here, an arbitrage” is an op­por­tu­nity to make a risk-free profit start­ing at zero wealth. In other words, Black–Scholes as­sumes the mar­ket is per­fectly ef­fi­cient, and the for­mula (Equation ) rep­re­sents the fair price of the op­tion in this ide­al­ized world.

Now when I first learned this stuff, I found this as­sump­tion con­fus­ing, be­cause I thought it was anal­o­gous to as­sum­ing no fric­tion” in a high-school physics prob­lem. In physics, we might sim­plify the world by as­sum­ing that a box slides on plane with­out fric­tion. This makes cal­cu­la­tions eas­ier for stu­dents, but the con­se­quence is that pre­dic­tions are sys­tem­at­i­cally wrong. At some point, we have to add fric­tion back into our model to make it re­al­is­tic.

But fric­tion is a poor anal­ogy here, and I pro­pose a dif­fer­ent one: as­sum­ing no ar­bi­trage is anal­o­gous to as­sum­ing no wind or air re­sistence when mod­el­ing pro­jec­tile mo­tion. This as­sump­tion also sim­pli­fies cal­cu­la­tions, but re­ally it helps us un­der­stand the essence of the phe­nom­e­non: the par­a­bolic arc of pro­jec­tile mo­tion. Later, we can add air re­sis­tance or wind de­pend­ing on our par­tic­u­lar cir­cum­stances, but the un­der­ly­ing par­a­bolic arc is a kind of pla­tonic ideal. It is the sig­nal with­out the noise.

This is the sense in which Black–Scholes as­sumes no ar­bi­trage. The as­sump­tion is not an im­plau­si­ble claim that real fi­nan­cial mar­kets are per­fectly ef­fi­cient. Rather, as­sum­ing no ar­bi­trage is as­sum­ing a per­fectly co­her­ent, noise-free sys­tem where prices are con­sis­tent and make sense rel­a­tive to each other. So this is­n’t about sim­pli­fy­ing cal­cu­la­tions. It’s about find­ing the pla­tonic price. And thus it’s not an as­sump­tion that we re­move in the fu­ture to get a more re­al­is­tic price. Quite the op­po­site! Adding ar­bi­trage would make the prob­lem im­pos­si­ble to solve be­cause prices would then be in­con­sis­tent. There would be no uni­ver­sally agreed-upon mar­ket price.

Now that we un­der­stand this, let’s re­trace the main line of ar­gu­ment of the orig­i­nal pa­per (Black & Scholes, 1973), but let’s do so in a sim­pler con­text. Imagine we sold a hy­po­thet­i­cal de­riv­a­tive con­tract: a re­deemable cer­tifi­cate on a stock, which can be ex­er­cised only at time . An in­vestor pays us , the fair price of the re­deemable at in­cep­tion, and in re­turn we give them a cer­tifi­cate that is re­deemable for the value of the stock at time

. Our goal is to solve for , the fair price of the re­deemable at any given mo­ment in time.

As deal­ers, the risk to us is that the price of the stock goes up. So what could we do? Well, the mo­ment we sell a re­deemable, we could buy the un­der­ly­ing stock. Now we are per­fectly hedged. We don’t care if the stock goes up or down in value, be­cause we own the stock. Whenever a cus­tomer comes to re­deem the value of the stock, we sim­ply sell the ap­pro­pri­ate share, and we’re done. We nei­ther make nor lose money on av­er­age, and thus the price is fair.

The key in­sight of Black–Scholes is to re­al­ize that our per­fectly hedged port­fo­lio, short a re­deemable and long a stock, is risk-less. And thus, it must grow or drift at the risk-free rate! Formally, we can say that the value of our port­fo­lio at ex­piry is

This is re­mark­able be­cause is ran­dom! But the left-hand side is not an ex­pec­ta­tion, be­cause we are al­ways hedged! Do you see the trick? This is the big idea of the orig­i­nal pa­per. We’ve neu­tral­ized the ran­dom­ness in by as­sum­ing we can per­fectly hedge it out!

Now a ter­mi­nal con­di­tion is that our re­deemable is worth the value of the stock at ex­piry, so . This is fair in the sense that nei­ther we nor the in­vestor makes more money than the other. Under this con­di­tion, we can write Equation as

But if our port­fo­lio is per­fectly hedged and risk­less, then clearly must be a risk­less de­riv­a­tive. And so it must drift at the risk-free rate, giv­ing us

And of course, we can re­place big with lit­tle , in gen­eral. Doesn’t this make sense? The fair price of the re­deemable is sim­ply the price of the stock at con­tract in­cep­tion, ad­justed for the time-value of money! Again, de­spite be­ing a ran­dom process, we have no ex­pec­ta­tions. We have neu­tral­ized ran­dom­ness through per­fect hedg­ing in a world with­out ar­bi­trage.

Now let’s ex­tend this line of rea­son­ing to op­tions. The chal­lenge here is that an op­tion has con­vex­ity. Its price changes non­lin­early with changes in the un­der­ly­ing. So un­like with the re­deemable, we would not want to buy ex­actly one share of stock for one op­tion con­tract. Instead, at each mo­ment, we want pre­cisely the amount of stock such that, if the stock price moved an in­fin­i­tes­i­mally small amount, our hedge would move an in­fin­i­tes­i­mally small amount that per­fectly matched the price of the op­tion. This sounds like a de­riv­a­tive from cal­cu­lus be­cause it is! The amount we want to hedge at an in­stant in time is sim­ply the de­riv­a­tive of the call price with re­spect to the stock price, called the delta:

So if the stock changes by one dol­lar, our op­tion price changes by ap­prox­i­mately

dol­lars.

Now imag­ine a world with­out ar­bi­trage or trans­ac­tion fees and with con­ti­nous trad­ing. In this world, we (an op­tions dealer) can al­ways be per­fectly delta hedged. Our port­fo­lio is al­ways just

In the orig­i­nal pa­per, the au­thors re­peat the ar­gu­ment made above for re­deemables but in the con­text of op­tions. The cal­cu­la­tions be­come more com­pli­cated be­cause work­ing with re­quires sto­chas­tic cal­cu­lus, since

is a ran­dom vari­able. But the ar­gu­ment is es­sen­tially the same. We as­sume a world with­out ar­bi­trage and then just model the dy­nam­ics of a per­fectly hedged and thus risk-less port­fo­lio. We use these dy­nam­ics (Black–Scholes PDE) and a ter­mi­nal con­di­tion to solve for the fair price .

That’s the orig­i­nal ar­gu­ment. It’s an ar­gu­ment about no ar­bi­trage and per­fect hedg­ing. But Black–Scholes feels like a law of na­ture be­cause it’s the so­lu­tion to a noise-free sys­tem and be­cause it can be de­rived in many ways. Another way to de­rive Black–Scholes is by mod­el­ing a port­fo­lio which per­fectly repli­cates the price of a call at each mo­ment. This idea of val­u­a­tion through repli­ca­tion is far-reach­ing in fi­nance. To quote Emanuel Derman (Derman, 2002):

If you want to know the value of a se­cu­rity, use the price of an­other se­cu­rity that’s as sim­i­lar to it as pos­si­ble. All the rest is mod­el­ing.

The dis­crete-time ver­sion of this ar­gu­ment is the bi­no­mial op­tions-pric­ing

model. Yet an­other way is to con­nect the idea of no ar­bi­trage to the idea of risk-neu­tral­ity. This re­la­tion­ship is called the fun­da­men­tal the­o­rem of as­set

pric­ing, and it’s the re­la­tion­ship we want to ex­plore here.

Let’s un­der­stand this con­nec­tion be­tween the orig­i­nal ar­gu­ment of no ar­bi­trage and the mod­ern ar­gu­ment of risk-neu­tral pric­ing. Consider this: in a Black–Scholes world, do we care about the drift of the un­der­ly­ing stock? As with a re­deemable, we are al­ways per­fectly and con­tin­u­ously delta hedged. We have no price ex­po­sure. In this world, all op­tions deal­ers would per­fectly hedge. And all op­tions in­vestors would be­come risk-neu­tral, be­cause it would not pay to take risk. So one trick that makes our logic and cal­cu­la­tions eas­ier is to just as­sume the drift of the stock is ! This is equiv­a­lent to as­sum­ing the world has no ar­bi­trage. And what’s the fair price of an op­tion in this world? It’s a time-dis­counted ex­pected value:

Here, the sub­script is called the risk-neu­tral mea­sure, and this no­ta­tion is used to make it ex­plicit that the ex­pec­ta­tion is com­puted in this imag­i­nary world, not in the real risky world. In this imag­i­nary world, is a mar­tin­gale, which is a sto­chas­tic process with an ex­pected value that is al­ways equal to the cur­rent value. Intuitively and im­por­tantly, mar­tin­gales have no drift. So in Equation , the only drift is from the dis­count fac­tor . The stock it­self is a drift-less mar­tin­gale.

So in one telling, we as­sume the world has no risk, and thus we can work with­out ex­pec­ta­tions. Random processes can be forced to be­come non-ran­dom. In an­other telling, we as­sume the world is risk-neu­tral, and ran­dom processes be­come mar­tin­gales. In both tellings, the only mean­ing­ful drift is the risk-free rate.

There is a lot of the­ory one could get into here. For ex­am­ple, the Girsanov

the­o­rem is a re­sult from prob­a­bil­ity the­ory on how a sto­chas­tic process changes un­der a change in mea­sure (here, from the true physical” mea­sure to the risk-neu­tral mea­sure). And you might read things like, Under the risk-neu­tral mea­sure, the stock price af­ter dis­count­ing by the risk-free rate fol­lows a mar­tin­gale”. You can eas­ily get lost in the tech­ni­cal de­tails. But in my mind, at a high-level, the con­cept is fairly sim­ple, if a bit non-ob­vi­ous. In a world in which all in­vestors and traders can per­fectly hedge their risk, there is no risk pre­mia; every­one is forced to be­come risk neu­tral. And thus, the price of every­thing is its risk-neu­tral ex­pected value, or its value if there was no pre­mia to risk.

And again, this sim­pli­fy­ing as­sump­tion is not like as­sum­ing no fric­tion”. We do not need to re-add ar­bi­trage or drift in or­der to com­pute a more re­al­is­tic op­tion price later. Instead, this as­sump­tion strips out all the noise, and

rep­re­sents a pla­tonic price in a co­her­ent and con­sis­tent mar­ket.

Armed with this un­der­stand­ing, let’s re­visit the gen­er­a­tive model for a stock. In a risk-neu­tral world, the drift of every stock is the same: it’s just the risk-free rate . This is the line of rea­son­ing which con­verts Equation to Equation .

We’re now ready to try to di­rectly make sense of Equation . At this point, we’ll need to a bit of te­dious al­ge­bra, but noth­ing in this sec­tion re­quires more than ba­sic prob­a­bil­ity. And the con­cep­tual work is mostly done. As promised, I’ve tried to side-step as much sto­chas­tic cal­cu­lus as pos­si­ble in or­der to get to this point.

Let’s also re­peat our mod­el­ing as­sump­tion (Equation ) but in terms of and :

As a fi­nal pre­lim­i­nary, two ob­ser­va­tions for no­ta­tional clar­ity. First, note that at time , the price

is non-ran­dom. So we can push that into the mean if we would like, giv­ing us:

Second, here I’ve used no­ta­tion and to match what I have com­monly seen in the lit­er­a­ture. But I think it’s ex­tremely use­ful to rewrite as

Now at a high-level, my claim is that we can think of

as de­com­pos­able into two terms, one rep­re­sent­ing what we make (stock price) and one rep­re­sent­ing what we pay (strike price), both con­tin­gent on the call end­ing in-the-money (). This idea is not orig­i­nal to me; it is from (Nielsen, 1992). We have:

\begin{aligned}

C_1 &= \text{contigent value of stock} &&=

\begin{cases}

S_T & \text{if $S_T \gt K$,}

0 & \text{else,}

\end{cases}

C_2 &= \text{contigent value of strike} &&=

\begin{cases}

-K & \text{if $S_T \gt K$,}

0 & \text{else.}

\end{cases}

\end{aligned} \tag{18}

Furthermore, both of these terms will have a clear, sim­ple, prob­a­bilis­tic in­ter­pre­ta­tion that will di­rectly map onto Equation . Let’s see this.

First, what is ? This is an ex­pec­ta­tion, and the value of the claim is zero when the call is out-of-the-money (when ). By de­f­i­n­i­tion:

And it is easy to see that is equal to

Here, de­notes the -score of the log move from to , i.e.

So in words, we can see that is just the prob­a­bil­ity that the call ends up in-the-money, or the prob­a­bil­ity that . All those ex­tra vari­ables em­bed­ded in just rep­re­sent nor­mal­iz­ing the log move from to

, such that we can rep­re­sent the equa­tion us­ing the CDF of the stan­dard nor­mal rather than the CDF of . We could, if we wanted to, rep­re­sent all of this us­ing the CDF of the un-stan­dard­ized log­nor­mal dis­tri­b­u­tion. But it’s cleaner and con­ven­tional to work in a stan­dard­ized space.

To sum­ma­rize, we have shown:

This rep­re­sents the ex­pected value we must pay to ex­er­cise a call op­tion, con­tin­gent on the op­tion be­ing ex­er­cised. Of course, this is an ex­pected value, but the Black–Scholes price is the price in to­day’s terms. So we need a dis­count fac­tor, giv­ing us:

Now for , we again have an ex­pec­ta­tion where the con­tin­gent value is zero when the op­tion ends out-of-the-money. This is a bit more com­pli­cated than the de­riva­tion for , since is ran­dom while is fixed. By the law of

to­tal ex­pec­ta­tion, we have:

...

Read the original on gregorygundersen.com »

8 139 shares, 7 trendiness

What P vs NP is actually about

We re­cently made a Polylog video about the P vs NP prob­lem. As usual, our goal was to pre­sent an un­der­rated topic in a broadly un­der­stand­able way, while be­ing slightly im­pre­cise and leav­ing out the messy tech­ni­cal de­tails. This post is where I ex­plain those de­tails so that I can sleep well at night.

EDIT: At the bot­tom, I added replies to some more com­mon ques­tions peo­ple asked in the YouTube chat.

The main point of the video

The main point of the video was to pre­sent P vs NP from a some­what un­usual per­spec­tive. The most com­mon way to frame the ques­tion is: If you can ef­fi­ciently ver­ify a so­lu­tion to some prob­lem, does that mean you can ef­fi­ciently solve it?” Our video ex­plored a dif­fer­ent fram­ing: If you can ef­fi­ciently com­pute a func­tion , is there an ef­fi­cient way to com­pute ?” If you for­mal­ize both of these ques­tions, they’re math­e­mat­i­cally equiv­a­lent, and they’re also equiv­a­lent to the ques­tion Can we ef­fi­ciently solve the Satisfiability prob­lem?” (as proven in a later sec­tion)

I think that the fram­ing with in­vert­ing a func­tion is quite un­der­rated. It’s ex­tremely clean from a math­e­mat­i­cal per­spec­tive and high­lights the fun­da­men­tal na­ture of the ques­tion. We can also eas­ily view the more com­mon ver­i­fier-for­mu­la­tion of P vs NP as a spe­cial case of this one, once we re­al­ize that in­vert­ing a checker al­go­rithm and running it back­ward from YES solves the prob­lem that the checker ver­i­fies.

If we man­aged to con­vey some of these ideas to you, then the video suc­ceeded! However, a deep un­der­stand­ing re­quires grap­pling with all the nitty-gritty de­tails, which I’ll go through in this post. I’ll also touch on some ad­di­tional top­ics, like a fun con­nec­tion be­tween P vs NP and deep learn­ing.

The main hero of the video, sat­is­fi­a­bil­ity, comes in sev­eral fla­vors:

Satisfiability (SAT): In the most ba­sic ver­sion of the prob­lem, we are given a log­i­cal for­mula (without quan­ti­fiers) and must find an as­sign­ment to its vari­ables that makes it true; or de­ter­mine that no such as­sign­ment ex­ists. This is the clean­est for­mu­la­tion of the sat­is­fi­a­bil­ity prob­lem if you’re fa­mil­iar with log­i­cal for­mu­las. We did­n’t opt for this choice since it raises ques­tions like What kinds of log­i­cal con­nec­tives are al­lowed?” or How do you quickly eval­u­ate for­mu­las with many nested paren­the­ses?”.

Conjunctive-Normal-Form Satisfiability (CNF-SAT): This is the ver­sion of sat­is­fi­a­bil­ity we used. Conjunctive” means that we re­quire our for­mula to be a large con­junc­tion (AND) of clauses, where each clause is a dis­junc­tion (OR) of lit­er­als (either or ). This is the clas­sic in­put for­mat of­ten re­quired by SAT solvers.

3-SAT: This is CNF-SAT where we ad­di­tion­ally re­quire that each clause has at most three lit­er­als. If you look care­fully at our con­ver­sion of a cir­cuit to an in­stance of CNF-SAT, you’ll no­tice that if all the gates in the cir­cuit are one of AND, OR, NOT, and take at most two in­puts (which can al­ways be achieved), then the in­stance of CNF-SAT we cre­ate is, in fact, an in­stance of 3-SAT. So, our ap­proach proves that even 3-SAT is NP-complete.

Circuit-SAT: In this prob­lem, you are given a cir­cuit that out­puts a sin­gle bit, and the ques­tion is whether there is an in­put that makes the cir­cuit out­put True. In our video, we showed how to re­duce this prob­lem to CNF-SAT by en­cod­ing the gates of the cir­cuit as con­straints (and adding one more con­straint say­ing that its out­put is True). This trans­for­ma­tion is also called the Tseytin trans­for­ma­tion.

Any Algorithm Can Be Viewed as a Circuit

In our video, we did­n’t want to dive into how any al­go­rithm can be con­verted into a cir­cuit — I feel that it’s quite in­tu­itive once you see a bunch of ex­am­ples like the mul­ti­pli­ca­tion cir­cuit or if you have some idea of how a CPU looks in­side. But there is an im­por­tant sub­tlety: real-world cir­cuits con­tain loops.

More con­cretely, our im­plicit de­f­i­n­i­tion of a cir­cuit (corresponding to what the­o­reti­cians call a cir­cuit) is that the un­der­ly­ing graph of a cir­cuit has to be acyclic so that run­ning the cir­cuit re­sults in a sin­gle pass from the in­put to the out­put wires.

On the other hand, a de­f­i­n­i­tion that closely cor­re­sponds to how CPUs work would al­low the un­der­ly­ing graph to have cy­cles. In that de­f­i­n­i­tion, run­ning the cir­cuit means sim­u­lat­ing it for some pre­de­ter­mined num­ber of steps and then read­ing the out­put from the out­put wires. I’ll call this de­f­i­n­i­tion a real-world cir­cuit.”

Fortunately, we can con­vert any real-world cir­cuit into an acyclic cir­cuit by unwrapping it in time”. Specifically, given any real-world cir­cuit sim­u­lated for steps, for any of its gates , we make copies of that gate. Then, when­ever there was a wire be­tween two gates , we cre­ate wires be­tween and , and , and so on, up to and . This way, we get an acyclic cir­cuit. Running this cir­cuit cor­re­sponds to sim­u­lat­ing the orig­i­nal cir­cuit for steps.

The most com­mon for­mal model of al­go­rithms is not a real-world cir­cuit, but a Turing ma­chine. Converting any Turing ma­chine to our acyclic cir­cuit can be done in a sim­i­lar way to how you unwrap” a real-world cir­cuit. However, this con­ver­sion is more messy if you want to un­der­stand it in full de­tail.

Decision Problems and NP vs coNP vs

When we talk about a problem” in com­puter sci­ence, we usu­ally mean some­thing like sorting,” where we are given some in­put ( num­bers) and are sup­posed to pro­duce an out­put ( num­bers). But one sub­tlety of the for­mal de­f­i­n­i­tions of P and NP is that they de­scribe classes of so-called de­ci­sion prob­lems. These are prob­lems like Is this se­quence sorted?” where the in­put can still be any­thing, but the out­put is a sin­gle bit: yes or no.

So, when we say that graph col­or­ing is in NP, the prob­lem we talk about is whether it’s pos­si­ble to color it prop­erly with col­ors.” The rea­son we fo­cus on de­ci­sion prob­lems in for­mal de­f­i­n­i­tions is that it makes it eas­ier to build a clean the­ory. Unfortunately, that’s pretty hard to ap­pre­ci­ate if you’re en­coun­ter­ing these terms for the first time, which is why we try to avoid these kinds of is­sues in our videos as much as pos­si­ble.

There’s one more nu­ance. In our video, we im­plic­itly de­fined that a prob­lem is NP-hard if any prob­lem in NP can be re­duced to it. However, we did­n’t ex­plain what a reduction” is.

Intuitively, say­ing a prob­lem can be re­duced to SAT should mean some­thing like if there is a poly­no­mial-time al­go­rithm for SAT, there is also a poly­no­mial-time al­go­rithm for .” However, this is­n’t how clas­si­cal re­duc­tions are de­fined. Saying a prob­lem can be re­duced to SAT for­mally means that there is an al­go­rithm for solv­ing that works by first run­ning a poly­no­mial-time pro­ce­dure that trans­forms an in­put to into an in­put to SAT and then de­ter­min­ing whether that SAT in­stance is sat­is­fi­able.

So, for ex­am­ple, if you can solve some prob­lem by run­ning a SAT solver ten times, this does­n’t mean that you have re­duced that prob­lem to SAT— in re­duc­tion, you can only run the SAT solver once. Moreover, if you solve a prob­lem by run­ning the SAT solver and then do­ing some post­pro­cess­ing of its an­swer, this is also not a re­duc­tion.

Let’s look at an ex­am­ple. Consider a prob­lem called Tautology, where the in­put is some log­i­cal for­mula, as in the Satisfiability prob­lem. However, the out­put is 1 if all pos­si­ble as­sign­ments of val­ues make the for­mula true, and 0 oth­er­wise. Notice that any for­mula is a tau­tol­ogy if and only if the for­mula is not sat­is­fi­able. In par­tic­u­lar, if you can solve Satisfiability and want to find out whether some for­mula is a tau­tol­ogy, just ask the SAT solver whether is sat­is­fi­able and negate its an­swer. But no­tice that this reduction” is not al­lowed be­cause, af­ter run­ning the SAT solver, there is a post­pro­cess­ing step where we flip its an­swer.

Although sat solvers can solve Tautology, the prob­lem is (probably) not even in the class NP: If some­one claims that a for­mula is a tau­tol­ogy, how should they per­suade us that it is? Tautology hap­pens to be­long to the class coNP (the com­ple­ment of NP), which is a kind of mir­ror im­age of NP.

Finally, the class of prob­lems that we can solve in poly­no­mial time if we could solve SAT in poly­no­mial time is called . In gen­eral, means you have poly­no­mial time but can also solve any poly­no­mial-sized in­stance of the prob­lem in one step. So, both Satisfiability and Tautology are in . When I first learned about P vs NP, for quite some time I did­n’t know about de­ci­sion prob­lems, thought that , and could­n’t un­der­stand what the hell even meant.

In our video, we did­n’t say that the prob­lem Inversion, de­fined as given a func­tion de­scribed as a cir­cuit, re­turn a cir­cuit for ,” is NP-complete. This is be­cause Inversion is not even a de­ci­sion prob­lem, so the state­ment is not true. The more cor­rect state­ment would be some­thing like .

In our video, we hinted that the ques­tion Can we in­vert func­tions ef­fi­ciently?” is equiv­a­lent to the P vs NP prob­lem. However, we have not proven this equiv­a­lence for­mally, so let’s be more pre­cise now. The claim is that the fol­low­ing three state­ments are equiv­a­lent:

There is a poly­no­mial-time al­go­rithm for Satisfiability.

Given any func­tion de­scribed as a cir­cuit, there is a poly­no­mial-time al­go­rithm to com­pute (i.e., given some and some as in­put, the al­go­rithm in poly­no­mial time out­puts some such that , if such an ex­ists).

There is a poly­no­mial-time al­go­rithm for any NP-complete prob­lem.

All the ideas of the proof are in the video, but let’s prove this a bit more for­mally.

1 → 2: Given a poly­no­mial-time al­go­rithm for Satisfiability, we can in­vert any func­tion , as we demon­strated in the video: We con­vert the logic of into a sat­is­fi­a­bil­ity prob­lem, use a few more con­straints to fix the out­put to be , and use the as­sumed al­go­rithm for Satisfiability to find a so­lu­tion.

2 → 3: Recall that any prob­lem in NP has, by de­f­i­n­i­tion, a fast ver­i­fier: an al­go­rithm that takes as in­put an in­stance of (e.g., a graph if is the graph col­or­ing prob­lem), a pro­posed so­lu­tion (e.g., a col­or­ing), and de­ter­mines whether this so­lu­tion is cor­rect. To solve any in­put in­stance of , we pro­ceed as fol­lows. First, we rep­re­sent the ver­i­fier as a cir­cuit (as ex­plained ear­lier, this is al­ways pos­si­ble) with two in­puts: the in­stance and the pro­posed so­lu­tion. Then, we fix the first in­put to the spe­cific in­stance we want to solve. This way, we ob­tain a cir­cuit that maps pro­posed so­lu­tions to whether they are cor­rect for the in­stance of . Using our as­sump­tion, we can in­vert this cir­cuit, thereby de­ter­min­ing whether our in­stance ad­mits a so­lu­tion.

3 → 1: By de­f­i­n­i­tion, a prob­lem is NP-complete if we can re­duce any prob­lem in NP to it. Satisfiability is in NP, so we can re­duce any in­stance of Satisfiability to an in­stance of , which we can solve in poly­no­mial time by our as­sump­tion, as we wanted to prove. As a small de­tail, this way, we are only solv­ing the sat­is­fi­a­bil­ity as a de­ci­sion prob­lem, i.e., whether a so­lu­tion ex­ists, or not. However, once we solve the de­ci­sion prob­lem, we can also find an ac­tual so­lu­tion. To do this, we will re­peat­edly solve the de­ci­sion prob­lem and each time, we add an ad­di­tional con­straint like . If the so­lu­tion is still sat­is­fi­able with this con­straint added, we know that there ex­ists a so­lu­tion with be­ing True, so we add to our in­stance and con­tinue with . Otherwise, we know that there is a so­lu­tion with , so we add this con­di­tion to our in­stance and again con­tinue with . After steps, we re­cover an as­sign­ment of vari­ables that sat­is­fies the in­put for­mula.

One of the biggest mys­ter­ies of the­o­ret­i­cal com­puter sci­ence is that most prob­lems we come across in prac­tice are ei­ther in P or are NP-complete.

More specif­i­cally, the mys­tery is why there are only a few in­ter­est­ing prob­lems that have a po­ten­tial to be NP-intermediate where NP-intermediate prob­lems are those in NP that are nei­ther in P nor NP-complete. Funnily enough, the most promi­nent NP-intermediate can­di­date prob­lem is fac­tor­ing, the run­ning ex­am­ple in our video. Besides fac­tor­ing and the so-called dis­crete log­a­rithm prob­lem, it’s re­ally hard to come up with a good ex­am­ple of prob­lems that look like po­ten­tial NP-intermediate prob­lems.

There are also only a few interesting” prob­lems that are even harder than NP. I would­n’t call this a mys­tery: such prob­lems have the prop­erty that we can’t even ver­ify pro­posed so­lu­tions. This makes them in­tu­itively so much harder than what we usu­ally deal with that we don’t en­counter those prob­lems of­ten in al­go­rith­mic prac­tice and thus we mostly don’t think of them as interesting”.

One ex­am­ple of a prob­lem that’s even harder than NP is de­ter­min­ing win­ning strate­gies in games. For ex­am­ple, think of a spe­cific game, like chess, and ask the ques­tion, Does white have a win­ning strat­egy in this po­si­tion?” Even if you claim that white is win­ning in some po­si­tion, how do you con­vince me? I can try to play black against you, but even if I lose every time, maybe it just means I’m not a good enough player. We could go through the en­tire game tree to­gether, but that takes ex­po­nen­tial time (NP re­quires that we can ver­ify in poly­no­mial time).

In fact, if you gen­er­al­ize chess so that it is played on a chess­board of size , the prob­lem of play­ing chess is ei­ther PSPACE-complete or EXP-complete. The gen­er­al­iza­tion to board is nec­es­sary since oth­er­wise chess can be solved in con­stant time.

PSPACE is the class of prob­lems we can solve if we have ac­cess to poly­no­mial space. If we say that our gen­er­al­ized game of chess can last for, say, at most rounds and if the check­mate did not oc­cur un­til then, it ends in a draw, the prob­lem of find­ing win­ning strate­gies can be solved in PSPACE: we can re­cur­sively walk through the en­tire game tree of depth to com­pute whether any given po­si­tion is win­ning for some player. In fact, the prob­lem would be PSPACE-complete.

EXP is the class of prob­lems we can solve if we can use ex­po­nen­tial time. If we don’t im­pose any limit on how long our gen­er­al­ized chess game can last, the prob­lem is no longer in PSPACE, but it’s still in EXP. This is be­cause the game has at most ex­po­nen­tially many dif­fer­ent states, which means that if we ex­plore the game tree and re­mem­ber states we’ve al­ready seen, we can fin­ish in ex­po­nen­tial time.

Let’s re­turn to the fram­ing of the P vs NP ques­tion as ” How can this fram­ing be use­ful? In the video, we showed how this view makes it clear that if P=NP, hash func­tions can­not ex­ist be­cause their en­tire shtick is to be easy to com­pute but hard to in­vert.

Here’s an­other rea­son why I find this fram­ing help­ful. It makes it clear that be­ing able to in­vert al­go­rithms brings a lot of power and makes you won­der whether we can run al­go­rithms back­ward” at least in some re­stricted sense. So, what kinds of func­tions can we ef­fi­ciently in­vert or even run back­ward”?

One ex­am­ple could be lin­ear func­tions. That is, we can solve the lin­ear equa­tion and write (assuming the so­lu­tion ex­ists). Importantly, the ma­trix can be com­puted from in poly­no­mial time.

Let’s be more am­bi­tious and talk about con­tin­u­ous func­tions. Concretely, let’s re­call our acyclic cir­cuits and mod­ify them as fol­lows: The wires will no longer carry ze­ros and ones but ar­bi­trary real num­bers. The gates will no longer com­pute log­i­cal func­tions like AND, OR, NOT, but sim­ple al­ge­braic func­tions like +, for some pa­ra­me­ter , and even more com­pli­cated func­tions like sig­moid or ReLU. These kinds of cir­cuits are, of course, called neural net­works.

Now, we can’t lit­er­ally in­vert neural net­works—that’s still NP-complete. But we can do some­thing sim­i­lar. Let’s say we have a net­work that com­putes some func­tion and we run it on some in­put vec­tor to get an out­put vec­tor , which we write as . Now, let’s say we’d like to nudge the out­put from to some very close to . The ques­tion is, how do we com­pute the vec­tor that has the prop­erty that ? This is anal­o­gous to the prob­lem of in­vert­ing func­tions, but this prob­lem is eas­ier. Since we’re only talk­ing about nudg­ing and we as­sume that is a nice con­tin­u­ous func­tion, we can ap­prox­i­mate it by a lin­ear func­tion in the vicin­ity of and write , where is the ma­trix of par­tial de­riv­a­tives. Since we know how to in­vert lin­ear func­tions, we can now solve for , i.e., find out how to nudge to get the ap­pro­pri­ate nudge at .

The al­go­rithm that can com­pute in lin­ear time for neural net­works is called back­prop­a­ga­tion. This al­go­rithm nicely fits our P vs NP dream of running al­go­rithms back­ward”: not only when it comes to the task that it solves but also in how it works: The al­go­rithm be­gins at the end of the neural net­work and works its way back through the wires while com­put­ing the de­riv­a­tives. I find it very sat­is­fy­ing how you can view this al­go­rithm as the cur­rently best an­swer we got to the ques­tion given a cir­cuit, how can we in­vert it and run it back­ward?” (everybody keeps telling me this is a stretch, though).

In prac­tice, when we train the neural net­work, we think of the weights of the net as the input” that we want to change to . The setup where we keep the weights of the net fixed and op­ti­mized the ac­tual in­put is also in­ter­est­ing, though — this is how you cre­ate so-called ad­ver­sar­ial ex­am­ples.

In gen­eral, the most strik­ing dif­fer­ence be­tween deep learn­ing and clas­si­cal al­go­rith­mics is how de­clar­a­tively deep learn­ing re­searchers think. That is, they think hard about what the right loss func­tion to op­ti­mize is or which part of the net to keep fixed and which part to op­ti­mize dur­ing an ex­per­i­ment. But they think less about how to ac­tu­ally achieve the goal of min­i­miz­ing the loss func­tion. This is of­ten done by in­clud­ing a few tiny lines in the code, like net­work.train() or net­work.back­ward(). To me, the essence of deep learn­ing has noth­ing to do with try­ing to mim­ick bi­o­log­i­cal sys­tems or some­thing in that sense; it’s the ob­ser­va­tion that if your cir­cuits are con­tin­u­ous, there’s a clear al­go­rith­mic way of in­vert­ing/​op­ti­miz­ing them us­ing back­prop­a­ga­tion.

From the per­spec­tive of some­one used to al­go­rithms like Dijkstra’s al­go­rithm, quick­sort, and so on, this de­clar­a­tive ap­proach of think­ing in terms of loss func­tions and ar­chi­tec­tures, rather than how the net is ac­tu­ally op­ti­mized, sounds very alien. But this is how the whole al­go­rith­mic world would look like if P equaled NP! In that world, we’d all pro­gram de­clar­a­tively in Prolog and use some kind of .solve() func­tion at the end that would in­ter­nally run a fast SAT solver to solve the prob­lem de­fined by our de­c­la­ra­tions.

Some peo­ple asked how does this con­nect to re­versible com­put­ing. Its idea is as fol­lows: When we are us­ing a gate like XOR gate that maps two in­puts a,b to one out­put , we are los­ing in­for­ma­tion about the in­puts. So, we can re­place XOR gate by the so-called CNOT gate that has two out­puts: and $a$. From these two out­puts, we can re­con­struct the in­put. A more com­pli­cated Toffoli gate is even uni­ver­sal in the sense that any cir­cuit can be con­verted to a re­versible cir­cuit built just from Toffoli gates. A re­versible cir­cuit looks a bit like the mu­sic staff (see the pic­ture be­low): The num­ber of wires through­out the cir­cuit is not chang­ing, we are just keep ap­ply­ing Toffoli or other re­versible gates on small sub­sets of the wires.

So, it seems that we can get re­versible al­go­rithms for free. But we are say­ing that be­ing able to re­verse al­go­rithms is equiv­a­lent to P=NP. Where is the prob­lem?

To un­der­stand why re­versibil­ity is not buy­ing you that much, you need to look closely at the fi­nal re­versible cir­cuit. First, such a cir­cuit has the same num­ber of input” and output” wires, so if the out­put has strictly less or strictly more bits than the out­put, how would we even de­fine the out­put ?

What hap­pens is that in re­versible cir­cuits with wires with bit in­put and bit out­put, we de­fine that at the start of the cir­cuit, the wires con­tain the bits of in­puts and then ze­ros. We re­quire that the first wires at the end of the cir­cuit con­tain the re­quired out­put and the rest of the wires can be ar­bi­trary junk. At the end of the al­go­rithm, we look at the first wires and for­get the junk.

But for­get­ting the junk is where the process stops be­ing re­versible! For ex­am­ple, if a re­versible cir­cuit is com­put­ing a hash func­tion, we are able to map the pair (hash, junk) back to the orig­i­nal in­put, but once we for­get the junk, we are screwed! So, the only thing that re­versible cir­cuits show is that we can al­ways cre­ate cir­cuits where the only non­re­versibil­ity is forgetting the junk”.

This is a nice ob­ser­va­tion but it does not change the re­al­ity on the ground: if we are given an out­put bits, find­ing some con­sis­tent in­put is still hard, whether we are talk­ing about the hard task of go­ing back through ir­re­versible cir­cuit, or the hard task of find­ing the miss­ing junk.

Why most cryp­tog­ra­phy breaks if P=NP

In the video, we showed that P=NP im­plies that RSA would be bro­ken and we could break hash func­tions in the sense that given any hash func­tion and its out­put, we can find an in­put that the func­tion maps to that out­put. However, how would we break the most ba­sic cryp­to­graph­i­cal task, the sym­met­ric en­cryp­tion?

In the sym­met­ric en­cryp­tion setup, we have two par­ties, A and B, that share a short key of bits. Moreover, A wants to send a plain text with bits to B. The so­lu­tion is that A uses some en­cryp­tion func­tion that maps (plain text, key) to en­crypted text, and B uses a de­cryp­tion func­tion that maps (encrypted text, key) back to the plain text.

The strat­egy of how to break sym­met­ric en­cryp­tion in the case when P=NP is straight­for­ward: we for­mu­late the ques­tion Find out the pair (plain text, key) that the en­ryp­tion func­tion maps to the en­crypted text” and use a fast SAT solver to an­swer it. The prob­lem is that if both plain text and en­crypted text have n bits, there are many pairs (plain text, key) map­ping to any given en­crypted text. This ap­proach only gives us one such pair which is prob­a­bly not the one we are af­ter.

But look, even if the plain text has length n bits, its en­tropy is typ­i­cally much smaller. For ex­am­ple, eng­lish text can be of­ten com­pressed to about 5 times smaller size us­ing stan­dard com­press­ing al­go­rithms. The keys that are used to en­crypt are ran­dom but typ­i­cally much shorter than n. So, the en­tropy of the pair (plain text, key) is typ­i­cally much less than n bits. In that case, given any en­crypted text, there is just one pos­si­ble plain text that maps to it, i.e., it is pos­si­ble to re­cover the plain text at least from the per­spec­tive of in­for­ma­tion the­ory.

We can re­cover this plain text ef­fi­ciently as fol­lows. We will cre­ate an­other al­go­rithm A that, given a string, tries to out­put how much that string looks like a mes­sage. For ex­am­ple, the al­go­rithm can check whether the string looks like an eng­lish text, .exe file, etc. Now, we can use SAT solver to an­swer the ques­tion Out of all pairs (plain text, key) that maps to the en­crypted text, re­turn the one that looks the most like a plain text ac­cord­ing to the al­go­rithm A”. This way, we man­age to se­lect the ac­tual plain text.

Notice that this ap­proach re­quires that plain text + key have to­gether at most n bits of en­tropy. In other words, if you are ei­ther send­ing ran­dom or well-com­pressed data, or if you en­crypt your data by one-time pad, you sur­vive P=NP. So, a lit­tle bit of cryp­tog­ra­phy can sur­vive P=NP, but only a lit­tle bit.

...

Read the original on vasekrozhon.wordpress.com »

9 137 shares, 8 trendiness

A corn starch based building material

Starch is a nat­ural poly­mer which is com­monly used as a cook­ing in­gre­di­ent. The re­newa­bil­ity and bio-degrad­abil­ity of starch has made it an in­ter­est­ing ma­te­r­ial for in­dus­trial ap­pli­ca­tions, such as pro­duc­tion of bio­plas­tic. This pa­per in­tro­duces the ap­pli­ca­tion of corn starch in the pro­duc­tion of a novel con­struc­tion ma­te­r­ial, named CoRncrete. CoRncrete is formed by mix­ing corn starch with sand and wa­ter. The mix­ture ap­pears to be self-com­pact­ing when wet. The mix­ture is poured in a mould and then heated in a mi­crowave or an oven. This heat­ing causes a gela­tin­i­sa­tion process which re­sults in a hard­ened ma­te­r­ial hav­ing com­pres­sive strength up to 26 MPa. The fac­tors af­fect­ing the strength of hard­ened CoRncrete such as wa­ter con­tent, sand ag­gre­gate size and heat­ing pro­ce­dure have been stud­ied. The degra­da­tion and sus­tain­abil­ity as­pects of CoRncrete are elu­ci­dated and lim­i­ta­tions in the po­ten­tial ap­pli­ca­tion of this ma­te­r­ial are dis­cussed.

Dive into the re­search top­ics of CoRncrete: A corn starch based build­ing ma­te­ri­al’. Together they form a unique fin­ger­print.

Kulshreshtha, Y. et al. / . In: . 2017 ; Vol. 154. pp. 411-423.

ab­stract = Starch is a nat­ural poly­mer which is com­monly used as a cook­ing in­gre­di­ent. The re­newa­bil­ity and bio-degrad­abil­ity of starch has made it an in­ter­est­ing ma­te­r­ial for in­dus­trial ap­pli­ca­tions, such as pro­duc­tion of bio­plas­tic. This pa­per in­tro­duces the ap­pli­ca­tion of corn starch in the pro­duc­tion of a novel con­struc­tion ma­te­r­ial, named CoRncrete. CoRncrete is formed by mix­ing corn starch with sand and wa­ter. The mix­ture ap­pears to be self-com­pact­ing when wet. The mix­ture is poured in a mould and then heated in a mi­crowave or an oven. This heat­ing causes a gela­tin­i­sa­tion process which re­sults in a hard­ened ma­te­r­ial hav­ing com­pres­sive strength up to 26 MPa. The fac­tors af­fect­ing the strength of hard­ened CoRncrete such as wa­ter con­tent, sand ag­gre­gate size and heat­ing pro­ce­dure have been stud­ied. The degra­da­tion and sus­tain­abil­ity as­pects of CoRncrete are elu­ci­dated and lim­i­ta­tions in the po­ten­tial ap­pli­ca­tion of this ma­te­r­ial are dis­cussed.”, au­thor = Y. Kulshreshtha and E. Schlangen and Jonkers, {H. M.} and Vardon, {P. J.} and {van Paassen}, {L. A.}”,

/ Kulshreshtha, Y. et al.

In: , Vol. 154, 15.11.2017, p. 411-423.

N2 - Starch is a nat­ural poly­mer which is com­monly used as a cook­ing in­gre­di­ent. The re­newa­bil­ity and bio-degrad­abil­ity of starch has made it an in­ter­est­ing ma­te­r­ial for in­dus­trial ap­pli­ca­tions, such as pro­duc­tion of bio­plas­tic. This pa­per in­tro­duces the ap­pli­ca­tion of corn starch in the pro­duc­tion of a novel con­struc­tion ma­te­r­ial, named CoRncrete. CoRncrete is formed by mix­ing corn starch with sand and wa­ter. The mix­ture ap­pears to be self-com­pact­ing when wet. The mix­ture is poured in a mould and then heated in a mi­crowave or an oven. This heat­ing causes a gela­tin­i­sa­tion process which re­sults in a hard­ened ma­te­r­ial hav­ing com­pres­sive strength up to 26 MPa. The fac­tors af­fect­ing the strength of hard­ened CoRncrete such as wa­ter con­tent, sand ag­gre­gate size and heat­ing pro­ce­dure have been stud­ied. The degra­da­tion and sus­tain­abil­ity as­pects of CoRncrete are elu­ci­dated and lim­i­ta­tions in the po­ten­tial ap­pli­ca­tion of this ma­te­r­ial are dis­cussed. AB - Starch is a nat­ural poly­mer which is com­monly used as a cook­ing in­gre­di­ent. The re­newa­bil­ity and bio-degrad­abil­ity of starch has made it an in­ter­est­ing ma­te­r­ial for in­dus­trial ap­pli­ca­tions, such as pro­duc­tion of bio­plas­tic. This pa­per in­tro­duces the ap­pli­ca­tion of corn starch in the pro­duc­tion of a novel con­struc­tion ma­te­r­ial, named CoRncrete. CoRncrete is formed by mix­ing corn starch with sand and wa­ter. The mix­ture ap­pears to be self-com­pact­ing when wet. The mix­ture is poured in a mould and then heated in a mi­crowave or an oven. This heat­ing causes a gela­tin­i­sa­tion process which re­sults in a hard­ened ma­te­r­ial hav­ing com­pres­sive strength up to 26 MPa. The fac­tors af­fect­ing the strength of hard­ened CoRncrete such as wa­ter con­tent, sand ag­gre­gate size and heat­ing pro­ce­dure have been stud­ied. The degra­da­tion and sus­tain­abil­ity as­pects of CoRncrete are elu­ci­dated and lim­i­ta­tions in the po­ten­tial ap­pli­ca­tion of this ma­te­r­ial are dis­cussed.

...

Read the original on research.tudelft.nl »

10 129 shares, 8 trendiness

Playing with BOLT and Postgres

A cou­ple days ago I had a bit of free time in the evening, and I was bored, so I de­cided to play with BOLT a lit­tle bit. No, not the dog

from a Disney movie, the BOLT

tool from LLVM pro­ject, aimed at op­ti­miz­ing bi­na­ries. It took me a while to get it work­ing, but the re­sults are un­ex­pect­edly good, in some cases up to 40%. So let me share my notes and bench­mark re­sults, and maybe there’s some­thing we can learn from it. We’ll start by go­ing through a cou­ple rab­bit holes first, though.

I do a fair amount of bench­mark­ing dur­ing de­vel­op­ment, to as­sess im­pact of patches, com­pare pos­si­ble ap­proaches, etc. Often the im­pact is very clear - the through­put dou­bles, query that took 1000 mil­lisec­onds sud­denly takes only 10 mil­lisec­onds, and so on. But some­times the change is tiny, or maybe you even need to prove there’s no change at all.

That sounds triv­ial, right? You just run the bench­mark enough times to get rid of ran­dom noise, and then com­pare the re­sults. Sadly, it’s not that sim­ple, and it gets harder the closer the re­sults are. So in a way, prov­ing a patch does not af­fect per­for­mance (and cause re­gres­sion) is the hard­est bench­mark­ing task.

It’s hard be­cause of binary lay­out” - lay­out of data struc­tures, vari­ables and func­tions etc. in the ex­e­cutable bi­nary. We imag­ine the ex­e­cutable gets loaded into mem­ory, and that mem­ory is uni­formly fast. And it’s not, we just live in the il­lu­sion of vir­tual ad­dress space. But it’s ac­tu­ally backed by a hi­er­ar­chy of mem­ory types with vastly dif­fer­ent per­for­mance (throughput, la­tency, en­ergy costs, …). There’s a won­der­ful pa­per by Ulrich Drepper

from 2007, dis­cussing all this. I highly rec­om­mend read­ing it.

This means the struc­ture of the com­piled bi­nary mat­ters, and maybe the patch ac­ci­den­tally changes it. Maybe the patch adds a lo­cal vari­able that shifts some­thing just enough to not fit in the same cache line. Maybe it adds just enough in­struc­tions or data to push some­thing use­ful from iTLB/​dTLB caches on the CPU, forc­ing ac­cess to DRAM later. Maybe it even af­fects branch pre­dic­tion, or stuff like that.

These ran­dom changes to bi­nary lay­out have a tiny im­pact - usu­ally less than 1% or so, per­haps a bit more (say 5%?). I’m sure it’s pos­si­ble to con­struct ar­ti­fi­cial ex­am­ples with much big­ger im­pact. But I’m talk­ing about im­pact ex­pected on normal” patches.

To fur­ther com­pli­cate things, these lay­out ef­fects are not ad­di­tive. If you have two patches caus­ing 1% re­gres­sion” each be­cause of lay­out, it does not mean ap­ply­ing both patches will regress by 2%. It might be 0% if the patches can­cel out, for ex­am­ple.

When you bench­mark a patch, and the dif­fer­ence is less than ~1%, it’s hard to say if it’s due to the patch or a small ac­ci­den­tal change to the bi­nary lay­out.

But we would like to know! 1% re­gres­sion seems small, but if we hap­pen to ac­cept mul­ti­ple of those, the to­tal re­gres­sion could be much worse.

What can we do about it?

There’s a great Performance Matters” talk about this very is­sue, by Emery Berger, pre­sented at StrangeLoop 2019. It starts by ex­plain­ing the is­sue - and it does a much bet­ter job than I did here. And then pre­sents the

Stabilizer pro­filer, ran­dom­iz­ing the bi­nary lay­out to get rid of the dif­fer­ences.

The ba­sic idea is very sim­ple - the bi­nary lay­out ef­fects are ran­dom and should can­cel out in the long run. Instead of do­ing many runs with a sin­gle fixed bi­nary lay­out for a given ex­e­cutable, we can ran­dom­ize the lay­out be­tween runs. If we do that in a smart way, the ef­fects will can­cel out and dis­ap­pear - magic.

Sadly, the Stabilizer pro­ject seems mostly in­ac­tive . The last com­mit touch­ing code is from 2013, and it only sup­ports LLVM 3.1 and GCC 4.6.2. Those are an­cient ver­sions. I don’t even know if you can build Postgres with them any­more, or how dif­fer­ent the bi­nary would be, com­pared to cur­rent LLVM/GCC ver­sions.

Note: I won­der if it would be pos­si­ble to do poor man’s Stabilizer” by ran­domly adding lo­cal vari­ables to func­tions, to change the size of the stacks. AFAIK that’s es­sen­tially one of the things Stabilizer does, al­though it does it in a nice way at run­time, with­out re­builds.

While look­ing for tools that might re­place Stabilizer, I re­al­ized that ran­dom­iz­ing the lay­out may not be the only op­tion. Maybe it would be pos­si­ble to elim­i­nate the ran­dom ef­fects by en­sur­ing the bi­nary lay­out is optimal” in some way (hopefully the same for both builds).

I don’t re­call how ex­actly, but this even­tu­ally led me to BOLT, which started as a re­search pro­ject at META. There’s a nice pa­per

ex­plain­ing the de­tails, of course.

Dealing with bi­nary lay­out dif­fer­ences for bench­mark­ing is not the goal of BOLT, it’s meant to op­ti­mize the bi­nary lay­out based on a pro­file. But my hope was that if I op­ti­mize the builds (unpatched and patched) the same way, the dif­fer­ences will not mat­ter any­more.

So I de­cided to give it a try, and do some quick test­ing …

The first thing I tried was sim­ply in­stalling bolt-16 (my ma­chines are run­ning Debian 12.7), and fol­lowed the in­struc­tions from the README. That seemed to work at first, but I quickly started to run into var­i­ous prob­lems.

BOLT re­quires builds with re­lo­ca­tions en­abled, so that it can re­or­ga­nize the bi­nary. So make sure you build Postgres with

Collecting the pro­file is pretty sim­ple, but that’s just reg­u­lar perf

(the $PID is a Postgres back­end run­ning some queries):

But then turn­ing that into BOLT pro­file started to com­plain:

I’m just run­ning the com­mand the README tells me to, so I’m not sure why it’s com­plain­ing about reading perf data di­rectly” or rec­om­mend­ing me to run the tool I’m ac­tu­ally run­ning (maybe it’s check­ing the name some­how, and the -16” con­fuses that check some­how?).

It does pro­duce the bolt.data file with BOLT pro­file, though. So let’s try op­ti­miz­ing the bi­nary us­ing it:

I have no idea what’s wrong here. The per­f2bolt-161 com­mand clearly pro­duced a file, it’s a valid ELF file (readelf can dump it), but it just does­n’t work for some rea­son.

Maybe there’s some prob­lem with per­f2bolt-16 af­ter all? The README does men­tion it’s pos­si­ble to in­stru­ment the bi­nary to col­lect the pro­file di­rectly, with­out us­ing perf, so let’s try that:

Well, that did­n’t work all that well :-( After a while I re­al­ized the li­brary ex­ists, but is in a dif­fer­ent di­rec­tory, so let’s cre­ate a sym­link and try again:

Now the in­stru­men­ta­tion should work - run the -instrument com­mand again, and it’ll pro­duce bi­nary post­gres.in­stru­mented. Copy it over the orig­i­nal post­gres bi­nary (but keep the orig­i­nal build, you’ll need it for the ac­tual op­ti­miza­tion), start it, and run some queries. It will cre­ate a pro­file in /tmp/prof.fdata, which you can use to op­ti­mize the orig­i­nal bi­nary:

And this mostly works. I oc­ca­sion­ally got some strange seg­fault crashes that seemed like an in­fi­nite loop. It seemed quite frag­ile (you look at it wrong, and it crashes). Maybe I did some­thing wrong, or maybe the multi-ver­sion pack­ages are con­fused a bit.

Issues with older LLVM builds are not a new thing, es­pe­cially for rel­a­tively new pro­jects like BOLT. The Debian ver­sion is from 16.0, while the git repos­i­tory is on 20.0, so I de­cided to try a cus­tom build, hop­ing it will fix the is­sues. It might also im­prove the op­ti­miza­tion, of course.

First, clone the LLVM pro­ject repos­i­tory, then build the three pro­jects needed by BOLT (this may take a cou­ple hours), and then in­stall it into the CMAKE_INSTALL_PREFIX di­rec­tory (you’ll need to ad­just the path).

This cus­tom build seems to work much bet­ter. I’m yet to see seg­faults, the prob­lems with miss­ing li­brary and in­put/​out­put er­rors when pro­cess­ing perf data went away too.

At some point I ran into a prob­lem when op­ti­miz­ing the bi­nary, when

llvm-bolt fails with an er­ror:

I don’t know what this is about ex­actly, but adding this op­tion seems to have fixed it:

I’m not sure this is a good so­lu­tion, though. This func­tion is for the ex­pres­sion in­ter­preter, and that’s likely one of the hottest func­tions in the ex­ecu­tor. So not op­ti­miz­ing it may limit the pos­si­ble ben­e­fits of the op­ti­miza­tion for com­plex (analytical) queries.

To mea­sure the im­pact of BOLT op­ti­miza­tion, I ran a cou­ple tra­di­tional bench­marks - pg­bench for OLTP, and the TPC-H queries for OLAP. I ex­pect the op­ti­miza­tions to help es­pe­cially CPU in­ten­sive work­loads, so I ran the bench­marks on small data sets that fit into mem­ory. That means scale 1 for pg­bench, 10GB for TPC-H.

I al­ways com­pared a clean” build from the mas­ter branch, with a build op­ti­mized us­ing BOLT. The pro­file used by BOLT was col­lected in var­i­ous ways - how im­por­tant the spe­cific pro­file mat­ters is one of the ques­tions. I as­sume it mat­ters quite a bit, be­cause op­ti­miz­ing based on a pro­file is the main idea in BOLT. If it did­n’t make a dif­fer­ence, why bother with a pro­file at all, right? We could just use a plain LTO.

First, let’s look at sim­ple read-only pg­bench, with a sin­gle client, that is

on reg­u­lar build (labeled master”), and then builds op­ti­mized us­ing pro­files col­lected for var­i­ous pg­bench work­loads:

The re­sults (throughput in trans­ac­tions per sec­ond, so higher val­ues are bet­ter) look like this:

Or rel­a­tive to master” you get this:

Those are pretty mas­sive im­prove­ments. Read-only pg­bench is a very sim­ple work­load, we’ve al­ready op­ti­mized it a lot, it’s hard to im­prove it sig­nif­i­cantly. So see­ing 30-40% im­prove­ments is sim­ply as­ton­ish­ing.

There’s also the first sign that the ac­tual pro­file mat­ters. Running a test with -M sim­ple on a build op­ti­mized us­ing the -M pre­pared

pro­file im­proves much less than with the -M sim­ple pro­file.

Interestingly enough, for -M pre­pared there’s no such gap, likely be­cause the -M pre­pared pro­file is a subset” of pro­file col­lected for -M sim­ple.

Let’s look at more com­plex queries too. I only took the 22 queries from the TPC-H bench­mark, and ran those on a 10GB data set. For each query I mea­sured the du­ra­tion for a clean master” build, and then also du­ra­tion for a build op­ti­mized us­ing a pro­file for that par­tic­u­lar query.

The 22 queries take very dif­fer­ent amounts of time, so I’m not go­ing to com­pare the raw tim­ings, just a com­par­i­son rel­a­tive to a master” build:

Most queries im­proved by 5-10%, ex­cept for queries 8 and 18, which im­proved by ~50% and ~15%. That’s very nice, but I have ex­pected to see big­ger im­prove­ments, con­sid­er­ing how CPU in­ten­sive these an­a­lyt­i­cal queries are.

I sus­pect this might be re­lated to the -skip-funcs=ExecInterpExpr.*

thing. Complex queries with ex­pres­sions are likely spend­ing quite a bit of time in the ex­pres­sion in­ter­preter. If the op­ti­miza­tion skips all that, that does­n’t seem great.

Even so, 5-10% across the board seems like a nice im­prove­ment.

The nat­ural ques­tion is how im­por­tant the op­ti­miza­tion pro­file is, and how it af­fects other work­loads. I al­ready touched on this in the OLTP sec­tion, when talk­ing about us­ing the -M pre­pared pro­file for

-M sim­ple work­load.

It might be a zero-sum game” where a pro­file im­proves work­load A, but then also re­gresses some other work­load B by the same amount. If you only do work­load A that might still be a win, but if the in­stance han­dles a mix of work­loads, you prob­a­bly don’t want this.

I did a cou­ple more bench­marks, us­ing pro­files com­bined from the ear­lier specific” pro­files and also a generic installcheck” pro­file:

* tpch-all - com­bines all the per-query pro­files from TPC-H

* all - com­bines tpch-all and pg­bench-both (so everything”)

The re­sults for OLTP look like this:

The all” pro­file com­bin­ing pro­files for the work­loads works great, pretty much the same as the best work­load-spe­cific pro­file. The pro­file de­rived from make in­stallcheck is a bit worse, but still pretty good (25-30% gain would be won­der­ful).

Interestingly, none of the pro­files makes it slower.

For TPC-H, I’ll only show one chart with the rel­a­tive speedup for

tpch-all and all pro­files.

The im­prove­ments re­main quite con­sis­tent for the tpch-all” and all” pro­files, al­though query 8 gets worse as the pro­file gets less spe­cific. Unfortunately the installcheck” pro­file loses about half of the im­prove­ments for most queries, ex­cept for query #8. The ~5% speedup is still nice, of course.

It would be in­ter­est­ing to see if op­ti­miz­ing the in­ter­preter (i.e. get­ting rid of -skip-funcs=ExecInterpExpr.*) makes the op­ti­miza­tion more ef­fec­tive. I don’t know what ex­actly the is­sue is or how to make it work.

There’s also the ques­tion of cor­rect­ness. There were some re­cent

dis­cus­sions

about pos­si­blly sup­port­ing link-time op­ti­miza­tion (LTO), in which some peo­ple sug­gested that we may be re­ly­ing on files be­ing optimization bar­ri­ers” in a cou­ple places. And that maybe en­abling LTO would break this, pos­si­bly lead­ing to sub­tle hard-to-re­pro­duce bugs.

The op­ti­miza­tions done by BOLT seem very sim­i­lar to what link-time op­ti­miza­tion (LTO) does, ex­cept that it lever­ages a work­load pro­file to de­cide how to op­ti­mize for that par­tic­u­lar work­load. But if LTO may be in­cor­rect, so would BOLT prob­a­bly.

I’m no ex­pert in this area, but but per the dis­cus­sion in those threads it seems this may not be quite ac­cu­rate. The optimization bar­rier” only af­fects com­pil­ers, and CPUs can re­order stuff any­way. The proper way to deal with this are compiler/memory bar­rier” in­struc­tions.

And some dis­tri­b­u­tions ap­par­ently en­abled LTO some time back, like Ubuntu in 22.04. And while it’s not a de­fin­i­tive proof of any­thing, we did­n’t ob­serve a mas­sive in­flux of strange isses from them.

I started look­ing at BOLT as a way to elim­i­nate the im­pact of ran­dom changes to bi­nary lay­out dur­ing bench­mark­ing. But I got dis­tracted by ex­per­i­ment­ing with BOLT on dif­fer­ent work­loads etc. I still think it might be pos­si­ble to op­ti­mize the builds the same way, and thus get rid of the bi­nary lay­out im­pact.

It’s clear ad­just­ing the bi­nary lay­out (and other op­ti­miza­tions) can yield sig­nif­i­cant speedups, on top of the ex­ist­ing op­ti­miza­tions al­ready per­formed by the com­pil­ers. We don’t see 30-40% speedups in pg­bench every day, that’s for sure.

But there’s also a lot of open ques­tions. The pro­file used for the op­ti­miza­tion mat­ters a lot, so how would we col­lect a good pro­file to use for builds?

The nice thing is that I haven’t re­ally seen any re­gres­sions - none of the cases got slower even if op­ti­miz­ing us­ing a wrong” pro­file. That’s nice, as it seems re­gres­sions are not very com­mon.

FWIW I doubt we would start us­ing BOLT di­rectly, at least not by de­fault. It’s more likely we’ll use it to learn how to ad­just the code and builds to gen­er­ate a bet­ter ex­e­cutable. Is there a way to reverse en­gi­neer” the trans­for­ma­tions per­formed by BOLT, and de­duce how to ad­just the code?

...

Read the original on vondra.me »

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

If you like 10HN please leave feedback and share

Visit pancik.com for more.