10 interesting stories served every morning and every evening.

1 636 shares, 25 trendiness

OpenAI announced this week that it has raised $6.6 billion in new funding and that the company is now valued at $157 billion overall. This is quite a feat for an organization that reportedly burns through $7 billion a year—far more cash than it brings in—but it makes sense when you realize that OpenAI’s primary product isn’t technology. It’s stories.

Case in point: Last week, CEO Sam Altman published an online manifesto titled “The Intelligence Age.” In it, he declares that the AI revolution is on the verge of unleashing boundless prosperity and radically improving human life. “We’ll soon be able to work with AI that helps us accomplish much more than we ever could without AI,” he writes. Altman expects that his technology will ﬁx the climate, help humankind establish space colonies, and discover all of physics. He predicts that we may have an all-powerful superintelligence “in a few thousand days.” All we have to do is feed his technology enough energy, enough data, and enough chips.

Maybe someday Altman’s ideas about AI will prove out, but for now, his approach is textbook Silicon Valley mythmaking. In these narratives, humankind is forever on the cusp of a technological breakthrough that will transform society for the better. The hard technical problems have basically been solved—all that’s left now are the details, which will surely be worked out through market competition and old-fashioned entrepreneurship. Spend billions now; make trillions later! This was the story of the dot-com boom in the 1990s, and of nanotechnology in the 2000s. It was the story of cryptocurrency and robotics in the 2010s. The technologies never quite work out like the Altmans of the world promise, but the stories keep regulators and regular people sidelined while the entrepreneurs, engineers, and investors build empires. (The Atlantic recently entered a corporate partnership with OpenAI.)

Despite the rhetoric, Altman’s products currently feel less like a glimpse of the future and more like the mundane, buggy present. ChatGPT and DALL-E were cutting-edge technology in 2022. People tried the chatbot and image generator for the ﬁrst time and were astonished. Altman and his ilk spent the following year speaking in stage whispers about the awesome technological force that had just been unleashed upon the world. Prominent AI ﬁgures were among the thousands of people who signed an open letter in March 2023 to urge a six-month pause in the development of large language models ( LLMs) so that humanity would have time to address the social consequences of the impending revolution. Those six months came and went. OpenAI and its competitors have released other models since then, and although tech wonks have dug into their purported advancements, for most people, the technology appears to have plateaued. GPT-4 now looks less like the precursor to an all-powerful superintelligence and more like … well, any other chatbot.

The technology itself seems much smaller once the novelty wears off. You can use a large language model to compose an email or a story—but not a particularly original one. The tools still hallucinate (meaning they conﬁdently assert false information). They still fail in embarrassing and unexpected ways. Meanwhile, the web is ﬁlling up with useless “AI slop,” LLM-generated trash that costs practically nothing to produce and generates pennies of advertising revenue for the creator. We’re in a race to the bottom that everyone saw coming and no one is happy with. Meanwhile, the search for product-market ﬁt at a scale that would justify all the inﬂated tech-company valuations keeps coming up short. Even OpenAI’s latest release, o1, was accompanied by a caveat from Altman that “it still seems more impressive on ﬁrst use than it does after you spend more time with it.”

In Altman’s rendering, this moment in time is just a waypoint, “the doorstep of the next leap in prosperity.” He still argues that the deep-learning technique that powers ChatGPT will effectively be able to solve any problem, at any scale, so long as it has enough energy, enough computational power, and enough data. Many computer scientists are skeptical of this claim, maintaining that multiple significant scientiﬁc breakthroughs stand between us and artiﬁcial general intelligence. But Altman projects conﬁdence that his company has it all well in hand, that science ﬁction will soon become reality. He may need $7 trillion or so to realize his ultimate vision—not to mention unproven fusion-energy technology—but that’s peanuts when compared with all the advances he is promising.

There’s just one tiny problem, though: Altman is no physicist. He is a serial entrepreneur, and quite clearly a talented one. He is one of Silicon Valley’s most revered talent scouts. If you look at Altman’s breakthrough successes, they all pretty much revolve around connecting early start-ups with piles of investor cash, not any particular technical innovation.

It’s remarkable how similar Altman’s rhetoric sounds to that of his fellow billionaire techno-optimists. The project of techno-optimism, for decades now, has been to insist that if we just have faith in technological progress and free the inventors and investors from pesky regulations such as copyright law and deceptive marketing, then the marketplace will work its magic and everyone will be better off. Altman has made nice with lawmakers, insisting that artiﬁcial intelligence requires responsible regulation. But the company’s response to proposed regulation seems to be “no, not like that.” Lord, grant us regulatory clarity—but not just yet.

At a high enough level of abstraction, Altman’s entire job is to keep us all ﬁxated on an imagined AI future so we don’t get too caught up in the underwhelming details of the present. Why focus on how AI is being used to harass and exploit children when you can imagine the ways it will make your life easier? It’s much more pleasant fantasizing about a benevolent future AI, one that ﬁxes the problems wrought by climate change, than dwelling upon the phenomenal energy and water consumption of actually existing AI today.

Remember, these technologies already have a track record. The world can and should evaluate them, and the people building them, based on their results and their effects, not solely on their supposed potential.

...

Read the original on www.theatlantic.com »

2 357 shares, 18 trendiness

Cloudﬂare on Thursday celebrated a victory over Sable Networks, which the former described as a “patent troll.”

That’s a term for an individual or organization that exists solely to makes patent infringement claims in the hope of winning a settlement from defendants concerned about costly patent litigation.

“Sable sued Cloudﬂare back in March 2021,” wrote Emily Terrell and Patrick Nemeroff, respectively Cloudﬂare’s senior counsel for litigation and senior associate general counsel, in a write-up Wednesday.

“Sable is a patent troll. It doesn’t make, develop, innovate, or sell anything. Sable IP is merely a shell entity formed to monetize (make money from) an ancient patent portfolio acquired by Sable Networks from Caspian Networks in 2006.”

Patent trolls have vexed the technology industry for years, sometimes even drawing regulatory responses as happened over a decade ago when opportunistic litigants focused on patents pertinent to the emerging smartphone market. The Obama administration responded by issuing a set of executive actions to curb abuses.

Lately, these patent profiteers have targeted the open source community. The Cloud Native Computing Foundation and Linux Foundation last month strengthened ties with United Patents, a company focused on defending against predatory patent claims.

Five other companies that we know of sued by Sable — Cisco, Fortinet, Check Point, SonicWall, and Juniper Networks — settled out of court. Splunk, meanwhile, fought back and managed last year to convince Sable to dismiss its claim against the operation prior to its takeover by, funnily enough, Cisco.

Internet services giant Cloudﬂare has notched up an even greater victory. Facing an initial infringement lawsuit regarding around a hundred claims related to four patents, the corporation ended up having to deal with just a single patent violation claim.

In February, Cloudﬂare prevailed when a Texas jury found it did not infringe Sable’s “micro-ﬂow label switching” patent.

The biz both convinced the jury that it did not use the “micro-ﬂow” technology described in the US patent 7,012,919 and that the patent was invalid because of prior art.

The existence of two earlier US patents, 6,584,071 and 6,680,933 for router technology developed by Nortel Networks and Lucent in the 1990s convinced the jury that Sable’s ’919 patent should never have been granted.

The matter also saw Sable pledge not to pursue further actions of this sort.

“In the end, Sable agreed to pay Cloudﬂare $225,000, grant Cloudﬂare a royalty-free license to its entire patent portfolio, and to dedicate its patents to the public, ensuring that Sable can never again assert them against another company,” said Terrell and Nemeroff.

The agreement means that Sable will tell the US Patent and Trademark Ofﬁce that it is abandoning its patent rights and no further claims based on those patents will be possible.

The Register called Sable’s listed phone number, and it is no longer in service. The company’s website is unresponsive and an attorney for the ﬁrm did not immediately respond to a request for comment.

Terrell and Nemeroff said that prior art submissions for the Sable case related to Project Jengo, a crowd-sourced patent invalidation initiative, will be accepted until November 2, 2024. At some point thereafter, Cloudﬂare, which has already given out $70,000 in awards for the case, will select ﬁnal award winners.

“We’re proud of our work ﬁghting patent trolls and believe that the outcome in this case sends a strong message that Cloudﬂare will ﬁght back against meritless patent cases and we will win,” a spokesperson for Cloudﬂare told The Register. ®

...

Read the original on www.theregister.com »

3 265 shares, 17 trendiness

What do you want to show?

Here you can ﬁnd a list of charts categorised by their data visualization functions or by what you want a chart to communicate to an audience. While the allocation of each chart into speciﬁc functions isn’t a perfect system, it still works as a useful guide for selecting chart based on your analysis or communication needs.

...

Read the original on datavizcatalogue.com »

4 263 shares, 13 trendiness

It is the summer of 1989 and a Norwegian couple return from hospital cradling their newborn son. Their names are Robert and Trude Steen, this is their ﬁrst child and they have named him Mats. The early weeks pass in the delirious, hormone-ﬂooded fug of new parenthood, Robert documenting his son’s wriggles and cries on his camcorder with a new-found paternal pride that has left him dumbfounded. And as the months pass, the camera keeps rolling. Mats grows. He sprouts bright blond Nordic hair. He drags himself to his feet and begins to toddle. His father ﬁlms him waddling across their living room in a Tom and Jerry T-shirt. He was, Robert remembers, “the most beautiful, perfect child”.

From about the age of two, though, something changes. Robert and Trude cannot put their ﬁnger on it at ﬁrst, but a concern for their child begins to gnaw at them. He struggles to get back to his feet when he falls. Playground obstacles become insurmountable. Robert ﬁlms Mats as he stumbles and plonks to his bottom. But he just sits on the ground in his dungarees, crying and helpless. “As a parent, you know when something is wrong with your child,” Trude says. They just didn’t know what.

For two years they worry, watching purse-lipped as their son’s physical development slows then stalls. They push and persuade doctors to take their fears seriously. There are tests and examinations and then, ﬁnally, they receive the news. “It was 1pm on May 18, 1993,” Robert says. “I remember the hospital ofﬁce where we were given the message. I remember everything.”

Mats aged ﬁve or six

Their son, the doctor explains, has a condition known as Duchenne muscular dystrophy. It is a genetic disorder that causes progressive muscle wastage and will, over time, deprive Mats of all his remaining strength and mobility. Walking will become harder and harder for him. In a few years he will need a wheelchair and then, eventually, a team of round-the-clock carers as his condition gradually renders him physically helpless. There will be feeding tubes and machines to help him clear his lungs, because swallowing and coughing will be beyond him. He may, his parents are told, live to be 20. “Our world broke apart,” Trude says gently. “The message was devastating, brutal. It was the day we understood it wouldn’t go away. It would rule our lives and, inthe end, also take Mats away from us.”

Amid the shock and horror, there is also the quiet abandonment of the future they had hoped for their son: of skiing and football, of university, a successful career and perhaps one day a family of his own. Instead, they learn to focus on each day as it comes and to ﬁnd and provide what happiness they can in each moment. Mats grows into an intelligent schoolboy with a droll sense of humour. By the age of eight he is in a wheelchair, but he bonds with his schoolmates in Oslo and shares in the communal passion for video games. He loves his little sister, Mia, and their pet dog. Everyone argues over who gets to be on Mats’ team during family quiz nights.

His teenage years are harder. His friends, inevitably, are drawn into a world of house parties and late-night cinema trips and early romantic relationships. It is a world Mats cannot enter. “No friends came knocking at the door any more,” Robert says, without bitterness. So Mats spends more and more time alone, playing video games. He discovers an online role-playing game called World of Warcraft, in which thousands of players can explore a vast, three-dimensional, Dungeons & Dragons-style fantasy world and work together to complete quests and defeat monsters. He sinks hour after hour into the game and his parents allow him to: he is already denied so many of life’s pleasures, it would seem churlish to deny him this.

At 18, he graduates from high school with excellent grades but is unemployable. He moves into an annexe, is looked after by a rotating team of carers and spends much of his time deeply absorbed in World of Warcraft, his right hand resting awkwardly on a custom-built keyboard, his head lolling to one side as he navigates an epic world. Robert and Trude sometimes sit with him while he plays, but after half an hour they ﬁnd their attention drifting. “It was boring, just sitting there watching something on screen,” Robert admits. “We didn’t know what was going on.”

The years pass. Mats’ 20th birthday comes and goes. He begins to write a blog about his life and condition as his body seems to become smaller and more fragile by the day. In November 2014, he is admitted to hospital with respiratory problems. It is not the ﬁrst time this has happened and, though these episodesare fraught, he has always returned home after several days of intensive care. This time, however, the Steens are roused by a telephone call not long after they have gone to bed for the night.

“It was the hospital,” Robert says. “They said, ‘We think you should get in your car and hurry here now.’ So we threw ourselves in the car. We had never driven so fast through Oslo, but it was at night, so the streets were quiet.” Robert and Trude ﬂew into the hospital at 12.14am, desperate to see their child. “We were exactly 14 minutes late. He had died at midnight.” The feeling, he says, was “complete emptiness. What had ﬁlled our lives, for better or worse, physically, mentally, practically, was now over. It had come to an end.”

Their grief is deepened by the knowledge that their son had lived a small, discreet life of little real consequence. He had made no mark on the world or on the lives of anyone outside his immediate family. Mats had never known romantic love or lasting friendship, or the feeling of having made a meaningful contribution to society. They log in to his blog so they can post a message letting his followers know that he has died. And then they sit together on the sofa, unable to sleep, unable to do anything.

Then something rouses them. It is an email from a stranger, expressing their sorrow at Mats’ death. It is quickly followed by another email from another stranger, eulogising their son. The messages continue, a trickle becoming a ﬂood as people convey their condolences and write paragraph after paragraph about Mats. He had a warm heart, people write. He was funny and imaginative, a good listener and generous. You should be proud of him, everyone stresses. A primary school teacher from Denmark writes that after hearing of Mats’ death, she broke down in class and had to return home. A 65-year-old psychologist from England says something similar. “Mats was a real friend to me,” writes another stranger. “He was an incurable romantic and had considerable success with women.” Someone else writes to them describing Mats’ empathy. “I don’t think,” they say, “he was aware of how big an impact he had on a lot of people.”

Robert and Trude cannot make sense of this. “Who are these people? Are they crazy or what?” Robert asks, frowning. Slowly, though, with each new email, the truth begins to reveal itself.

Their son had lived by another name. To his family he had simply been Mats. But within World of Warcraft he had existed for years as a charismatic adventurer named “Ibelin”, a strapping swashbuckler with auburn hair tied back in a ponytail and a butch goatee beard. And it was as this digital alter ego that Mats had thrived in a way his family had never appreciated. They had misunderstood what World of Warcraft really was. It had seemed to them like a frenetic action game of monster-bashing and point-scoring. To Mats and the many people he played with — the people now emailing Robert and Trude — it was something far more profound: an immersive world built on social interactions, friendships and shared storytelling. Robert smiles. “This window started to open up to us that let us see he had another life besides his physical life. And that it had been so rich, so big and so full of contentment.”

The story of Mats’ double life, and of the emotional impact its discovery has had on his parents, is told in a new documentary called The Remarkable Life of Ibelin. It is a moving and often deeply philosophical work that tackles questions around the nature of reality and relationships in an increasingly online world. In some ways it is also incredibly ambitious. How do you show the internal world of a terminally ill young man who has now been dead for ten years? How do you recreate the words and deeds that made such an impact on others but which took place within an old online role-playing game? These were perhaps the two greatest challenges facing the Norwegian director Benjamin Ree.

“I’ve spent the past four years trying to ﬁnd out what kind of person Mats was,” says Ree, who is bearded, cheerful and, as he later discovered, was born within a few days of his subject. “I almost feel like I’ve done a doctorate degree in him, in trying to understand him better.”

In making Ibelin, however, Ree discovered he had a number of unique resources at his disposal. Mats had been a member of a World of Warcraft “guild”, a sort of formal club or fraternity that players can join, often by invitation only. Mats’ guild was called Starlight and had its own online forum on which he was a proliﬁc poster, interacting with other guild members and swapping thousands of messages. The guild also kept a digital log of their members’ every action within the game itself: the transcripted text of every typed conversation their character had, every action and emote they commanded their avatar to carry out — laughing, curtseying, crying, dancing, eating, drinking, hugging — and the timestamped coordinates of every location they visited.

“Mats spent almost 20,000 hours in that world. He basically grew up in World of Warcraft,” Ree says. “And what I saw in all the logs and forums and transcriptions was that coming of age inside a game had a lot of similarities to coming of age in the real world.”

What makes Ree’s ﬁlm so affecting is the way in which viewers are able to feel as though we are beside Mats as he goes through this coming-of-age process. Using animation in the style of World of Warcraft graphics, we follow Ibelin as he runs through a fantasy landscape of mountain peaks and deep forests. He sits beside a pond and is approached by a beautiful dark-haired young woman named Rumour, who begins to tease him playfully before snatching his hat and running off into the forest. Mats is 17 at this point and, like all teenage boys, he cannot quite work out that the woman is ﬂirting with him. But he eventually twigs that he is supposed to give chase and the pair strike up a conversation. The young woman is, in fact, controlled by a teenage girl from the Netherlands. She is named Lisette and, like her avatar, she has long dark hair. Over the following weeks and months Ibelin and Rumour, controlled by Mats and Lisette, fall into the kind of intense but romantically ambiguous relationship that will be agonisingly familiar to anyone who has ever been a teenager. At one point, Rumour gives Ibelin a peck on the cheek. “It was just a virtual kiss,” Mats remembers years later on his blog. “But boy, I could almost feel it.”

Rumour and Ibelin in World of WarcraftWORLD OF WARCRAFT AND BLIZZARD ENTERTAINMENT © 2024/COURTESY OF NETFLIX

Lisette and Mats swap addresses and send each other tokens: mix CDs and, in Lisette’s case, sketches of Rumour and Ibelin embracing. But Mats will not video-chat with her or attend any of the real-life Starlight meet-ups that take place. He does not want her — or anyone else — to know of his condition. In World of Warcraft everyone is physically perfect, so it is a player’s ability to project their personality and charisma that makes them attractive and popular. And Ibelin is attractive and popular. Why risk that by revealing his true nature? “In this other world, a girl wouldn’t see a wheelchair or anything different,” Mats will later reﬂect. “They would see my soul, heart and mind, conveniently placed in some strong body.”

But though Mats’ true identity remains hidden, what’s striking is how much he is able to affect the lives of the people he games with. Lisette’s parents conﬁscate her computer when her school grades dip, and don’t believe her when she tells them she will be cut off from so many of her friends without it. Isolated and lonely, she falls into a severe depression. “I couldn’t think of reasons to get out of bed,” she says. Mats intervenes. He writes a heartfelt but measured letter to her parents, introducing himself as an online friend of their daughter’s, expressing his admiration for Lisette, explaining how important World of Warcraft is to her and urging them all to work together to ﬁnd a solution that will allow her some access to her computer.

Lisette’s parents are taken aback. But they reassess and Rumour returns, and she and Ibelin are reunited in the game once more. He helps Lisette unpack and understand her troubles. “Ibelin was a really big support pillar. He was a friend I could be open with about all the things that were going on,” she says. “It’s one of the things that got me out of the depression I was in.”

She is not the only person he helps. A man named Kristian plays the game as a blue-haired gnome and is overcome with feelings of worthlessness. Over time he admits all this to Ibelin. “I told him everything,” Kristian says. “I told him how terrible I had felt. Perhaps it doesn’t seem like much, but it meant the world to me.”

Having spoken to dozens of the people he knew online, Ree says it’s clear that Mats was a very good listener. “Which might seem a strange thing to say when they didn’t actually talk, because everything was written down,” he says. “But he would remember everything that his friends told him. He would ask them questions about it months later. Even towards the end of his life, when he was as sick as it’s possible to get, you could see how he would prioritise his friends and ﬁnd the energy to be there for them. It’s quite extraordinary.”

Perhaps most moving of all is when Mats learns that one of his Starlight friends is, in real life, mother to an autistic teenage son named Mikkel. He is unable to leave their apartment, she tells him. He is unable to show her any physical affection. She feels like a terrible mother. Mats listens and then suggests that she invite her son into the game itself. The son agrees to this and the three of them begin to spend time together within the game, with Mats gently encouraging Mikkel to take social risks, to introduce himself to other avatars or at least respond when spoken to. Under Mats’ guidance, Mikkel begins to ﬁnd a measure of conﬁdence. One day, Mikkel gives his mother’s character a virtual hug, a watershed moment for them both. “It was the ﬁrst time in my life that I could feel love, and started to understand love,” Mikkel says in Ibelin. “The heavens opened up,” his mother says. “This was what I had been waiting for.” Today, Mikkel is able to hug his mother in real life. He is able to leave the house. “I went from the most negative person in the world to a person who could tolerate people,” he says, laughing.

The Remarkable Life of Ibelin is not a hagiography, however. In sifting through Mats’ life, Ree also uncovers conﬂict. If you’ve ever been part of an online community, you will know that rivalries and disagreements fester easily. Empathetic as he was, Mats could also be scathing, sarcastic and temperamental. He ﬁnds himself estranged from Lisette, who discovers that Ibelin has been romancing other women within the game. He falls out with the mother of Mikkel, who begins to suspect that he may be suffering from a chronic illness in real life. “We see that Mats gets in a lot of trouble because he keeps it a secret,” Ree says. “The inner demons build up and he gets into a lot of problems because he isn’t honest about his illness.” It is only towards the end of his life, when he ﬁnally opens up about his condition to everyone in World of Warcraft and starts writing his blog, that he is able to mend all his bridges.

But the fact that he broke them in the ﬁrst place makes him all the more relatable, Ree says. “Who hasn’t made mistakes as a young guy? Pissed people off, been a prick, lost friends and then had to get them back again?” He grins. “There were a lot of similarities with me growing up, actually.”

In the days immediately following Mats’ death, Robert and Trude found the volume of information they received from strangers about their son both wondrous and surreal. They had simply not known. “I knew he was caring and kind,” Trude says. “But I never saw the extent to which he helped and supported so many people.”

Robert nods. “It’s when I realised that the emotions and relationships we create online can be stronger than we realise.”

A delegation from the Starlight guild attend his funeral, including Lisette. During the service, his cofﬁn is draped with their banner, a silver star set against midnight blue. Robert delivers a eulogy for his son in which he speaks of the sorrow he and Trude had felt, believing that his short life had been one void of meaning, friendship, love and belonging. But, he continues, over the past few days they have come to understand that this was not the case, and that he had experienced all these things. “You proved us wrong. You proved us so wrong,” he says from the lectern, before explaining how they had come to learn of Ibelin’s exploits. “Mats was, at times, accused of being a womaniser,” he tells the funeral congregation, tight-throated and smiling. “And I must admit, being a father, I’m a bit proud of that.”

Ten years on, Robert and Trude are still discovering things about Mats. This, along with working on the documentary, has helped prevent him from fading in their hearts and minds. “We’ve still not got to the bottom of everything he did,” Robert says. “In a way, he still lives. He’s 35 and we’re still learning about his life. He’s very present. He’s very close to us.”

Trude and Robert Steen. “I knew he was kind,” says Trude. “But I never saw how much he helped people”TIM JOBLING FOR THE TIMES MAGAZINE. STYLING: HANNAH SKELLEY. GROOMING: CAROL SULLIVAN AT ARLINGTON ARTISTS USING STILA. TRUDE WEARS DRESS, WHISTLES. COM. ROBERT WEARS T-SHIRT AND SUIT, MR P (MRPORTER.COM), TRAINERS, GRENSON.COM. OPENING IMAGE CHAIR, ANTHROPOLOGIE.COM

Every year, the members of the Starlight guild meet at a certain spot within World of Warcraft to commemorate Ibelin. They stand shoulder to shoulder, ﬂawless elves, gallant knights and mighty sorcerers. But really they are just normal people with normal problems. Many of them had their lives improved, in some way or another, by a kind and funny Norwegian boy who died too soon but still left a mark on the world. His father draws a breath and smiles helplessly. “The tears that I have in my eyes now are not sad tears,” he says. “They are happy tears.”

The Remarkable Life of Ibelin is on Netﬂix from October 25

...

Read the original on www.thetimes.com »

5 200 shares, 21 trendiness

...

Read the original on dfns.dyalog.com »

6 193 shares, 8 trendiness

My insight into corporate legal disputes is as meaningful as my opinion on Quantum Mechanics. What I do know is that, when given the chance this week to leave my job with half a year’s salary paid in advance, I chose to stay at Automattic.

Listen, I’m struggling with medical debts and ﬁnancial obligations incurred by the closing of my conference and publishing businesses. Six months’ salary in advance would have wiped the slate clean. From a ﬁduciary point of view, if nothing else, I had to at least consider my CEO’s offer to walk out the door with a big bag of dollars.

But even as I made myself think about what six months’ salary in a lump sum could do to help my family and calm my creditors, I knew in my soul there was no way I’d leave this company. Not by my own choice, anyway.

I respect the courage and conviction of my departed colleagues. I already miss them, and most only quit yesterday. I feel their departure as a personal loss, and my grief is real. The sadness is like a cold fog on a dark, wet night.

The next weeks will be challenging. My remaining coworkers and I will work twice as hard to cover temporary employee shortfalls and recruit new teammates, while also navigating the complex personal feelings these two weeks of sudden, surprising change have brought on. Who needs the aggravation, right? But I stayed.

I stayed because I believe in the work we do. I believe in the open web and owning your own content. I’ve devoted nearly three decades of work to this cause, and when I chose to move in-house, I knew there was only one house that would suit me. In nearly six years at Automattic, I’ve been able to do work that mattered to me and helped others, and I know that the best is yet to come.

I also know that the Maker-Taker problem is an issue in open source, just as I know that a friend you buy lunch for every day, and who earns as much money as you do, is supposed to return the favor now and then. If a friend takes advantage, you’re supposed to say or do something about it. Addressing these imbalances is rarely pretty. Doing it in public takes its own kind of courage. Now it’s for the lawyers to sort out.

On May 1, 1992, a man who’d been horribly beaten by the L. A. police called for calm in ﬁve heartfelt, memorable words: “Can’t we all get along?” We couldn’t then, and we aren’t, now, but my job at Automattic is about helping people, and that remains my focus at the conclusion of this strange and stressful week. I’m grateful that making the tough business decisions isn’t my responsibility. In that light, my decision to stay at Automattic was easy.

...

Read the original on zeldman.com »

7 152 shares, 9 trendiness

I explain the Black–Scholes formula using only basic probability theory and calculus, with a focus on the big picture and intuition over technical details.

The Black–Scholes formula is the crown jewel of quantitative ﬁnance. The formula gives the fair price of a European-style option, and its success can ultimately be measured by its impact on option markets. Before the formula’s publication in 1973 (Black & Scholes, 1973; Merton, 1973), option markets were relatively small and illiquid, and options were not traded in standardized contracts. But after the formula’s publication, option markets grew rapidly. The ﬁrst exchange to list standardized stock options, the Chicago Board Options Exchange, was founded the same year that Black–Scholes was published. And today, options are a highly liquid, mature, and global asset class, with many different tenors, exercise rights, and underlying assets.

The ﬁnancial and mathematical theory underpinning Black–Scholes is rich, and one could easily spend months learning the foundational ideas: continuous-time martingales, Brownian motion, stochastic integration, valuation through replication, and risk-neutrality to name just a few key concepts. But properly contextualized, the formula can be surprisingly inevitable. It can almost feel like a law of nature rather than a ﬁnancial model. My goal here is to justify this claim.

To begin, let’s setup the problem and then state the formula. Recall that a call option () is a contract that gives the holder the right but not obligation to buy the underlying asset () at an agreed-upon strike price (). A put option () is the right but not obligation to sell the underlying short, but since calls and puts are fungible through put–call

parity, we will only concern ourselves with call options in this post. If we can price one, we can price the other. We say the holder exercises

the option if they choose to buy or sell the underlying. A European-style

option can only be exercised at a ﬁxed time in the future, called expiry ().

Clearly, the payoff at expiry of a European-style option is just a piecewise linear ramp function,

where denotes the value of the stock at expiry (black line, Figure

). The single most important characteristic of an option is this asymmetric payoff. For a call option, our downside is limited, but our upside is unlimited. In ﬁnance, this kind of asymmetric behavior is called “convexity”.

Given this, we might guess that the price of the call before expiry, so where , should look like a smooth approximation of the ramp function (colored lines, Figure ), at least to a ﬁrst approximation. Why? Imagine we hold a call before expiry, but the underlying is currently less than the strike . Our position is not worthless precisely because it’s still possible that the stock price will rise before expiry. So we should still be able to sell the contract for more than zero. The closer the stock is to the strike , the more we should be able to re-sell our option for. Put differently, and this is the key point here, the price of the option will change nonlinearly with the price of the stock, and as time passes, the smooth approximation should look more and more like the payoff function, because the option is losing its optionality.

This tension between time decay and convexity is the central dynamic of an option, and the Black–Scholes formula encapsulates this tension beautifully. According to the Black–Scholes model, the fair price of a European-style call option is the following:

Here, is the cumulative distribution function (CDF) of the standard normal distribution, is the standard deviation or volatility of the underlying asset, and is the risk-free interest rate. Black–Scholes assumes that the volatility and risk-free rate are both constant.

At a high level, this formula is just the weighted difference between the stock and strike prices. But the terms and are not obviously interpretable. So can we make headway here? Can we say something more precise without quickly getting bogged down in mathematical details? Let’s try.

Understanding how to derive Equation from ﬁrst principles requires a complex set of mathematical and ﬁnancial ideas. Perhaps the trickiest part for most people is the use of stochastic calculus (Itô, 1944; Itô, 1951; Bru & Yor, 2002). Stochastic calculus is required because we assume the underlying stock price follows this stochastic differential equation (SDE), which is a geometric Brownian

motion:

Here, is the drift of the stock, and is the volatility of the stock. This is the same variable in as Equation . Finally, is an inﬁnitesimal change in a Brownian motion.

The important but subtle point here is that Equation is mathematically imprecise to most people, even those with strong technical backgrounds. In standard calculus, we cannot take the derivative of a random function. Informally, it breaks the required assumption that the function is smooth enough and can therefore be locally approximated by a tangent line. Thus, the notation has no meaning in standard calculus. So it appears that to even understand the generative model above, one must understand stochastic calculus.

As an aside, it’s worth mentioning that this technical difﬁculty is made even more subtle in the original paper (Black & Scholes, 1973) because it’s only a technical detail. The main content of the paper is the derivation of the famous Black–Scholes partial differential equation (PDE), and this is done by reasoning about the dynamics of an option. But since an option’s price is a function of its underlying stock’s price, describing its dynamics via a PDE requires partial derivative such as

And of course, computing this term requires stochastic calculus since is a random function. So without stochastic calculus, any derivation of the Black–Scholes PDE is high-level at best. My approach in this post is to circumvent the PDE entirely and to just make sense of Equation directly. The PDE is beautiful—and perhaps I’ll write a separate post about it in the future—but we can still make progress by just thinking about the generative model of the stock price in simple, probabilistic terms.

So let’s side-step the challenge of stochastic calculus implicit in Equation

. The key point is this: a differential equation is an equation that expresses the relationship between a function and its derivatives. And a solution here is the closed-form expression of

such that we could plug and its derivative(s) into Equation and it would hold. A standard result is that the solution to the SDE in Equation is the following definition of :

If you want more detail, read about solving the SDE of geometric Brownian motion. But the main idea for us is that the Black–Scholes modeling assumption can be expressed in a form that does not require stochastic calculus. Equation

can be reframed as the assumption that stock prices are lognormally

distributed or, put differently, that log returns are normally distributed:

So as a pedagogical trick, let’s just assume Equations and rather than Equation

. This is our new starting point.

For some intuition for the leap between the two, recall that the derivative of

is , and this looks a bit like the left-hand side of Equation . So think of the left-hand side of Equation as a log return. And think of the right-hand side of Equation as a normally distributed random variable, since it is an incremental (additive) change in Brownian motion plus a constant drift. The extra term, , can only be properly understood with stochastic calculus—it’s from the quadratic variation of Brownian motion—, but it provides no real intuition for us here, and we will take it as a given.

Let’s look at some examples. In Figure , I have plotted the stock price over varying drifts and volatilities

. This looks like random noise with drift because that’s precisely what it is. The normal distribution is as random as it gets, in the sense that the Central

Limit Theorem states that the appropriately scaled sum of independent random variables converges to the normal distribution. Things that aren’t initially normal can still become normal over time.

Now if is normally distributed, then is lognormally distributed. So as time passes, the stock price is always lognormally distributed with a variance that increases linearly with time and thus a volatility that increases with the square root of

time. To visualize this assumption, I’ve plotted the appropriate lognormal distribution at various time slices along with a many empirical samples (Figure ). In my mind, this ﬁgure captures the geometric essence of Equations and . Black–Scholes assumes that the stock price is unpredictable modulo the drift, and that “inﬁnitesimal change in Brownian motion”, mathematically vague for the uninitiated, amounts to an uncertainty about the stock price that grows with the square root of time.

Of course, we could make the same ﬁgure but using log returns. In that case, the lines indicating the lognormal distributions in Figure

would change to indicate normal distributions. In either case, the generative model of Black–Scholes is that prices and therefore returns are unpredictable.

Now pause. We’re going to make a subtle but critical tweak to our assumptions so far. We’re going to replace the stock-speciﬁc drift with the risk-free interest rate . Concretely, rather than assuming Equation , we’re going to assume a stock’s log returns follow this normal distribution:

We haven’t discussed in detail yet, but this is just the an idealized number representing the time-value of money. To a ﬁrst order, think of as the interest rate from a very secure or reliable asset, such as a short-term US government bond. It is the risk-free rate and thus the lower-bound on what you can earn without risk.

Of course, different stocks might be modeled with different drifts. The drift of a blue-chip company might not be the same as the drift of a penny stock. So in essence, this assumption is a claim that Black–Scholes is making: the drift of the stock doesn’t matter when pricing an option! This is a surprising and deep claim. Let’s understand it.

In the original paper, Black and Scholes assume that the market has no arbitrage. Here, an “arbitrage” is an opportunity to make a risk-free proﬁt starting at zero wealth. In other words, Black–Scholes assumes the market is perfectly efﬁcient, and the formula (Equation ) represents the fair price of the option in this idealized world.

Now when I ﬁrst learned this stuff, I found this assumption confusing, because I thought it was analogous to assuming “no friction” in a high-school physics problem. In physics, we might simplify the world by assuming that a box slides on plane without friction. This makes calculations easier for students, but the consequence is that predictions are systematically wrong. At some point, we have to add friction back into our model to make it realistic.

But friction is a poor analogy here, and I propose a different one: assuming no arbitrage is analogous to assuming no wind or air resistence when modeling projectile motion. This assumption also simpliﬁes calculations, but really it helps us understand the essence of the phenomenon: the parabolic arc of projectile motion. Later, we can add air resistance or wind depending on our particular circumstances, but the underlying parabolic arc is a kind of platonic ideal. It is the signal without the noise.

This is the sense in which Black–Scholes assumes no arbitrage. The assumption is not an implausible claim that real ﬁnancial markets are perfectly efﬁcient. Rather, assuming no arbitrage is assuming a perfectly coherent, noise-free system where prices are consistent and make sense relative to each other. So this isn’t about simplifying calculations. It’s about ﬁnding the platonic price. And thus it’s not an assumption that we remove in the future to get a more realistic price. Quite the opposite! Adding arbitrage would make the problem impossible to solve because prices would then be inconsistent. There would be no universally agreed-upon market price.

Now that we understand this, let’s retrace the main line of argument of the original paper (Black & Scholes, 1973), but let’s do so in a simpler context. Imagine we sold a hypothetical derivative contract: a redeemable certiﬁcate on a stock, which can be exercised only at time . An investor pays us , the fair price of the redeemable at inception, and in return we give them a certiﬁcate that is redeemable for the value of the stock at time

. Our goal is to solve for , the fair price of the redeemable at any given moment in time.

As dealers, the risk to us is that the price of the stock goes up. So what could we do? Well, the moment we sell a redeemable, we could buy the underlying stock. Now we are perfectly hedged. We don’t care if the stock goes up or down in value, because we own the stock. Whenever a customer comes to redeem the value of the stock, we simply sell the appropriate share, and we’re done. We neither make nor lose money on average, and thus the price is fair.

The key insight of Black–Scholes is to realize that our perfectly hedged portfolio, short a redeemable and long a stock, is risk-less. And thus, it must grow or drift at the risk-free rate! Formally, we can say that the value of our portfolio at expiry is

This is remarkable because is random! But the left-hand side is not an expectation, because we are always hedged! Do you see the trick? This is the big idea of the original paper. We’ve neutralized the randomness in by assuming we can perfectly hedge it out!

Now a terminal condition is that our redeemable is worth the value of the stock at expiry, so . This is fair in the sense that neither we nor the investor makes more money than the other. Under this condition, we can write Equation as

But if our portfolio is perfectly hedged and riskless, then clearly must be a riskless derivative. And so it must drift at the risk-free rate, giving us

And of course, we can replace big with little , in general. Doesn’t this make sense? The fair price of the redeemable is simply the price of the stock at contract inception, adjusted for the time-value of money! Again, despite being a random process, we have no expectations. We have neutralized randomness through perfect hedging in a world without arbitrage.

Now let’s extend this line of reasoning to options. The challenge here is that an option has convexity. Its price changes nonlinearly with changes in the underlying. So unlike with the redeemable, we would not want to buy exactly one share of stock for one option contract. Instead, at each moment, we want precisely the amount of stock such that, if the stock price moved an inﬁnitesimally small amount, our hedge would move an inﬁnitesimally small amount that perfectly matched the price of the option. This sounds like a derivative from calculus because it is! The amount we want to hedge at an instant in time is simply the derivative of the call price with respect to the stock price, called the delta:

So if the stock changes by one dollar, our option price changes by approximately

dollars.

Now imagine a world without arbitrage or transaction fees and with continous trading. In this world, we (an options dealer) can always be perfectly delta hedged. Our portfolio is always just

In the original paper, the authors repeat the argument made above for redeemables but in the context of options. The calculations become more complicated because working with requires stochastic calculus, since

is a random variable. But the argument is essentially the same. We assume a world without arbitrage and then just model the dynamics of a perfectly hedged and thus risk-less portfolio. We use these dynamics (Black–Scholes PDE) and a terminal condition to solve for the fair price .

That’s the original argument. It’s an argument about no arbitrage and perfect hedging. But Black–Scholes feels like a law of nature because it’s the solution to a noise-free system and because it can be derived in many ways. Another way to derive Black–Scholes is by modeling a portfolio which perfectly replicates the price of a call at each moment. This idea of valuation through replication is far-reaching in ﬁnance. To quote Emanuel Derman (Derman, 2002):

If you want to know the value of a security, use the price of another security that’s as similar to it as possible. All the rest is modeling.

The discrete-time version of this argument is the binomial options-pricing

model. Yet another way is to connect the idea of no arbitrage to the idea of risk-neutrality. This relationship is called the fundamental theorem of asset

pricing, and it’s the relationship we want to explore here.

Let’s understand this connection between the original argument of no arbitrage and the modern argument of risk-neutral pricing. Consider this: in a Black–Scholes world, do we care about the drift of the underlying stock? As with a redeemable, we are always perfectly and continuously delta hedged. We have no price exposure. In this world, all options dealers would perfectly hedge. And all options investors would become risk-neutral, because it would not pay to take risk. So one trick that makes our logic and calculations easier is to just assume the drift of the stock is ! This is equivalent to assuming the world has no arbitrage. And what’s the fair price of an option in this world? It’s a time-discounted expected value:

Here, the subscript is called the risk-neutral measure, and this notation is used to make it explicit that the expectation is computed in this imaginary world, not in the real risky world. In this imaginary world, is a martingale, which is a stochastic process with an expected value that is always equal to the current value. Intuitively and importantly, martingales have no drift. So in Equation , the only drift is from the discount factor . The stock itself is a drift-less martingale.

So in one telling, we assume the world has no risk, and thus we can work without expectations. Random processes can be forced to become non-random. In another telling, we assume the world is risk-neutral, and random processes become martingales. In both tellings, the only meaningful drift is the risk-free rate.

There is a lot of theory one could get into here. For example, the Girsanov

theorem is a result from probability theory on how a stochastic process changes under a change in measure (here, from the true “physical” measure to the risk-neutral measure). And you might read things like, “Under the risk-neutral measure, the stock price after discounting by the risk-free rate follows a martingale”. You can easily get lost in the technical details. But in my mind, at a high-level, the concept is fairly simple, if a bit non-obvious. In a world in which all investors and traders can perfectly hedge their risk, there is no risk premia; everyone is forced to become risk neutral. And thus, the price of everything is its risk-neutral expected value, or its value if there was no premia to risk.

And again, this simplifying assumption is not like assuming “no friction”. We do not need to re-add arbitrage or drift in order to compute a more realistic option price later. Instead, this assumption strips out all the noise, and

represents a platonic price in a coherent and consistent market.

Armed with this understanding, let’s revisit the generative model for a stock. In a risk-neutral world, the drift of every stock is the same: it’s just the risk-free rate . This is the line of reasoning which converts Equation to Equation .

We’re now ready to try to directly make sense of Equation . At this point, we’ll need to a bit of tedious algebra, but nothing in this section requires more than basic probability. And the conceptual work is mostly done. As promised, I’ve tried to side-step as much stochastic calculus as possible in order to get to this point.

Let’s also repeat our modeling assumption (Equation ) but in terms of and :

As a ﬁnal preliminary, two observations for notational clarity. First, note that at time , the price

is non-random. So we can push that into the mean if we would like, giving us:

Second, here I’ve used notation and to match what I have commonly seen in the literature. But I think it’s extremely useful to rewrite as

Now at a high-level, my claim is that we can think of

as decomposable into two terms, one representing what we make (stock price) and one representing what we pay (strike price), both contingent on the call ending in-the-money (). This idea is not original to me; it is from (Nielsen, 1992). We have:

\begin{aligned}

C_1 &= \text{contigent value of stock} &&=

\begin{cases}

S_T & \text{if $S_T \gt K$,}

0 & \text{else,}

\end{cases}

C_2 &= \text{contigent value of strike} &&=

\begin{cases}

-K & \text{if $S_T \gt K$,}

0 & \text{else.}

\end{cases}

\end{aligned} \tag{18}

Furthermore, both of these terms will have a clear, simple, probabilistic interpretation that will directly map onto Equation . Let’s see this.

First, what is ? This is an expectation, and the value of the claim is zero when the call is out-of-the-money (when ). By definition:

And it is easy to see that is equal to

Here, denotes the -score of the log move from to , i.e.

So in words, we can see that is just the probability that the call ends up in-the-money, or the probability that . All those extra variables embedded in just represent normalizing the log move from to

, such that we can represent the equation using the CDF of the standard normal rather than the CDF of . We could, if we wanted to, represent all of this using the CDF of the un-standardized lognormal distribution. But it’s cleaner and conventional to work in a standardized space.

To summarize, we have shown:

This represents the expected value we must pay to exercise a call option, contingent on the option being exercised. Of course, this is an expected value, but the Black–Scholes price is the price in today’s terms. So we need a discount factor, giving us:

Now for , we again have an expectation where the contingent value is zero when the option ends out-of-the-money. This is a bit more complicated than the derivation for , since is random while is ﬁxed. By the law of

total expectation, we have:

...

Read the original on gregorygundersen.com »

8 139 shares, 7 trendiness

We recently made a Polylog video about the P vs NP problem. As usual, our goal was to present an underrated topic in a broadly understandable way, while being slightly imprecise and leaving out the messy technical details. This post is where I explain those details so that I can sleep well at night.

EDIT: At the bottom, I added replies to some more common questions people asked in the YouTube chat.

The main point of the video

The main point of the video was to present P vs NP from a somewhat unusual perspective. The most common way to frame the question is: “If you can efﬁciently verify a solution to some problem, does that mean you can efﬁciently solve it?” Our video explored a different framing: “If you can efﬁciently compute a function , is there an efﬁcient way to compute ?” If you formalize both of these questions, they’re mathematically equivalent, and they’re also equivalent to the question “Can we efﬁciently solve the Satisﬁability problem?” (as proven in a later section)

I think that the framing with inverting a function is quite underrated. It’s extremely clean from a mathematical perspective and highlights the fundamental nature of the question. We can also easily view the more common veriﬁer-formulation of P vs NP as a special case of this one, once we realize that inverting a checker algorithm and “running it backward from YES” solves the problem that the checker veriﬁes.

If we managed to convey some of these ideas to you, then the video succeeded! However, a deep understanding requires grappling with all the nitty-gritty details, which I’ll go through in this post. I’ll also touch on some additional topics, like a fun connection between P vs NP and deep learning.

The main hero of the video, satisﬁability, comes in several ﬂavors:

Satisﬁability (SAT): In the most basic version of the problem, we are given a logical formula (without quantiﬁers) and must ﬁnd an assignment to its variables that makes it true; or determine that no such assignment exists. This is the cleanest formulation of the satisﬁability problem if you’re familiar with logical formulas. We didn’t opt for this choice since it raises questions like “What kinds of logical connectives are allowed?” or “How do you quickly evaluate formulas with many nested parentheses?”.

Conjunctive-Normal-Form Satisﬁability (CNF-SAT): This is the version of satisﬁability we used. “Conjunctive” means that we require our formula to be a large conjunction (AND) of clauses, where each clause is a disjunction (OR) of literals (either or ). This is the classic input format often required by SAT solvers.

3-SAT: This is CNF-SAT where we additionally require that each clause has at most three literals. If you look carefully at our conversion of a circuit to an instance of CNF-SAT, you’ll notice that if all the gates in the circuit are one of AND, OR, NOT, and take at most two inputs (which can always be achieved), then the instance of CNF-SAT we create is, in fact, an instance of 3-SAT. So, our approach proves that even 3-SAT is NP-complete.

Circuit-SAT: In this problem, you are given a circuit that outputs a single bit, and the question is whether there is an input that makes the circuit output True. In our video, we showed how to reduce this problem to CNF-SAT by encoding the gates of the circuit as constraints (and adding one more constraint saying that its output is True). This transformation is also called the Tseytin transformation.

Any Algorithm Can Be Viewed as a Circuit

In our video, we didn’t want to dive into how any algorithm can be converted into a circuit — I feel that it’s quite intuitive once you see a bunch of examples like the multiplication circuit or if you have some idea of how a CPU looks inside. But there is an important subtlety: real-world circuits contain loops.

More concretely, our implicit definition of a circuit (corresponding to what theoreticians call a circuit) is that the underlying graph of a circuit has to be acyclic so that running the circuit results in a single pass from the input to the output wires.

On the other hand, a definition that closely corresponds to how CPUs work would allow the underlying graph to have cycles. In that definition, running the circuit means simulating it for some predetermined number of steps and then reading the output from the output wires. I’ll call this definition a “real-world circuit.”

Fortunately, we can convert any real-world circuit into an acyclic circuit by “unwrapping it in time”. Speciﬁcally, given any real-world circuit simulated for steps, for any of its gates , we make copies of that gate. Then, whenever there was a wire between two gates , we create wires between and , and , and so on, up to and . This way, we get an acyclic circuit. Running this circuit corresponds to simulating the original circuit for steps.

The most common formal model of algorithms is not a real-world circuit, but a Turing machine. Converting any Turing machine to our acyclic circuit can be done in a similar way to how you “unwrap” a real-world circuit. However, this conversion is more messy if you want to understand it in full detail.

Decision Problems and NP vs coNP vs

When we talk about a “problem” in computer science, we usually mean something like “sorting,” where we are given some input ( numbers) and are supposed to produce an output ( numbers). But one subtlety of the formal definitions of P and NP is that they describe classes of so-called decision problems. These are problems like “Is this sequence sorted?” where the input can still be anything, but the output is a single bit: yes or no.

So, when we say that graph coloring is in NP, the problem we talk about is “whether it’s possible to color it properly with colors.” The reason we focus on decision problems in formal definitions is that it makes it easier to build a clean theory. Unfortunately, that’s pretty hard to appreciate if you’re encountering these terms for the ﬁrst time, which is why we try to avoid these kinds of issues in our videos as much as possible.

There’s one more nuance. In our video, we implicitly deﬁned that a problem is NP-hard if any problem in NP can be reduced to it. However, we didn’t explain what a “reduction” is.

Intuitively, saying “a problem can be reduced to SAT” should mean something like “if there is a polynomial-time algorithm for SAT, there is also a polynomial-time algorithm for .” However, this isn’t how classical reductions are deﬁned. Saying “a problem can be reduced to SAT” formally means that there is an algorithm for solving that works by ﬁrst running a polynomial-time procedure that transforms an input to into an input to SAT and then determining whether that SAT instance is satisﬁable.

So, for example, if you can solve some problem by running a SAT solver ten times, this doesn’t mean that you have reduced that problem to SAT— in reduction, you can only run the SAT solver once. Moreover, if you solve a problem by running the SAT solver and then doing some postprocessing of its answer, this is also not a reduction.

Let’s look at an example. Consider a problem called Tautology, where the input is some logical formula, as in the Satisﬁability problem. However, the output is 1 if all possible assignments of values make the formula true, and 0 otherwise. Notice that any formula is a tautology if and only if the formula is not satisﬁable. In particular, if you can solve Satisﬁability and want to ﬁnd out whether some formula is a tautology, just ask the SAT solver whether is satisﬁable and negate its answer. But notice that this “reduction” is not allowed because, after running the SAT solver, there is a postprocessing step where we ﬂip its answer.

Although sat solvers can solve Tautology, the problem is (probably) not even in the class NP: If someone claims that a formula is a tautology, how should they persuade us that it is? Tautology happens to belong to the class coNP (the complement of NP), which is a kind of mirror image of NP.

Finally, the class of problems that we can solve in polynomial time if we could solve SAT in polynomial time is called . In general, means you have polynomial time but can also solve any polynomial-sized instance of the problem in one step. So, both Satisﬁability and Tautology are in . When I ﬁrst learned about P vs NP, for quite some time I didn’t know about decision problems, thought that , and couldn’t understand what the hell even meant.

In our video, we didn’t say that the problem Inversion, deﬁned as “given a function described as a circuit, return a circuit for ,” is NP-complete. This is because Inversion is not even a decision problem, so the statement is not true. The more correct statement would be something like .

In our video, we hinted that the question “Can we invert functions efﬁciently?” is equivalent to the P vs NP problem. However, we have not proven this equivalence formally, so let’s be more precise now. The claim is that the following three statements are equivalent:

There is a polynomial-time algorithm for Satisﬁability.

Given any function described as a circuit, there is a polynomial-time algorithm to compute (i.e., given some and some as input, the algorithm in polynomial time outputs some such that , if such an exists).

There is a polynomial-time algorithm for any NP-complete problem.

All the ideas of the proof are in the video, but let’s prove this a bit more formally.

1 → 2: Given a polynomial-time algorithm for Satisﬁability, we can invert any function , as we demonstrated in the video: We convert the logic of into a satisﬁability problem, use a few more constraints to ﬁx the output to be , and use the assumed algorithm for Satisﬁability to ﬁnd a solution.

2 → 3: Recall that any problem in NP has, by definition, a fast veriﬁer: an algorithm that takes as input an instance of (e.g., a graph if is the graph coloring problem), a proposed solution (e.g., a coloring), and determines whether this solution is correct. To solve any input instance of , we proceed as follows. First, we represent the veriﬁer as a circuit (as explained earlier, this is always possible) with two inputs: the instance and the proposed solution. Then, we ﬁx the ﬁrst input to the speciﬁc instance we want to solve. This way, we obtain a circuit that maps proposed solutions to whether they are correct for the instance of . Using our assumption, we can invert this circuit, thereby determining whether our instance admits a solution.

3 → 1: By definition, a problem is NP-complete if we can reduce any problem in NP to it. Satisﬁability is in NP, so we can reduce any instance of Satisﬁability to an instance of , which we can solve in polynomial time by our assumption, as we wanted to prove. As a small detail, this way, we are only solving the satisﬁability as a decision problem, i.e., whether a solution exists, or not. However, once we solve the decision problem, we can also ﬁnd an actual solution. To do this, we will repeatedly solve the decision problem and each time, we add an additional constraint like . If the solution is still satisﬁable with this constraint added, we know that there exists a solution with being True, so we add to our instance and continue with . Otherwise, we know that there is a solution with , so we add this condition to our instance and again continue with . After steps, we recover an assignment of variables that satisﬁes the input formula.

One of the biggest mysteries of theoretical computer science is that most problems we come across in practice are either in P or are NP-complete.

More specifically, the mystery is why there are only a few interesting problems that have a potential to be NP-intermediate where NP-intermediate problems are those in NP that are neither in P nor NP-complete. Funnily enough, the most prominent NP-intermediate candidate problem is factoring, the running example in our video. Besides factoring and the so-called discrete logarithm problem, it’s really hard to come up with a good example of problems that look like potential NP-intermediate problems.

There are also only a few “interesting” problems that are even harder than NP. I wouldn’t call this a mystery: such problems have the property that we can’t even verify proposed solutions. This makes them intuitively so much harder than what we usually deal with that we don’t encounter those problems often in algorithmic practice and thus we mostly don’t think of them as “interesting”.

One example of a problem that’s even harder than NP is determining winning strategies in games. For example, think of a speciﬁc game, like chess, and ask the question, “Does white have a winning strategy in this position?” Even if you claim that white is winning in some position, how do you convince me? I can try to play black against you, but even if I lose every time, maybe it just means I’m not a good enough player. We could go through the entire game tree together, but that takes exponential time (NP requires that we can verify in polynomial time).

In fact, if you generalize chess so that it is played on a chessboard of size , the problem of playing chess is either PSPACE-complete or EXP-complete. The generalization to board is necessary since otherwise chess can be solved in constant time.

PSPACE is the class of problems we can solve if we have access to polynomial space. If we say that our generalized game of chess can last for, say, at most rounds and if the checkmate did not occur until then, it ends in a draw, the problem of ﬁnding winning strategies can be solved in PSPACE: we can recursively walk through the entire game tree of depth to compute whether any given position is winning for some player. In fact, the problem would be PSPACE-complete.

EXP is the class of problems we can solve if we can use exponential time. If we don’t impose any limit on how long our generalized chess game can last, the problem is no longer in PSPACE, but it’s still in EXP. This is because the game has at most exponentially many different states, which means that if we explore the game tree and remember states we’ve already seen, we can ﬁnish in exponential time.

Let’s return to the framing of the P vs NP question as “” How can this framing be useful? In the video, we showed how this view makes it clear that if P=NP, hash functions cannot exist because their entire shtick is to be easy to compute but hard to invert.

Here’s another reason why I ﬁnd this framing helpful. It makes it clear that being able to invert algorithms brings a lot of power and makes you wonder whether we can “run algorithms backward” at least in some restricted sense. So, what kinds of functions can we efﬁciently invert or even “run backward”?

One example could be linear functions. That is, we can solve the linear equation and write (assuming the solution exists). Importantly, the matrix can be computed from in polynomial time.

Let’s be more ambitious and talk about continuous functions. Concretely, let’s recall our acyclic circuits and modify them as follows: The wires will no longer carry zeros and ones but arbitrary real numbers. The gates will no longer compute logical functions like AND, OR, NOT, but simple algebraic functions like +, for some parameter , and even more complicated functions like sigmoid or ReLU. These kinds of circuits are, of course, called neural networks.

Now, we can’t literally invert neural networks—that’s still NP-complete. But we can do something similar. Let’s say we have a network that computes some function and we run it on some input vector to get an output vector , which we write as . Now, let’s say we’d like to nudge the output from to some very close to . The question is, how do we compute the vector that has the property that ? This is analogous to the problem of inverting functions, but this problem is easier. Since we’re only talking about nudging and we assume that is a nice continuous function, we can approximate it by a linear function in the vicinity of and write , where is the matrix of partial derivatives. Since we know how to invert linear functions, we can now solve for , i.e., ﬁnd out how to nudge to get the appropriate nudge at .

The algorithm that can compute in linear time for neural networks is called backpropagation. This algorithm nicely ﬁts our P vs NP dream of “running algorithms backward”: not only when it comes to the task that it solves but also in how it works: The algorithm begins at the end of the neural network and works its way back through the wires while computing the derivatives. I ﬁnd it very satisfying how you can view this algorithm as the currently best answer we got to the question “given a circuit, how can we invert it and run it backward?” (everybody keeps telling me this is a stretch, though).

In practice, when we train the neural network, we think of the weights of the net as the “input” that we want to change to . The setup where we keep the weights of the net ﬁxed and optimized the actual input is also interesting, though — this is how you create so-called adversarial examples.

In general, the most striking difference between deep learning and classical algorithmics is how declaratively deep learning researchers think. That is, they think hard about what the right loss function to optimize is or which part of the net to keep ﬁxed and which part to optimize during an experiment. But they think less about how to actually achieve the goal of minimizing the loss function. This is often done by including a few tiny lines in the code, like network.train() or network.backward(). To me, the essence of deep learning has nothing to do with trying to mimick biological systems or something in that sense; it’s the observation that if your circuits are continuous, there’s a clear algorithmic way of inverting/optimizing them using backpropagation.

From the perspective of someone used to algorithms like Dijkstra’s algorithm, quicksort, and so on, this declarative approach of thinking in terms of loss functions and architectures, rather than how the net is actually optimized, sounds very alien. But this is how the whole algorithmic world would look like if P equaled NP! In that world, we’d all program declaratively in Prolog and use some kind of .solve() function at the end that would internally run a fast SAT solver to solve the problem deﬁned by our declarations.

Some people asked how does this connect to reversible computing. Its idea is as follows: When we are using a gate like XOR gate that maps two inputs a,b to one output , we are losing information about the inputs. So, we can replace XOR gate by the so-called CNOT gate that has two outputs: and $a$. From these two outputs, we can reconstruct the input. A more complicated Toffoli gate is even universal in the sense that any circuit can be converted to a reversible circuit built just from Toffoli gates. A reversible circuit looks a bit like the music staff (see the picture below): The number of wires throughout the circuit is not changing, we are just keep applying Toffoli or other reversible gates on small subsets of the wires.

So, it seems that we can get reversible algorithms for free. But we are saying that being able to reverse algorithms is equivalent to P=NP. Where is the problem?

To understand why reversibility is not buying you that much, you need to look closely at the ﬁnal reversible circuit. First, such a circuit has the same number of “input” and “output” wires, so if the output has strictly less or strictly more bits than the output, how would we even deﬁne the output ?

What happens is that in reversible circuits with wires with bit input and bit output, we deﬁne that at the start of the circuit, the wires contain the bits of inputs and then zeros. We require that the ﬁrst wires at the end of the circuit contain the required output and the rest of the wires can be arbitrary junk. At the end of the algorithm, we look at the ﬁrst wires and forget the junk.

But forgetting the junk is where the process stops being reversible! For example, if a reversible circuit is computing a hash function, we are able to map the pair (hash, junk) back to the original input, but once we forget the junk, we are screwed! So, the only thing that reversible circuits show is that we can always create circuits where the only nonreversibility is “forgetting the junk”.

This is a nice observation but it does not change the reality on the ground: if we are given an output bits, ﬁnding some consistent input is still hard, whether we are talking about the hard task of going back through irreversible circuit, or the hard task of ﬁnding the missing junk.

Why most cryptography breaks if P=NP

In the video, we showed that P=NP implies that RSA would be broken and we could break hash functions in the sense that given any hash function and its output, we can ﬁnd an input that the function maps to that output. However, how would we break the most basic cryptographical task, the symmetric encryption?

In the symmetric encryption setup, we have two parties, A and B, that share a short key of bits. Moreover, A wants to send a plain text with bits to B. The solution is that A uses some encryption function that maps (plain text, key) to encrypted text, and B uses a decryption function that maps (encrypted text, key) back to the plain text.

The strategy of how to break symmetric encryption in the case when P=NP is straightforward: we formulate the question “Find out the pair (plain text, key) that the enryption function maps to the encrypted text” and use a fast SAT solver to answer it. The problem is that if both plain text and encrypted text have n bits, there are many pairs (plain text, key) mapping to any given encrypted text. This approach only gives us one such pair which is probably not the one we are after.

But look, even if the plain text has length n bits, its entropy is typically much smaller. For example, english text can be often compressed to about 5 times smaller size using standard compressing algorithms. The keys that are used to encrypt are random but typically much shorter than n. So, the entropy of the pair (plain text, key) is typically much less than n bits. In that case, given any encrypted text, there is just one possible plain text that maps to it, i.e., it is possible to recover the plain text at least from the perspective of information theory.

We can recover this plain text efﬁciently as follows. We will create another algorithm A that, given a string, tries to output how much that string looks like a message. For example, the algorithm can check whether the string looks like an english text, .exe ﬁle, etc. Now, we can use SAT solver to answer the question “Out of all pairs (plain text, key) that maps to the encrypted text, return the one that looks the most like a plain text according to the algorithm A”. This way, we manage to select the actual plain text.

Notice that this approach requires that plain text + key have together at most n bits of entropy. In other words, if you are either sending random or well-compressed data, or if you encrypt your data by one-time pad, you survive P=NP. So, a little bit of cryptography can survive P=NP, but only a little bit.

...

Read the original on vasekrozhon.wordpress.com »

9 137 shares, 8 trendiness

Starch is a natural polymer which is commonly used as a cooking ingredient. The renewability and bio-degradability of starch has made it an interesting material for industrial applications, such as production of bioplastic. This paper introduces the application of corn starch in the production of a novel construction material, named CoRncrete. CoRncrete is formed by mixing corn starch with sand and water. The mixture appears to be self-compacting when wet. The mixture is poured in a mould and then heated in a microwave or an oven. This heating causes a gelatinisation process which results in a hardened material having compressive strength up to 26 MPa. The factors affecting the strength of hardened CoRncrete such as water content, sand aggregate size and heating procedure have been studied. The degradation and sustainability aspects of CoRncrete are elucidated and limitations in the potential application of this material are discussed.

Dive into the research topics of ‘CoRncrete: A corn starch based building material’. Together they form a unique ﬁngerprint.

Kulshreshtha, Y. et al. / . In: . 2017 ; Vol. 154. pp. 411-423.

abstract = “Starch is a natural polymer which is commonly used as a cooking ingredient. The renewability and bio-degradability of starch has made it an interesting material for industrial applications, such as production of bioplastic. This paper introduces the application of corn starch in the production of a novel construction material, named CoRncrete. CoRncrete is formed by mixing corn starch with sand and water. The mixture appears to be self-compacting when wet. The mixture is poured in a mould and then heated in a microwave or an oven. This heating causes a gelatinisation process which results in a hardened material having compressive strength up to 26 MPa. The factors affecting the strength of hardened CoRncrete such as water content, sand aggregate size and heating procedure have been studied. The degradation and sustainability aspects of CoRncrete are elucidated and limitations in the potential application of this material are discussed.”, author = “Y. Kulshreshtha and E. Schlangen and Jonkers, {H. M.} and Vardon, {P. J.} and {van Paassen}, {L. A.}”,

/ Kulshreshtha, Y. et al.

In: , Vol. 154, 15.11.2017, p. 411-423.

N2 - Starch is a natural polymer which is commonly used as a cooking ingredient. The renewability and bio-degradability of starch has made it an interesting material for industrial applications, such as production of bioplastic. This paper introduces the application of corn starch in the production of a novel construction material, named CoRncrete. CoRncrete is formed by mixing corn starch with sand and water. The mixture appears to be self-compacting when wet. The mixture is poured in a mould and then heated in a microwave or an oven. This heating causes a gelatinisation process which results in a hardened material having compressive strength up to 26 MPa. The factors affecting the strength of hardened CoRncrete such as water content, sand aggregate size and heating procedure have been studied. The degradation and sustainability aspects of CoRncrete are elucidated and limitations in the potential application of this material are discussed. AB - Starch is a natural polymer which is commonly used as a cooking ingredient. The renewability and bio-degradability of starch has made it an interesting material for industrial applications, such as production of bioplastic. This paper introduces the application of corn starch in the production of a novel construction material, named CoRncrete. CoRncrete is formed by mixing corn starch with sand and water. The mixture appears to be self-compacting when wet. The mixture is poured in a mould and then heated in a microwave or an oven. This heating causes a gelatinisation process which results in a hardened material having compressive strength up to 26 MPa. The factors affecting the strength of hardened CoRncrete such as water content, sand aggregate size and heating procedure have been studied. The degradation and sustainability aspects of CoRncrete are elucidated and limitations in the potential application of this material are discussed.

...

Read the original on research.tudelft.nl »

10 129 shares, 8 trendiness

A couple days ago I had a bit of free time in the evening, and I was bored, so I decided to play with BOLT a little bit. No, not the dog

from a Disney movie, the BOLT

tool from LLVM project, aimed at optimizing binaries. It took me a while to get it working, but the results are unexpectedly good, in some cases up to 40%. So let me share my notes and benchmark results, and maybe there’s something we can learn from it. We’ll start by going through a couple rabbit holes ﬁrst, though.

I do a fair amount of benchmarking during development, to assess impact of patches, compare possible approaches, etc. Often the impact is very clear - the throughput doubles, query that took 1000 milliseconds suddenly takes only 10 milliseconds, and so on. But sometimes the change is tiny, or maybe you even need to prove there’s no change at all.

That sounds trivial, right? You just run the benchmark enough times to get rid of random noise, and then compare the results. Sadly, it’s not that simple, and it gets harder the closer the results are. So in a way, proving a patch does not affect performance (and cause regression) is the hardest benchmarking task.

It’s hard because of “binary layout” - layout of data structures, variables and functions etc. in the executable binary. We imagine the executable gets loaded into memory, and that memory is uniformly fast. And it’s not, we just live in the illusion of virtual address space. But it’s actually backed by a hierarchy of memory types with vastly different performance (throughput, latency, energy costs, …). There’s a wonderful paper by Ulrich Drepper

from 2007, discussing all this. I highly recommend reading it.

This means the structure of the compiled binary matters, and maybe the patch accidentally changes it. Maybe the patch adds a local variable that shifts something just enough to not ﬁt in the same cache line. Maybe it adds just enough instructions or data to push something useful from iTLB/dTLB caches on the CPU, forcing access to DRAM later. Maybe it even affects branch prediction, or stuff like that.

These random changes to binary layout have a tiny impact - usually less than 1% or so, perhaps a bit more (say 5%?). I’m sure it’s possible to construct artiﬁcial examples with much bigger impact. But I’m talking about impact expected on “normal” patches.

To further complicate things, these layout effects are not additive. If you have two patches causing “1% regression” each because of layout, it does not mean applying both patches will regress by 2%. It might be 0% if the patches cancel out, for example.

When you benchmark a patch, and the difference is less than ~1%, it’s hard to say if it’s due to the patch or a small accidental change to the binary layout.

But we would like to know! 1% regression seems small, but if we happen to accept multiple of those, the total regression could be much worse.

What can we do about it?

There’s a great “Performance Matters” talk about this very issue, by Emery Berger, presented at StrangeLoop 2019. It starts by explaining the issue - and it does a much better job than I did here. And then presents the

Stabilizer proﬁler, randomizing the binary layout to get rid of the differences.

The basic idea is very simple - the binary layout effects are random and should cancel out in the long run. Instead of doing many runs with a single ﬁxed binary layout for a given executable, we can randomize the layout between runs. If we do that in a smart way, the effects will cancel out and disappear - magic.

Sadly, the Stabilizer project seems mostly inactive . The last commit touching code is from 2013, and it only supports LLVM 3.1 and GCC 4.6.2. Those are ancient versions. I don’t even know if you can build Postgres with them anymore, or how different the binary would be, compared to current LLVM/GCC versions.

Note: I wonder if it would be possible to do “poor man’s Stabilizer” by randomly adding local variables to functions, to change the size of the stacks. AFAIK that’s essentially one of the things Stabilizer does, although it does it in a nice way at runtime, without rebuilds.

While looking for tools that might replace Stabilizer, I realized that randomizing the layout may not be the only option. Maybe it would be possible to eliminate the random effects by ensuring the binary layout is “optimal” in some way (hopefully the same for both builds).

I don’t recall how exactly, but this eventually led me to BOLT, which started as a research project at META. There’s a nice paper

explaining the details, of course.

Dealing with binary layout differences for benchmarking is not the goal of BOLT, it’s meant to optimize the binary layout based on a proﬁle. But my hope was that if I optimize the builds (unpatched and patched) the same way, the differences will not matter anymore.

So I decided to give it a try, and do some quick testing …

The ﬁrst thing I tried was simply installing bolt-16 (my machines are running Debian 12.7), and followed the instructions from the README. That seemed to work at ﬁrst, but I quickly started to run into various problems.

BOLT requires builds with relocations enabled, so that it can reorganize the binary. So make sure you build Postgres with

Collecting the proﬁle is pretty simple, but that’s just regular perf

(the $PID is a Postgres backend running some queries):

But then turning that into BOLT proﬁle started to complain:

I’m just running the command the README tells me to, so I’m not sure why it’s complaining about “reading perf data directly” or recommending me to run the tool I’m actually running (maybe it’s checking the name somehow, and the “-16” confuses that check somehow?).

It does produce the bolt.data ﬁle with BOLT proﬁle, though. So let’s try optimizing the binary using it:

I have no idea what’s wrong here. The perf2bolt-161 command clearly produced a ﬁle, it’s a valid ELF ﬁle (readelf can dump it), but it just doesn’t work for some reason.

Maybe there’s some problem with perf2bolt-16 after all? The README does mention it’s possible to instrument the binary to collect the proﬁle directly, without using perf, so let’s try that:

Well, that didn’t work all that well :-( After a while I realized the library exists, but is in a different directory, so let’s create a symlink and try again:

Now the instrumentation should work - run the -instrument command again, and it’ll produce binary postgres.instrumented. Copy it over the original postgres binary (but keep the original build, you’ll need it for the actual optimization), start it, and run some queries. It will create a proﬁle in /tmp/prof.fdata, which you can use to optimize the original binary:

And this mostly works. I occasionally got some strange segfault crashes that seemed like an inﬁnite loop. It seemed quite fragile (you look at it wrong, and it crashes). Maybe I did something wrong, or maybe the multi-version packages are confused a bit.

Issues with older LLVM builds are not a new thing, especially for relatively new projects like BOLT. The Debian version is from 16.0, while the git repository is on 20.0, so I decided to try a custom build, hoping it will ﬁx the issues. It might also improve the optimization, of course.

First, clone the LLVM project repository, then build the three projects needed by BOLT (this may take a couple hours), and then install it into the CMAKE_INSTALL_PREFIX directory (you’ll need to adjust the path).

This custom build seems to work much better. I’m yet to see segfaults, the problems with missing library and input/output errors when processing perf data went away too.

At some point I ran into a problem when optimizing the binary, when

llvm-bolt fails with an error:

I don’t know what this is about exactly, but adding this option seems to have ﬁxed it:

I’m not sure this is a good solution, though. This function is for the expression interpreter, and that’s likely one of the hottest functions in the executor. So not optimizing it may limit the possible beneﬁts of the optimization for complex (analytical) queries.

To measure the impact of BOLT optimization, I ran a couple traditional benchmarks - pgbench for OLTP, and the TPC-H queries for OLAP. I expect the optimizations to help especially CPU intensive workloads, so I ran the benchmarks on small data sets that ﬁt into memory. That means scale 1 for pgbench, 10GB for TPC-H.

I always compared a “clean” build from the master branch, with a build optimized using BOLT. The proﬁle used by BOLT was collected in various ways - how important the speciﬁc proﬁle matters is one of the questions. I assume it matters quite a bit, because optimizing based on a proﬁle is the main idea in BOLT. If it didn’t make a difference, why bother with a proﬁle at all, right? We could just use a plain LTO.

First, let’s look at simple read-only pgbench, with a single client, that is

on regular build (labeled “master”), and then builds optimized using proﬁles collected for various pgbench workloads:

The results (throughput in transactions per second, so higher values are better) look like this:

Or relative to “master” you get this:

Those are pretty massive improvements. Read-only pgbench is a very simple workload, we’ve already optimized it a lot, it’s hard to improve it significantly. So seeing 30-40% improvements is simply astonishing.

There’s also the ﬁrst sign that the actual proﬁle matters. Running a test with -M simple on a build optimized using the -M prepared

proﬁle improves much less than with the -M simple proﬁle.

Interestingly enough, for -M prepared there’s no such gap, likely because the -M prepared proﬁle is a “subset” of proﬁle collected for -M simple.

Let’s look at more complex queries too. I only took the 22 queries from the TPC-H benchmark, and ran those on a 10GB data set. For each query I measured the duration for a clean “master” build, and then also duration for a build optimized using a proﬁle for that particular query.

The 22 queries take very different amounts of time, so I’m not going to compare the raw timings, just a comparison relative to a “master” build:

Most queries improved by 5-10%, except for queries 8 and 18, which improved by ~50% and ~15%. That’s very nice, but I have expected to see bigger improvements, considering how CPU intensive these analytical queries are.

I suspect this might be related to the -skip-funcs=ExecInterpExpr.*

thing. Complex queries with expressions are likely spending quite a bit of time in the expression interpreter. If the optimization skips all that, that doesn’t seem great.

Even so, 5-10% across the board seems like a nice improvement.

The natural question is how important the optimization proﬁle is, and how it affects other workloads. I already touched on this in the OLTP section, when talking about using the -M prepared proﬁle for

-M simple workload.

It might be a “zero-sum game” where a proﬁle improves workload A, but then also regresses some other workload B by the same amount. If you only do workload A that might still be a win, but if the instance handles a mix of workloads, you probably don’t want this.

I did a couple more benchmarks, using proﬁles combined from the earlier “speciﬁc” proﬁles and also a generic “installcheck” proﬁle:

* tpch-all - combines all the per-query proﬁles from TPC-H

* all - combines tpch-all and pgbench-both (so “everything”)

The results for OLTP look like this:

The “all” proﬁle combining proﬁles for the workloads works great, pretty much the same as the best workload-speciﬁc proﬁle. The proﬁle derived from make installcheck is a bit worse, but still pretty good (25-30% gain would be wonderful).

Interestingly, none of the proﬁles makes it slower.

For TPC-H, I’ll only show one chart with the relative speedup for

tpch-all and all proﬁles.

The improvements remain quite consistent for the “tpch-all” and “all” proﬁles, although query 8 gets worse as the proﬁle gets less speciﬁc. Unfortunately the “installcheck” proﬁle loses about half of the improvements for most queries, except for query #8. The ~5% speedup is still nice, of course.

It would be interesting to see if optimizing the interpreter (i.e. getting rid of -skip-funcs=ExecInterpExpr.*) makes the optimization more effective. I don’t know what exactly the issue is or how to make it work.

There’s also the question of correctness. There were some recent

discussions

about possiblly supporting link-time optimization (LTO), in which some people suggested that we may be relying on ﬁles being “optimization barriers” in a couple places. And that maybe enabling LTO would break this, possibly leading to subtle hard-to-reproduce bugs.

The optimizations done by BOLT seem very similar to what link-time optimization (LTO) does, except that it leverages a workload proﬁle to decide how to optimize for that particular workload. But if LTO may be incorrect, so would BOLT probably.

I’m no expert in this area, but but per the discussion in those threads it seems this may not be quite accurate. The “optimization barrier” only affects compilers, and CPUs can reorder stuff anyway. The proper way to deal with this are “compiler/memory barrier” instructions.

And some distributions apparently enabled LTO some time back, like Ubuntu in 22.04. And while it’s not a deﬁnitive proof of anything, we didn’t observe a massive inﬂux of strange isses from them.

I started looking at BOLT as a way to eliminate the impact of random changes to binary layout during benchmarking. But I got distracted by experimenting with BOLT on different workloads etc. I still think it might be possible to optimize the builds the same way, and thus get rid of the binary layout impact.

It’s clear adjusting the binary layout (and other optimizations) can yield significant speedups, on top of the existing optimizations already performed by the compilers. We don’t see 30-40% speedups in pgbench every day, that’s for sure.

But there’s also a lot of open questions. The proﬁle used for the optimization matters a lot, so how would we collect a good proﬁle to use for builds?

The nice thing is that I haven’t really seen any regressions - none of the cases got slower even if optimizing using a “wrong” proﬁle. That’s nice, as it seems regressions are not very common.

FWIW I doubt we would start using BOLT directly, at least not by default. It’s more likely we’ll use it to learn how to adjust the code and builds to generate a better executable. Is there a way to “reverse engineer” the transformations performed by BOLT, and deduce how to adjust the code?

...

Read the original on vondra.me »

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

If you like 10HN please leave feedback and share

Visit pancik.com for more.