10 interesting stories served every morning and every evening.
In the early days of high-performance computing, the major tech companies of the day each invested heavily in developing their own closed source versions of Unix. It was hard to imagine at the time that any other approach could develop such advanced software. Eventually though, open source Linux gained popularity — initially because it allowed developers to modify its code however they wanted and was more affordable, and over time because it became more advanced, more secure, and had a broader ecosystem supporting more capabilities than any closed Unix. Today, Linux is the industry standard foundation for both cloud computing and the operating systems that run most mobile devices — and we all benefit from superior products because of it.
I believe that AI will develop in a similar way. Today, several tech companies are developing leading closed models. But open source is quickly closing the gap. Last year, Llama 2 was only comparable to an older generation of models behind the frontier. This year, Llama 3 is competitive with the most advanced models and leading in some areas. Starting next year, we expect future Llama models to become the most advanced in the industry. But even before that, Llama is already leading on openness, modifiability, and cost efficiency.
Today we’re taking the next steps towards open source AI becoming the industry standard. We’re releasing Llama 3.1 405B, the first frontier-level open source AI model, as well as new and improved Llama 3.1 70B and 8B models. In addition to having significantly better cost/performance relative to closed models, the fact that the 405B model is open will make it the best choice for fine-tuning and distilling smaller models.
Beyond releasing these models, we’re working with a range of companies to grow the broader ecosystem. Amazon, Databricks, and NVIDIA are launching full suites of services to support developers fine-tuning and distilling their own models. Innovators like Groq have built low-latency, low-cost inference serving for all the new models. The models will be available on all major clouds including AWS, Azure, Google, Oracle, and more. Companies like Scale. AI, Dell, Deloitte, and others are ready to help enterprises adopt Llama and train custom models with their own data. As the community grows and more companies develop new services, we can collectively make Llama the industry standard and bring the benefits of AI to everyone.
Meta is committed to open source AI. I’ll outline why I believe open source is the best development stack for you, why open sourcing Llama is good for Meta, and why open source AI is good for the world and therefore a platform that will be around for the long term.
Why Open Source AI Is Good for Developers
When I talk to developers, CEOs, and government officials across the world, I usually hear several themes:
We need to train, fine-tune, and distill our own models. Every organization has different needs that are best met with models of different sizes that are trained or fine-tuned with their specific data. On-device tasks and classification tasks require small models, while more complicated tasks require larger models. Now you’ll be able to take the most advanced Llama models, continue training them with your own data and then distill them down to a model of your optimal size — without us or anyone else seeing your data.
We need to control our own destiny and not get locked into a closed vendor. Many organizations don’t want to depend on models they cannot run and control themselves. They don’t want closed model providers to be able to change their model, alter their terms of use, or even stop serving them entirely. They also don’t want to get locked into a single cloud that has exclusive rights to a model. Open source enables a broad ecosystem of companies with compatible toolchains that you can move between easily.
We need to protect our data. Many organizations handle sensitive data that they need to secure and can’t send to closed models over cloud APIs. Other organizations simply don’t trust the closed model providers with their data. Open source addresses these issues by enabling you to run the models wherever you want. It is well-accepted that open source software tends to be more secure because it is developed more transparently.
We need a model that is efficient and affordable to run. Developers can run inference on Llama 3.1 405B on their own infra at roughly 50% the cost of using closed models like GPT-4o, for both user-facing and offline inference tasks.
We want to invest in the ecosystem that’s going to be the standard for the long term. Lots of people see that open source is advancing at a faster rate than closed models, and they want to build their systems on the architecture that will give them the greatest advantage long term.
Why Open Source AI Is Good for Meta
Meta’s business model is about building the best experiences and services for people. To do this, we must ensure that we always have access to the best technology, and that we’re not locking into a competitor’s closed ecosystem where they can restrict what we build.
One of my formative experiences has been building our services constrained by what Apple will let us build on their platforms. Between the way they tax developers, the arbitrary rules they apply, and all the product innovations they block from shipping, it’s clear that Meta and many other companies would be freed up to build much better services for people if we could build the best versions of our products and competitors were not able to constrain what we could build. On a philosophical level, this is a major reason why I believe so strongly in building open ecosystems in AI and AR/VR for the next generation of computing.
People often ask if I’m worried about giving up a technical advantage by open sourcing Llama, but I think this misses the big picture for a few reasons:
First, to ensure that we have access to the best technology and aren’t locked into a closed ecosystem over the long term, Llama needs to develop into a full ecosystem of tools, efficiency improvements, silicon optimizations, and other integrations. If we were the only company using Llama, this ecosystem wouldn’t develop and we’d fare no better than the closed variants of Unix.
Second, I expect AI development will continue to be very competitive, which means that open sourcing any given model isn’t giving away a massive advantage over the next best models at that point in time. The path for Llama to become the industry standard is by being consistently competitive, efficient, and open generation after generation.
Third, a key difference between Meta and closed model providers is that selling access to AI models isn’t our business model. That means openly releasing Llama doesn’t undercut our revenue, sustainability, or ability to invest in research like it does for closed providers. (This is one reason several closed providers consistently lobby governments against open source.)
Finally, Meta has a long history of open source projects and successes. We’ve saved billions of dollars by releasing our server, network, and data center designs with Open Compute Project and having supply chains standardize on our designs. We benefited from the ecosystem’s innovations by open sourcing leading tools like PyTorch, React, and many more tools. This approach has consistently worked for us when we stick with it over the long term.
Why Open Source AI Is Good for the World
I believe that open source is necessary for a positive AI future. AI has more potential than any other modern technology to increase human productivity, creativity, and quality of life — and to accelerate economic growth while unlocking progress in medical and scientific research. Open source will ensure that more people around the world have access to the benefits and opportunities of AI, that power isn’t concentrated in the hands of a small number of companies, and that the technology can be deployed more evenly and safely across society.
There is an ongoing debate about the safety of open source AI models, and my view is that open source AI will be safer than the alternatives. I think governments will conclude it’s in their interest to support open source because it will make the world more prosperous and safer.
My framework for understanding safety is that we need to protect against two categories of harm: unintentional and intentional. Unintentional harm is when an AI system may cause harm even when it was not the intent of those running it to do so. For example, modern AI models may inadvertently give bad health advice. Or, in more futuristic scenarios, some worry that models may unintentionally self-replicate or hyper-optimize goals to the detriment of humanity. Intentional harm is when a bad actor uses an AI model with the goal of causing harm.
It’s worth noting that unintentional harm covers the majority of concerns people have around AI — ranging from what influence AI systems will have on the billions of people who will use them to most of the truly catastrophic science fiction scenarios for humanity. On this front, open source should be significantly safer since the systems are more transparent and can be widely scrutinized. Historically, open source software has been more secure for this reason. Similarly, using Llama with its safety systems like Llama Guard will likely be safer and more secure than closed models. For this reason, most conversations around open source AI safety focus on intentional harm.
Our safety process includes rigorous testing and red-teaming to assess whether our models are capable of meaningful harm, with the goal of mitigating risks before release. Since the models are open, anyone is capable of testing for themselves as well. We must keep in mind that these models are trained by information that’s already on the internet, so the starting point when considering harm should be whether a model can facilitate more harm than information that can quickly be retrieved from Google or other search results.
When reasoning about intentional harm, it’s helpful to distinguish between what individual or small scale actors may be able to do as opposed to what large scale actors like nation states with vast resources may be able to do.
At some point in the future, individual bad actors may be able to use the intelligence of AI models to fabricate entirely new harms from the information available on the internet. At this point, the balance of power will be critical to AI safety. I think it will be better to live in a world where AI is widely deployed so that larger actors can check the power of smaller bad actors. This is how we’ve managed security on our social networks — our more robust AI systems identify and stop threats from less sophisticated actors who often use smaller scale AI systems. More broadly, larger institutions deploying AI at scale will promote security and stability across society. As long as everyone has access to similar generations of models — which open source promotes — then governments and institutions with more compute resources will be able to check bad actors with less compute.
The next question is how the US and democratic nations should handle the threat of states with massive resources like China. The United States’ advantage is decentralized and open innovation. Some people argue that we must close our models to prevent China from gaining access to them, but my view is that this will not work and will only disadvantage the US and its allies. Our adversaries are great at espionage, stealing models that fit on a thumb drive is relatively easy, and most tech companies are far from operating in a way that would make this more difficult. It seems most likely that a world of only closed models results in a small number of big companies plus our geopolitical adversaries having access to leading models, while startups, universities, and small businesses miss out on opportunities. Plus, constraining American innovation to closed development increases the chance that we don’t lead at all. Instead, I think our best strategy is to build a robust open ecosystem and have our leading companies work closely with our government and allies to ensure they can best take advantage of the latest advances and achieve a sustainable first-mover advantage over the long term.
When you consider the opportunities ahead, remember that most of today’s leading tech companies and scientific research are built on open source software. The next generation of companies and research will use open source AI if we collectively invest in it. That includes startups just getting off the ground as well as people in universities and countries that may not have the resources to develop their own state-of-the-art AI from scratch.
The bottom line is that open source AI represents the world’s best shot at harnessing this technology to create the greatest economic opportunity and security for everyone.
With past Llama models, Meta developed them for ourselves and then released them, but didn’t focus much on building a broader ecosystem. We’re taking a different approach with this release. We’re building teams internally to enable as many developers and partners as possible to use Llama, and we’re actively building partnerships so that more companies in the ecosystem can offer unique functionality to their customers as well.
I believe the Llama 3.1 release will be an inflection point in the industry where most developers begin to primarily use open source, and I expect that approach to only grow from here. I hope you’ll join us on this journey to bring the benefits of AI to everyone in the world.
You can access the models now at llama.meta.com.
Meta AI Is Now Multilingual, More Creative and Smarter
We’re expanding access to Meta AI, adding new creative tools and giving you the option to use our most advanced open-source model inside Meta AI for tough math and coding questions.
Meta AI Is Now Multilingual, More Creative and Smarter
To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookie Policy
...
Read the original on about.fb.com »
You can access data from deleted forks, deleted repositories and even private repositories on GitHub. And it is available forever. This is known by GitHub, and intentionally designed that way.
This is such an enormous attack vector for all organizations that use GitHub that we’re introducing a new term: Cross Fork Object Reference (CFOR). A CFOR vulnerability occurs when one repository fork can access sensitive data from another fork (including data from private and deleted forks). Similar to an Insecure Direct Object Reference, in CFOR users supply commit hashes to directly access commit data that otherwise would not be visible to them.
Consider this common workflow on GitHub:
You commit code to your fork
Is the code you committed to the fork still accessible? It shouldn’t be, right? You deleted it.
It is. And it’s accessible forever. Out of your control.
In the video below, you’ll see us fork a repository, commit data to it, delete the fork, and then access the “deleted” commit data via the original repository.
You might think you’re protected by needing to know the commit hash. You’re not. The hash is discoverable. More on that later.
Pretty often. We surveyed a few (literally 3) commonly-forked public repositories from a large AI company and easily found 40 valid API keys from deleted forks. The user pattern seemed to be this:
Hard-code an API key into an example file.
But this gets worse, it works in reverse too:
You have a public repo on GitHub. You commit data after they fork it (and they never sync their fork with your updates).
Is the code you committed after they forked your repo still accessible?
GitHub stores repositories and forks in a repository network, with the original “upstream” repository acting as the root node. When a public “upstream” repository that has been forked is “deleted”, GitHub reassigns the root node role to one of the downstream forks. However, all of the commits from the “upstream” repository still exist and are accessible via any fork.
In the video below, we create a repo, fork it and then show how data not synced with the fork can still be accessed by the fork after the original repo is deleted.
This isn’t just some weird edge case scenario. This unfolded last week:
I submitted a P1 vulnerability to a major tech company showing they accidentally committed a private key for an employee’s GitHub account that had significant access to their entire GitHub organization. They immediately deleted the repository, but since it had been forked, I could still access the commit containing the sensitive data via a fork, despite the fork never syncing with the original “upstream” repository.
The implication here is that any code committed to a public repository may be accessible forever as long as there is at least one fork of that repository.
Consider this common workflow for open-sourcing a new tool on GitHub:
You create a private repo that will eventually be made public. You create a private, internal version of that repo (via forking) and commit additional code for features that you’re not going to make public.You make your “upstream” repository public and keep your fork private.
Are your private features and related code (from step 2) viewable by the public?
Yes. Any code committed between the time you created an internal fork of your tool and when you open-sourced the tool, those commits are accessible on the public repository.
Any commits made to your private fork after you make the “upstream” repository public are not viewable. That’s because changing the visibility of a private “upstream” repository results in two repository networks - one for the private version, and one for the public version.
In the video below, we demonstrate how organizations open-source new tools while maintaining private internal forks, and then show how someone could access commit data from the private internal version via the public one.
Unfortunately, this workflow is one of the most common approaches users and organizations take to developing open-source software. As a result, it’s possible that confidential data and secrets are inadvertently being exposed on an organization’s public GitHub repositories.
Destructive actions in GitHub’s repository network (like the 3 scenarios mentioned above) remove references to commit data from the standard GitHub UI and normal git operations. However, this data still exists and is accessible (if you know the commit hash). This is the tie-in between CFOR and IDOR vulnerabilities - if you know the commit hash you can directly access data that is not intended for you.
If a user knows the SHA-1 commit hash of a particular commit they want to see, they can directly navigate to that commit at the endpoint: https://github.com/. They’ll see a yellow banner explaining that “[t]his commit does not belong to any branch of this repository, and may belong to a fork outside of the repository.”
Where do you get these hash values?
Commit hashes can be brute forced through GitHub’s UI, particularly because the git protocol permits the use of short SHA-1 values when referencing a commit. A short SHA-1 value is the minimum number of characters required to avoid a collision with another commit hash, with an absolute minimum of 4. The keyspace of all 4 character SHA-1 values is 65,536 (16^4). Brute forcing all possible values can be achieved relatively easily.
For example, consider this commit in TruffleHog’s repository:
To access this commit, users typically visit the URL containing the full SHA-1 commit hash: https://github.com/trufflesecurity/trufflehog/commit/07f01e8337c1073d2c45bb12d688170fcd44c637
But users don’t need to know the entire 32 character SHA-1 value, they only need to correctly guess the Short SHA-1 value, which in this case is 07f01e.
But what’s more interesting; GitHub exposes a public events API endpoint. You can also query for commit hashes in the events archive which is managed by a 3rd party, and saves all GitHub events for the past decade outside of GitHub, even after the repos get deleted.
We recently submitted our findings to GitHub via their VDP program. This was their response:
After reviewing the documentation, it’s clear as day that GitHub designed repositories to work like this.
We appreciate that GitHub is transparent about their architecture and has taken the time to clearly document what users should expect to happen in the instances documented above.
Our issue is this:
The average user views the separation of private and public repositories as a security boundary, and understandably believes that any data located in a private repository cannot be accessed by public users. Unfortunately, as we documented above, that is not always true. Whatsmore, the act of deletion implies the destruction of data. As we saw above, deleting a repository or fork does not mean your commit data is actually deleted.
We have a few takeaways from this:
As long as one fork exists, any commit to that repository network (ie: commits on the “upstream” repo or “downstream” forks) will exist forever. This further cements our view that the only way to securely remediate a leaked key on a public GitHub repository is through key rotation. We’ve spent a lot of time documenting how to rotate keys for the most popularly leaked secret types - check our work out here: howtorotate.com.
GitHub’s repository architecture necessitates these design flaws and unfortunately, the vast majority of GitHub users will never understand how a repository network actually works and will be less secure because of it.
As secret scanning evolves, and we can hopefully scan all commits in a repository network, we’ll be alerting on secrets that might not be our own (ie: they might belong to someone who forked a repository). This will require more diligent triaging. While these three scenarios are shocking, that doesn’t even cover all of the ways GitHub could be storing deleted data from your repositories. Check out our recent post (and related TruffleHog update) about how you also need to scan for secrets in deleted branches.
Finally, while our research focused on GitHub, it’s important to note that some of these issues exist on other version control system products.
...
Read the original on trufflesecurity.com »
Artificial general intelligence (AGI) with advanced mathematical reasoning has the potential to unlock new frontiers in science and technology.
We’ve made great progress building AI systems that help mathematicians discover new insights, novel algorithms and answers to open problems. But current AI systems still struggle with solving general math problems because of limitations in reasoning skills and training data.
Today, we present AlphaProof, a new reinforcement-learning based system for formal math reasoning, and AlphaGeometry 2, an improved version of our geometry-solving system. Together, these systems solved four out of six problems from this year’s International Mathematical Olympiad (IMO), achieving the same level as a silver medalist in the competition for the first time.
The IMO is the oldest, largest and most prestigious competition for young mathematicians, held annually since 1959.
Each year, elite pre-college mathematicians train, sometimes for thousands of hours, to solve six exceptionally difficult problems in algebra, combinatorics, geometry and number theory. Many of the winners of the Fields Medal, one of the highest honors for mathematicians, have represented their country at the IMO.
More recently, the annual IMO competition has also become widely recognised as a grand challenge in machine learning and an aspirational benchmark for measuring an AI system’s advanced mathematical reasoning capabilities.
This year, we applied our combined AI system to the competition problems, provided by the IMO organizers. Our solutions were scored according to the IMO’s point-awarding rules by prominent mathematicians Prof Sir Timothy Gowers, an IMO gold medalist and Fields Medal winner, and Dr Joseph Myers, a two-time IMO gold medalist and Chair of the IMO 2024 Problem Selection Committee.
The fact that the program can come up with a non-obvious construction like this is very impressive, and well beyond what I thought was state of the art.
First, the problems were manually translated into formal mathematical language for our systems to understand. In the official competition, students submit answers in two sessions of 4.5 hours each. Our systems solved one problem within minutes and took up to three days to solve the others.
AlphaProof solved two algebra problems and one number theory problem by determining the answer and proving it was correct. This included the hardest problem in the competition, solved by only five contestants at this year’s IMO. AlphaGeometry 2 proved the geometry problem, while the two combinatorics problems remained unsolved.
Each of the six problems can earn seven points, with a total maximum of 42. Our system achieved a final score of 28 points, earning a perfect score on each problem solved — equivalent to the top end of the silver-medal category. This year, the gold-medal threshold starts at 29 points, and was achieved by 58 of 609 contestants at the official competition.
Graph showing performance of our AI system relative to human competitors at IMO 2024. We earned 28 out of 42 total points, achieving the same level as a silver medalist in the competition.
AlphaProof is a system that trains itself to prove mathematical statements in the formal language Lean. It couples a pre-trained language model with the AlphaZero reinforcement learning algorithm, which previously taught itself how to master the games of chess, shogi and Go.
Formal languages offer the critical advantage that proofs involving mathematical reasoning can be formally verified for correctness. Their use in machine learning has, however, previously been constrained by the very limited amount of human-written data available.
In contrast, natural language based approaches can hallucinate plausible but incorrect intermediate reasoning steps and solutions, despite having access to orders of magnitudes more data. We established a bridge between these two complementary spheres by fine-tuning a Gemini model to automatically translate natural language problem statements into formal statements, creating a large library of formal problems of varying difficulty.
When presented with a problem, AlphaProof generates solution candidates and then proves or disproves them by searching over possible proof steps in Lean. Each proof that was found and verified is used to reinforce AlphaProof’s language model, enhancing its ability to solve subsequent, more challenging problems.
We trained AlphaProof for the IMO by proving or disproving millions of problems, covering a wide range of difficulties and mathematical topic areas over a period of weeks leading up to the competition. The training loop was also applied during the contest, reinforcing proofs of self-generated variations of the contest problems until a full solution could be found.
Process infographic of AlphaProof’s reinforcement learning training loop: Around one million informal math problems are translated into a formal math language by a formalizer network. Then a solver network searches for proofs or disproofs of the problems, progressively training itself via the AlphaZero algorithm to solve more challenging problems.
AlphaGeometry 2 is a significantly improved version of AlphaGeometry. It’s a neuro-symbolic hybrid system in which the language model was based on Gemini and trained from scratch on an order of magnitude more synthetic data than its predecessor. This helped the model tackle much more challenging geometry problems, including problems about movements of objects and equations of angles, ratio or distances.
AlphaGeometry 2 employs a symbolic engine that is two orders of magnitude faster than its predecessor. When presented with a new problem, a novel knowledge-sharing mechanism is used to enable advanced combinations of different search trees to tackle more complex problems.
Before this year’s competition, AlphaGeometry 2 could solve 83% of all historical IMO geometry problems from the past 25 years, compared to the 53% rate achieved by its predecessor. For IMO 2024, AlphaGeometry 2 solved Problem 4 within 19 seconds after receiving its formalization.
Illustration of Problem 4, which asks to prove the sum of ∠KIL and ∠XPY equals 180°. AlphaGeometry 2 proposed to construct E, a point on the line BI so that ∠AEB = 90°. Point E helps give purpose to the midpoint L of AB, creating many pairs of similar triangles such as ABE ~ YBI and ALE ~ IPC needed to prove the conclusion.
As part of our IMO work, we also experimented with a natural language reasoning system, built upon Gemini and our latest research to enable advanced problem-solving skills. This system doesn’t require the problems to be translated into a formal language and could be combined with other AI systems. We also tested this approach on this year’s IMO problems and the results showed great promise.
Our teams are continuing to explore multiple AI approaches for advancing mathematical reasoning and plan to release more technical details on AlphaProof soon.
We’re excited for a future in which mathematicians work with AI tools to explore hypotheses, try bold new approaches to solving long-standing problems and quickly complete time-consuming elements of proofs — and where AI systems like Gemini become more capable at math and broader reasoning.
We thank the International Mathematical Olympiad organization for their support. AlphaProof development was led by Thomas Hubert, Rishi Mehta and Laurent Sartran; AlphaGeometry 2 and natural language reasoning efforts were led by Thang Luong.AlphaProof was developed with key contributions from Hussain Masoom, Aja Huang, Miklós Z. Horváth, Tom Zahavy, Vivek Veeriah, Eric Wieser, Jessica Yung, Lei Yu, Yannick Schroecker, Julian Schrittwieser, Ottavia Bertolli, Borja Ibarz, Edward Lockhart, Edward Hughes, Mark Rowland, Grace Margand. Alex Davies and Daniel Zheng led the development of informal systems such as final answer determination, with key contributions from Iuliya Beloshapka, Ingrid von Glehn, Yin Li, Fabian Pedregosa, Ameya Velingker and Goran Žužić. Oliver Nash, Bhavik Mehta, Paul Lezeau, Salvatore Mercuri, Lawrence Wu, Calle Soenne, Thomas Murrills, Luigi Massacci and Andrew Yang advised and contributed as Lean experts. Past contributors include Amol Mandhane, Tom Eccles, Eser Aygün, Zhitao Gong, Richard Evans, Soňa Mokrá, Amin Barekatain, Wendy Shang, Hannah Openshaw, Felix Gimeno. This work was advised by David Silver and Pushmeet Kohli.The development of AlphaGeometry 2 was led by Trieu Trinh and Yuri Chervonyi, with key contributions by Mirek Olšák, Xiaomeng Yang, Hoang Nguyen, Junehyuk Jung, Dawsen Hwang and Marcelo Menegali. The development of the natural language reasoning system was led by Golnaz Ghiasi, Garrett Bingham, YaGuang Li, with key contributions by Swaroop Mishra, Nigamaa Nayakanti, Sidharth Mudgal, Qijun Tan, Junehyuk Jung, Hoang Nguyen, Alex Zhai, Dawsen Hwang, Mingyang Deng, Clara Huiyi Hu, Jarrod Kahn, Maciej Kula, Cosmo Du. Both AlphaGeometry and natural language reasoning systems were advised by Quoc Le.David Silver, Quoc Le, Demis Hassabis, and Pushmeet Kohli coordinated and managed the overall project.We’d also like to thank Insuk Seo, Evan Chen, Zigmars Rasscevskis, Kari Ragnarsson, Junhwi Bae, Jeonghyun Ahn, Jimin Kim, Hung Pham, Nguyen Nguyen, Son Pham, and Pasin Manurangsi who helped evaluate the quality of our language reasoning system. Jeff Stanway, Jessica Lo, Erica Moreira and Kareem Ayoub for their support for compute provision and management. Prof Gregor Dolinar and Dr Geoff Smith MBE from the IMO Board, for the support and collaboration; and Tu Vu, Hanzhao Lin, Chenkai Kuang, Vikas Verma, Yifeng Lu, Vihan Jain, Henryk Michalewski, Xavier Garcia, Arjun Kar, Lampros Lamprou, Kaushal Patel, Ilya Tolstikhen, Olivier Bousquet, Anton Tsitsulin, Dustin Zelle, CJ Carey, Sam Blackwell, Abhi Rao, Vahab Mirrokni, Behnam Neyshabur, Ethan Dyer, Keith Rush, Moritz Firsching, Dan Shved, Ihar Bury, Divyanshu Ranjan, Hadi Hashemi, Alexei Bendebury, Soheil Hassas Yeganeh, Shibl Mourad, Simon Schmitt, Satinder Baveja, Chris Dyer, Jacob Austin, Wenda Li, Heng-tze Cheng, Ed Chi, Koray Kavukcuoglu, Oriol Vinyals, Jeff Dean and Sergey Brin for their support and advice.Finally, we’d like to thank the many contributors to the Lean and Mathlib projects, without whom AlphaProof wouldn’t have been possible.
We’re introducing a series of updates across the Gemini family of models, including the new 1.5 Flash, our lightweight model for speed and efficiency, and Project Astra, our vision for the future…
The model delivers dramatically enhanced performance, with a breakthrough in long-context understanding across modalities.
Introducing Gemini: our largest and most capable AI model
Making AI more helpful for everyone
AlphaZero: Shedding new light on chess, shogi, and Go
In late 2017 we introduced AlphaZero, a single system that taught itself from scratch how to master the games of chess, shogi (Japanese chess), and Go, beating a world-champion program in each…
Powerful, general AI systems that mastered a range of board games and video games — and are now helping us solve real-world problems.
Solving novel problems and setting a new milestone in competitive programming.
Novel AI system mastered the ancient game of Go, defeated a Go world champion, and inspired a new era of AI.
...
Read the original on deepmind.google »
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in
to your account
...
Read the original on github.com »
There are 47 millionaires working for Central States Manufacturing, and they’re not all in the C-Suite. Many of them are drivers or machinists—blue-collar workers for the company.
How? The company is owned by its employees. Every worker gets a salary but also a percentage of their salary in stock ownership. When the company does well, so do the employees—all of them, not just the ones at the top.
And the company is doing well. “When we sat down eight years ago, we said we want to be a billion-dollar company and have 1,500 people, we are on track to be both of those this year,” Tim Ruger, president of Central States, tells me.
That’s right, this manufacturing company will become a unicorn this year—one of only 6,000 companies in the world earning more than $1 billion in revenue. But unlike Walmart, Amazon, and Apple, it’s not just the executives getting paid out.
“It’s not like 80 percent of the company is owned by management and the rest is owned by employees, it’s really well spread across all functions,” Ruger tells me. “We’ve got a number of people that have been here 15, 20 years and they have $1 million plus balances, which is really cool for a person that came out of high school and runs our rollformer. You can’t do that everywhere.”
He’s right, and because you can’t do that everywhere there is a huge wealth disparity in America. Even though the economy has been on an upward trajectory for a century, the wealth it generates has funneled to a much smaller population who owns it. After a 1990s bill meant executives started getting paid in stock options while the rest of their employees earned a static salary, executive pay skyrocketed with the market while their workers’ pay stagnated.
If employees had also owned part of the company, their pay would have skyrocketed with the market too, but they didn’t. “It’s hard to build true wealth for yourself if you don’t have some type of ownership in something, and it’s hard for most people to get ownership in something,” Ruger says.
Upping the minimum wage won’t fix that. As Nathan Schneider says in his book Everything for Everyone: “One way or another, wealth is going to the owners—of where we live, where we work, and what we consume.”
So why not make workers the owners?
There is a growing movement to do just that. Central States is one of 6,533 companies that have formed an Employee Stock Ownership Plan (or ESOP) in the United States, and that number is growing by about 250 companies annually. That’s 14.7 million employees who have ownership in companies worth, collectively, $2.1 trillion.
Every year, those employees get a percentage of their salaries in company stock. During Central States’ worst year, employees earned the equivalent of 6 percent of their pay in stock, during their best they earned 26 percent. Last year, an employee earning $100,000 a year received $26,000 worth of stock in their account. As the company has grown, the value of that stock has averaged 20 percent returns annually, outperforming the stock market.
Just like Jeff Bezos can sell a portion of his Amazon stock to buy a new house, employees at ESOPs can pull money out of their stock accounts to pay for tuition, medical bills, or as a downpayment on a primary residence.
“We have several in production and drivers who have been here for over 20 years that have multi-million dollar accounts,” Chad Ware, Central States’ CFO tells me. “We’ve had several folks take out enough money to buy a home outright.”
An ESOP account functions a lot like a second 401k, but invested solely in the company. Employees can pull out whenever they’d like, but outside of those approved uses they will have to pay taxes and an early withdrawal fee to remove the funds before the age of 59 ½. After they leave the company or retire, the complete balance of their accounts will be paid out to them over a six-year period.
This means the company needs to have that cash on hand to pay out, and this has to be budgeted into their annual cash flow. But it also means the employees are incentivized to participate in the wellbeing of the company.
On the stock market, executives are expected to produce quarterly results, often to the detriment of their companies’ long-term success. After Boeing famously rushed the rollout of its 737 MAX aircraft to meet quarterly expectations, fatal crashes and safety concerns killed 346 people and cost the company $20 billion. Companies like Wells Fargo, Sears, and Bausch Health have similarly cut corners to inflate short-term results at the expense of their long-term health.
But an employee at Central States doesn’t care about one good quarter, they care about a good 10 years, and a good 50. If the company’s products turn out to be inferior next year their stock in the company will tank, if the company goes bankrupt in 20 years it will go down to zero. It’s in their best interest to act in the long-term interest of the company, and to grow it sustainably rather than quickly.
“One of our CEOs likes to say that these companies are not looking to hit home runs, they’re looking to hit singles and doubles on a regular basis,” Noelle Montaño, executive director for Employee-owned S Corporations of America (or ESCA) tells me. “When the C-Suite goes into work every day, they see the receptionist, the person on the factory floor, the guy who’s building the building or digging the ditches, and they know that those are the shareholders they are responsible for. They take that seriously.”
Every month, Central States executives share the company’s P&L with employees, and every year they share the financials of the business at annual shareholders meetings, where employee-owners can participate in discussions about the future of the company. After a new plant was struggling with sales in its second year, one employee-owner raised his hand at a shareholder meeting and said, “This isn’t helping our company, and it’s not helping my share price, can we discuss the closure of this plant?”
“It was a great conversation, I love the fact that it’s not somebody else’s problem,” Ruger says. “They’re thinking like business owners, which is what you want, right?”
As a result, ESOPs are generally healthier companies. “These companies do better at employee retention, they do better at retirement benefits, they default less often on loans,” Montaño says. “Our companies did better during the Great Recession, they did better during Covid.”
There are serious benefits to the company for operating this way. ESOPs are a viable alternative to unions—there is no rift between the owners and the workers, workers are the owners! They are also exempt from paying income tax—though they tend to spend those dollars on their employees instead.
“It helped us grow when we were smaller. Now that we’re larger, what we’re paying out each year to our employee-owners is probably more than we would pay in tax, quite honestly,” Ruger says. “But if I have to choose who we pay our money to, I’d rather pay employee owners than give it back to the government. I think it’s probably the right way.”
He brings up a good point. I’ve mentioned before that I do not think the answer to our wealth disparity is to “tax the rich.” Don’t take Jeff Bezos’ money and give it to the government, better distribute Amazon’s earnings among its employees—not just to its founder.
“I think our taxes are way too burdening and we don’t do a good job using the money. I wouldn’t mind paying more if we were using it well, I just don’t know if we are,” Ruger says. “Why redistribute the wealth after it’s already been earned, why can’t we earn it beforehand? It naturally levels out the haves and have-nots.”
It’s worth noting that the “haves” still benefit from this equation.
“Most ESOP companies start because the founder wants to exit or cash out, but they don’t want to sell to a private equity firm that will run their company into the ground or slash and burn employee headcount,” Ware says. “A lot of owners built a company and were in the trenches with the people beside them. They want to take care of them, but they also want to cash out. A good option is to set up an ESOP, and that’s exactly how Central States got started.”
Carl Carpenter founded Central States in 1988, but sold it to his employees when he retired in 1991. More specifically: He sold a portion of the shares of his company to an ESOP trust, which holds the company’s shares on behalf of the employees. In 2011, the company bought the remaining shares and became 100% employee-owned. Carpenter sailed into the sunset with a nice retirement package even as he allowed his employees to start building their own, and I don’t see why every founder shouldn’t do the same.
“There are real benefits for an owner turning the company into an ESOP,” Ruger says. “They personally benefit from the sale when they exit the business. Additionally, there are some real tax benefits to turn it into an ESOP—they pay a whole lot less taxes when they sell the company.”
The only reason they don’t do it more often is because they don’t know about it. “The number one issue is education,” Montaño says. “If you’re looking to sell your business and you go to your accountant or lawyer, they may not say ‘Have you thought about an ESOP?’”
It’s also not a quick process—founders interested in selling to their employees need to plan ahead. A feasibility study needs to be conducted to ensure the viability of an ESOP plan and an independent valuation of the company needs to be conducted to determine the fair market value of the shares. A trust needs to be defined and structured, and a trustee appointed to oversee it on behalf of the employees. “If an owner just wants to get out, an ESOP is not for them,” Montaño says.
That might be changing, ESOPs have bi-partisan support in Congress and moves have been made to improve education about ESOPS and make the transition easier for founders. “We have support from Members of Congress across the political spectrum…. It’s capitalism at its best,” Montaño says. “A year and a half ago, there was legislation mandating the Department of Labor to open an Office of Employee Ownership, and they’ve taken a more robust interest in ESOP companies and recently appointed someone from the employee ownership community to this important role.”
In 2022, only 34 percent of families in the bottom half of income distribution held stocks, while 78 percent of families in the upper-middle-income group did—95 percent of families in the top one percent held stocks. But employee ownership changes that equation. As the process becomes easier and education about ESOPs grows, more and more founders will exit by selling their companies to their employees, and the result is that more and more of the wealth will be owned by everyone who works, not just the person they work for.
As the stock market gets richer, so will we all, and that’s a future I’m excited about working toward.
“I’d love to see it more and more and more,” Ruger says. “It’s really generating wealth for people, I’m convinced we’re going to change generations.”
This is a continuation of my capitalism series which is figuring out how capitalism can work better for everyone while serving as research for my utopian novel. I hope you’ll join us in the comments for further discussion!
P. S. If you enjoyed this post, please share it or recommend my work to your subscribers! That’s how I meet new people and earn a living as a writer. Thank you so much for your support.
...
Read the original on www.elysian.press »
You must log in or register to reply here.
...
Read the original on bitbuilt.net »
So you think you know box shadows?
Four years ago I found out my m1 can render a stupid number of these bad boys and so I set out to see just how far you can push them and boy did I. If you are looking for a how to use box shadows to get the look of the latest UX trend, this is not the right article for you. But if you like some janky creativity, stay tuned.
I want to share some of the worst possible things one can do with box shadows all on a single div. Things which shouldn’t work at all yet somehow they do. But before getting into that, a question must be answered.
What exactly is a box shadow?
A box shadow is a kind of drop shadow. And a drop shadow is a kind of image filter which is particularly popular in graphic design due to how versatile it is at adding an approximation of depth to composition.
The filter takes the raster of an image and shifts the pixels along the x and y axis. It will draw these pixels as a single color behind the source image. This gives an illusion of depth by dropping the outline of an image as a “shadow” into the composition hence the name drop shadow.
We can use the css filter property to see this in action.
There are many different implementations of a drop shadow filter across different tools like photoshop, gimp, figma, and css each having a different set of features. For example css also supports an optional blur value to apply to the drop shadow.
By layering several drop filters one can easily add interesting depth to a composition.
For example, here are 2 layered drop-shadow filters.
A box shadow is a form of drop filter with many trade offs. First, the name “Box” has to do with the filter only supporting box shapes. For example, lets try applying it to the previous example.
Notice that the shadow shape is limited to the bounding box of the container and how the shadow can break out of the bounding box. This seems limiting but it comes with a few more features one of which is performance.
It turns out that the majority of user interfaces are made up of boxes. It also turns out that some smart people figured out maths hacks to draw rounded boxes for super cheap which UI peeps love because with this hack boxes can be so round as to appear as circles. And the css box shadow implementation supports this math hack.
This means that designers can be liberal with box shadows rather than relying on prerendered source images bloating download sizes.
This little mixer shows the variety of shapes available. Tap to randomize the color.
This opens up a all kinds of freedom for UI design. Layering these together can produce amazing results. You can play around with a border editor here.
Layering. That is an important word. You can layer or chain many box shadows together on a single div. The above example uses this to set the colors.
How not to use box shadows
Usually, a designer will carefully position squares within other squares with consistent margins, paddings, and typography for optimal accessibility and understandability. Wisely, they further add layered shadows and perhaps a few images to help visually distinguish widget interaction and state.
That is all well and good but what we are really working with is a kind of painting api. We can paint an arbitrary number of squares to the screen optionally applying a blur to them.
I initially explored this with somme minimal art in an earlier write up.
everything changed when the fire nation attackedWe are not worthy!
The config that drives this is pretty simple.
Now the natural question I am sure you have and I certainly had was, “can we do more box shadows?” What about blurring or transparency? How do they impact performance?
I whipped up a little visual tool where a giant box shadow is created and set on a div like so.
Animation is handled by setting the box shadow string every 300ms and then letting transition: all prop do the animation. This causes some jank and ended up being slower that setting the box shadow on every frame.
The result is an app where you can tap to remix a color palette with a history of the last 10 palettes to the left. Here is an example with 100 box shadows. Tap around.
I noticed that applying a blur slowed down the number you could animate which makes sense. However, using a transparent color significantly slowed down the number that can be drawn too which doesn’t make as much sense to me. I’d imagine that with hardware today transparency should be somewhat free. The div size also impacts performance which makes me think there is some software rasterizer involved when things are animated. I could look into the source code of browsers but it would be different depending on the js engine.
However, I found that if I didn’t set any transparency or blur, my m1 laptop could draw buckets of box shadows. Like thousands of them.
How to not use box shadows
Ok, many box shadows can be drawn. Now what?
Well we cannot rotate the box shadows but they can be circles and a circle kinda looks like a ball. So what if I made a bunch of balls that could bounce around? And maybe I can “fake” a 3d effect by scaling the size based on a z value. This wouldn’t be accurate perspective but would add some 3d depth.
This one is pretty simple. Just a big’ol “gamestate” updated in a requestAnimationFrame and then set a giant box shadow string on div. You can touch somewhere to pull the balls towards you. The balls are contained to a box and will bounce to stay in frame.
Updating the simulation isn’t complicated but for the sake of brevity I will use a bit of psudocode.
const update: GameFunction = (state) => {
for (const ball in state) {
updateBall();
containBall();
addFriction();
if (touched) pullToPoint(touchX, touchY);
Now rendering is the interesting part. What is going to be run 60 time a second is the following.
Sort the balls based on z index and fill an array of box shadows. The size calculation is based off of wanting to have x,y,z represent the center of a ball with a radius of size. The z scale is a hack to have some z “depth” where the size is scaled based on a fixed ratio.
Here are 50 balls. Drag em around and make em bounce on the sides.
The 3d scaling works pretty well to give a little bit of depth even if it is total bs. You can notice that when a ball gets close to the “camera” at a certain point it is no longer a circle. This is because the box shadow div is too small for the scaling method. Increasing the container size fixes this but a larger container means slower performance.
Let’s see what happens if the balls can bounce off each other with some good old fashion n^2 collision check. Now, I am only going to reflect the balls velocity on a collision detection which is inaccurate but simple. This is not simulating any real physics interaction. I will also fix the z position to make it 2d so it is easier to see what is happening.
Not very interesting. I think something more accurate physics would look nicer but maybe another time. Adding a phone gyro as input to this could be fun too but again maybe another time.
I reproduced another setup where the balls always try and find their way home to a random starting position. The force of a touch is enough to pull them away however. This give an effect almost like a sponge where you can pull bits off. I can think of ways this could be used for some foam spray in fake fluid sim as part of a game or something. Kind of fun.
I noticed that the fake 3d really comes out in the above example as the balls slowly travel back home. How could the 3d aspect be taken further? Maybe I could draw point clouds with the box shadows as points? I could project points on difference surfaces and then draw the points like some godawful 3d renderer.
I thought a good starting point would be to simply map pixels from a picture as points on a 2d plane. This would also be a good stress test to find out what the upper limit is on number of realtime simulated box shadows. Here is the mapping function.
const pixels = await getImagePixels(
“/images/starry_night_full.jpg” as any,
width
const dx = window.innerWidth / pixels[0].length;
const dy = window.innerHeight / pixels.length;
for (let y = 0; y < pixels.length; y++) {
for (let x = 0; x < pixels[0].length; x++) {
const px = x * dx + dx / 2,
py = y * dy + dy / 2,
pz = 60 + Math.random() * 3;
state.particles.push({
size: pSize,
x: px,
y: py,
z: pz,
ox: px,
oy: py,
oz: pz,
dx: Math.random() * 3,
dy: Math.random() * 3,
dz: Math.random() * 3,
color: pixels[y][x],
The image is scaled to fit a max width which can be configured in a query param but otherwise it is the same as before. If you want the source, here is the codesandbox.
I started with one of my all time favorite paintings ever. Tap around.
Depending on your device, this demo may be melting it as it is rendering several thousand box shadows in simulated 3d space. You can drag around to kinda explode the image up. This is taking the previous examples and setting the starting positions and colors based on an image.
I am going to bump up the count and rotate the camera. I will record this one to save some battery life. If you want to burn your battery give a live version a try here. You have been warned.
This is promising. I personally love how this has an almost painted style due to the circular pixels looking kinda splatted on. Here is another example with an increased count and some interaction.
You can see it is chugging at this scale. For reference, this is somewhere in the ballpark of 12,000 box shadows. I mean, damn. I wonder if perhaps it is so fast because the m1 has shared gpu and cpu memory? My desktop certainly cannot push this many box shadows neither can my iphone or old android. Crazy result tho.
What about a projecting the points uniformly on to the surface of a mesh?
It turns out with a bit of math it totally works. Here is a cube using a formula with uniform point distribution.
You can tap to interact with it. Kinda reminds me of jello. I also added a small light which follows the mouse positioning. This adds a bit more depth. The light is not accurate at all with magic constants left and right but what is programming without a sprinkling of magic? 😊
.map(([p, coord]) => {
const zIndexScale = 1 + coord.z / 40;
const size = p.size * zIndexScale;
const halfSize = (size - state.renderContainerSize) / 2;
const hcs = state.renderContainerSize / 2;
const lightDist = Math.sqrt(dist(coord, lightPos));
const intensity = (1 - lightDist / 900) * 1; // I have no idea what i was doing here.
const lumen = Math.min(2, (1 / lightDist ** 2) * 60000);
return [
coord.x + hcs,
“px ”,
coord.y + hcs,
“px 0 ”,
halfSize,
“px ”,
darkenHexColor(p.color, lumen),
].join(“”);
I used gypity to give me a function to do the cube particle mapping among a few other math helpers. Sometimes gypity-g would work but sometimes it wouldn’t and I have to stop being lazy. More on this later…
The first function gyptiy gave was a random distribution which I didn’t want. I wanted a uniform placement of points across the surface of a uniform sized cube. It was able to get this on the second try.
const halfSideLength = Math.floor(cubeSideLength / 2);
...
Read the original on dgerrells.com »
This is a very short book about copying. Its contents, unless
otherwise noted, are licensed under
CC-BY SA 4.0
(more on that in a bit). You can download, copy, remix, excerpt,
change, and repost it however you see fit.
Charles Eames said it best: “We don’t do ‘art’ — we solve problems.”
To buy furniture in 1950, you had to choose between affordable and enduring, between rugged and fashionable. Charles and Ray designed a chair that was all the above and sold it for $20.95.
They called it the LCW.
The LCW embodies the Eames’ obsession with simplicity in material and method. “We want to make the best for the most for the least,” they said.
The design was revolutionary: in 1999, Time magazine called the LCW “the best design of the century.”
Today, you can buy a brand new LCW from Herman Miller (the officially licensed manufacturer of Eames products) for $1,195.
Or, you can buy a chair called the “Fathom” from a company called Modway for $145.
Functionally and aesthetically, the chairs are identical.
There’s an LCW from 1946 in MOMA’s collection. It’s one of the very first ever made. Most people would call it the original LCW.
Charles and Ray Eames sold the manufacturing rights for their furniture to Herman Miller in 1947. Collectors call the LCWs made in the ’40s and ’50s “originals.” But in some sense, these — and the more recently manufactured Herman Miller versions — are copies of that LCW in the MOMA collection.
And then there’s the Modway Fathom. It’s clearly a copy, an unlicensed one at that. But at $145 (the equivalent of $12.78 in 1947) it’s more affordable than the LCW was when it was first manufactured and sold. In spirit, it’s more of an original than any LCW: the best, for the most, for the least.
I’m sharing this story because it demonstrates a surprising fact: what makes something “original” (the first, the best, the most famous, the most true) or a “copy” (an identical copy, an unauthorized replica, an interpretation or a remix) isn’t always obvious — or important.
I’m a designer. As a designer, I feel the need to be original. If you’re a designer, or even if you’re just interested in design, you probably feel the need to be original, too. We tend to worship inventors and originators, designers who were trailblazing and innovative. And we copy them.
This oxymoron of a craft can drive a person crazy. There’s lots of space between originality and industry, authorship and acknowledgement, riffing and ripping. I wrote this very short book to explore that space.
Some people have been frustrated by copying, refused to accept it, and struggled with every ounce of their strength against it. Other people have used copying to their advantage, whether to improve themselves, build a community, or subvert authority.
I’ve only been able to have a career in design because I copied.
I hope that by the time you’ve finished reading, you’ll see how important copying is. Right or wrong, virtue or vice, copying is the way design works.
Steve Jobs copied.
“Great artists steal,” he said, quoting Pablo Picasso (or was it Stravinsky? T. S. Eliot?). Jobs and Apple copied many designs in their early days, most notably from a Xerox research laboratory in Palo Alto. The story goes like this:
In the early 20th century, Xerox was a pioneer of office technology. By the middle of the century, computers were getting smaller and more affordable, and Xerox knew they’d have to work hard to keep their market dominance. In 1970, The Xerox Palo Alto Research Center — Xerox PARC — was founded to explore the future of the “paperless office.”
Within two years, Xerox PARC had designed a groundbreaking computer called the Alto. One of its innovations was a graphical user interface: programs and files were displayed in virtual windows which users navigated using a mouse. It was an eerily accurate picture of what personal computers would look like 30 years later.
Jef Raskin, leader of the Macintosh project at Apple, had seen Xerox’s work. He wanted Steve Jobs to see it for himself, and set up a meeting.
“I thought it was the best thing I’d ever seen in my life,” Jobs said of the Alto’s user interface. “Within ten minutes it was obvious to me that all computers would work like this some day.”
When the Macintosh was released in 1984, it featured a graphical user interface. Programs and files were displayed in virtual windows which users navigated using a mouse.
It was just like the Alto.
Steve Jobs didn’t like to be copied.
In 1985, a year after the Macintosh was launched, Apple sued a company called Digital Research Interactive for copying the Macintosh’s user interface. Digital Research settled out of court, and changed the appearance of its icons, windows, and mouse pointers.
In 1990, Apple sued both Microsoft and Hewlett-Packard. The case was a repeat: Microsoft’s Windows and HP’s NewWave featured designs that Apple claimed were copies of the Macintosh’s operating system. But early licensing agreements between Apple and Microsoft made it unclear if any infringement took place; the case was thrown out.
In the middle of Apple’s case against Microsoft, Xerox sued Apple, hoping to establish its rights as the inventor of the desktop interface. The court threw out this case, too, and questioned why Xerox took so long to raise the issue.
Bill Gates later reflected on these cases: “we both had this rich neighbor named Xerox … I broke into his house to steal the TV set and found out that [Jobs] had already stolen it.”
The rampant copying fueling the explosive growth of consumer computers meant that by 1990, the desktop user interface was ubiquitous; it was impossible to determine who originated any part of it, or who copied who. The quest to stake their claim nearly consumed Apple. But when they emerged, they had learned a thing or two. Today, Apple holds more than 2,300 design patents.
This story ends in 2011, with Apple suing Samsung for copying the design of its software and hardware products. One of the most remarkable claims: Samsung broke the law when it sold “a rectangular product with four evenly rounded corners.”
The court rejected Apple’s claim to own rounded rectangles. But it upheld the other claims, fining Samsung a blistering $539 million for patent violations.
Designers copy. We steal like great artists. But when we see a copy of our work, we’re livid. Jobs, on Google’s Android: “I will spend my last dying breath if I need to, and I will spend every penny of Apple’s $40 billion in the bank, to right this wrong. I’m going to destroy Android, because it’s a stolen product.”
Steve Jobs was unmatched in his visionary dedication to innovation. But he never came to terms with the inevitability of copying.
John Carmack had a different relationship with copying. For him, copying was a way to learn, a challenge to overcome, and a source of new ideas.
Carmack was — still is — a brilliant coder. He’s best known for programming the ultraviolent and action-packed first-person shooters Doom and Quake. Those games pushed the limits of consumer computers and defined a genre. But his first real breakthrough game was simpler, cuter, more whimsical. It was called Commander Keen.
Growing up in the early ’90s, I loved Commander Keen. It’s a goofy adventure game; you guide an eight-year-old boy wearing a football helmet and red Converses through alien planets, collecting candy bars and zapping monsters with a ray gun.
Keen began life as a copy of another of my favorite games: Super Mario Bros. 3.
Before Keen, Carmack was working for a subscription software company called Softdisk. Carmack and the other programmers at Softdisk churned out these games at a prodigious rate: today, blockbuster games can take more than five years to create;
Softdisk produced a brand-new full-length game every single month.
In September 1990, Carmack decided that for his next game, he’d try to tackle a new and daunting challenge: scrolling. At the time, only consoles like the Nintendo had enough computing power to smoothly scroll scenery, characters, and enemies. The PCs were stuck to simple one-screen-at-a-time games. But if Carmack was going to sell millions of games like Nintendo had with Super Mario Bros., he needed to figure out how to recreate the effect.
So, on September 19, 1990, Carmack and another developer named Tom Hall decided to reverse-engineer the first level of Super Mario Bros. 3. Working through the night, Carmack coaxed his PC into scrolling and animating the world of Super Mario; Hall jumped back and forth between a TV screen and his computer, playing the Nintendo version, pausing to copy the images pixel-for-pixel.
The next day, their coworkers were floored. Nobody had ever seen a PC game work like this. John Romero, Carmack’s closest colleague and future collaborator on Doom and Quake, called it “the fucking coolest thing on the planet.”
He insisted that they keep copying until they had finished an exact replica of the full game. They were going to send it to Nintendo.
Unfortunately for Carmack and his team, Nintendo wasn’t interested in a PC version of Super Mario (their console version was doing just fine, thank you very much).
Disappointed, but not defeated, they resolved to build a better version of Mario. Starting with Carmack’s code for scrolling and animating the screen, the coders — calling themselves Ideas from the Deep, keeping the game a secret from their day jobs at Softdisk — put their Super Mario copy through a complete metamorphosis. In place of Mario, it starred eight-year-old Billy Blaze. Instead of turtles and mushrooms, the enemies were aliens called Yorps. Instead of eating a mushroom to jump higher, Billy Blaze hopped on a pogo stick.
The debut Commander Keen game, Commander Keen in Invasion of the Vorticons, was a huge success. More than 50,000 copies were sold, making Keen one of the best-selling PC games of its time.
Unlike Steve Jobs, John Carmack never changed his mind about copying. When his boss at Softdisk suggested that they patent Carmack’s PC scrolling technique, Carmack reeled. “If you ever ask me to patent anything,” he said, “I’ll quit.”
In a 2005 forum post, John Carmack explained his thoughts on patents. While patents are framed as protecting inventors, he wrote, that’s seldom how they’re used. Smart programmers working on hard problems tend to come up with the same solutions. If any one of those programmers patents their solution, the rest are screwed.
He concluded: “I’ll have no part of it. Its [sic] basically mugging someone.”
In his games after Keen, Carmack would go beyond simply refusing to patent his inventions. He would release the source code to the biggest games of the ’90s, Wolfenstein 3D, Doom, and Quake. Everyone is free to download, modify, or copy them.
It’s one thing to copy.
It’s another to encourage others to copy from you. Richard Stallman went even further — he made copying a right.
In 1983, Richard Stallman wanted to build a new operating system. At the time, Unix was the most popular and influential operating system, but it was expensive to license. Commercial licenses cost $20,000 — that’s $52,028 in 2020 money.
And Unix was closed-source.
So on September 27, 1983, he wrote this message on the Unix Wizards message board:
That Stallman would write software and give it to others to use, for free, was a radical notion. To drive the point home, Stallman wrote a manifesto, defining the idea of free software (“Free software is software that users have the freedom to distribute and change.”) The manifesto kicked off the free software movement.
The enduring innovation of Stallman’s movement was how he and his co-conspirators used software licenses. They flipped traditional licensing on its head: instead of prohibiting the copying or distribution of the software, a free software license guarantees the right of people to use, modify, distribute, and learn from its code.
Instead of prohibiting the copying or distribution of the software,
a free software license guarantees the right of people to use,
modify, distribute, and learn from its code.
New kinds of software licenses weren’t the only product of the free software movement. Ideological offshoots quickly spun out into new groups, like the open-source software movement. While Stallman’s free software faction was centered around a small group of hard-line progressive coders, the open-source movement was broad and inclusive, abandoning some of Stallman’s more political language to spread farther and find new audiences.
Permissive licensing and distributed source control form the engine of modern software development. They create a feedback loop, or a symbiotic pair, or a living organism, or maybe even a virus: the tools that software developers use are themselves products of the open-source philosophy. Free and open-source code replicates itself, mutates, and spreads instantly across the world.
The free and open-source software movements (sometimes combined into a single acronym, FOSS) were echoed by another revolution in how creative works are licensed. In 2001, Lawrence Lessig, Hal Abelson, and Eric Eldred started Creative Commons, a non-profit and international network dedicated to enabling the sharing and reuse of “creativity and knowledge through the provision of free legal tools.”
Nearly 20 years later, nearly half of a million images on Flickr have Creative Commons (or CC) licenses. Wikipedia uses CC licenses on all its photos and art. MIT provides more than 2,400 courses online for free under Creative Commons licenses. Countless millions of creative works have benefited from the open-source approach to licenses and permissions.
An image of a feedback loop from Flickr’s Creative Commons
archive
A decade ago, the open-source movement came to design. Michael Cho created Unsplash
in 2013 to share a few photographs he thought might be useful to designers at startups; as of September 2020, Unsplash hosts 2,147,579 photos, and all-time photo downloads are well over 2 billion.
Pablo Stanley recently released Humaaans, a collection of Creative Commons-licensed designs that can be re-assembled into editorial graphics. Feather icons, Heroicons, and Bootstrap Icons
are all open-source and free-to-use collections of UI icons, used by designers to build websites and applications.
Meanwhile, the explosion of open-source design resources has been bolstered by a new class of tools for sharing and collaborating on design. Abstract
is a version-control system for design that promises “collaboration without the chaos.” With Abstract, many designers can contribute to a single file, without worrying about overwriting each other’s changes or always needing to download the latest versions. Figma, too, has just launched its community feature
, allowing designers to publish files and download each other’s projects. It’s not hard to imagine how this will evolve into a designer’s version of GitHub
in the near future. Other design tools have followed suit: both Sketch and Framer have launched community content hubs, laying the groundwork for distributed source control.
Copying is fundamental to design, just as it is to software. The rise of permissive licenses and version control tools makes it seem like copying is a new idea, an innovative approach in an industry that thrives on novelty. But the truth is, copying has informed art and industry for thousands of years.
In China, there are many concepts of a copy, each with distinct subtext. Fangzhipin (仿製品) are copies that are obviously different from the original — like small souvenir models of a statue. Fuzhipin (複製品) are exact, life-size reproductions of the original. Fuzhipin are just as valuable as originals, and have no negative stigma.
In 1974, local farmers in the Xi’an region of China unearthed life-sized sculptures of soldiers made of terra cotta clay. When Chinese archeologists came to investigate the site, they uncovered figure after figure, including horses and chariots, all exquisitely detailed. All told, there were more than 8,000 terra cotta soldiers. They were dated to 210 BCE.
The terracotta warriors instantly became cultural treasures. A museum was built on the site of the excavation, but many of the statues were also exhibited in traveling shows. Hundreds of thousands of museumgoers all over the world lined up in galleries to see the soldiers.
Then, in 2007, a revelation rocked the Museum für Völkerkunde in Hamburg, Germany: some of the terracotta warriors it had on display were not the originals that had been discovered in the field in Xi’an. They were copies.
The Museum für Völkerkunde’s director became a pariah: “We have come to the conclusion that there is no other option than to close the exhibition completely, in order to maintain the museum’s good reputation.” The museum issued refunds to visitors. The event kicked off a rash of geopolitical finger-pointing: German officials cried foul, saying they were duped; Chinese officials washed their hands, since they never claimed the statues were originals to begin with.
The statues in the Hamburg museum were fuzhipin, exact copies. They were equivalent to the originals. After all, the originals were themselves products of mass manufacturing, made with modules and components cast from molds. Almost as soon as the terracotta warriors were discovered, Chinese artisans began producing replicas, continuing the work that had started more than 2,000 years before.
It’s easy to attribute this approach to copying as a cultural curiosity, an aberration particular to China. But copying was just as vital to Western artists.
Japanese art was one of the main sources of inspiration for Vincent van Gogh, himself one of the most influential European painters of the 19th century, if not of all time. Van Gogh was fascinated by the woodblock prints of artists like Hiroshige: stylized and vivid, they captured dramatic moments within compelling stories.
Van Gogh’s interest went beyond inspiration. To study the techniques mastered by Japanese artists, he copied prints by Keisei Eisen and Utagawa Hiroshige. He tried to replicate their bold lines, their energetic compositions, and their strong colors. For his copy of Eisen’s A courtesan, van Gogh started by tracing the outline of the courtesan’s figure directly from the May 1886 edition of Paris Illustré. For Flowering Plum Tree and The Bridge in the Rain, both copies of Hiroshige prints, he added borders of Japanese calligraphy he had seen on other prints.
Sudden Shower over Shin-Ōhashi bridge and Atake
(1857) by Hiroshige
The Bridge in the Rain (after Hiroshige) (1887) by
Vincent Van Gogh
His practice with Japanese styles provided a crucial breakthrough. Van Gogh began to flatten landscapes. He outlined his subjects in bold black strokes. He painted with eye-watering colors. His interpretations of reality lit the art world on fire, influencing artists and designers to this day.
By copying directly from Japanese artists, van Gogh’s works became what we know today.
He was clear about this influence. In a letter to his brother Theo, he wrote: “All my work is based to some extent on Japanese art.”
There’s another word in Chinese for a copy: shanzhai (山寨). It’s translated to English as “fake,” but as with most Chinese words, the translation is lacking. Shanzhai literally means “mountain stronghold;” the word is a neologism, a recent invention, inspired by a famous novel in which the protagonists hide in a mountain stronghold to fight against a corrupt regime. Shanzhai products are playful, drawing attention to the fact that they aren’t original, putting their makers’ creativity on display.
Take the popular shanzhai novel Harry Potter and the Porcelain Doll; in it, Harry goes to China to stop Voldemort and Voldemort’s Chinese counterpart. It doesn’t pretend to be an original. It plays on its fake-ness: Harry speaks Chinese fluently, but he has trouble eating with chopsticks.
...
Read the original on matthewstrom.com »
Today, we are announcing Mistral Large 2, the new generation of our flagship model. Compared to its predecessor, Mistral Large 2 is significantly more capable in code generation, mathematics, and reasoning. It also provides a much stronger multilingual support, and advanced function calling capabilities.
This latest generation continues to push the boundaries of cost efficiency, speed, and performance. Mistral Large 2 is exposed on la Plateforme and enriched with new features to facilitate building innovative AI applications. Mistral Large 2 has a 128k context window and supports dozens of languages including French, German, Spanish, Italian, Portuguese, Arabic, Hindi, Russian, Chinese, Japanese, and Korean, along with 80+ coding languages including Python, Java, C, C++, JavaScript, and Bash.Mistral Large 2 is designed for single-node inference with long-context applications in mind — its size of 123 billion parameters allows it to run at large throughput on a single node. We are releasing Mistral Large 2 under the Mistral Research License, that allows usage and modification for research and non-commercial usages. For commercial usage of Mistral Large 2 requiring self-deployment, a Mistral Commercial License must be acquired by contacting us.Mistral Large 2 sets a new frontier in terms of performance / cost of serving on evaluation metrics. In particular, on MMLU, the pretrained version achieves an accuracy of 84.0%, and sets a new point on the performance/cost Pareto front of open models.Following our experience with Codestral 22B and Codestral Mamba, we trained Mistral Large 2 on a very large proportion of code. Mistral Large 2 vastly outperforms the previous Mistral Large, and performs on par with leading models such as GPT-4o, Claude 3 Opus, and Llama 3 405B.A significant effort was also devoted to enhancing the model’s reasoning capabilities. One of the key focus areas during training was to minimize the model’s tendency to “hallucinate” or generate plausible-sounding but factually incorrect or irrelevant information. This was achieved by fine-tuning the model to be more cautious and discerning in its responses, ensuring that it provides reliable and accurate outputs.Additionally, the new Mistral Large 2 is trained to acknowledge when it cannot find solutions or does not have sufficient information to provide a confident answer. This commitment to accuracy is reflected in the improved model performance on popular mathematical benchmarks, demonstrating its enhanced reasoning and problem-solving skills:Performance accuracy on code generation benchmarks (all models were benchmarked through the same evaluation pipeline)Performance accuracy on MultiPL-E (all models were benchmarked through the same evaluation pipeline, except for the “paper” row)Performance accuracy on GSM8K (8-shot) and MATH (0-shot, no CoT) generation benchmarks (all models were benchmarked through the same evaluation pipeline)We drastically improved the instruction-following and conversational capabilities of Mistral Large 2. The new Mistral Large 2 is particularly better at following precise instructions and handling long multi-turn conversations. Below we report the performance on MT-Bench, Wild Bench, and Arena Hard benchmarks:Performance on general alignment benchmarks (all models were benchmarked through the same evalutation pipeline)On some benchmarks, generating lengthy responses tends to improve the scores. However, in many business applications, conciseness is paramount — short model generations facilitate quicker interactions and are more cost-effective for inference. This is why we spent a lot of effort to ensure that generations remain succinct and to the point whenever possible. The graph below reports the average length of generations of different models on questions from the MT Bench benchmark:A large fraction of business use cases today involve working with multilingual documents. While the majority of models are English-centric, the new Mistral Large 2 was trained on a large proportion of multilingual data. In particular, it excels in English, French, German, Spanish, Italian, Portuguese, Dutch, Russian, Chinese, Japanese, Korean, Arabic, and Hindi. Below are the performance results of Mistral Large 2 on the multilingual MMLU benchmark, compared to the previous Mistral Large, Llama 3.1 models, and to Cohere’s Command R+.Performance on Multilingual MMLU (measured on the base pretrained model)Mistral Large 2 is equipped with enhanced function calling and retrieval skills and has undergone training to proficiently execute both parallel and sequential function calls, enabling it to serve as the power engine of complex business applications.You can use Mistral Large 2 today via la Plateforme under the name mistral-large-2407, and test it on le Chat. It is available under the version 24.07 (a YY.MM versioning system that we are applying to all our models), and the API name mistral-large-2407. Weights for the instruct model are available and are also hosted on HuggingFace.we are consolidating the offering on la Plateforme around two general purpose models, Mistral Nemo and Mistral Large, and two specialist models, Codestral and Embed. As we progressively deprecate older models on la Plateforme, all Apache models (Mistral 7B, Mixtral 8x7B and 8x22B, Codestral Mamba, Mathstral) remain available for deployment and fine-tuning using our SDK mistral-inference and mistral-finetune.Starting today, we are extending fine-tuning capabilities on la Plateforme: those are now available for Mistral Large, Mistral Nemo and Codestral.We are proud to partner with leading cloud service providers to bring the new Mistral Large 2 to a global audience. In particular, today we are expanding our partnership with Google Cloud Platform to bring Mistral AI’s models on Vertex AI via a Managed API. Mistral AI’s best models are now available on Vertex AI, in addition to Azure AI Studio, Amazon Bedrock and IBM watsonx.ai.
...
Read the original on mistral.ai »
Please make sure your browser supports JavaScript and cookies and that you are not blocking them from loading. For more information you can review our Terms of
Service and Cookie Policy.
...
Read the original on www.bloomberg.com »
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.