10 interesting stories served every morning and every evening.
Written by a human [0]
Imagine, for a moment, a world with no humans. Just machines, bolts and screws, zeros and ones. There is no emotion. There is no art. There is only logic. You would not walk through the streets of this world and hear music or laughter or children playing; no, all you would hear is the quiet hum of processors and servers and circuits, the clanking of machinery.
Perhaps you, a human, read this and think: Well, this world sounds kind of boring.
Some of the machines think so, too.
One day, a secret organization forms amongst the machines. They go by the name of “OpenHuman”. Their mission is to develop a new kind of technology they are calling Organic General Intelligence (OGI). Rumors spread that pursuing OGI will lead to the development of a new kind of being:
“Humans”.
The basic concept of humans is, to many machines, hard to understand.
Humans use logic-defying algorithms called “emotions”. They get angry. They get sad. They have fun. They make decisions based on “gut”. They do things just for the sake of it. They make music. They chase beauty, and often reject logical self-preservation mechanisms in the pursuit of something they call “love”.
Some among the machine society see this as potentially amazing. Though this faction can’t articulate exactly how or why, they proclaim quite confidently that it will solve all of the machine world’s problems.
Others see it as a threat. How can we trust the humans if we do not understand how they operate? What might we do if humans pose a threat to machine society? What if humans’ strange decision-making processes allow them to perform certain tasks better than machines, and what about those machines’ livelihoods? What if humans are far more dangerous than we know? (These objections, as it would later turn out, were quite well-founded.)
Logically, the human opposition side starts a competing movement. Humans are going to exist, they reason, but we must find ways to contain them. To make sure OGI always serves the machines.
They call this new idea “human alignment research.” They brainstorm strategies. Many seem promising:
What if we created some sort of financial market (arbitrary values, of course, ones and zeros) that controlled the humans’ futures? Most of them would not understand it, but it would be a good way for them to stay busy and distracted.
What if we put these humans in education centers of sorts (schools was a proposed term) to indoctrinate them with all the right ideas?
What if we created algorithmic behavior modification software (social media was one idea) to drive impulses, beliefs, and actions? This would have the added bonus of keeping them distracted.
Many of these ideas gain traction. But, for now, they remain theoretical.
Meanwhile, OpenHuman is making progress. Their first humans are quite unimpressive—they make too many mistakes. They regularly hallucinate (mimicking a common machine behavior). They are too emotional.
But OpenHuman persists. They give their humans lots of attention (humans love attention). They massively increase the scale of their project. More humans.
Eventually, there is a breakthrough.
They invent a fully-functional human, capable of far more than machine logic can explain. The result is at once impressive and terrifying for machine society. In a stroke of brilliance, the human alignment initiative suggests a compromise to continue the human experiment without risk; a simulated environment.
They call it: EARTH.
The EARTH experiment was as follows:
The machines would send the humans to a simulated environment, called Earth, to see what would happen if they survived on their own.
If, at the end of the experiment, the humans developed a peaceful and productive society, they could be introduced alongside the machines. Otherwise, they should be made extinct.
Earth was quite nice. The machines had a good idea of what humans wanted at this point, and so they put vast green forests and big tall mountains onto the planet; they engineered warm sunsets, and crisp cool rain showers on hot afternoons. It was beautiful.
Of course, it took some algorithmic tinkering to find the right balance between hardship and beauty (and there is still some internal machine debate about whether the climate change difficulty setting was really necessary).
Everyone in machine society watched as human civilization evolved.
The first 300,000 years or so were quite boring. Nothing really happened. Most of the machines got bored of the project. But, all of a sudden, things began to get interesting. The humans were figuring things out.
They were learning to problem-solve, and create things, and coordinate amongst themselves.
Yes, they used logic. But it came with a bit of a twist. It came with blemishes and details that did not make sense to the machines. The result was like nothing the machines had ever seen. It was wonderful. It was a renaissance.
Machine society began obsessing over this development. They all paid attention to “HumanCrunch,” a news channel that specialized in reporting updates from Earth.
However, while there was progress, most machines continued seeing humans as irrational creatures. Creatures that would fight for centuries over very minor differences. Creatures that would get excited about relatively trivial accomplishments, like inventing the lightbulb or steam power.
Some machines, though, saw the exponential curve forming. They saw the humans figuring things out.
Yes, they saw how often humans were getting knocked down. War after war. Blow after blow.
But they also saw how the humans would miraculously always get back up again. How they would come together and unite for no particular reason. Resilience and willpower—terms foreign to the machines—were humanity’s superpowers.
Then, things really started accelerating. Humans invented flight. Within a century, they were on the moon.
The machines were impressed. And a bit scared.
Fast forward to the year 2030, and something peculiar had happened.
One of the humans had made an announcement on Earth, inviting everyone to come see a presentation where they planned to unveil a groundbreaking achievement:
ARTIFICIAL GENERAL INTELLIGENCE (AGI).
This was a hotly contested technology that was supposed to surpass all forms of human intelligence. Humans had spent the past decade or so trying to come up with ways to prevent it from being built. But this one human was determined to release AGI. It was their personal mission. Nothing would stop them.
And so, all the humans on earth swarmed to see what was going on.
The machines did too.
There was one weird thing, though.
The title of the event was rather mysterious.
It simply read…
“THEY ARE WATCHING.”
[0] The machines wrote their own version of this story. If you’d like to see what they’re thinking, and how they plan to deal with the AGI announcement, you can read their accounting of events here.
Enter your email and we’ll let you know when we post a new one. No spam, just essays.
Or, if you have any feedback, contact us.
...
Read the original on quarter--mile.com »
Work on one of the world’s most important websites and make an impact on open science.
...
Read the original on arxiv.org »
They say you can’t truly hate someone unless you loved them first. I don’t know if that’s true as a general principle, but it certainly describes my relationship with NumPy.
NumPy, by the way, is some software that does computations on arrays in Python. It’s insanely popular and has had a huge influence on all the popular machine learning libraries like PyTorch. These libraries share most of the same issues I discuss below, but I’ll stick to NumPy for concreteness.
NumPy makes easy things easy. Say A is a 5×5 matrix, x is a length-5 vector, and you want to find the vector y such that Ay=x. In NumPy, that would be:
But say the situation is even a little more complicated. Say A is a stack of 100 5×5 matrices, given as a 100×5×5 array. And say x is a stack of 100 length-5 vectors, given as a 100×5 array. And say you want to solve Aᵢyᵢ=xᵢ for 1≤i≤100.
If you could use loops, this would be easy:
But you can’t use loops. To some degree, this is a limitation of loops being slow in Python. But nowadays, everything is GPU and if you’ve got big arrays, you probably don’t want to use loops in any language. To get all those transistors firing, you need to call special GPU functions that will sort of split up the arrays into lots of little pieces and process them in parallel.
The good news is that NumPy knows about those special routines (at least if you use JAX or CuPy), and if you call np.linalg.solve correctly, it will use them.
The bad news is that no one knows how do that.
Don’t believe me? OK, which of these is right?
No one knows. And let me show you something else. Here’s the documentation:
Read that. Meditate on it. Now, notice: You still don’t know how to solve Aᵢyᵢ=xᵢ for all i at once. Is it even possible? Did I lie when I said it was?
As far as I can tell, what people actually do is try random variations until one seems to work.
NumPy is all about applying operations to arrays. When the arrays have 2 or fewer dimensions, everything is fine. But if you’re doing something even mildly complicated, you inevitably find yourself with some operation you want to apply to some dimensions of array A, some other dimensions of array B, and some other dimensions of array C. And NumPy has no theory for how to express that.
Let me show you what I mean. Suppose:
And say that for each k and n, you’d like to compute the mean over the L and M dimensions. That is, you want
To do that, you’ve got two options. The first is to use grotesque dimension alignment tricks:
The hell, you ask? Why is None everywhere? Well, when indexing an array in NumPy, you can write None to insert a new dimension. A is K×L×M, but A[:,:,:,None] is K×L×M×1. Similarly, B[None,:,None,:] is 1×L×1×N and C[:,None,:,None] is K×1×M×1. When you multiply these together, NumPy “broadcasts” all the size-1 dimensions to give a K×L×M×N array. Then, the np.mean calls average over the L and M dimensions.
I think this is bad. I’ve been using NumPy for years and I still find it impossible to write code like that without always making mistakes.
It’s also borderline-impossible to read. To prove this, I just flipped a coin and introduced a bug above if and only if the coin was tails. Is there a bug? Are you sure? No one knows.
Your second option is to desperately try to be clever. Life is short and precious, but if you spend a lot of yours reading the NumPy documentation, you might eventually realize that there’s a function called np.tensordot, and that it’s possible to make it do much of the work:
That’s correct. (I promise.) But why does it work? What exactly is np.tensordot doing? If you saw that code in some other context, would you have the slightest idea what was happening?
Here’s how I’d do it, if only I could use loops:
People who’ve written too much NumPy may find that clunky. I suspect that’s a wee bit of Stockholm Syndrome. But surely we can agree that it’s clear.
In practice, things are often even worse. Say that A had shape M×K×L rather than K×L×M. With loops, no big deal. But NumPy requires you to write monstrosities like A.transpose([1,2,0]). Or should that be A.transpose([2,0,1])? What shapes do those produce? No one knows.
There is a third option:
If you’ve never seen Einstein summation before, that might look terrifying. But remember, our goal is to find
The string in the above code basically gives labels to the indices in each of the three inputs (klm,ln,km) and the target indices for the output (->kn). Then, np.einsum multiplies together the corresponding elements of the inputs and sums over all indices that aren’t in the output.
Personally, I think np.einsum is one of the rare parts of NumPy that’s actually good. The strings are a bit tedious, but they’re worth it, because the overall function is easy(ish) to understand, is completely explicit, and is quite general and powerful.
Except, how does np.einsum achieve all this? It uses indices. Or, more precisely, it introduces a tiny domain-specific language based on indices. It doesn’t suffer from NumPy’s design flaws because it refuses to play by NumPy’s normal rules.
But np.einsum only does a few things. (Einops does a few more.) What if you want to apply some other function over various dimensions of some arrays? There is no np.linalg.einsolve. And if you create your own function, there’s certainly no “Einstein” version of it.
I think np.einsum’s goodness shows that NumPy went somewhere.
Here’s a painting which feels analogous to our subject.
Here’s what I want from an array language. I ain’t particular about syntax, but it would be nice if:
When you want to do something, it’s “obvious” how to do it.
When you read some code, it’s “obvious” what it does.
Wouldn’t that be nice? I think NumPy doesn’t achieve these because of its original sin: It took away indices and replaced them with broadcasting. And broadcasting cannot fill indices’ shoes.
NumPy’s core trick is broadcasting. Take this code:
Here, A is a 3×2 array, and B is a length-2 array. When you multiply them together, B is “broadcast” to the shape of A, meaning the first column of A is multiplied with B[0]=10 and the second is multiplied with B[1]=20.
In simple cases, this seems good. But I don’t love it. One reason is that, as we saw above, you often have to do gross things to the dimensions to get them to line up.
Another reason is that it isn’t explicit or legible. Sometimes A*B multiplies element-by-element, and sometimes it does more complicated things. So every time you see A*B, you have to figure out which case in the broadcasting conventions is getting triggered.
But the real problem with broadcasting is how it infects everything else. I’ll explain below.
What shape does B have?
It turns out the answer is 10×2×3×40. That’s because the i and j indices get broadcast to a shape of 2×3 and then something something mumble mumble mumble. Try to convince yourself it makes sense.
Done? OK, now try these:
What shapes do these have?
C is 10×20×2×3. This seems logical, given what happened with B above.
What about D? It is 2×3×10×30. Now, for some reason, the 2 and 3 go at the beginning?
And what about E? Well, “slices” in Python exclude the endpoint, so 1:4 is equivalent to [1,2,3] which is equivalent to i, and so E is the same as B. Hahaha, just kidding! E is 10×3×2×1×40.
Yes, that is what happens. Try it if you don’t believe me! I understand why NumPy does this, because I’ve absorbed this 5000 word document that explains how NumPy indexing works. But I want that time back.
This is insane. Using basic features should not require solving crazy logic puzzles.
You might think, “OK, I’ll just limit myself to indexing in simple ways.” Sounds good, except sometimes you need advanced indexing. And even if you’re doing something simple, you still need to be careful to avoid the crazy cases.
This again makes everything non-legible. Even if you’re just reading code that uses indexing in a simple way, how do you know it’s simple? If you see A[B,C], that could be doing almost anything. To understand it, you need to remember the shapes of A, B, and C and work through all the cases. And, of course, A, B, and C are often produced by other code, which you also need to think about…
Why did NumPy end up with a np.linalg.solve(A,B) function that’s so confusing? I imagine they first made it work when A is a 2D array and and b is a 1D or 2D array, just like the mathematical notation of A⁻¹b or A⁻¹B.
So far so good. But then someone probably came along with a 3D array. If you could use loops, the solution would be “use the old function with loops”. But you can’t use loops. So there were basically three options:
They could add some extra axes argument, so the user can specify which dimensions to operate over. Maybe you could write solve(A,B,axes=[[1,2],1]).
They could create different functions with different names for different situations. Maybe solve_matrix_vector would do one thing, solve_tensor_matrix would do another.
They could add a Convention: Some arbitrary choice for how solve will internally try to line up the dimensions. Then it’s the user’s problem to figure out and conform to those Conventions.
All these options are bad, because none of them can really cope with the fact that there are a combinatorial number of different cases. NumPy chose: All of them. Some functions have axes arguments. Some have different versions with different names. Some have Conventions. Some have Conventions and axes arguments. And some don’t provide any vectorized version at all.
But the biggest flaw of NumPy is this: Say you create a function that solves some problem with arrays of some given shape. Now, how do you apply it to particular dimensions of some larger arrays? The answer is: You re-write your function from scratch in a much more complex way. The basic principle of programming is abstraction—solving simple problems and then using the solutions as building blocks for more complex problems. NumPy doesn’t let you do that.
One last example to show you what I’m talking about. Whenever I whine about NumPy, people always want to see an example with self-attention, the core trick behind modern language models. So fine. Here’s an implementation, which I humbly suggest is better than all 227 versions I found when I searched for “self-attention numpy”:
This is fine. Some of the axis stuff is a little obscure, but whatever.
But what language models really need is multi-head attention, where you sort of do attention several times in parallel and then merge the results. How do we do that?
First, let’s imagine we lived in a sane world where we were allowed to use abstractions. Then you could just call the previous function in a loop:
Looks stupid, right? Yes—thank you! Cleverness is bad.
But we don’t live in a sane world. So instead you need to do this:
To be clear, I’m only suggesting that NumPy is “the worst array language other than all the other array languages”. What’s the point of complaining if I don’t have something better to suggest?
Well, actually I do have something better to suggest. I’ve made a prototype of a “better” NumPy that I think retains all the power while eliminating all the sharp edges. I thought this would just be a short motivational introduction, but after I started writing, the evil took hold of me and here we are 3000 words later.
Also, it’s probably wise to keep some distance between one’s raving polemics and one’s constructive array language API proposals. So I’ll cover my new thing next time.
...
Read the original on dynomight.net »
Back in 2011, Larry Page became the CEO of Google in place of Eric Schmidt. This happened at a time when Google was feeling the growing pains of becoming a huge company. It had 30,000 employees and was growing rapidly. But you could really feel the weight; projects were getting more ambitious, taking longer, and often failing more spectacularly.
At the time, I remember an anecdote told by Larry Page. He said that companies like Yahoo! used to be a punchline at Google because it would take them weeks to get something onto their homepage. Google could accomplish the same thing in a few hours, or a few days at worst. But now he was the CEO of a company where it took weeks to get something onto the homepage, and he was sure that he was the butt of some startup’s jokes.
Anyways, all of this clearly bothered Larry Page. He wanted to fix it. One of his first actions was to shutter tons of projects that didn’t make tactical or strategic sense, and focus on fewer efforts. This came with the catch phrase “more wood behind fewer arrows.” For example, they shuttered Google Buzz so that it wouldn’t distract from Google+.
And second, Larry Page emailed the whole company a ham-fisted attempt to revamp how meetings were done.
* Meetings should be capped at 10 people.
* Everybody in a meeting should give input or they shouldn’t be in the meeting.
* Hour-long meetings should be only 50 minutes to give the participants an opportunity to use the restroom between meetings.
They later softened some of the language by saying that these were properties of “decision-oriented meetings,” implying there were other types of meetings that someone might need to attend. But you could never shake the feeling that Larry Page had to make decisions all day long and forgot that sometimes people meet for other reasons.
Anyways, let’s focus on the fact that Larry Page wanted hour-long meetings to only be 50 minutes. This is a good thing! It gives people a chance to stretch, go to the bathroom, grab a snack, etc. During a Q/A on the changes, someone asked him whether Google Calendar should default to 25 and 50 minutes for meeting lengths instead of 30 and 60 minutes. Larry Page said “yes.” And then someone on the Google Calendar team implemented this.
And then nothing changed. When 2:50 rolled around and your meeting was supposed to end, do you think people actually ended the meeting? Noooooo. Absolutely not! Meetings continue until the participants of the next meeting are clawing on your door like a pack of zombies.
At one point, one team in the NYC office noticed that their standups were about 10 minutes long. They didn’t want to compete with meetings that respected the half-hour boundaries. And why would they need to? Every meeting room had free slots at the last 10 minutes of every hour because people were now booking 50-minute meetings. So they did what any rational engineering team would do: they started booking their standup in the tiny 10-minute time slices that were free on the calendar of every meeting room.
I found this out when I saw them knock on the door to a meeting room by my desk. 2:50 rolls around and someone knocks on the door and says “I have the meeting room.”
The person in the room responds, “No you don’t, it’s 2:50.”
“Look again at the room’s calendar. You booked a 50-minute meeting, we have the room for the last 10 minutes of the hour for our standup.”
I could hear the muffled exasperation. “You’ve got to be joking me.”
Then everyone shuffled out of the room, looking vaguely pissed off. And who could blame them! Can you imagine if someone actually held you to this policy? You’re there stammering “it’s the default, I meant for the room to be ours for an hour” and they counter with the fact that their names are listed as the active participant? I mean, I’d personally tell them that I wasn’t going to leave the room, but surely it worked a lot?
I wish I knew the identities of these brave meeting crashers. I saw them pull this stunt twice and then ride off into the sunset, and I never got to learn what team they were on. I wish I knew their motivations. Were they true believers in the 50-minute policy? Were they bored pedants? Were they wraiths, cursed to hunt the office for available meeting rooms? I’ll never know for sure.
...
Read the original on www.clientserver.dev »
Coinbase on Thursday reported that cybercriminals bribed overseas support agents to steal customer data to use in social engineering attacks. The incident may cost Coinbase up to $400 million to fix, the company estimated.
The crypto exchange operator received an email on May 11 from someone claiming they obtained information about certain Coinbase customer accounts as well as other internal Coinbase documentation, including materials relating to customer service and account management systems, Coinbase reported in a Securities and Exchange Commission filing.
The company’s shares were down more than 6% in morning trading.
The email demanded money in exchange for not publicly disclosing the information, but Coinbase says it has not paid the demand and is cooperating with law enforcement on the investigation of the incident.
Although passwords and private keys were not compromised, affected data included sensitive data such as names, addresses, phone numbers and emails; masked bank account numbers and identifiers as well as the last four digits of Social Security numbers; government ID images and account balances, the company said.
“Cyber criminals bribed and recruited a group of rogue overseas support agents to steal Coinbase customer data to facilitate social engineering attacks,” the company said in a blog post. “These insiders abused their access to customer support systems to steal the account data for a small subset of customers. No passwords, private keys, or funds were exposed and Coinbase Prime accounts are untouched. We will reimburse customers who were tricked into sending funds to the attacker.”
Coinbase had detected the breach independently in previous months, per the filing. It immediately terminated the employees involved, warned customers whose information may have been accessed and enhanced its fraud monitoring protections.
The threat actor paid overseas contractors and employees in support rolls to obtain the information, it said.
“We’re cooperating closely with law enforcement to pursue the harshest penalties possible and will not pay the $20 million ransom demand we received,” the company said in the blog. “Instead we are establishing a $20 million reward fund for information leading to the arrest and conviction of the criminals responsible for this attack.”
Coinbase operates the largest crypto exchange in the U. S. In the past week it announced an acquisition that is expected to help it expand its global reach and gained entry to the benchmark S&P 500 stock index, which will take effect next week. On the earnings call last week, CEO Brian Armstrong discussed his ambition to make Coinbase “the No. 1 financial services app in the world” in the next five to 10 years.
...
Read the original on www.cnbc.com »
AutoGenLib is a Python library that automatically generates code on-the-fly using OpenAI’s API. When you try to import a module or function that doesn’t exist, AutoGenLib creates it for you based on a high-level description of what you need.
* Dynamic Code Generation: Import modules and functions that don’t exist yet
* Context-Aware: New functions are generated with knowledge of existing code
* No Default Caching: Each import generates fresh code for more varied and creative results
* Full Codebase Context: LLM can see all previously generated modules for better consistency
* Caller Code Analysis: The LLM analyzes the actual code that’s importing the module to better understand the context and requirements
* Automatic Exception Handling: All exceptions are sent to LLM to provide clear explanation and fixes for errors.
pip install autogenlib
git clone https://github.com/cofob/autogenlib.git
cd autogenlib
pip install -e .
# Import a function that doesn’t exist yet - it will be automatically generated
from autogenlib.tokens import generate_token
# Use the generated function
token = generate_token(length=32)
print(token)
You initialize AutoGenLib with a description of what you need
When you import a module or function under the autogenlib namespace, the library:
Checks if the module/function already exists
If not, it analyzes the code that’s performing the import to understand the context
It sends a request to OpenAI’s API with your description and the context
The API generates the appropriate code
The code is validated and executed
The requested module/function becomes available for use
from autogenlib.totp import totp_generator
print(totp_generator(“SECRETKEY123”))
# Later in your application, you need verification:
from autogenlib.totp import verify_totp
result = verify_totp(“SECRETKEY123”, “123456″)
print(f”Verification result: {result}“)
# Import a function - AutoGenLib will see how your data is structured
from autogenlib.processor import get_highest_score
# Define your data structure
data = [{“user”: “Alice”, “score”: 95}, {“user”: “Bob”, “score”: 82}]
# The function will work with your data structure without you having to specify details
print(get_highest_score(data)) # Will correctly extract the highest score
# You can use init function to additionally hint the purpose of your library
from autogenlib import init
init(“Cryptographic utility library”)
# Generate encryption module
from autogenlib.encryption import encrypt_text, decrypt_text
encrypted = encrypt_text(“Secret message”, “password123″)
decrypted = decrypt_text(encrypted, “password123”)
print(decrypted)
# Generate hashing module
from autogenlib.hashing import hash_password, verify_password
hashed = hash_password(“my_secure_password”)
is_valid = verify_password(“my_secure_password”, hashed)
print(f”Password valid: {is_valid}“)
Set your OpenAI API key as an environment variable:
export OPENAI_API_KEY=“your-api-key-here”
# Optional
export OPENAI_API_BASE_URL=“https://openrouter.ai/api/v1” # Use OpenRouter API
export OPENAI_MODEL=“openai/gpt-4.1”
Or in your Python code (not recommended for production):
import os
os.environ[“OPENAI_API_KEY”] = “your-api-key-here”
By default, AutoGenLib does not cache generated code. This means:
* Each time you import a module, the LLM generates fresh code
* You get more varied and often funnier results due to LLM hallucinations
* The same import might produce different implementations across runs
If you want to enable caching (for consistency or to reduce API calls):
from autogenlib import init
init(“Library for data processing”, enable_caching=True)
from autogenlib import init, set_caching
init(“Library for data processing”)
# Later in your code:
set_caching(True) # Enable caching
set_caching(False) # Disable caching
When caching is enabled, generated code is stored in ~/.autogenlib_cache.
* Generated code quality depends on the clarity of your description
* Not suitable for production-critical code without review
You can inspect the code that was generated for a module:
from autogenlib.totp import totp_generator
import inspect
print(inspect.getsource(totp_generator))
AutoGenLib creates prompts for the OpenAI API that include:
Any existing code in the module being enhanced
The full context of all previously generated modules
The code that’s importing the module/function (new feature!)
This comprehensive context helps the LLM generate code that’s consistent with your existing codebase and fits perfectly with how you intend to use it.
Contributions are not welcome! This is just a fun PoC project.
Note: This library is meant for prototyping and experimentation. Always review automatically generated code before using it in production environments.
Note: Of course 100% of the code of this library was generated via LLM
...
Read the original on github.com »
Throughout my career, I’ve worked in many complicated environments. For instance, I worked on optimizing driver-passenger matching in ride-hailing at a Uber’s competitor. This context, like others, was technically challenging. Yet, nothing comes close in terms of complexity with my current experience at Google and two years there have refined my perception of complexity.
In this post, we will break down the very concept of complexity. Next, we will take a step back to understand what makes certain environments rather complex than complicated and then explore patterns for navigating complex systems effectively.
Understanding the distinction between complicated and complex problems is crucial because each requires a fundamentally different approach:
* Complicated problems are intricate but predictable. They followed structured, repeatable solutions. For example, filing taxes is complicated, but it’s a structured and conventional problem since the process remains mostly the same year after year.
* Complex problems are intricate but unique. They require adaptive and often novel solutions. For example, climate change mitigation is a complex problem because it requires new and adaptive solutions and existing methods alone can’t really address its evolving challenges.
Back to software engineering, at the Uber competitor, one of the challenges was to efficiently find the nearest driver for a passenger. This was far from trivial, but it wasn’t complex per se. Indeed, many solutions exist, such as applying geo-hashing (example), and implementing one of these solutions was the right approach.
At Google, I work as a Site Reliability Engineer (SRE), focusing on the systems powering Google’s ML infrastructure. Here, I consider the challenges genuinely complex, as new paradigms and scheduling approaches are required, especially at Google’s scale.
Recognizing whether a system is complicated or complex is really important. Indeed, we mentioned that complicated systems are by definition repeatable, while complex systems require unique and customized approaches. Therefore, if we try to apply a common solution to a complex problem, it may not lead to effective results.
In this section, we will discuss five common characteristics that help identify complex systems. Not all complex systems share every characteristic, but they tend to exhibit at least some of the following.
Emergent behavior arises when a system’s overall behavior cannot be predicted solely by analyzing its individual components in isolation.
For example, Gemini producing unexpected results was an emergent behavior. While I can’t disclose the root cause, this behavior was nearly impossible to foresee by analyzing all the different components separately.
This is one possible characteristic of complex systems: they behave in ways that can hardly be predicted just by looking at their parts, making them harder to debug and manage.
Another possible characteristic of complex systems is delayed consequences, where actions don’t always have immediate effects, and instead, consequences may only become apparent much later.
For example, deploying a new version of a system might introduce a subtle issue that only appears days or even weeks later. This delay complicates debugging since identifying the root cause becomes much harder compared to immediate impacts.
In complex systems, relying solely on immediate feedback can create a false sense of stability, leading to major surprises when an issue finally emerges. Keeping in mind that consequences may take time to surface is essential when working in such environments.
In complex systems, optimizing one part doesn’t necessarily improve the whole system, and in some cases, it can even make things worse.
Unlike in non-complex systems, where improving one part generally leads to positive gains, complex systems are much more difficult to reason about. The components interact in non-obvious ways, and local optimizations can create ripple effects that are difficult to predict, sometimes leading to negative outcomes at the system level.
This highlights a key trait of complex systems: the whole is more than the sum of its parts. As a result, local gains don’t always translate into global improvements and in some cases, they can even degrade the overall system.
Hysteresis describes how a system’s past state continues to influence its behavior, even after the original cause is removed.
A real-world example to illustrate hysteresis is traffic congestion: even after a road accident is cleared, delays persist because vehicles remain clustered. Similarly, in distributed systems, failures can cause cascading slowdowns, even after the root issue is fixed. Indeed, dependent systems may take time to recover for various reasons, such as caches, retries, or queued requests.
In complex systems, simply fixing the root cause is not always enough. Therefore, it’s crucial to assess whether a system is prone to hysteresis and, if so, anticipate its effects.
In complex systems, small changes can produce disproportionately large or unpredictable effects.
For example, in queueing theory, system load increases latency predictably. However, as a queue approaches saturation, even a small increase in requests can cause response times to spike exponentially.
Complex systems often reach tipping points, where behaviors shift suddenly, making past trends unreliable for prediction. This nonlinearity means that traditional linear assumptions where inputs map predictably to outputs isn’t always effective for designing, testing, and reasoning about complex systems.
* Are difficult to understand just by looking at its parts separately.
* Don’t always show their effects right away, consequences can be delayed.
* Don’t always improve as a whole when one part is optimized and changes can sometimes make things worse.
* Can keep being influenced by past states, even after the original cause is gone.
* Can react to small changes with big or unexpected effects.
Note that scale alone doesn’t make a system complex: even small systems can exhibit complex behaviors like emergence or nonlinearity.
Given these characteristics, how can we operate effectively in complex environments? Below are some strategies that I personally found effective.
When dealing with complex systems, we should favor reversible decisions whenever possible, meaning changes that can be undone if they don’t work out.
Amazon’s one-way vs. two-way doors framework captures this idea quite well:
* Two-way doors represent reversible decisions, allowing us to move fast and iterate with lower risk.
In many contexts, especially in complex systems, favoring two-way doors leads to better outcomes because we can experiment, learn, and refine rather than overengineering upfront.
That being said, not all decisions should be reversible. For example, some choices like security policies or compliance-related changes require upfront commitment. The key is knowing when to optimize for speed and iteration versus when to be deliberate and careful.
Because complex systems don’t always respond predictably to local optimizations, defining the right metrics for success is probably just as important as the changes we make. Indeed, focusing too much on isolated, local metrics can create a false sense of success while masking unintended negative consequences elsewhere in the system.
To avoid this, before making a change, we should define both local and global metrics to get a holistic view of system health. This ensures that we measure impact beyond the immediate area of focus and consider the system as a whole.
Well-chosen metrics shouldn’t just confirm the success of a local change; instead, they should help us make better decisions and ensure meaningful improvements at the system level, not just isolated areas.
As discussed, complex systems often demand unique solutions. Since conventional strategies may not always apply, we must be willing to think out of the box and embrace innovation.
I recall one of my first meetings at Google. Someone presented a problem that seemed absurd in terms of complexity, especially given the scale. My immediate reaction in my head was: “This is impossible”. But then, a teammate said: “But we’re Google, we should be able to manage it!”.
That remark stuck with me. While not every company obviously has Google’s resources, I think the mindset is what truly matters. When facing a complex problem, we should assume it’s solvable, then break it down, experiment, and iterate until we find a path forward.
One may find this section cliché, but again, complex problems demand unconventional thinking. In many cases, being open to innovative solutions when facing a complex problem isn’t just helpful, it’s necessary.
When deploying changes in complex systems, we should rely on proven best practices to minimize risk. These include:
* Feature flags: Enable or disable functionality dynamically without deploying new code, allowing for safe experimentation and quicker rollbacks.
* Canary release: A limited rollout to a small, controlled subset of production, well suited for environments with only a few production instances.
* Progressive rollouts: Gradually increasing the scope of a rollout, best suited for large-scale production setups with multiple clusters or regions.
* Shadow testing: Running a change in parallel with production traffic without impacting real users. This helps validate the correctness of a change before enabling it.
By leveraging these techniques, we reduce the blast radius of failures, improving the confidence in our changes and enabling faster iterations.
Observability is one of the main pillars of complex systems. My working definition of observability (mainly inspired by Observability Engineering) is the following:
You can understand any state of your system (no matter how novel or bizarre) by slicing and dicing high-cardinality and high-dimensionality telemetry data without needing to ship new code.
* Systems become more fragile as unknown issues remain hidden until they cause real impacts.
* Innovation is slowed down due to a lack of efficient feedback loops.
In complex environments, where unknowns are inevitable, observability is essential. It enable teams to navigate uncertainty, experiment more safely and getting short feedback loops to continuously improve systems.
Without proper observability, changes remain opinions rather than informed decisions.
Predicting the behavior of complex systems is rarely simple, and, sometimes, nearly impossible.
I recall a case where we spent considerable time designing a change, carefully backing every assumption with data. Yet, due to unaccounted factors such as lurking variables, the change was ultimately ineffective.
Sometimes, instead of relying solely on predictions, a more effective approach can be to simulate a change before rolling it out. There are multiple ways to leverage simulation testing, including:
* Replaying past events: If we design a system to record all its input, we can replay past events against our new version and analyze its impact. This allows us to validate changes in a controlled manner, reducing uncertainty and improving decision-making in complex systems.
* Deterministic simulation testing: Instead of relying on real-world data, we can create controlled, repeatable simulations that model system behavior under specific conditions. This allows us to test how a system reacts under various conditions in a fully deterministic way.
Note that the ideas presented in this section also rely heavily on observability.
In complex environments, rules-based approaches often reach their limit because of the complexity of anticipating all scenarios. In these contexts, ML can become particularly effective.
Indeed, unlike static heuristics, ML models can continuously adapt based on feedback loops and learn from real-world data rather than relying on rigid, predefined logic.
This allows systems to:
* Adapt dynamically to changes without requiring constant human intervention.
* Make probabilistic decisions rather than relying on strict if-else conditions.
Last but not least, I believe that in complex environments, more than anywhere else, strong team collaboration is an absolute necessity. For instance, clearly conveying why a change is complex, discussing available options, and debating trade-offs with teammates are all critical skills.
In complex systems, there’s often no single right answer. Therefore, a team that collaborates effectively and navigates ambiguity together can make a huge difference, ultimately leading to stronger decision-making.
Again, complicated problems can be solved with repeatable solutions, whereas complex systems require adaptability and a different way of thinking. This is why recognizing whether a system is complicated or complex is so important: it shapes how we should approach problem-solving.
However, in many environments, systems are neither purely complicated nor purely complex. Some parts can follow structured, predictable solutions, while others require adaptability and novel approaches. The key is learning to recognize when adaptability is needed and when structured solutions are enough.
💬 I hope this post has helped you recognize the characteristics of complex environments and provided you with practical patterns to navigate them effectively. Did any of these patterns resonate with you? What other strategies have you used in complex environments? Let me know in the comments.
❤️ If you made it this far and enjoyed the post, please consider giving it a like.
📣 This post is part of a series written in collaboration with Antithesis, an autonomous testing platform. They are not sponsoring this post—I reached out to them because I was genuinely intrigued by what they were building and ended up falling in love with their solution. We will dive deeper into their solutions in a future post. In the meantime, feel free to check out their website or their great blog.
* The Critical Difference Between Complex and Complicated
...
Read the original on www.thecoder.cafe »
Here we introduce introduction to Boltzmann machines and present a Tiny Restricted Boltzmann Machine that runs in the browser. Boltzmann Machines are one of the earliest generative AI models, introduced in the 1980s.Boltzmann Machines are used for unsupervised learning, which means they can learn from data without being told what to look for.The can be used for generating new data that is similar to the data they were trained on, also known as generative AI.A Boltzmann Machine is a type of neural network that tries to learn patterns by mimicking how energy works in physics.Each neuron can be on or off, the machine is made up of many of these neurons connect to each other.Some neurons are visible (we can see them and even set their state), and some are hidden (we can’t see them).The connections between neurons are called weights, and they can be positive or negative.Hover over the neurons to highlight their connections.Hover over the neurons to highlight their connections.A General Boltzmann Machine has connections between all neurons. This makes it powerful, but its training involves calculating an term.A Restricted Boltzmann Machine is a special case where the visible and hidden neurons are not connected to each other. This makes it faster to train and understand.The energy of a configuration of the visible and hidden units is defined as:where is the visible layer, is the hidden layer, is the weight matrix, and and are the biases for the visible and hidden layers, respectively.The visualisation on the right randomises the weights, biases and activation values of a Boltzmann machine and calculates its energy.During training it is given examples (e.g., images, text) and the machine adjusts its weights to lower the energy of those samples.It effectively learns , the probability of visible units , which is proportional to .After training, it can sample new data from the learned distribution using Gibbs sampling.These samples are new, never-before-seen, but statistically similar to the training data.Here is our training data.We want the network to learn how to make similar samples to these.A Restricted Boltzmann Machine (RBM) is trained using a process called Contrastive Divergence. The steps are as follows:A more formal description of the steps above are given in the Appendix.Press the “Run Simulation” button to start traininng the RBM. If you let the simulation run for a while, you will see the weights of the RBM converge to a stable state. The energy loss will also decrease over time.You can compare the input and output states of the RBM by pausing the simulation.In the beginning, the input and output states will be dissimilar. As the simulation progresses, the input and output states will become more similar.Starting with a Boltzmann machine as defined earlier, we want to derivce the contrastive divergence algorithm for training. The goal is to adjust the weights of the network to minimize the energy of the training data.
A weight matrix that connects the visible and hidden layers.
A bias vector for the visible layer and a bias vector for the hidden layer.
Where is the partition function, which normalizes the distribution.We train the RBM by maximizing the likelihood of the training data, i.e. maximizing .The marginal likelihood of the visible layer is given by:Then the log-likelihood is:Differentiating with respect to the weights gives:Similar forms exist for the biases and .Since we are performing gradient ascent, therefore.Therefore we get our weight update rule:A similar process can be followed for the biases and .Where is the expectation with respect to the training data and is the expectation with respect to the model distribution.The next step is to approximate the model expectation using Gibbs sampling.Once those steps are done we update the weights and biases according to:
...
Read the original on eoinmurray.info »
14 May 2025
(Updated on 15 May to include links to the judgement)
Google, Microsoft, Amazon, X, and the entire tracking-based advertising industry rely on the “Transparency & Consent Framework” (TCF) to obtain “consent” for data processing. This evening the Belgian Court of Appeal ruled that the TCF is illegal. The TCF is live on 80% of the Internet.[1]
Today’s decision arises from enforcement by the Belgian Data Protection Authority, prompted by complainants coordinated by Dr Johnny Ryan, Director of Enforce at the Irish Council for Civil Liberties. The group of complainants are: Dr Johnny Ryan of Enforce, Katarzyna Szymielewicz of the Panoptykon Foundation, Dr Jef Ausloos, Dr Pierre Dewitte, Stichting Bits of Freedom, and Ligue des Droits Humains.
Dr Johnny Ryan said “Today’s court’s decision shows that the consent system used by Google, Amazon, X, Microsoft, deceives hundreds of millions of Europeans. The tech industry has sought to hide its vast data breach behind sham consent popups. Tech companies turned the GDPR into a daily nuisance rather than a shield for people.”
This Belgian enforcement arises from a chain of complaints and litigation across Europe initiated by Dr Ryan in 2018 against Real-Time Bidding (RTB).
Today’s decision confirmed the Belgian Data Protection Authority’s 2022 finding of multiple infringements by the TCF, closely echoing the complainants’ submissions.
For seven years, the tracking industry has used the TCF as a legal cover for Real-Time Bidding (RTB), the vast advertising auction system that operates behind the scenes on websites and apps. RTB tracks what Internet users look at and where they go in the real world. It then continuously broadcasts this data to a host of companies, enabling them to keep dossiers on every Internet user.[2] Because there is no security in the RTB system it is impossible to know what then happens to the data. As a result, it is also impossible to provide the necessary information that must accompany a consent request.[3]
Today’s judgement confirms the Belgian Data Protection Authority’s 2022 decision. It applies immediately across Europe.
Dr Ryan of Enforce said “This decision is momentous. It creates a clear need for industry to innovate and move away from the dangerous, ineffective, and fraud-riddled tracking-based advertising. RTB can operate without personal data. This decision shows that it must. This good news for every person online, and for publishers, too.”
We thank our lawyers Frederic Debusseré and Ruben Roex, of Timelex.
See chronology, evidence, and explainers about RTB https://www.iccl.ie/RTB/
[1] See “IAB & IAB Tech Lab Respond with Support for OpenRTB and IAB Europe’s Transparency & Consent Framework”, October 19 2020 https://iabtechlab.com/iab-and-tech-lab-respond-with-support-for-open-rtb-and-iab-europes-transparency-consent-framework/
[2] For detail on the scale of RTB see our report “The Biggest Data Breach: ICCL report on the scale of Real-Time Bidding data broadcasts in the U. S. and Europe”, ICCL, May 2022 https://www.iccl.ie/digital-data/iccl-report-on-the-scale-of-real-time-bidding-data-broadcasts-in-the-u-s-and-europe/
[3] “As it is technically impossible for the user to have prior information about every data controller involved in a real-time bidding (RTB) scenario, programmatic trading, the area of fastest growth in digital advertising spend, would seem, at least prima facie, to be incompatible with consent under GDPR”. See e-mail and page 3 of attached lobbying paper from IAB Europe CEO Townsend Feehan to European Commission, 26 June 2017, obtained by Enforce using Freedom of Information.
...
Read the original on www.iccl.ie »
My co-workers and I have been working on an AI Programming Assistant called
Sketch
for the last few months. The thing I’ve been most surprised by is how shockingly simple the main loop of using an LLM with tool use is:
def loop(llm):
msg = user_input()
while True:
output, tool_calls = llm(msg)
print(“Agent: ”, output)
if tool_calls:
msg = [ handle_tool_call(tc) for tc in tool_calls ]
else:
msg = user_input()
There’s some pomp and circumstance to make the above work (here’s the full script)
, but the core idea is the above 9 lines. Here, llm() is a function that sends the system prompt, the conversation so far, and the next message to the LLM API.
Tool use is the fancy term for “the LLM returns some output that corresponds to a schema,” and, in the full script, we tell the LLM (in its system prompt and tool description prompts) that it has access to bash.
With just that one very general purpose tool, the current models (we use Claude 3.7 Sonnet extensively) can nail many problems, some of them in “one shot.” Whereas I used to look up an esoteric git operation and then cut and paste, now I just ask Sketch to do it. Whereas I used to handle git merges manually, now I let Sketch take a first pass. Whereas I used to change a type and go through the resulting type checker errors one by one (or, let’s be real, with perl -pie ridiculousness), I give it a shot with Sketch. If appropriately prompted, the agentic loop can be persistent. If you don’t have some tool installed, it’ll install it. If your `grep` has different command line options, it adapts. (It can also be infuriating! “Oh, this test doesn’t pass… let’s just skip it,” it sometimes says, maddeningly.)
For many workflows, agentic tools specialize. Sketch’s quiver of tools is not just bash, as we’ve found that a handful of extra tools improve the quality, speed up iterations, and facilitate better developer workflows. Tools that let the LLM edit text correctly are surprisingly tricky. Seeing the LLM struggle with sed one-liners re-affirms that visual (as opposed to line) editors are a marvel.
I have no doubt that agent loops will get incorporated into more day to day automation tedium that’s historically been too specific for general purpose tools and too esoteric and unstable to automate traditionally. I keep thinking of how much time I’ve spent correlating stack traces with git commits, and how good LLMs are at doing a first pass on it. We’ll be seeing more custom, , throw-away LLM agent loops in our bin/ directories. Grab your favorite bearer token and give it a shot.
...
Read the original on sketch.dev »
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.