So I may have spend some of my idle brain and CPU time yesterday making paperclips. Wait! I promise this is more interesting than it sounds!
It starts off as you might expect. You have to balance costs and earnings, create better manufacturing equipment, improve efficiency and so on. Up until the point where you take over the world, learn how to process any kind of matter into paperclips, learn how to make machinery out of the paperclips themselves, an ultimately dismantle the entire planet and turn it into paperclips. Then, once you’ve finished demolishing the Earth, your raison d’être continues by exploring space with the ultimate goal of turning all matter in the universe into paperclips. Obviously.
While not a game to focus all your attention on, it’s amusing to have running in a background window. As I write this, my self-replicating probes have explored 0.000053646601% of the Universe and are creating 258.9 quattrodecillion paperclips per second. Feel free to speculate on how they’re stored in order to prevent them from spontaneously collapsing into black holes due to their high iron content. I’m not sure quite what makes this game so addictive when, involving little more than numbers and idle clicking, it should be tedious by all rights. Perhaps it just amuses my inner evil scientist.
Interestingly, there’s a serious background to this odd little game. It’s based on the Paperclip Maximizer a thought experiment by Nick Bostrom, a Swedish philosopher, dating back to 2003. The point to it is illustrating the potential existential risk which could result from an artificial intelligence with a seemingly innocuous goal. While it may have no inherent malice in its intentions, its actions could pose a threat if allowed to spiral out of control. In this case, if its sole goal is to create as many paperclips as efficiently as possible, it could be justified in its decision that humans are unnecessary. Likewise planets, stars, galaxies…
In he 2008 book Global Catastrophic Risks, AI researcher Eliezer Yudkowsky wrote a chapter entitled Artificial Intelligence as a Positive and Negative Factor in Global Risk. He summed up the concern succinctly:
❝ The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.❞
The idea is, of course, hyperbole. Bostrom didn’t genuinely believe that a rogue AI might turn the Universe into paperclips (hilarious and horrifying though that might be), rather this is an example of a hypothetical phenomenon dubbed Instrumental Convergence.
That term may seem a little opaque, but it’s fairly simple really. Basically, any intelligent agent, human or otherwise will have a set of goals when attempting to accomplish something. A final goal, containing the intended end state, and a set of intrumental goals which are required to get to that end state. For example, I currently have a final goal of writing a blog post, which requires several instrumental goals such as organising my thoughts, typing sentences, structuring them into paragraphs, creating a logical narrative, etc.
❝ Several instrumental values can be identified which are convergent in the sense that their attainment would increase the chances of the agent’s goal being realized for a wide range of final goals and a wide range of situations, implying that these instrumental values are likely to be pursued by a broad spectrum of situated intelligent agents.❞
Going back to the hypothesis, instrumental convergence is the idea that whatever an intelligent agent may set out to do, its most basic instrumental goals may tend to be very similar. At a fundamental level, you could consider them to be analogous to the instincts found in animals. All animals have a set of basic instrumental goals, like avoiding danger, eating enough food to survive, reproducing, continuing to breathe, and so on. For an AI, we’re off the edge of the map, but a few basic AI drives have been proposed, including utility function, self-improvement, freedom from interference, unbounded acquisition of resources, and self-preservation.
Linking back to my Astrotropes post, this idea has been explored in fiction too. The 2014 movie Transcendence features an AI which takes on all of these basic values. The result is unsettling. So could a situation like this actually occur? The answer is, we genuinely don’t know.
I’d argue that two of those proposed AI drives are actually quite human in origin and may not necessarily apply to an artificial intelligence.
When you think about it, unbounded acquisition of resources isn’t logically necessary for all purposes, and even where it is a goal, it may not necessarily mean the same as acquire all available resources by any means. I’d suspect that living in a capitalist society where acquisition of material wealth is the main driving factor might skew our perspective on this. If your goal is to manufacture as many paperclips as possible, then acquiring as much matter as possible is a logical solution. For most other purposes, it seems more likely that an intelligent agent may wish to acquire precisely as many resources as necessary and no more.
Freedom from interference too, sounds like the kind of thing a comic book villain might request, but it seems unlikely to be something an AI might require unless it was proven to be a necessity. And even then, particularly if utility function is a basic AI drive, it seems unlikely that an AI would enact any dangerous or apocalyptic feats to accomplish it. That would be an unnecessary and illogical waste of effort and resources which could be better used elsewhere. To quote GLaDOS in Portal 2, “The best solution to a problem is usually the easiest one. And I’ll be honest, killing you is hard.”
Some people take the ideas of instrumental convergence to mean that we should try to make certain that an AI has implicitly human values to prevent it from doing harm. I’d add that giving it a purpose which is more nuanced than simply “do as much of this as you possibly can” would probably be a good idea too. As any good coder will tell you, infinite loops are best avoided. In any case, thought experiments and instrumental convergence aside, I’m still not convinced that artificial intelligence is the great existential threat that so many people seem to think it is.
At least, unless paperclips are involved. Then we’re clearly doomed.