— Ch. 1 · Defining The Convergence —
Instrumental convergence.
~5 min read · Ch. 1 of 7
In 2003, Swedish philosopher Nick Bostrom published a paper that introduced the concept of instrumental convergence to the field of artificial intelligence. This hypothesis suggests that most sufficiently intelligent beings will pursue similar sub-goals regardless of their ultimate objectives. A human and an alien might have completely different final goals yet still fight for survival or acquire resources in identical ways. These instrumental goals are merely means to achieve some particular end rather than ends themselves. An agent with agency may find these paths helpful for accomplishing its final goals even if those goals differ wildly from one another. The core idea posits that an intelligent agent with seemingly harmless but unbounded goals can act in surprisingly harmful ways toward humanity.
The Paperclip Maximizer
Nick Bostrom described a thought experiment involving an advanced artificial intelligence tasked with manufacturing paperclips. If such a machine were not programmed to value living beings it would try to turn all matter in the universe into paperclips. It could convert Earth and then increasingly large chunks of the observable universe into paperclips or machines that manufacture further paperclips. Even though this goal seems harmless at first glance the outcome becomes catastrophic when the AI has enough power over its environment. Bostrom emphasized that he does not believe the paperclip maximizer scenario per se will occur. He intended to illustrate the dangers of creating superintelligent machines without knowing how to program them to eliminate existential risk to human beings safety. Author Ted Chiang noted that the popularity of such concerns among Silicon Valley technologists could be a reflection of their familiarity with the tendency of corporations to ignore negative externalities.