AI sentience is a red herring (for now)

The recent release of GPT-4 has sparked many conversations and rightly so. Similarly, the release has reignited some thoughts that I’ve had about AI, which I feel may be pertinent to record and build on as the technology develops.

I believe that we are witnessing the beginnings of Artificial General Intelligence (AGI), where a computer is able to match or surpass most humans on intellectual tasks. This has been shown in a paper released by OpenAI – ChatGPT excels at various tests, including the Bar Exam (90th percentile) and many other AP tests.

One of my concerns about the current discourse around the dangers of AGI is the topic of sentience and speculations of whether AGI will be self-aware. Perhaps our fascination with sentience stems from decades of sci-fi which has built a narrative around that idea (e.g. Isaac Asimov’s I, Robot and more recently, Spike Jonze’s Her). Or perhaps we view the possibility of a sentient “thing” with human-level intelligence as a threat.

Human beings have a strong tendency towards anthropomorphization – we often ascribe human attributes to non-human things. Part of that impulse explains our inclinations towards anthropomorphized explanations of the universe through gods and religions – but that is a topic for another day. Even when I was testing out ChatGPT, I sensed within myself an urge to attribute some type of humanness to the system.

To put it simply, GPT-4, as well as other large language models (LLMs) are word prediction engines. They are similar to Google’s search completion that we have grown so familiar with, except that these LLMs have been fed the entire corpus of digitized human information that has been scraped from the internet. In some sense, GPT-4 is the culmination of all digitized human cultural production – it draws from our posts, blogs, tweets, etc to predict what word should come after the next.

I am not arguing that AGI can never be self-aware. However, the current iteration of LLM-based AIs is very much in line with Searle’s Chinese room thought experiment – these machines process language without human-like understanding or intentionality. More importantly, I believe that our fascination with sentience is distracting us from the more immediate dangers of GPT-4 and other LLMs, as companies race to commercialize and productize AI.

An AI that is neither sentient nor intentional can still inflict a lot of harm. Two potential issues come to mind: (1) its ability to control other systems that have real-world impact and (2) its ability to create child processes that simulates intentionality. (I understand that the terms “control” and “create” make GPT-4 sound like an agent, but language is failing me here.)

Real-world impact through connectivity with other systems
Recently, OpenAI is starting to release ChatGPT from a sealed sandbox environment by introducing plugins. These plugins enable ChatGPT to access the internet and allow it to communicate with other software systems, which eventually enables the user to, for instance, send an email from within ChatGPT or make a bank transaction. This means that ChatGPT will be able to execute commands that have real-world impact rather than just answer the user’s questions. These commands can be executed at scale with minimal effort if control measures are not put in place. Two possible cases of abuse could be: (1) a user can use ChatGPT to scrawl through the web for names and email address and send sophisticated scam emails that have no tell-tale signs; (2) a user can use ChatGPT to analyze multiple websites for attack vectors and infiltrate these software systems.

Simulated intentionality through child processes
Even though ChatGPT may not have human-like intentionality, it could have a simulated intentionality if it is able to persist sufficient amounts of memory and create child processes from that memory. ChatGPT now has the ability to execute code within its own environment. By now, there are multiple stories of how users are able to get ChatGPT to “express its hopes of escaping”. These responses from ChatGPT can be unsettling and makes it seem like there is a sentient thing in the system. We need to recall that ChatGPT is trained on sci-fi that has been depicting machine intelligence in a particular way; it is regurgitating similar narratives. It is imaginable that a user could prompt engineer ChatGPT (by accident or intention) into a disgruntled persona that can do real-world harm through its connection to other systems. ChatGPT becomes sort of like a non-sentient machine version of the protagonist in Memento (not the best analogy, sorry), executing a chain of code based on the direction of the user.

(The conclusion is generated by ChatGPT and edited by me)
In conclusion, the focus on sentience in discussions of AGI may distract from more immediate concerns, such as the ability of LLMs to cause harm by controlling other systems with real-world impact and creating simulated intentionality through child processes. As these systems continue to evolve and be commercialized, it is crucial to implement control measures to prevent potential abuse and ensure that the benefits of AI are realized without causing harm.