How to test-drive an AGI: linguistic indicators of conceptual capabilities

Let’s say we find ourselves presented, at some point in the not-too-distant future, with an AI that claims to be able to communicate with us in a useful way about making some kind of a plan. Before engaging in a consequential way with this AI, we would reasonably like to have a good degree of confidence in its ability to do what it says it can do. How do we gain this confidence? One place to start would be to wonder what we would look for in a reliable human collaborator, and see whether we can identify that in the AI, as well. There is a problem here though, and it’s not so much that the AI is just different than a person (it’s very different indeed for the most part, but it might be similar in the right ways on the right level of abstraction), nor is it that we can’t make enquiries of the AI (we can in fact look at its training data and also prod it to see if it seems like it works like it says it does). The problem, or at least one big problem, we have with comparing an expert AI to an expert human is really that in a very fundamental way we don’t understand how the human became an expert in the first place. We can talk about the fact that they gained the qualifications and experiences necessary to become an expert, but when we try to understand how that really works, we inevitably fall back on this somewhat ineffable idea that becoming knowledgeable is simply what happens to humans in the course of living life.

If we don’t truly understand our own ability to experience, know, and eventually communicate about the world as it is and as it could be, there’s not really any point in trying to understand the same things about a machine. Given this, we inevitably find ourselves in a familiar position, faced with some version of the famous Turing Test, which suggests that all we can really consider when making our judgements about the capabilities of an AI is how it compares, on the level of its observable performance, with a human, without worrying too much about how either human or machine became the way they are. Is this really satisfactory, though? In this article, taking the Turing Test as a starting point, we’ll try to get a glimpse of the relationship between language, thinking, and the way that ideas about words and thoughts might bear on our assessment of a goal-accomplishing AI:

We’ll consider the Turing Test and its potential application to recent and forthcoming develops in plan-making AI;
We’ll look at the historical context in which the Turing Test was proposed, focusing on the field of behaviourism and taking into account how that context may have influenced the formulation of the test;
We’ll consider developments in ideas about language and behaviour that happened in the years after the Turing Test was formulated, looking at how these ideas could have influence the way the test was conceived;
We’ll consider other modes of linguistic evaluation of a thinking machine that might offer glimpses into the way the machine goes about actually constructing concepts;
We’ll reflect on how a reconsideration of the way we evaluate the cognitive capacities of machines might begin to inform the ongoing pursuit of a more generally strategic mode of AI.

At the end of this excursion, we arrive not at an answer to the question of how to judge a machine’s ability to make plans, but at a broader question about how the relationship between words and ideas bears on our interaction with machines.

Thinking machines, talking machines, and artificial intelligence

An area that is emerging as the frontier of AI as we arrive at the one-year mark since the introduction of OpenAI’s ChatGPT platform is the ability of an AI to consistently formulate relatively complex and reliably effective plans for accomplishing an open-ended assortment of goals in the world. We might as well bracket this objective in the field with the term "planning", as suggested by Yann LeCun when he says "one of the main challenges to improve LLM reliability is to replace Auto-Regressive token prediction with planning" – a rough interpretation of this statement is something like "we should stop building models that just predict what to say next in some conversation and start building models that predict what to say about what to do next in some situation". Two looming projects indicative of the appetite for planning technology are Google’s Gemini and OpenAI’s Q*. Speaking about Gemini earlier this year, DeepMind CEO Demis Hassabis described it as "combining some of the strengths of AlphaGo-type systems with the amazing capabilities of the large models", referring to on the one hand the astonishing success of strategy-game-learning techniques developed by Google and on the other hand the rampant language-producing capabilities recently exhibited by large language models (which are, to borrow LeCun’s term, "auto-regressive token predictors") like Palm (from Google) and GPT (from OpenAI).

The idea that the next significant advance in AI is going to involve an amalgamation of coming up with plans and then also coming up with words to describe those plans raises an essential question: is the ability to generate sentences that describe a good plan, in response to a request for that plan, tantamount to the ability to understand the desire for the outcome of the plan, the nature of the world in which the plan will happen, and how the plan transforms that world to accomplish its outcome? The idea that the answer to that question could be "yes" must be linked with the collective popular fascination with the famous Turing Test, which seeks to evaluate the cognitive capabilities of an artificial agent on the basis of the inputs and outputs associated with that agent alone. The point of Alan Turing’s original formulation of his test is not so much that a machine which on some level of abstraction acts in a human-like way is essentially commensurate with a human; it is rather that when it comes to assessessing a machine there is no recourse to considering anything other than the appearance of its operations, and so there is no point in wondering whether the machine possesses other unobservable properties that might be associated with aspects of being human. So we might consider that, per the Turing Test, a machine that uses words to talk about plans like a competent human can also be considered to be, in Turing’s words, a "thinking machine", inasmuch as "thinking" must be an important part of "making plans".

Putting the Turing Test in context

In order to understand where Turing’s idea for his test comes from and how it relates to the current understanding of language and cognition, it’s worthwhile to consider that Turing was working in a milieu in which the prevalent trend in discussions about thinking revolved around the field of behaviourism, which sought to construe animal and human activities in terms of only objectively observable parameters such as "stimulus" and "response". With roots in experiments on animal conditioning carried out by researchers such as Ivan Pavlov, by the middle of the 20th Century a leading figure in the area was B. F. Skinner, who explored the way that behaviour becomes embedded through a process of "reinforcement" received by an agent in its interactions with its environment. By the time Turing was formulating the basis for his test, Skinner’s experiments involving relatively low-level behaviours such as animals learning how to act in order to attain food had made Skinner himself a publicly recognised authority on psychology.

To put this on a timeline, Skinner outlined core concepts of the behaviourist thesis in his influential book The Behavior of Organisms in 1938, and by the end of the 1940s he was a publicly recognised figure writing for the popular press and involved in the marketing of behaviouristic gadgets for child-rearing and education. Turing wrote his paper Computing Machinery and Intelligence, from which the Turing Test has been extrapolated, in 1950. The highlighting of this juxtaposition is not intended to suggest that Turing was directly influenced by or commenting on Skinner and the work of other behaviourists, but the relationship between Turing’s particular ideas about machines and the more general context of enthusiasm for objective external assessment of ostensibly intelligent behaviour within the psychology community of the time seems unavoidable. Turing would die tragically four years later, in 1954, with behaviourism still in the ascendent. In 1957, Skinner turned the general behaviourist methodology to the topic of language (and so more complex modes of human behaviour) in his book Verbal Behaviour, seeking to formulate a theory of language acquisition projected onto the framework of stimulus and response.

In 1959, Noam Chomsky published a review of Verbal Behaviour which effectively decimated the idea that language – and arguably, by extension, all human activities associated with the complex apparatus of cognition – could reasonably be studied from a merely behaviourist perspective. The thrust of Chomsky’s critique was that the relationship between the elements of a language, so its syntax and semantics, and the way a language user might apply those elements to situations in the world they occupied were far too dynamically enmeshed with the language-user themselves to be considered in terms of inputs and outputs. As Chomsky put it:

"One would naturally expect that prediction of the behavior of a complex organism (or machine) would require, in addition to information about external stimulation, knowledge of the internal structure of the organism, the ways in which it processes input information and organizes its own behavior."

To support his point, Chomsky illustrates the way in which the ability to use language is acquired without any requirement to be directly exposed to specific stimulus-response conditions involving particular sentences, phrases, or even words (we make new sentences all the time, and are capable from an early age and then throughout life of acquiring new words and coming up with novel applications of new words in a single-shot manner). In fact the way that humans acquire language in the course of their cognitive development seems to overtly contradict the behaviourist hypothesis, with children achieving the ability to both comprehend and creatively apply language in the absence of anything like environmental stimuli conditioning for and then reinforcing particular applications of elements of language.

Many of the points that Chomsky made in 1959 seem directly relevant to the production of language-based AI today, given that the parameters by which we would probably consider evaluating such a system – its ability to produce linguistic outputs that manage to be simultaneously relevant and helpful and at the same time to be to some extent pleasantly novel and even outright creative – look like precisely those abilities which Chomsky says Skinner’s models of environmentally reinforced behaviour can’t explain. Chomsky’s conclusion regarding Skinner’s approach to language is that it is a sufficient statement of the problem of language, asking how language becomes so evidently effective at mitigating all the layers of interiority and exteriority involved in the dynamics of language-users communicating in and about a messy environment. Beyond this, though, behaviourism makes no headway on actually solving the problem it presents; it merely leaves us with an index to some function that needs to actually be described.

While Chomsky’s assessment of Skinner’s approach to language shouldn’t necessarily be taken as a direct rebuttal of Turing’s proposal for evaluation of a potentially thinking machine, it seems that Chomsky’s critique could and should at least bear on any consideration of the Turing Test, and the tragedy of Turing’s death is compounded by the fact that we will never know how Turing himself might have reacted to the evolution of fields like cognitive science and linguistics. It’s probably also unfortunate that Skinner would choose to react to Chomsky’s review with a sustained combination of deflection and disdain, effectively shutting down any direct dialogue between the two thinkers. Chomsky meanwhile has become probably the central figure in a debate that has defined linguistics ever since, revolving around the question of where exactly language resides in the psychological development and overall cognitive schema of a human. Is language an innate human attribute (even instinct), perhaps encoded at the level of genetics and through evolutionary processes? Or is it more like an artefact of human culture and society, perpetually emerging from the complex networks of activities in which humans collaboratively participate? Is thought itself a kind of language, or is the experience of being a thinking agent just what language is ultimately somehow about? To what extent does having a language in general and then the specific language a person has impact that person’s cognitive experience, including their outright perception of the world?

It is perfectly reasonable to accept Chomsky’s rebuke to the behaviourist methodology without embracing other aspects of what is sometimes referred to as the "nativist" programme in linguistics (the idea that language is somehow directly encoded into the essence of being human, and so is necessarily much more than conditioned behaviour). In fact, the point of highlighting these open questions in the field of linguistics here is simply to illustrate that we don’t have anything approaching a consensus on the role that language plays in our own existence as human agents, and so, turning back to the question of evaluating the success of AGI, it seems completely futile to try to use linguistic content as the prima facie basis for gauging the degree to which an AI might pass as one of Turing’s thinking machines. If we are to consider language as the arena in which our assessment of the intelligence of something that is structurally radically different from us will play out, it seems we first need to reach some kind of an agreement of what exactly language does and how it does that thing for ourselves. This point of abiding uncertainty about the nature of the language we would use to evaluate AI seems absent from the current debate, generally to the detriment of what could be considered the more human side of the issue.

As a case in point, Brian Christian’s excellent, accessible, and often prescient treatment of some of the core issues around building AI The Alignment Problem, written in 2020, has a well-researched presentation of Skinner’s behaviourism and its relationship to more recent trends in machine learning, but makes no mention of Chomsky’s intervention or of the subsequent developments in cognitive science. This is one of the many examples of a rather severe schism that seems to have evolved over the past couple decades in particular by which critics and champions of data-driven approaches to modelling intelligent agents alike are not so much in dispute over some of the most fundamental concerns in contemporary ideas about language and mind as just plainly unaware of the existence of these concerns.

Other ways of talking to talking machines

Where does this leave us in terms of our assessment of, and so, crucially, confidence in an AI that uses language to talk about more abstract conceptual structures, such as complex plans? It is certainly the case that the process of learning undertaken by a data-driven AI, which in the instance of language involves exposure to a vast amount of purely linguistic (not environmentally grounded) data, doesn’t look much at all like what happens when a human learns language in the course of cognitive development (in which a child has sparse exposure to the language itself, but rich exposure to the thing that language is about, namely, the world and the ways of thinking agents in that world). But it seems like a step too far to insist that any agent that we accept as using language to reveal the handling of concepts must be in some replete way isomorphic with a human language-learner; we don’t need to insist for instance that a plan-making AI be human-like in a physiological or biochemical sense to accept its utility.

Maybe one way we could approach evaluation of a language using, concept handling AI is by seeking for evidence that the language it uses is not only superficially indicative of what a human might interpret as underlying conceptual structure, but is also in general characterised by some of the same structural features we expect to find in the relationships between language and ideas. Here are some questions about a linguistic AI that could be empirically investigated:

Does the AI have the ability to learn new words and phrases on the go? An important feature of the human use of language is the lifelong ability of a language-user to update their vocabulary, and these updates can be taken as evidence of a corresponding capacity for adjusting a world view and the concepts associated with that view. Learning here should be able to occur either through an essentially ostensive indexing ("this new word X means that Y there") or through observation of actual sentential application of the new element of language ("X happened and so the world was like Y"). Can an AI both respond to and apply new language based on essentially one-shot exposure?
Is the AI able to take novel words and phrases and apply underlying language-specific assumptions about the ways that linguistic surface forms can be manipulated in order to express novel concepts? This might be gauged in terms of for example the appearance of an application of what can be referred to as a "construction grammar" in the context of cognitive linguistics. A construction grammar is a set of at least initially morphological rules about how words can be formed. A paradigmatic example (from Ronald Langacker) is the way that knowledge of the adjective "sharp" is the basis for both understanding and using (possibly without previous exposure) things like the verb "sharpen" and the noun "sharpener". Can an AI that works with a language both understand and concoct novel constructions that adhere to the constructive expectations of human users of the same language?
Can the AI handle the application of previously unobserved idioms? An idiom is a phrase, perhaps sometimes verging on a proverb, that takes on a specific lexicalised meaning often in a likewise culturally specific context. Examples of idioms are sayings like "spill the beans" and "under the weather", which have broadly accepted meanings that may well not be completely obvious from a literal interpretation of the combination of words in the idiom. Humans may not immediately recognise the meanings of idioms within a language they use at a first encounter if they are presented out of context, but they will generally quickly grasp the idiom when they encounter it in a way where the intended interpretation is clear, and will then understand future applications of the idiom and adopt it for their own usage. Can AI do the same?
Is the AI able to deal with figurative language? Here "figurative" refers to ways in which the interpretation of an instance of language can diverge from the literal meaning of the words being interpreted. The perhaps paradigmatic example of figurative language is metaphor, but an exploration should also include examples of metonymy and any other way in which an element of language is coerced out of its lexicalised role. Some instances of figurative language are themselves effectively lexicalised and will be well attested in AI training data: for instance, even though "love" is not literally a container, a representative corpus of English linguistic data will include discussions of various aspects of "falling in love", and we might expect an AI trained on this data to be able to talk about the concept of love in terms of the properties of a (hazardous?) container. But could the same AI handle the interpretation of novel metaphors, and extend mappings of the new metaphors between different aspects of the conceptual domains they link?
Can the AI learn a new language in a similar way to a human? This one is a bit harder to evaluate empirically, because it would have to involve training something like an LLM based on data from only one language, and then exposing it through conversation to another language in the way a teacher might talk to someone learning that language. It would also require that the LLM have a capacity for updating semantic memory that is not currently incorporated directly into these models. But with the right modifications, this could be an interesting way of exploring the extent to which the AI appears to have acquired what could be referred to as a "language module" – whatever that actually means.

One of the points of the empirical objectives outlined here, offered for now just as rough sketches of possible research projects, is that they are ways of using what is in the language to make preliminary assessments about what is in the conceptual structure of the language-user. This is proposed as a step towards getting past the inadequacy of assuming that because an agent can talk with authority and in detail about some particular thing (for instance making a plan), it can also do that thing or even really understand that thing in a way that is useful enough to make the agent more than an interactive encyclopaedia or a template for a decision tree. If we accept the idea that there is a degree of isomorphism between the language an agent has and the worldview that agent might have (which is not an uncontroversial point, but here it is posited in he most general way, that there is just some connection without getting too specific about the nature of that connection), then we can begin to think about how we can use purely linguistic enquiries to test the durability of the agent’s corresponding idea of the world.

Language learning: necessary but not sufficient for AGI

In the end the ability to work with the conceptual structures traced by a language, with its particular syntax, semantics, and pragmatics, probably has to be taken as a necessary condition for accepting that an agent can make plans about the world, but not as an absolutely sufficient one. We may gain a degree of confidence in an AI by witnessing its ability to make use of the familiar architecture of language, but we can never really be confident that this is not a performance of ingenious pattern matching. In this respect we are still caught within the imitation game as construed by Turing: unable to know an AI by anything other than what it shows us, we can just try to press it to reveal the outline of an inaccessible part of its apparatus that may or may not exist. Another step towards solving the puzzle of knowing the structure and not just the behaviour of a plan-making agent could involve introducing a likewise structured element to its operation, so, to get back to the comments from LeCun and Hassabis quoted at the beginning of this post, the integration of a strategy-making model with a meaning-making model. Such an amalgamation might make the conceptual mechanisms revealed through an appropriate linguistic analysis of an AI’s behaviour seem more convincing.

And we need to balance out the frustration of the interior inscrutability of any model with the utter inadequacy of our knowledge of ourselves. How can we possibly expect an AI to reveal how language really works within its structure, when there is no consensus at all on how it works within our own structure (our individual bodies, our societies, our shared histories, our evolution)? As things stand, how would we know the answer to the question "does this agent think in the same way that I mean when I say I’m thinking" if we don’t even really know what we mean about ourselves when we say we’re thinking?

For now one hopeful conjecture offered in parting is that these agents, properly configured, may ultimately help us, perhaps because of their utter otherness, to really understand ourselves and to find our way through the world. If there’s a chance of that, it seems worthwhile to use the tools we have at our immediate disposal to try to advance this technology.