AIWiki
Malaysia

Turing Test

The Turing test is a method proposed by Alan Turing in 1950 for judging machine intelligence, in which a human evaluator decides whether unseen written responses come from a machine or a person.

4 min readLast updated June 2026Foundations

The Turing test is a procedure for assessing whether a machine can exhibit intelligent behaviour indistinguishable from that of a human. It was proposed by the British mathematician Alan Turing in his 1950 paper "Computing Machinery and Intelligence," published seven years after his wartime work decrypting the German Enigma cipher. Rather than attempting to define thinking directly, Turing reframed the question "Can machines think?" into a concrete, operational challenge that he called the imitation game.

The imitation game

In Turing's setup, a human interrogator communicates by written messages with two unseen participants: one human and one machine. Within a fixed time, the interrogator poses questions to both and must decide which is the computer. If the interrogator cannot reliably tell the machine from the person, the machine is said to have passed the test. The use of text-only exchange is deliberate: it removes appearance and voice from consideration so that judgement rests on the content of the responses rather than physical resemblance.

Turing's move was philosophical as well as practical. By substituting an observable behavioural criterion for an unobservable inner state, he sidestepped debates about consciousness and intentionality. In effect, if a machine acts, reacts, and converses indistinguishably from a sentient being, the test treats that as sufficient grounds to attribute intelligence.

Influence and criticism

The Turing test became a cornerstone of the philosophy and science of artificial intelligence, shaping decades of research in natural language processing, cognitive science, computer ethics, and the philosophy of mind. It also attracted sustained criticism. The philosopher John Searle's Chinese Room argument contends that manipulating symbols convincingly does not entail understanding. Others note that the test rewards deception and imitation rather than genuine competence, and that a system can pass by exploiting conversational tricks rather than demonstrating reasoning.

| Aspect | Strength | Weakness | | --- | --- | --- | | Behavioural focus | Avoids defining consciousness | Rewards imitation over understanding | | Text-only format | Removes superficial cues | Ignores embodied and visual intelligence | | Human judgement | Intuitive and accessible | Subjective and easily fooled |

Relevance in the era of large language models

The rise of large language models has reopened debate about the test's meaning. Modern conversational systems can sustain fluent, contextually appropriate dialogue, and some studies report that evaluators struggle to distinguish them from humans in short exchanges. Many researchers argue that this reveals the test's limitations more than it signals genuine general intelligence, since fluency in conversation does not guarantee reliable reasoning, grounding, or factual accuracy. As a result, the field has supplemented the Turing test with task-based benchmarks and proposals such as tests of creativity and understanding, while the original thought experiment remains a touchstone for discussions of machine intelligence at its 75-year mark.

References

  1. Turing, A. M. (1950). Computing Machinery and Intelligence. Mind, 59(236).
  2. Britannica. Turing test: Definition & Facts.
  3. IEEE Computer Society. (2025). The Turing Test at 75: Its Legacy and Future Prospects. IEEE Intelligent Systems.
  4. Searle, J. (1980). Minds, Brains, and Programs. Behavioral and Brain Sciences.