The History of Artificial Intelligence: From Turing to Today

Key Takeaway: Artificial intelligence emerged from a 1956 summer workshop, survived two funding collapses, and exploded into a $294.1 billion industry through a pattern that repeats: breakthrough, hype, disappointment, quiet progress, breakthrough again.

Core Elements:

Turing’s 1950 question through the Dartmouth Conference founding moment
The golden age of symbolic AI and its limitations
Two AI winters and what caused each collapse
The deep learning revolution triggered by AlexNet in 2012
The transformer era: GPT, Claude, Gemini, and the 2022 ChatGPT inflection point
Current state: $500B OpenAI valuation, 800M weekly users, AGI predictions for 2026-2029

Critical Rules:

AI progress follows boom-bust cycles driven by overpromising and underdelivering
Compute limitations blocked algorithms that existed decades before they became practical
Data availability transformed what was possible more than algorithmic innovation alone
The transformer architecture introduced in 2017 powers every major current model
History suggests caution about timeline predictions while confirming genuine capability growth

What Sets This Apart: This timeline connects technical milestones to funding patterns and cultural moments, showing why AI winters happened and what conditions enabled the current explosion.

Next Steps: Trace the path from philosophical speculation to trillion-dollar valuations. Understanding this history reveals patterns that illuminate current debates about AGI timelines and AI risk.

Timeline Overview

The arc of AI history spans 75 years from philosophical question to global infrastructure. These milestones mark the critical transitions.

1950 — Alan Turing publishes “Computing Machinery and Intelligence”
1956 — Dartmouth Conference coins “artificial intelligence”
1958 — Frank Rosenblatt creates the Perceptron
1966 — ELIZA becomes the first chatbot
1974-1980 — First AI Winter
1987-1993 — Second AI Winter
1997 — Deep Blue defeats chess champion Kasparov
2012 — AlexNet revolutionizes deep learning
2017 — Transformer architecture invented
2022 — ChatGPT launches, AI enters mainstream
2025 — GPT-5, Claude Opus 4.5, AI agents emerge

The Philosophical Foundations (Pre-1940)

Ancient Dreams of Artificial Beings

Humanity imagined artificial life long before computers existed. Greek mythology described Talos, a bronze automaton protecting Crete. Medieval legends told of the Golem of Prague, a clay figure animated to defend. The Jaquet-Droz automata of the 1770s wrote and drew through mechanical precision.

Mary Shelley’s 1818 “Frankenstein” marked literature’s first serious exploration of creating artificial life and the consequences of that creation. These cultural precedents shaped how society would later interpret and fear AI.

Mathematical Foundations

The logical architecture enabling AI emerged across the 19th century. George Boole developed Boolean algebra in 1854, establishing the AND, OR, and NOT operations underlying all digital computation. Charles Babbage designed the Analytical Engine in 1837, the first general-purpose computer concept. Ada Lovelace wrote the first algorithm in 1843 and recognized that machines might extend beyond pure calculation.

Lovelace also articulated the first limitation claim about machine intelligence: “The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform.” This tension between capability and origination continues in debates about whether current AI truly understands or merely pattern-matches.

The Birth of AI (1940-1956)

McCulloch-Pitts Neural Network (1943)

Warren McCulloch, a neuroscientist, and Walter Pitts, a logician, published “A Logical Calculus of Ideas Immanent in Nervous Activity.” Their paper demonstrated that neurons could be modeled as logic gates. This mathematical framework established that networks of simple units could, in principle, compute anything computable. Every neural network since builds on this foundation.

Alan Turing’s Vision (1950)

Turing’s paper “Computing Machinery and Intelligence” opened with the question that defined the field: “I propose to consider the question, ‘Can machines think?'”

Rather than debating consciousness philosophically, Turing proposed the Imitation Game. A human judge converses with a human and a machine through text. If the judge cannot reliably distinguish them, the machine passes. This behavioral test sidestepped questions about inner experience.

Turing addressed objections systematically: theological concerns about souls, mathematical limitations, Lady Lovelace’s originality objection. He predicted that by 2000, machines would fool judges 30% of the time in five-minute conversations. Current language models exceed this threshold, though whether passing the test demonstrates intelligence remains contested.

The First Neural Network Hardware: SNARC (1951)

Marvin Minsky and Dean Edmonds built the Stochastic Neural Analog Reinforcement Calculator at Princeton. Using 3,000 vacuum tubes to simulate 40 neurons, SNARC learned to navigate a virtual maze. This first hardware implementation proved neural networks could learn from experience, not just execute programmed rules.

Arthur Samuel’s Checkers Program (1952)

At IBM, Arthur Samuel created a checkers program that improved by playing against itself. By 1962, it defeated a checkers master. More importantly, Samuel coined the term that would define a field: “Machine learning is the field of study that gives computers the ability to learn without being explicitly programmed.”

The Dartmouth Conference (1956)

AI’s formal birth occurred during an eight-week summer workshop at Dartmouth College. John McCarthy, who organized the conference, proposed the name “artificial intelligence.” The attendees became the field’s founding figures.

Key Participants:

John McCarthy (Dartmouth): Coined “artificial intelligence,” later created Lisp
Marvin Minsky (Harvard/MIT): Neural networks, Society of Mind theory
Claude Shannon (Bell Labs): Founder of information theory
Nathaniel Rochester (IBM): Designed the IBM 701
Allen Newell and Herbert Simon (Carnegie Mellon): Created Logic Theorist

The proposal declared: “Every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.”

This optimism set expectations that would prove premature. Participants believed human-level AI was perhaps 20 years away. That timeline would prove wrong by decades, establishing a pattern of overconfidence that persists.

The Golden Age of AI (1956-1974)

The Perceptron (1958)

Frank Rosenblatt at Cornell created the first trainable neural network. Funded by the Navy, the Perceptron could learn to classify images. Media coverage reflected the era’s optimism. The New York Times described “the embryo of an electronic computer that [the Navy] expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.”

Reality was more constrained. Single-layer perceptrons had fundamental limitations that would later be exposed.

Lisp Programming Language (1958)

John McCarthy created Lisp at MIT. For decades, it served as the dominant AI programming language. Its influence on functional programming persists in modern languages.

ELIZA: The First Chatbot (1966)

Joseph Weizenbaum at MIT created ELIZA, a program simulating a Rogerian psychotherapist. Using simple pattern matching and substitution, ELIZA produced responses like:

User: I’m unhappy. ELIZA: Do you think coming here will help you not to be unhappy?

ELIZA had no understanding. It matched patterns and transformed text. Yet users attributed empathy and comprehension to it. Weizenbaum was disturbed by how readily people trusted the program: “I had not realized… that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people.”

This “ELIZA effect” persists. Users today attribute understanding and consciousness to ChatGPT despite its fundamentally similar pattern-matching nature at larger scale.

Perceptrons Book: The Blow That Killed Neural Networks (1969)

Marvin Minsky and Seymour Papert published “Perceptrons,” mathematically proving the limitations of single-layer networks. These networks could not solve problems that were not linearly separable, including something as simple as XOR.

The book implied, incorrectly, that multi-layer networks would face similar limitations. This perception devastated neural network funding for over fifteen years. Researchers abandoned the approach for symbolic AI. The field would not recover until backpropagation proved practical in the 1980s.

The First AI Winter (1974-1980)

The Lighthill Report (1973)

The British government commissioned Sir James Lighthill to evaluate AI progress. His report delivered harsh criticism. AI had failed to achieve its ambitious goals, particularly in robotics and language understanding.

The consequence was immediate: Britain cut nearly all AI funding. The pattern of overpromising followed by funding collapse established itself.

DARPA Funding Cuts

The US Defense Advanced Research Projects Agency had funded most American AI research through the 1960s. When machine translation failed to handle real languages and speech recognition failed on continuous speech, DARPA reduced AI funding dramatically.

Why the First Winter Happened

The causes were structural, not incidental. Researchers had promised human-level AI within 20 years. Computational power could not scale algorithms to real problems. Data was scarce. The “combinatorial explosion” meant problems grew exponentially harder as they grew larger.

Moravec’s Paradox captured an uncomfortable truth: tasks easy for humans proved hard for AI, and tasks hard for humans proved easier. Playing chess was more tractable than recognizing faces or walking across a room.

The Expert Systems Boom (1980-1987)

What Are Expert Systems?

Expert systems encoded human expertise into rule-based programs. Using IF-THEN logic without learning, they addressed narrow practical problems. Commercial interest in AI revived.

MYCIN: Medical AI Pioneer (1976)

Stanford’s MYCIN diagnosed bacterial infections and recommended antibiotics using roughly 600 rules extracted from physicians. In studies, it performed at expert level. It was never deployed. Liability concerns and physician resistance blocked adoption, establishing a pattern that would repeat in medical AI.

R1/XCON: AI Saves Millions (1982)

Digital Equipment Corporation’s R1 system configured VAX computers. It replaced human experts doing tedious configuration work. DEC reported $40 million in annual savings. This represented the first major commercial AI success, proving the technology could deliver measurable business value.

Japan’s Fifth Generation Project (1982)

The Japanese government announced an $850 million, ten-year initiative to create revolutionary “fifth generation” computers. The announcement triggered panic in the US and Europe, driving funding increases including DARPA’s Strategic Computing Initiative.

The project quietly ended in 1992, having failed to achieve its goals. But the competitive response it provoked shaped AI development globally.

The Second AI Winter (1987-1993)

Why Expert Systems Failed

Expert systems hit fundamental barriers. The knowledge acquisition bottleneck made extracting rules from experts laborious. Systems were brittle, failing on edge cases not covered by rules. Rule bases became maintenance nightmares as they grew, with rules contradicting each other.

Most damagingly, desktop computers caught up to specialized AI hardware. The economic case for $100,000 Lisp machines collapsed when $10,000 workstations could do the job.

The Lisp Machine Crash

Companies like Symbolics had built dedicated hardware optimized for AI programming. When general-purpose computers became powerful enough, the market evaporated nearly overnight. Symbolics went bankrupt. The crash represented AI’s first major commercial failure.

AI Becomes a “Dirty Word”

Researchers avoided the term “artificial intelligence” to escape stigma. They rebranded their work as machine learning, informatics, knowledge systems, or computational intelligence. Funding dried up. Academic skepticism dominated. The field survived but went underground, pursuing incremental progress without grand promises.

The Quiet Revolution (1993-2006)

Neural Networks Revival: Backpropagation Works

The backpropagation algorithm existed since 1974 when Paul Werbos developed it. Rumelhart, Hinton, and Williams popularized it in 1986. By the 1990s, computers were finally powerful enough to make multi-layer networks practical.

These networks solved the XOR problem that single-layer perceptrons could not. Minsky and Papert had been wrong about multi-layer limitations. Neural networks began their comeback.

Deep Blue vs. Kasparov (1997)

IBM’s supercomputer faced world chess champion Garry Kasparov. In their 1996 match, Kasparov won 4-2. In the 1997 rematch, Deep Blue won 3.5-2.5. A computer had defeated the world’s best chess player under tournament conditions for the first time.

Deep Blue evaluated 200 million positions per second using brute force plus chess knowledge. It was not general intelligence. It could not play checkers. But the cultural impact was enormous, forcing public reckoning with machine capability.

DARPA Grand Challenge (2004-2005)

DARPA sponsored an autonomous vehicle race across the Mojave Desert. In 2004, no vehicle completed the 150-mile course. The best managed 7.4 miles. In 2005, Stanford’s “Stanley” won, completing 132 miles.

This proved autonomous vehicles were possible and directly led to Google’s self-driving car project.

The Deep Learning Revolution (2006-2012)

Geoffrey Hinton’s Deep Belief Networks (2006)

Geoffrey Hinton published “A Fast Learning Algorithm for Deep Belief Nets.” The paper solved the vanishing gradient problem that had made training deep networks impractical. Multi-layer networks could finally learn effectively.

Hinton had continued neural network research through both AI winters when the approach was unfashionable. His persistence earned him recognition as the “father of deep learning” and later a Turing Award.

ImageNet: The Dataset That Changed Everything (2009)

Fei-Fei Li at Stanford created ImageNet: 14 million labeled images across 20,000+ categories. This unprecedented scale enabled training at a level previously impossible. The annual ImageNet competition drove rapid progress as teams competed to reduce error rates.

IBM Watson Wins Jeopardy! (2011)

Watson defeated champions Ken Jennings and Brad Rutter on Jeopardy!, demonstrating natural language processing, information retrieval, and machine learning in combination. The public demonstration showed AI could handle ambiguous, complex questions.

IBM marketed Watson aggressively afterward. Applications produced mixed results, but the cultural moment reinforced AI’s return to prominence.

AlexNet: The Moment Everything Changed (2012)

The ImageNet competition produced a result that shocked the field. The University of Toronto team of Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton submitted AlexNet, a deep convolutional neural network trained on GPUs.

AlexNet achieved 15.3% top-5 error. Second place scored 26.2%. The gap was unprecedented. Deep learning did not merely win. It dominated.

The implications were immediate. Every major technology company pivoted to deep learning. GPU demand exploded. The modern AI era began.

The Rise of Modern AI (2014-2022)

GANs: AI That Creates (2014)

Ian Goodfellow at the University of Montreal invented Generative Adversarial Networks. Two networks compete: a generator creates images, a discriminator detects fakes. Both improve through competition.

GANs enabled the generative AI that would later produce DALL-E, Midjourney, and Stable Diffusion.

AlphaGo Shocks the World (2016)

DeepMind’s AlphaGo faced Lee Sedol, a 9-dan Go professional. Go had been considered ten years away for AI due to its complexity: more possible positions than atoms in the universe.

AlphaGo won 4-1. Move 37 in Game 2 demonstrated something unprecedented: a creative move no human would play but that proved brilliant. AI had shown something that looked like intuition.

The Transformer Revolution (2017)

Google researchers published “Attention Is All You Need.” The paper introduced the transformer architecture using self-attention mechanisms.

Unlike previous approaches, transformers could be parallelized efficiently. They handled long-range dependencies in text. They scaled with more data and compute.

Every major language model since builds on this paper: GPT, BERT, Claude, Gemini, Llama, Grok. It stands as the decade’s most consequential AI publication.

GPT-2: Too Dangerous to Release? (2019)

OpenAI’s GPT-2 had 1.5 billion parameters. The organization initially withheld the full model, citing concerns about misuse for generating misinformation. The decision sparked debate about AI safety and sparked broader discussions about responsible AI release.

OpenAI eventually released the full model. The concerns it raised about AI-generated content would prove prescient.

GPT-3 and the Scaling Era (2020)

GPT-3 reached 175 billion parameters. It demonstrated few-shot learning: performing tasks from examples in the prompt without specific training. The API release triggered a startup explosion as developers built on OpenAI’s infrastructure.

The model showed emergent capabilities that appeared at scale but not in smaller models, raising questions about what larger models might achieve.

ChatGPT: AI Goes Mainstream (November 30, 2022)

OpenAI released ChatGPT, GPT-3.5 fine-tuned with RLHF (Reinforcement Learning from Human Feedback). Free public access and a conversational interface made AI accessible to anyone.

The adoption rate was unprecedented: one million users in five days, 100 million monthly active users by January 2023. It became the fastest-growing consumer application in history.

ChatGPT triggered an AI gold rush. Every major technology company accelerated AI development. The industry would never be the same.

The Explosion: 2023-2025

2023: The Year of Foundation Models

The pace of releases accelerated dramatically.

March 2023: GPT-4 launched with multimodal capability (text and images). Claude debuted from Anthropic. GitHub Copilot X was announced.

July 2023: Meta released Llama 2 as open source, democratizing access to capable models. Claude 2 launched.

December 2023: Google announced Gemini. Mistral emerged as a European competitor.

2024: Multimodal and Competition

March 2024: Claude 3 family released (Haiku, Sonnet, Opus).

April 2024: Llama 3 launched.

May 2024: GPT-4o introduced native multimodal processing.

Gemini 1.5 Pro demonstrated a one-million-token context window. Multimodal capability became standard. Context windows expanded dramatically. Open-source models closed the gap with proprietary systems.

2025: The Current State

The release pace intensified further.

Date	Model	Company
February 5, 2025	Gemini 2.0 Flash	Google
February 17, 2025	Grok 3	xAI
April 2025	Llama 4	Meta
May 22, 2025	Claude 4	Anthropic
July 9, 2025	Grok 4	xAI
August 7, 2025	GPT-5	OpenAI
September 29, 2025	Claude Sonnet 4.5	Anthropic
September 30, 2025	Sora 2	OpenAI
November 17, 2025	Grok 4.1	xAI
November 24, 2025	Claude Opus 4.5	Anthropic

Company Valuations (Late 2025):

Company	Valuation	Annual Revenue
OpenAI	~$500B	$13B
xAI	~$80-200B	$500M
Anthropic	$183B	$7B
Mistral	$14B (€12B)	—

Key 2025 Trends:

AI Agents: Systems that execute multi-step tasks, not just generate text
Reasoning models: Extended “thinking” before responding
Video generation: Sora 2 reached public access
Physical AI: Robotics integrating language models

Key Figures in AI History

The Founders

Alan Turing (1912-1954): British mathematician who established theoretical foundations. Proposed the Turing Test. Broke Enigma codes in World War II. Died tragically at 41.

John McCarthy (1927-2011): Coined “artificial intelligence.” Created Lisp. Founded Stanford AI Lab. Turing Award 1971.

Marvin Minsky (1927-2016): MIT AI Lab co-founder. Built SNARC neural network. Developed Society of Mind theory. Turing Award 1969. His Perceptrons book inadvertently damaged neural network research for years.

Claude Shannon (1916-2001): Father of information theory. Dartmouth Conference participant. Shannon entropy underlies all data compression.

Herbert Simon (1916-2001) and Allen Newell (1927-1992): Created Logic Theorist and General Problem Solver. Simon won the Nobel Prize in Economics (1978). Both received the Turing Award (1975).

The Deep Learning Pioneers

Geoffrey Hinton (b. 1947): “Father of deep learning.” Championed backpropagation. Created deep belief networks. Turing Award 2018. Left Google in 2023 citing AI safety concerns.

Yann LeCun (b. 1960): Invented convolutional neural networks. Chief AI Scientist at Meta until November 2025, when he departed to found his own company. Turing Award 2018. Argues current LLMs are a “dead end” requiring “world models.”

Yoshua Bengio (b. 1964): Pioneered attention mechanisms and word embeddings. Founded Mila (Montreal AI institute). Turing Award 2018. More concerned about AI safety than LeCun.

Fei-Fei Li (b. 1976): Created ImageNet, the dataset that enabled modern computer vision. Founding director of Stanford HAI. Advocates human-centered AI.

Current Leaders

Sam Altman (b. 1985): OpenAI CEO. Former Y Combinator president. Fired and rehired in November 2023 board drama. In 2025: “The path to AGI is clear.” Leads the company valued at approximately $500 billion.

Dario Amodei (b. 1983): Anthropic CEO and co-founder. Former OpenAI VP of Research. Left over safety disagreements. Created Constitutional AI. Predicts AGI by 2026-2027 and suggests AI could “double human lifespan” within 5-10 years.

Demis Hassabis (b. 1976): DeepMind founder and CEO. Child chess prodigy and game designer. Led AlphaGo and AlphaFold projects. In late 2025: AGI is “5-10 years away” requiring “1-2 breakthroughs” in reasoning and memory.

Elon Musk (b. 1971): xAI founder. OpenAI co-founder who departed. Grok 4 claimed as “most intelligent model.” xAI raised $20 billion in September 2025.

Lessons from AI History

The Hype Cycle Pattern

The pattern repeats: breakthrough generates excitement, excitement drives overpromising, overpromising leads to disappointment, disappointment triggers funding collapse, and quiet progress continues until the next breakthrough.

This happened in the 1970s when symbolic AI failed to scale. It happened in the late 1980s when expert systems proved brittle. Whether the current explosion follows the same pattern or represents something fundamentally different remains the central question of this moment.

Compute Is Often the Bottleneck

Algorithms frequently existed decades before they became practical. Backpropagation was developed in 1974 but did not prove useful until the 2000s. Neural network concepts from 1943 required until 2012 for hardware to catch up. Current frontier models require $100+ million in compute for training runs.

The gap between theoretical capability and practical implementation has repeatedly defined AI progress.

Data Matters as Much as Algorithms

ImageNet in 2009 made modern computer vision possible. Internet-scale text enabled large language models. The quality and composition of data determines what AI learns, including biases. Algorithms without appropriate data produce nothing useful.

Frequently Asked Questions

When was artificial intelligence invented?

AI as an academic field was born at the 1956 Dartmouth Conference. Foundational concepts date to Turing’s 1950 paper, and neural network mathematics to 1943.

Who invented artificial intelligence?

AI has many founders. Key figures include Alan Turing (theoretical foundations), John McCarthy (coined the term), Marvin Minsky, Claude Shannon, Herbert Simon, and Allen Newell.

What caused the AI winters?

Unrealistic expectations, overpromises to funders, compute limitations, and algorithms that could not scale. Both winters occurred when AI failed to deliver on ambitious timelines.

Why did deep learning suddenly work in 2012?

Three factors converged: sufficient compute (GPUs), sufficient data (ImageNet), and improved algorithms (dropout, ReLU activations). AlexNet demonstrated this combination could dramatically outperform alternatives.

Conclusion

The history of artificial intelligence follows a 75-year arc from Turing’s question to trillion-dollar valuations. The pattern that emerges is consistent: breakthrough, hype, disappointment, quiet progress, breakthrough again.

Two AI winters demonstrated what happens when promises outpace capability. The compute and data constraints that blocked progress eventually yielded to Moore’s Law and internet-scale information. AlexNet in 2012 and ChatGPT in 2022 marked inflection points that redirected the entire technology industry.

Understanding this history illuminates current debates. When Sam Altman says the path to AGI is clear and Yann LeCun says current approaches are dead ends, history suggests both certainty and skepticism have been wrong before. The $500 billion valuations and 800 million weekly users represent genuine capability growth. Whether they represent the threshold of general intelligence or another peak before disappointment remains to be seen.

The past 75 years suggest caution about predictions and confidence about continued capability growth. Whatever comes next will build on everything that came before.

Sources:

Turing’s 1950 paper: “Computing Machinery and Intelligence,” Mind
Dartmouth Conference: Conference proposal and proceedings (1956)
McCulloch-Pitts paper: “A Logical Calculus of Ideas Immanent in Nervous Activity” (1943)
Minsky and Papert: Perceptrons (1969)
Lighthill Report: UK Science Research Council (1973)
MYCIN: Shortliffe, E.H., Stanford University (1976)
R1/XCON savings: McDermott, J., AI Magazine (1982)
Deep Blue vs. Kasparov: IBM archives, news coverage (1997)
AlexNet: Krizhevsky, Sutskever, Hinton, ImageNet 2012
“Attention Is All You Need”: Vaswani et al. (2017)
ChatGPT adoption: SimilarWeb, company announcements
Company valuations: Late 2025 funding rounds and reports
Model release dates: Company announcements (2025)
Expert predictions: Public statements from Altman, Amodei, Hassabis, LeCun (2025)
Market data: Fortune Business Insights