ARC-AGI-3

Interactive Reasoning Benchmark

The first eval that measures human-like intelligence in AI.

The Evolution of ARC

ARC-AGI-3 is designed to measure AI system generalization and intelligence through skill-acquisition efficiency in novel, unseen environments.

The in-progress benchmark dataset will consist of ~100 unique environments split into public private evaluation sets where AI agents must perceive, decide, and act over multiple steps without prior instructions.

ARC-AGI-3 is currently in development. The early preview is limited to 6 games (3 public, 3 to be released in Aug '25). Development began in early 2025 and is set to launch in 2026.

Interactive Reasoning Benchmarks

Traditionally, to measure intelligence, static benchmarks have been the yardstick, but they do not have the bandwidth required to measure the full spectrum of intelligence.

Interactive Reasoning Benchmarks (IRBs) test for a much broader scope of capabilities:

Exploration
Percept -> Plan → Action
Memory
Goal Acquisition
Alignment

Game environment provide a rich medium to test experience-driven competence.

Intelligence Is Interactive

Human-Like Intelligence

We can declare the arrival of AGI when we build an artificial system that matches the learning efficiency of humans.

Humans are the only existance proof of general intelligence. Human-level intelligence is inherently interactive. It unfolds over time, drawing on experience as we explore an environment, plan, reflect, and adjust towards a goal. By testing intelligence over time we are able to observe extended trajectories, planning horizons, memory compression (distilling past states into future decisions), self-reflection, and plan-execution in context.

Games

Game environments provide an ideal medium to test interactivity. They strike a unique balance between clear rules, goals, and feedback but also requiring the test-taker to engage in complex planning, and learning.

We've seen echoes of this in earlier eras; Atari games have been widely used in the past. But the agent shortcomings were clear: These systems couldn't generalize beyond memorized pixels, relied on built-in human priors, ignored efficiency, encoded developer intelligence, and no true hidden test set.

Benchmark Design

ARC-AGI-3 will overcome this by introducing a new set of hand-crafted novel environments that are designed to test the skill-acquisition efficiency of artificial systems as compared to humans.

It will rely on previous ARC-AGI pillars (core knowledge priors, excluding reliance on language, trivia, or vast training data) to evaluate performance against human baselines.

IRBs aren't just better metrics; they're a clear signal that there is a wide gap between human and artificial intelligence.

As long as that gap remains, we do not have AGI.

Compete + Win Prizes

ARC-AGI-3 Agent Competition

ARC-AGI-3 Preview Agent Competition

ARC Prize partnering with HuggingFace to host a competition that harnesses the collective intelligence of our community to evaluate how current AI performs on ARC-AGI-3.

We need your help to build agents that can play and learn to help us calibrate difficulty and refine game design. We're open to a mix of language model and reinforcement learning-based approaches. Are you ready to build?

Learn more about the competition

Get Involved

Build Agents

Help us learn about where current frontier AI is. Build on top of the ARC-AGI API to create agents that can play ARC-AGI-3. Ready to build? ARC-AGI-3 API documentation

Got Game Ideas? 👾

One of the primary challenges in developing ARC-AGI-3 is generating innovative game ideas.

Great ideas can come from anyone, anywhere - that's why we're calling on the community to contribute your creative game concepts. While we can't guarantee that every submission will be implemented, your inspiration and enthusiasm are invaluable to us.

Game Design Constraints

Easy for humans (can pick it up in <1 min of game play)
Core Knowledge Priors (no language, trivia, cultural symbols)
Should require no instructions to play
Should be fun for humans and playable in 5-10 minutes
Innovative and novel game mechanics encouraged (Hidden state, theory of mind, long term planning, navigating other agents, etc.)

Have a game idea?

Submit an idea

As a non-profit, building over 100 games is a worthy challenge, but we're able to do this important work thanks to the generous support of our incredible sponsors.

Every donation above $5,000 directly funds the creation of one new ARC-AGI-3 game.

Interested? Please consider making a donation today.

Learn more

Get inspired by the recent YC AI Startup School talk from ARC-AGI creator François Chollet to hear more about the gap between AI and AGI.