ARC-AGI Community Leaderboard

ARC-AGI has gained significant popularity over the past two years, and we've been overwhelmed by the number of researchers and builders who want to showcase their work to the community. The ARC-AGI Community Leaderboard provides a landing spot for these submissions, where the community can review, discuss, and verify results together.

Community Leaderboard submissions must be general purpose and reproducible. Scores are self-reported unless noted otherwise. Results on the ARC-AGI-1 and ARC-AGI-2 semi-private sets are run and verified by ARC Prize. Everything else is scored on a public set and self-reported. We aren't able to verify the authenticity of self-reported scores and won't independently verify submissions except in extraordinary cases, so we encourage the community to explore each submission and validate the results for themselves. We reserve the right to determine what qualifies. For more on how we approach testing, see our testing policy.

To submit your work, head to the ARC-AGI Community Leaderboard repo on GitHub.

NameAuthorsBenchmarkScoreCostDateLinks
baseline1

Coding agent that builds and verifies an executable Python world model, then plans through it.

Sergey Rodionov
ARC-AGI-3Public Demo
63.7%$3502026-05-20
Vision - Continual Learning v1

Multimodal agent with continual-learning weights carried across games and levels.

Vansh
ARC-AGI-3Public Demo
63.1%$4,7882026-05-18
OpenClaw

OpenClaw Harness adapted to play ARC-AGI-3 allowed memory and code execution tools.

ARC Prize Foundation
ARC-AGI-3Public Demo
5.2%$2,9122026-05-15
Human Intelligence Harness

Maximum human intelligence built into an agent harness.

ARC Prize Foundation
ARC-AGI-3Public Demo
95.3%-2026-04-14
TELL

Single-conversation agent that compounds confirmed knowledge in a MEMORY.md file.

Hesai, Waci
ARC-AGI-3Public Demo
43.9%$1,4062026-04-09
a-evolve MAS Evolved

Evolved multi-agent orchestrator with 9 learned skills mined from competition logs.

Zhan Shi, Hanqing Lu,
Bing He
, Yisi Sang, Minhua Lin
ARC-AGI-3Public Demo
12.3%$5,3002026-04-09
Read-Grep-Bash Agent

A coding agent that uses search and Python scripting over game logs.

Alexis Fox, Junlin Wang,
Paul Rosu
, Bhuwan Dhingra
ARC-AGI-3Public Demo
50.2%-2026-03-13
Evolutionary Test-Time Compute with Natural Language Instructions

Evolves natural language instructions instead of code.

Jeremy Berman
ARC-AGI-2Semi-Private
29.4%$3,6482025-09-16
Efficient Evolutionary Program Synthesis

Evolves a growing library of Python programs with an LLM.

Eric Pang
ARC-AGI-2Semi-Private
26.0%$4762025-09-01
Tiny Recursive Model (TRM)

7M parameter recursive model with think-act refinement loops.

Alexia Jolicoeur-Martineau
ARC-AGI-2Public Train
7.8%$2522025-07-01
Hierarchical Reasoning Model (HRM)

Brain-inspired 27M parameter model with iterative refinement.

Sapient Intelligence
ARC-AGI-2Semi-Private
2.0%$2012025-06-08
Evolutionary Test-time Compute

Genetic algorithm over LLM-generated Python transforms.

Jeremy Berman
ARC-AGI-1Semi-Private
53.6%$2,9002024-12-18
Ryan Greenblatt

LLM generates and refines thousands of candidate programs per task.

Ryan Greenblatt
ARC-AGI-1Semi-Private
43.0%$40,0002024-06-17