AGI remains unsolved.
New ideas still needed.

ARC-AGI Leaderboard

Hover to highlight
ARC-AGI-1
ARC-AGI-2
Only systems which required less than $10,000 to run are shown. Notably missing from this chart is o3 (high compute). For more information on this see our announcement blog post.

Understanding the Leaderboard

ARC-AGI has evolved from its first version (ARC-AGI-1) which measured basic fluid intelligence, to ARC-AGI-2 which challenges systems to demonstrate both high adaptability and high efficiency.

The scatter plot above visualizes the critical relationship between cost-per-task and performance - a key measure of intelligence efficiency. True intelligence isn't just about solving problems, but solving them efficiently with minimal resources.

Interpreting the data

For more information on our reporting process, see our testing policy.

Leaderboard Breakdown

* ARC-AGI-2 score estimate based on partial testing results and o1-pro pricing.

* * Preview results: Results marked as preview are unofficial and may be based on incomplete testing. Models without available pricing information will not be shown on the efficiency chart. Results become official after complete testing is finished.

Toggle Animation