ARC-AGI-3

free

Interactive reasoning benchmark to measure human-like intelligence in AI agents

6 views0arcprize.org

API

AI Research AI Agents

Visit ARC-AGI-3 Watch Demo Docs

About ARC-AGI-3

ARC-AGI-3 is an interactive reasoning benchmark designed to challenge AI agents to explore novel environments, acquire goals on the fly, build adaptable world models, and learn continuously. Rather than solving static puzzles, agents must learn from experience inside each environment by perceiving what matters, selecting actions, and adapting their strategy without relying on natural-language instructions. A 100% score means AI agents can beat every game as efficiently as humans, measuring intelligence across time and the gap between AI and human learning.

Key Features

Interactive reasoning benchmark
Replayable runs for transparent evaluation
Developer toolkit for agent integration
Interactive UI for testing and iteration
API for agent integration
100% human-solvable environments
Experience-driven adaptation
Long-horizon planning with sparse feedback

Use Cases

Measuring AI agent reasoning capabilitiesBenchmarking human-like intelligenceTesting adaptive learning in novel environmentsEvaluating long-horizon planningAgent development and iteration