PACE: A Proxy for Agentic Capability Evaluation
ai

PACE: A Proxy for Agentic Capability Evaluation

Yueqi Song, Lintang Sutawika, Jiarui Liu, Lindia Tjuatja, Jiayi Geng 2026-07-01

The PACE framework investigates whether expensive, time-consuming agentic benchmarks like SWE-Bench and GAIA can be predicted using cheaper, non-agentic LLM benchmarks. By focusing on individual capabilities such as reas…

→ View original source
Loading more articles...