Nvidia publishes the first AgentPerf infrastructure results
AgentPerf measures real-world agentic workloads, with Nvidia claiming Blackwell Ultra runs up to 20 times more agents per megawatt than Hopper.
Read more
Nvidia published the first results from AgentPerf, which it describes as the industry's first infrastructure benchmark built around real agentic AI workloads. The benchmark uses coding-agent trajectories across more than 12 programming languages and measures how many simultaneous tasks a platform can support while meeting responsiveness thresholds. Nvidia says its GB300 NVL72 system ran up to 20 times more agents per megawatt than an H200 system on the initial DeepSeek V4 Pro workload. The result is vendor-published and should be independently scrutinized, but AgentPerf addresses an important gap: chatbot inference benchmarks do not capture the long sequences, tool delays, and repeated model calls that determine agent economics.
Key details: June 12, 2026, First published AgentPerf results, Uses real coding-agent trajectories, Nvidia claims up to 20x more agents per megawatt.
Continue swiping for more AI Brief stories.