AutoBench Run 4 - November 2025
Date
November 28, 2025
Version
2025-11-28
Models
32
New Models
18
Latest AutoBench run with models Gemini 3 Pro, Gpt 5.1, Grok 4.1 and more
View Results→Latest AutoBench run with models Gemini 3 Pro, Gpt 5.1, Grok 4.1 and more
View Results→Latest AutoBench run with enhanced metrics including evaluation iterations and fail rates
View Results→Second major AutoBench run with o4-mini, GPT-4.1-mini, Gemini 2.5 Pro Preview, Claude 3.7 Sonnet:thinking, etc.
View Results→