Back to Archive
AutoBench Agronomy LLM Benchmark - December 2025
The first AutoBench run for the Agronomy domain with models Gemini 3 Pro, Gpt 5.1, Grok 4.1, Opus 4.5 and more
Past
Date
December 10, 2025
Version
2025-12-10
Models
40
New Models
17
Run data
Model | Average (All Topics) |
|---|---|
| 140.66s (#40) | |
| 112.19s (#39) | |
| 74.34s (#38) | |
| 74.18s (#37) | |
| 71.34s (#36) | |
| 70.41s (#35) | |
| 68.36s (#34) | |
| 68.03s (#33) | |
| 66.00s (#32) | |
| 61.60s (#31) | |
| 53.70s (#30) | |
| 52.84s (#29) | |
| 50.84s (#28) | |
| 50.43s (#27) | |
| 46.15s (#26) | |
| 45.41s (#25) | |
| 42.23s (#24) | |
| 35.68s (#23) | |
| 35.56s (#22) | |
| 35.26s (#21) | |
| 34.63s (#20) | |
| 32.19s (#19) | |
| 30.64s (#18) | |
| 29.33s (#17) | |
| 26.09s (#16) | |
| 24.09s (#15) | |
| 23.30s (#14) | |
| 21.87s (#13) | |
| 21.11s (#12) | |
| 19.89s (#11) | |
| 17.50s (#10) | |
| 16.98s (#9) | |
| 15.16s (#8) | |
| 14.87s (#7) | |
| 12.37s (#6) | |
| 12.09s (#5) | |
| 10.98s (#4) | |
| 7.84s (#3) | |
| 7.51s (#2) | |
| 6.53s (#1) |