Back to Archive
AutoBench Agronomy LLM Benchmark - December 2025
The first AutoBench run for the Agronomy domain with models Gemini 3 Pro, Gpt 5.1, Grok 4.1, Opus 4.5 and more
Past
Date
December 10, 2025
Version
2025-12-10
Models
40
New Models
17
Run data
Model | AutoBench |
|---|---|
| 2.9 (#40) | |
| 3.43 (#39) | |
| 3.44 (#38) | |
| 3.48 (#37) | |
| 3.51 (#36) | |
| 3.61 (#35) | |
| 3.66 (#34) | |
| 3.68 (#33) | |
| 3.91 (#32) | |
| 4.16 (#31) | |
| 4.18 (#30) | |
| 4.27 (#29) | |
| 4.28 (#28) | |
| 4.32 (#27) | |
| 4.33 (#26) | |
| 4.34 (#25) | |
| 4.38 (#24) | |
| 4.38 (#23) | |
| 4.44 (#22) | |
| 4.45 (#21) | |
| 4.45 (#20) | |
| 4.46 (#19) | |
| 4.47 (#18) | |
| 4.52 (#17) | |
| 4.52 (#16) | |
| 4.54 (#15) | |
| 4.54 (#14) | |
| 4.56 (#13) | |
| 4.56 (#12) | |
| 4.57 (#11) | |
| 4.58 (#10) | |
| 4.58 (#9) | |
| 4.59 (#8) | |
| 4.59 (#7) | |
| 4.6 (#6) | |
| 4.63 (#5) | |
| 4.64 (#4) | |
| 4.64 (#3) | |
| 4.83 (#2) | |
| 4.85 (#1) |