Back to Archive

AutoBench Agronomy LLM Benchmark - December 2025

The first AutoBench run for the Agronomy domain with models Gemini 3 Pro, Gpt 5.1, Grok 4.1, Opus 4.5 and more

Past
Date
December 10, 2025
Version
2025-12-10
Models
40
New Models
17

Run data

Model
Average (All Topics)
6.53s (#1)
7.51s (#2)
7.84s (#3)
10.98s (#4)
12.09s (#5)
12.37s (#6)
14.87s (#7)
15.16s (#8)
16.98s (#9)
17.50s (#10)
19.89s (#11)
21.11s (#12)
21.87s (#13)
23.30s (#14)
24.09s (#15)
26.09s (#16)
29.33s (#17)
30.64s (#18)
32.19s (#19)
34.63s (#20)
35.26s (#21)
35.56s (#22)
35.68s (#23)
42.23s (#24)
45.41s (#25)
46.15s (#26)
50.43s (#27)
50.84s (#28)
52.84s (#29)
53.70s (#30)
61.60s (#31)
66.00s (#32)
68.03s (#33)
68.36s (#34)
70.41s (#35)
71.34s (#36)
74.18s (#37)
74.34s (#38)
112.19s (#39)
140.66s (#40)
AutoBench Agronomy LLM Benchmark - December 2025 - AutoBench