AA score/cost comparison

Snapshot #2 fetched 2026-05-12 00:00:09 · 7.55 MB raw HTML · 516 models

Snapshot

Pick the stored fetch snapshot to compare.

Scoring

Choose the quality benchmark and final score formula.

Cost adjustment

Tune formulas that account for benchmark cost.

Result set

Control table filtering, ordering, and row count.

Top quality model in this view: GPT-5.5 (xhigh) (64.5 pts, $3,357) · sorted by score

Historic #1 winner?

Top row for each successful snapshot using the current mode/calc/cost settings and score sort.

Latest #1 GPT-5.5 (xhigh) 64.5
65.5 63.5 #1 · GPT-5.5 (xhigh) · score 64.5 · 2026-05-11 23:17:33#2 · GPT-5.5 (xhigh) · score 64.5 · 2026-05-12 00:00:09 GPT-5.5 (xhigh)

Recent winner changes

Run?Fetched?Winner?Score?Qual?Cost?
#2 2026-05-12 00:00:09 GPT-5.5 (xhigh) 64.5 64.5 $3,357
#1 2026-05-11 23:17:33 GPT-5.5 (xhigh) 64.5 64.5 $3,357
#?Pareto?Model?Released?Cost$?$/Q?Qual?ΔTop?Intel?Code?Agent?Pen?Score?
1 GPT-5.5 (xhigh)
OpenAI
2026-04-23 $3,357 52.05 64.5 0.0 60.2 59.1 74.1 35.3 64.5
2 GPT-5.5 (high)
OpenAI
2026-04-23 $2,159 34.21 63.1 -1.4 58.9 58.5 72.0 33.3 63.1
3 GPT-5.5 (medium)
OpenAI
2026-04-23 $1,199 19.73 60.8 -3.7 56.7 56.2 69.4 30.8 60.8
4 Gemini 3.1 Pro Preview
Google
2026-02-19 $892 15.58 57.3 -7.2 57.2 55.5 59.1 29.5 57.3
5 MiMo-V2.5-Pro
Xiaomi
2026-04-22 $462 8.30 55.6 -8.9 53.8 45.5 67.4 26.6 55.6
6 Grok 4.3
xAI
2026-04-30 $395 7.40 53.4 -11.1 53.2 41.0 65.9 26.0 53.4
7 MiMo-V2.5
Xiaomi
2026-04-22 $207 3.97 52.2 -12.3 49.0 42.1 65.5 23.2 52.2
8 MiniMax-M2.7
MiniMax
2026-03-18 $176 3.44 51.0 -13.5 49.6 41.9 61.5 22.4 51.0
9 DeepSeek V4 Flash (Reasoning, Max Effort)
DeepSeek
2026-04-24 $113 2.31 48.8 -15.7 46.5 38.7 61.3 20.5 48.8
10 DeepSeek V4 Flash (Reasoning, High Effort)
DeepSeek
2026-04-24 $57.2 1.20 47.5 -17.0 44.9 39.8 57.8 17.6 47.5
11 DeepSeek V4 Flash (Non-reasoning)
DeepSeek
2026-04-24 $40.0 0.90 44.3 -20.2 36.5 35.1 61.3 16.0 44.3
12 Grok 4.1 Fast (Reasoning)
xAI
2025-11-19 $39.6 1.00 39.6 -24.9 38.6 30.9 49.3 16.0 39.6
13 MiMo-V2-Flash (Non-reasoning)
Xiaomi
2025-12-16 $21.4 0.62 34.5 -30.0 30.4 25.8 47.3 13.3 34.5
14 Grok 4.1 Fast (Non-reasoning)
xAI
2025-11-19 $21.4 0.84 25.3 -39.2 23.6 19.5 33.0 13.3 25.3
15 Grok 4 Fast (Non-reasoning)
xAI
2025-09-19 $17.4 0.71 24.7 -39.8 23.1 19.0 31.9 12.4 24.7
16 gpt-oss-120B (low)
OpenAI
2025-08-05 $15.9 0.70 22.7 -41.8 24.5 15.5 28.0 12.0 22.7
17 gpt-oss-20B (low)
OpenAI
2025-08-05 $7.68 0.40 19.0 -45.5 20.8 14.4 21.9 8.9 19.0
18 Qwen3.5 0.8B (Non-reasoning)
Alibaba
2026-03-02 $6.67 0.61 10.9 -53.6 9.9 1.0 21.7 8.2 10.9
19 Granite 4.0 H Small
IBM
2025-09-22 $4.48 0.54 8.4 -56.1 10.8 8.5 5.8 6.5 8.4
20 Phi-4
Microsoft
2024-12-12 $4.27 0.59 7.2 -57.3 10.4 11.2 0.0 6.3 7.2
21 Apertus 70B Instruct
Swiss AI Initiative
2025-09-02 $3.78 0.82 4.6 -59.9 7.7 1.9 4.3 5.8 4.6
22 Apertus 8B Instruct
Swiss AI Initiative
2025-09-02 $0.10 0.03 3.7 -60.8 5.9 1.4 3.8 0.0 3.7