AI Models Toplist by Benchmarks

Model Toplist by ReasoningAverage Benchmark(the higher the better)

Number Model ReasoningAverage
1 Claude 4 Sonnet Thinking 95.25
2 o3 Pro High 94.67
3 o3 High 94.67
4 Gemini 2.5 Pro (Max Thinking) 94.28
5 Gemini 2.5 Pro 93.72
6 DeepSeek R1 (2025-05-28) 91.08
7 o3 Medium 91
8 Claude 4 Opus Thinking 90.47
9 o4-Mini High 88.11
10 Grok 3 Mini Beta (High) 87.61

Model DataAnalysisAverage LeaderBoard(the higher the better)

Model Toplist by MathematicsAverage Benchmark(the higher the better)

Number Model MathematicsAverage
1 Claude 4 Opus Thinking 88.25
2 DeepSeek R1 (2025-05-28) 85.26
3 Claude 4 Sonnet Thinking 85.25
4 o3 High 85
5 o4-Mini High 84.9
6 o3 Pro High 84.75
7 Gemini 2.5 Pro (Max Thinking) 84.19
8 Gemini 2.5 Flash 84.1
9 Gemini 2.5 Pro 83.33
10 o4-Mini Medium 81.02

Model LanguageAverage LeaderBoard(the higher the better)

Model Toplist by CodingAverage Benchmark(the higher the better)

Number Model CodingAverage
1 o4-Mini High 79.98
2 Claude 4 Sonnet 78.25
3 o3 Medium 77.86
4 ChatGPT-4o 77.48
5 o3 Pro High 76.78
6 o3 High 76.71
7 DeepSeek R1 76.07
8 GPT-4.5 Preview 76.07
9 Claude 3.7 Sonnet 74.28
10 o4-Mini Medium 74.22

Model IFAverage LeaderBoard(the higher the better)