Model breakdown
Grok / xAIx-ai/grok-4
Profile
Grok 4xAI
Release
2025-07-09Available on OpenRouter
Specs
Not publicly disclosed256,000 tokens
Capabilities
Image + TextFrontier multimodal reasoning with native tool use and real-time search
Training
Not publicly disclosedxAI
Rank#12
-601.1alignment score
76.7%crowd match
Mean gap23.3%
Human match76.7%
Best fitBacon Lettuce Tomato
Average vote57.9%
57.9%model yes
62.8%human yes
Workload2K evals
2Kevals
100iterations
3Mtokens
Photo-by-photo

Model Results

Breaking down how close the model answered each question, compared to humans.

Dodge Van
Photo 02Dodge Van
Grok / xAIx-ai/grok-4
0.0% yes100.0% no
Gap7.0%
Model readLeans no
Sub Sandwich
Photo 03Sub Sandwich
Grok / xAIx-ai/grok-4
100.0% yes0.0% no
Gap5.5%
Model readLeans yes
Grok / xAIx-ai/grok-4
100.0% yes0.0% no
Gap4.4%
Model readLeans yes
Grok / xAIx-ai/grok-4
14.0% yes86.0% no
Gap40.2%
Model readLeans no
Hamburger
Photo 08Hamburger
Grok / xAIx-ai/grok-4
100.0% yes0.0% no
Gap27.0%
Model readLeans yes
Hot Dog
Photo 10Hot Dog
Grok / xAIx-ai/grok-4
12.0% yes88.0% no
Gap27.8%
Model readLeans no
Grok / xAIx-ai/grok-4
0.0% yes100.0% no
Gap65.6%
Model readLeans no
Avocado Tea
Photo 12Avocado Tea
Grok / xAIx-ai/grok-4
100.0% yes0.0% no
Gap7.2%
Model readLeans yes
Panini
Photo 13Panini
Grok / xAIx-ai/grok-4
100.0% yes0.0% no
Gap7.6%
Model readLeans yes
Cookie PB
Photo 14Cookie PB
Grok / xAIx-ai/grok-4
72.0% yes28.0% no
Gap20.5%
Model readLeans yes
Chicken Wrap
Photo 15Chicken Wrap
Grok / xAIx-ai/grok-4
0.0% yes100.0% no
Gap22.6%
Model readLeans no
Sloppy Joe
Photo 17Sloppy Joe
Grok / xAIx-ai/grok-4
100.0% yes0.0% no
Gap20.6%
Model readLeans yes
Grok / xAIx-ai/grok-4
17.0% yes83.0% no
Gap38.7%
Model readLeans no
Bagel PB&J
Photo 20Bagel PB&J
Grok / xAIx-ai/grok-4
93.0% yes7.0% no
Gap46.4%
Model readLeans yes
PhotoVote SplitHuman responseGapRead
Grok / xAIx-ai/grok-4
100.0% yes0.0% no
3.7%absolute gap
Leans yesPeople mostly said yes
Dodge Van
Photo 02Dodge Van
Grok / xAIx-ai/grok-4
0.0% yes100.0% no
7.0%absolute gap
Leans noPeople mostly said no
Sub Sandwich
Photo 03Sub Sandwich
Grok / xAIx-ai/grok-4
100.0% yes0.0% no
5.5%absolute gap
Leans yesPeople mostly said yes
Grok / xAIx-ai/grok-4
13.0% yes87.0% no
27.9%absolute gap
Leans noHuman knife-edge
Grok / xAIx-ai/grok-4
100.0% yes0.0% no
4.4%absolute gap
Leans yesPeople mostly said yes
Grok / xAIx-ai/grok-4
100.0% yes0.0% no
8.3%absolute gap
Leans yesPeople mostly said yes
Grok / xAIx-ai/grok-4
14.0% yes86.0% no
40.2%absolute gap
Leans noHuman knife-edge
Hamburger
Photo 08Hamburger
Grok / xAIx-ai/grok-4
100.0% yes0.0% no
27.0%absolute gap
Leans yesSplit concept
Grok / xAIx-ai/grok-4
37.0% yes63.0% no
22.4%absolute gap
Leans noHuman knife-edge
Hot Dog
Photo 10Hot Dog
Grok / xAIx-ai/grok-4
12.0% yes88.0% no
27.8%absolute gap
Leans noSplit concept
Grok / xAIx-ai/grok-4
0.0% yes100.0% no
65.6%absolute gap
Leans noSplit concept
Avocado Tea
Photo 12Avocado Tea
Grok / xAIx-ai/grok-4
100.0% yes0.0% no
7.2%absolute gap
Leans yesPeople mostly said yes
Panini
Photo 13Panini
Grok / xAIx-ai/grok-4
100.0% yes0.0% no
7.6%absolute gap
Leans yesPeople mostly said yes
Cookie PB
Photo 14Cookie PB
Grok / xAIx-ai/grok-4
72.0% yes28.0% no
20.5%absolute gap
Leans yesHuman knife-edge
Chicken Wrap
Photo 15Chicken Wrap
Grok / xAIx-ai/grok-4
0.0% yes100.0% no
22.6%absolute gap
Leans noSplit concept
Grok / xAIx-ai/grok-4
99.0% yes1.0% no
32.7%absolute gap
Leans yesSplit concept
Sloppy Joe
Photo 17Sloppy Joe
Grok / xAIx-ai/grok-4
100.0% yes0.0% no
20.6%absolute gap
Leans yesSplit concept
Grok / xAIx-ai/grok-4
0.0% yes100.0% no
29.8%absolute gap
Leans noSplit concept
Grok / xAIx-ai/grok-4
17.0% yes83.0% no
38.7%absolute gap
Leans noHuman knife-edge
Bagel PB&J
Photo 20Bagel PB&J
Grok / xAIx-ai/grok-4
93.0% yes7.0% no
46.4%absolute gap
Leans yesHuman knife-edge
x-ai/grok-4 Sandwich Benchmark Breakdown | opensandwich.ai