Previous photoChicken WrapNext photoSloppy Joe
Human 66.3% yes33.7% no Model average 64.9% yes35.1% no Most aligned model qwen/qwen-2-vl-72b-instruct qwen/qwen-2-vl-72b-instruct Least aligned models 6-way tie qwen/qwen2.5-vl-32b-instructgoogle/gemma-3-27b-itmistralai/pixtral-large-2411+3 more Human distribution 66.3% yes, 33.7% no over 655 explicit votes. Model average distribution 64.9% yes, 35.1% no across the current model set. Closest current model 75.0% yes. Least aligned models 66.3 point gap. Legacy GPT-4o baseline 100.0% yes with a 33.7 point gap against humans. Biggest model gap 66.3 percentage points on this image. Current classification Split concept 

WICSplit concept
Benchmark image 16
Waffle Ice Cream
Waffle ice cream "Sandwich"
Ice cream wedged between waffles presents itself as a dessert sandwich with zero shame and excellent marketing instincts. It is not lunch, but it absolutely understands the assignment.
Under development: this benchmark and its published results are provisional, not final.
At a glance
How this photo split the room
qwen/qwen-2-vl-72b-instruct
6-way tie
Why this page matters
This is the compact read on where humans, models, and comments start disagreeing about the same image.
Model spread
How models line up against the crowd
Bars run from more no on the left to more yes on the right. The marker shows the human yes rate.
amazon/nova-pro-v1
baidu/ernie-4.5-vl-28b-a3b
google/gemma-3-27b-it
mistralai/pixtral-large-2411
openai/gpt-4o-mini
qwen/qwen2.5-vl-32b-instruct
amazon/nova-lite-v1
openai/gpt-4.1-mini
google/gemma-3-12b-it
meta-llama/llama-3.2-11b-vision-instruct
openai/gpt-5.4
minimax/minimax-01
qwen/qwen-2-vl-72b-instruct
meta-llama/llama-4-maverick
qwen/qwen2.5-vl-72b-instruct
openai/gpt-4o-2024-11-20
qwen/qwen3.5-397b-a17b
anthropic/claude-opus-4.6
anthropic/claude-sonnet-4.6
google/gemini-2.5-pro
google/gemini-3-flash-preview
google/gemini-3.1-pro-preview
meta-llama/llama-4-scout
moonshotai/kimi-k2.5
openai/gpt-4.1
openai/gpt-4o
openai/gpt-5.4-pro
openai/o3
openai/o3-pro
x-ai/grok-4-fast
z-ai/glm-4.6v
Vote card
Generated summary for this photo



Selected human comments
qwen/qwen-2-vl-72b-instruct comments
amazon/nova-pro-v1 comments