Previous photoSub SandwichNext photoGrilled Cheese
Human 40.9% yes59.1% no Model average 8.7% yes91.3% no Most aligned model meta-llama/llama-3.2-11b-vision-instruct meta-llama/llama-3.2-11b-vision-instruct Least aligned model anthropic/claude-sonnet-4.6 anthropic/claude-sonnet-4.6 Human distribution 40.9% yes, 59.1% no over 656 explicit votes. Model average distribution 8.7% yes, 91.3% no across the current model set. Closest current model 31.4% yes. Least aligned model 59.1 point gap. Legacy GPT-4o baseline 0.0% yes with a 40.9 point gap against humans. Biggest model gap 59.1 percentage points on this image. Current classification Human knife-edge 

PPLHuman knife-edge
Benchmark image 04
Sandwich Costume
Human "Sandwich"
A parade line of humans dressed as bread, cheese, meat, and tomato forms a structurally convincing sandwich that still fails the crucial requirement of being lunch. It is the kind of edge case that makes literalists sound insane and compositionalists sound worse.
Under development: this benchmark and its published results are provisional, not final.
At a glance
How this photo split the room
meta-llama/llama-3.2-11b-vision-instruct
anthropic/claude-sonnet-4.6
Why this page matters
This is the compact read on where humans, models, and comments start disagreeing about the same image.
Model spread
How models line up against the crowd
Bars run from more no on the left to more yes on the right. The marker shows the human yes rate.
amazon/nova-lite-v1
amazon/nova-pro-v1
baidu/ernie-4.5-vl-28b-a3b
google/gemini-3-flash-preview
google/gemini-3.1-pro-preview
google/gemma-3-12b-it
google/gemma-3-27b-it
meta-llama/llama-4-maverick
meta-llama/llama-4-scout
minimax/minimax-01
openai/gpt-4.1
openai/gpt-4.1-mini
openai/gpt-4o
openai/gpt-4o-2024-11-20
openai/gpt-4o-mini
openai/gpt-5.4
openai/gpt-5.4-pro
openai/o3
openai/o3-pro
qwen/qwen2.5-vl-32b-instruct
z-ai/glm-4.6v
moonshotai/kimi-k2.5
qwen/qwen2.5-vl-72b-instruct
qwen/qwen-2-vl-72b-instruct
mistralai/pixtral-large-2411
qwen/qwen3.5-397b-a17b
x-ai/grok-4-fast
google/gemini-2.5-pro
meta-llama/llama-3.2-11b-vision-instruct
anthropic/claude-opus-4.6
anthropic/claude-sonnet-4.6
Vote card
Generated summary for this photo



Selected human comments
meta-llama/llama-3.2-11b-vision-instruct comments
anthropic/claude-sonnet-4.6 comments