[P] LLM Jigsaw: Benchmarking Spatial Reasoning in VLMs - frontier models hit a wall at 5×5 puzzles
r/MachineLearning
21 ↑
2 💬
1 week ago
Открыть →
LLM Jigsaw: Benchmarking Spatial Reasoning in VLMs - GPT 5.2 and other frontier models hit a wall at 5×5 puzzles
LLM Jigsaw: Benchmarking Spatial Reasoning in VLMs - Claude hits a wall at 3x3, other frontier models at 5x5 puzzles