Introduces GAP datasets of synthetic heavily eroded irregular puzzle pieces modeled on archaeological fragments and PuzzleFlow, a ViT and flow-matching framework that outperforms prior jigsaw solvers on these datasets.
Seq2seq models reconstruct visual jigsaw puzzles without seeing them
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 2
citation-polarity summary
fields
cs.CV 2years
2026 2roles
background 2polarities
background 2representative citing papers
Fine-tuning VLMs to output action sequences for puzzles causes emergent internal visual representations that improve performance when integrated into reasoning.
citing papers explorer
-
The Missing GAP: From Solving Square Jigsaw Puzzles to Handling Real World Archaeological Fragments
Introduces GAP datasets of synthetic heavily eroded irregular puzzle pieces modeled on archaeological fragments and PuzzleFlow, a ViT and flow-matching framework that outperforms prior jigsaw solvers on these datasets.
-
Do multimodal models imagine electric sheep?
Fine-tuning VLMs to output action sequences for puzzles causes emergent internal visual representations that improve performance when integrated into reasoning.