Entropy-adaptive Gumbel-Sinkhorn formulation for unsupervised permutation learning that modulates temperature per assignment to address non-uniform uncertainty.
Seq2seq models reconstruct visual jigsaw puzzles without seeing them
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
citation-role summary
background 2
citation-polarity summary
years
2026 3roles
background 2polarities
background 2representative citing papers
Introduces GAP datasets of synthetic heavily eroded irregular puzzle pieces modeled on archaeological fragments and PuzzleFlow, a ViT and flow-matching framework that outperforms prior jigsaw solvers on these datasets.
Fine-tuning VLMs to output action sequences for puzzles causes emergent internal visual representations that improve performance when integrated into reasoning.
citing papers explorer
-
Do multimodal models imagine electric sheep?
Fine-tuning VLMs to output action sequences for puzzles causes emergent internal visual representations that improve performance when integrated into reasoning.