Transformers learn connectivity on low-dimensional grid graphs but fail on high-dimensional grids or graphs with many disconnected components, with larger models showing better generalization on grids.
Entropy law: The story behind data compression and llm performance
3 Pith papers cite this work. Polarity classification is still indexing.
years
2025 3verdicts
UNVERDICTED 3representative citing papers
MIST models up to 10x larger than prior work, fine-tuned on over 400 structure-property tasks, match or exceed SOTA on benchmarks and demonstrate zero-shot olfactory perception mapping consistent with hyperbolic geometry.
Selecting preference pairs whose DPO implicit reward gap is small yields better LLM alignment than random or baseline selection while using only 10% of the data.
citing papers explorer
-
Transformers Can Learn Connectivity in Some Graphs but Not Others
Transformers learn connectivity on low-dimensional grid graphs but fail on high-dimensional grids or graphs with many disconnected components, with larger models showing better generalization on grids.
-
Foundation Models for Discovery and Exploration in Chemical Space
MIST models up to 10x larger than prior work, fine-tuned on over 400 structure-property tasks, match or exceed SOTA on benchmarks and demonstrate zero-shot olfactory perception mapping consistent with hyperbolic geometry.
-
Difficulty-Based Preference Data Selection by DPO Implicit Reward Gap
Selecting preference pairs whose DPO implicit reward gap is small yields better LLM alignment than random or baseline selection while using only 10% of the data.