PaLI jointly scales a 4B-parameter vision transformer with language models on a new 10B multilingual image-text dataset to reach state-of-the-art results on vision-language tasks while keeping a simple modular design.
Deep Learning
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
SPaCe uses semantic clustering to shrink training sets and a multi-armed bandit to adaptively select samples, matching or beating baselines on reasoning benchmarks with up to 100x fewer examples.
Two hybrid Bayesian surrogate training approaches integrate simulation and real-world data via a weighting strategy independent of surrogate family, shown in synthetic and real case studies to improve accuracy and diagnose simulation issues.
HRM is a recurrent architecture with high-level planning and low-level execution modules that reaches near-perfect accuracy on complex Sudoku, maze navigation, and ARC benchmarks using 27M parameters and 1000 samples without pre-training or CoT supervision.
citing papers explorer
-
PaLI: A Jointly-Scaled Multilingual Language-Image Model
PaLI jointly scales a 4B-parameter vision transformer with language models on a new 10B multilingual image-text dataset to reach state-of-the-art results on vision-language tasks while keeping a simple modular design.
-
SPaCe: Unlocking Sample-Efficient Large Language Models Training With Self-Pace Curriculum Learning
SPaCe uses semantic clustering to shrink training sets and a multi-armed bandit to adaptively select samples, matching or beating baselines on reasoning benchmarks with up to 100x fewer examples.
-
Bayesian Surrogate Training on Multiple Data Sources: A Hybrid Modeling Strategy
Two hybrid Bayesian surrogate training approaches integrate simulation and real-world data via a weighting strategy independent of surrogate family, shown in synthetic and real case studies to improve accuracy and diagnose simulation issues.
-
Hierarchical Reasoning Model
HRM is a recurrent architecture with high-level planning and low-level execution modules that reaches near-perfect accuracy on complex Sudoku, maze navigation, and ARC benchmarks using 27M parameters and 1000 samples without pre-training or CoT supervision.