SPEC CPU 2026 presents a new benchmark suite using open-source apps, expanded multithreading, and Rolling-Round-Robin Rate to address gaps in evaluating heterogeneous multiprogrammed CPU performance.
Insights into deepseek-v3: Scaling challenges and reflections on hardware for ai architectures
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 5roles
background 1polarities
background 1representative citing papers
Entrain reduces microbatch workload variability by up to 10.6x and improves multimodal LLM training throughput by 1.4x via static model parallelism and deferred hierarchical microbatch assignment.
Sim-FA is a new simulator that instruments FlashAttention-3 for cycle-accurate GPGPU analysis, achieving 5.7% average error on H800 while explaining inaccuracies in existing DRAM traffic models.
MLLMs exhibit a consistent recognition-reasoning inversion on discrete visual symbols across domains, underperforming on elementary perception while appearing competent on higher-level reasoning via linguistic compensation.
PRISM introduces a probabilistic performance modeling framework that quantifies guarantees on training time for large-scale distributed systems under runtime variability.
citing papers explorer
-
SPEC CPU: The Next Generation
SPEC CPU 2026 presents a new benchmark suite using open-source apps, expanded multithreading, and Rolling-Round-Robin Rate to address gaps in evaluating heterogeneous multiprogrammed CPU performance.
-
Addressing Variable Heterogeneity in Distributed Multimodal Training with Entrain
Entrain reduces microbatch workload variability by up to 10.6x and improves multimodal LLM training throughput by 1.4x via static model parallelism and deferred hierarchical microbatch assignment.
-
Sim-FA: A GPGPU Simulator Framework for Fine-Grained FlashAttention Pipeline Analysis
Sim-FA is a new simulator that instruments FlashAttention-3 for cycle-accurate GPGPU analysis, achieving 5.7% average error on H800 while explaining inaccuracies in existing DRAM traffic models.
-
Cognitive Mismatch in Multimodal Large Language Models for Discrete Symbol Understanding
MLLMs exhibit a consistent recognition-reasoning inversion on discrete visual symbols across domains, underperforming on elementary perception while appearing competent on higher-level reasoning via linguistic compensation.