A GEMM-centric taxonomy and unified benchmark show static depth pruning as the strongest Pareto-optimal baseline for LLM inference acceleration, with the frontier shifting to dynamic depth then static width pruning as quality loss rises.
A survey of scientific large language models: From data foundations to agent frontiers
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 8roles
background 1polarities
background 1representative citing papers
Battery-Sim-Agent reframes inverse battery parameter estimation as an LLM reasoning task in closed loop with a simulator and outperforms Bayesian optimization baselines on diverse benchmarks.
ReCrit frames critic interaction as a correctness-transition problem and uses quadrant-based RL rewards to improve LLM performance on scientific reasoning benchmarks by rewarding corrections and robustness while penalizing sycophancy.
TeleCom-Bench reveals LLMs reach 90% on telecom intent and entity tasks but drop to 30% on solution generation and root cause analysis in live network scenarios.
AgentEconomist is an end-to-end agentic system with idea development, experimental design, and execution stages that uses a large economics paper database to produce research ideas with better literature grounding, novelty, and insight than generic LLMs.
Eywa enables language-based agentic AI systems to collaborate with specialized scientific foundation models for improved performance on structured data tasks.
AblateCell reproduces baselines in three single-cell perturbation repositories with 88.9% success and recovers ground-truth critical components with 93.3% accuracy via closed-loop ablation.
citing papers explorer
-
Heterogeneous Scientific Foundation Model Collaboration
Eywa enables language-based agentic AI systems to collaborate with specialized scientific foundation models for improved performance on structured data tasks.