AssayBench is a new gene-ranking benchmark for phenotypic CRISPR screens that shows zero-shot generalist LLMs outperform both biology-specific LLMs and trainable baselines on adjusted nDCG.
Deep sets
2 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 2representative citing papers
Self-consistency training on real data improves amortized Bayesian model comparison accuracy under distribution shifts, especially in open-world misspecification when analytic or locally accurate surrogate likelihoods are available.
citing papers explorer
-
AssayBench: An Assay-Level Virtual Cell Benchmark for LLMs and Agents
AssayBench is a new gene-ranking benchmark for phenotypic CRISPR screens that shows zero-shot generalist LLMs outperform both biology-specific LLMs and trainable baselines on adjusted nDCG.
-
Improving the Accuracy of Amortized Model Comparison with Self-Consistency
Self-consistency training on real data improves amortized Bayesian model comparison accuracy under distribution shifts, especially in open-world misspecification when analytic or locally accurate surrogate likelihoods are available.