ViroBench benchmarks 66 nucleotide foundation models on viral tasks, finding weak extrapolation under shifts, decoupling of likelihood from functional validity in generation, and greater value from taxonomic diversity than model scale.
Generator: a long-context generative genomic foundation model
5 Pith papers cite this work. Polarity classification is still indexing.
years
2026 5verdicts
UNVERDICTED 5representative citing papers
LDARNet learns adaptive token boundaries via dynamic chunking in a genomic foundation model and reports gains on histone modification tasks over larger models.
AURORA is a representation learning framework that uses contextual orthogonalization and relational alignment to create disentangled, geometrically interpretable latent spaces in healthcare foundation models.
WISTERIA learns robust clinical representations from noisy EHR labels by enforcing consistency across multiple weak supervision views plus ontology regularization.
DNA pretraining suffers from inappropriate evaluation datasets, flawed neighbor-masking, and neglected vocabulary design; the authors supply guidelines and a reproducible testbed to fix them.
citing papers explorer
-
LDARNet: DNA Adaptive Representation Network with Learnable Tokenization for Genomic Modeling
LDARNet learns adaptive token boundaries via dynamic chunking in a genomic foundation model and reports gains on histone modification tasks over larger models.