A sequence-graph model using gated modulation of methylation signals by eight handcrafted DNA sequence features achieves 3.149 years MAE on 3707 samples, a 12.8% gain over graph baselines.
Kingma and Jimmy Ba
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4representative citing papers
Fitting logic gates as 4D multilinear polynomials with covariance Jacobian selection matches or beats 16D softmax baselines on seven datasets and remains stable at 12-layer depth where the baseline drops 37 points on CIFAR-10.
OSDN adds online diagonal preconditioning to the Delta Rule, preserving chunkwise parallelism while proving super-geometric convergence and delivering 32-39% recall gains at 340M-1.3B scales.
LBW-Guard is a bounded autonomous control layer above AdamW that improves stability, reduces perplexity, and speeds up training for Qwen2.5 models under learning-rate stress on WikiText-103.
citing papers explorer
-
Bridging Sequence and Graph Structure for Epigenetic Age Prediction
A sequence-graph model using gated modulation of methylation signals by eight handcrafted DNA sequence features achieves 3.149 years MAE on 3707 samples, a 12.8% gain over graph baselines.
-
Fitting Multilinear Polynomials for Logic Gate Networks
Fitting logic gates as 4D multilinear polynomials with covariance Jacobian selection matches or beats 16D softmax baselines on seven datasets and remains stable at 12-layer depth where the baseline drops 37 points on CIFAR-10.
-
OSDN: Improving Delta Rule with Provable Online Preconditioning in Linear Attention
OSDN adds online diagonal preconditioning to the Delta Rule, preserving chunkwise parallelism while proving super-geometric convergence and delivering 32-39% recall gains at 340M-1.3B scales.
-
Learn-by-Wire Training Control Governance: Bounded Autonomous Training Under Stress for Stability and Efficiency
LBW-Guard is a bounded autonomous control layer above AdamW that improves stability, reduces perplexity, and speeds up training for Qwen2.5 models under learning-rate stress on WikiText-103.