Understanding reinforcement learning for model training, and future directions with grape.arXiv preprint arXiv:2509.04501, 2025

Rohit Patel · 2025 · arXiv 2509.04501

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Training-Inference Kernel Contracts: Bounding Divergence in Post-Training and Deployment

cs.LG · 2026-05-26 · unverdicted · novelty 5.0

Introduces kernel contracts framework with derived bounds on divergence from logit drift to reward drift, specialized for RL post-training under support and norm assumptions.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Training-Inference Kernel Contracts: Bounding Divergence in Post-Training and Deployment cs.LG · 2026-05-26 · unverdicted · none · ref 18
Introduces kernel contracts framework with derived bounds on divergence from logit drift to reward drift, specialized for RL post-training under support and norm assumptions.

Understanding reinforcement learning for model training, and future directions with grape.arXiv preprint arXiv:2509.04501, 2025

fields

years

verdicts

representative citing papers

citing papers explorer