pith. sign in

← back to paper

Review history

arxiv: 2604.18966 · 2 revisions

Self-Improving Tabular Language Models via Iterative Reward-Guided Post-Training

  1. 2026-05-21 UNVERDICTED LOW v0.9.0 novelty 5.0
    43053 ms 5847 in 1249 out 2026-05-21T00:13:19.507341+00:00
  2. 2026-05-10 UNVERDICTED LOW v0.9.0 novelty 7.0
    36734 ms 5573 in 1324 out 2026-05-10T03:01:14.803598+00:00