pith. machine review for the scientific record. sign in

← back to paper

Review history

arxiv: 2605.04539 · 3 revisions

RLearner-LLM: Balancing Logical Grounding and Fluency in Large Language Models via Hybrid Direct Preference Optimization

  1. 2026-05-13 UNVERDICTED LOW v0.9.0 novelty 6.0
    66860 ms 5641 in 1428 out 2026-05-13T07:02:54.627587+00:00
  2. 2026-05-12 UNVERDICTED LOW v0.9.0 novelty 6.0
    49506 ms 5641 in 1227 out 2026-05-12T03:22:24.978979+00:00
  3. 2026-05-08 UNVERDICTED LOW v0.9.0 novelty 5.0
    76395 ms 5641 in 1227 out 2026-05-08T16:49:01.763802+00:00