Missing old logits in async agentic RL entangle discrepancy and staleness terms in PPO off-policy correction; exact acquisition methods and revised PPO-EWMA restore decoupled updates with reported gains in speed and performance.
Ernie 5.0 technical report
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
PaddleOCR-VL-1.6 improves on PaddleOCR-VL-1.5 via region-aware data optimization and progressive post-training to reach 96.33% on OmniDocBench v1.6.
citing papers explorer
-
PaddleOCR-VL-1.6: Expanding the Frontier of Document Parsing with Under-Optimized Region Refinement and Progressive Post-Training
PaddleOCR-VL-1.6 improves on PaddleOCR-VL-1.5 via region-aware data optimization and progressive post-training to reach 96.33% on OmniDocBench v1.6.