E-grpo: High entropy steps drive effective reinforcement learning for flow models

Shengjun Zhang, Zhang Zhang, Chensheng Dai, Yueqi Duan · 2026 · arXiv 2601.00423

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 2

citation-polarity summary

background 1 unclear 1

representative citing papers

Learning to Credit the Right Steps: Objective-aware Process Optimization for Visual Generation

cs.CV · 2026-04-21 · unverdicted · novelty 7.0

OTCA improves GRPO training for visual generation by estimating step importance in trajectories and adaptively weighting multiple reward objectives.

Sketch Then Paint: Hierarchical Reinforcement Learning for Diffusion Multi-Modal Large Language Models

cs.AI · 2026-05-16 · unverdicted · novelty 6.0

Proposes HT-GRPO with sketch-then-paint staged updates, prompt-conditioned importance ratios, and hierarchical credit assignment for dMLLMs, reporting gains on GenEval and DPG plus quality metrics.

Power Reinforcement Post-Training of Text-to-Image Models with Super-Linear Advantage Shaping

cs.CV · 2026-05-11 · unverdicted · novelty 6.0

Super-Linear Advantage Shaping (SLAS) introduces a non-linear geometric policy update for RL post-training of text-to-image models that reshapes the local policy space via advantage-dependent Fisher-Rao weighting to reduce reward hacking and improve performance over GRPO baselines.

Superintelligent Retrieval Agent: The Next Frontier of Information Retrieval

cs.IR · 2026-05-07 · unverdicted · novelty 5.0

SIRA compresses multi-round exploratory retrieval into one LLM-guided, corpus-statistic-validated weighted BM25 query and reports superior results over dense retrievers and agentic baselines on BEIR benchmarks.

A Systematic Post-Train Framework for Video Generation

cs.CV · 2026-04-28 · unverdicted · novelty 5.0

A post-training pipeline for video generation models combines SFT, RLHF with novel GRPO, prompt enhancement, and inference optimization to improve visual quality, temporal coherence, and instruction following.

citing papers explorer

Showing 5 of 5 citing papers.

Learning to Credit the Right Steps: Objective-aware Process Optimization for Visual Generation cs.CV · 2026-04-21 · unverdicted · none · ref 50
OTCA improves GRPO training for visual generation by estimating step importance in trajectories and adaptively weighting multiple reward objectives.
Sketch Then Paint: Hierarchical Reinforcement Learning for Diffusion Multi-Modal Large Language Models cs.AI · 2026-05-16 · unverdicted · none · ref 49
Proposes HT-GRPO with sketch-then-paint staged updates, prompt-conditioned importance ratios, and hierarchical credit assignment for dMLLMs, reporting gains on GenEval and DPG plus quality metrics.
Power Reinforcement Post-Training of Text-to-Image Models with Super-Linear Advantage Shaping cs.CV · 2026-05-11 · unverdicted · none · ref 117
Super-Linear Advantage Shaping (SLAS) introduces a non-linear geometric policy update for RL post-training of text-to-image models that reshapes the local policy space via advantage-dependent Fisher-Rao weighting to reduce reward hacking and improve performance over GRPO baselines.
Superintelligent Retrieval Agent: The Next Frontier of Information Retrieval cs.IR · 2026-05-07 · unverdicted · none · ref 17
SIRA compresses multi-round exploratory retrieval into one LLM-guided, corpus-statistic-validated weighted BM25 query and reports superior results over dense retrievers and agentic baselines on BEIR benchmarks.
A Systematic Post-Train Framework for Video Generation cs.CV · 2026-04-28 · unverdicted · none · ref 27
A post-training pipeline for video generation models combines SFT, RLHF with novel GRPO, prompt enhancement, and inference optimization to improve visual quality, temporal coherence, and instruction following.

E-grpo: High entropy steps drive effective reinforcement learning for flow models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer