DoMinO reformulates discrete flow matching sampling as an MDP for unbiased RL fine-tuning with new TV regularizers, yielding better enhancer activity and naturalness on DNA design tasks.
arXiv preprint arXiv:2509.22623 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
dFlowGRPO is a new rate-aware RL method for discrete flow models that outperforms prior GRPO approaches on image generation and matches continuous flow models while supporting broad probability paths.
citing papers explorer
-
Discrete Flow Matching Policy Optimization
DoMinO reformulates discrete flow matching sampling as an MDP for unbiased RL fine-tuning with new TV regularizers, yielding better enhancer activity and naturalness on DNA design tasks.
-
dFlowGRPO: Rate-Aware Policy Optimization for Discrete Flow Models
dFlowGRPO is a new rate-aware RL method for discrete flow models that outperforms prior GRPO approaches on image generation and matches continuous flow models while supporting broad probability paths.