Task-reward reinforcement learning yields robust gains on math benchmarks for models like Llama-3.2-3B while distribution sharpening alone delivers only limited and unstable improvements.
Flow network based generative models for non-iterative diverse candidate generation
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
AlphaSAGE is a GFlowNet framework with an RGCN structure-aware encoder and dense multi-faceted rewards that mines diverse, novel, and predictive formulaic alphas for quantitative trading.
citing papers explorer
-
Beyond Distribution Sharpening: The Importance of Task Rewards
Task-reward reinforcement learning yields robust gains on math benchmarks for models like Llama-3.2-3B while distribution sharpening alone delivers only limited and unstable improvements.
-
AlphaSAGE: Structure-Aware Alpha Mining via GFlowNets for Robust Exploration
AlphaSAGE is a GFlowNet framework with an RGCN structure-aware encoder and dense multi-faceted rewards that mines diverse, novel, and predictive formulaic alphas for quantitative trading.