Cost-Aware SGD samples by gradient-norm-to-cost ratio and is instantiated as Cost-Aware GRPO for length-dependent policy gradients, reducing tokens used in LLM RL while matching baseline accuracy.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
A multi-domain matrix framework is proposed for HR decision support in startups, illustrated by a case study identifying workload imbalances and informing hiring.
citing papers explorer
-
Cost-Aware Learning
Cost-Aware SGD samples by gradient-norm-to-cost ratio and is instantiated as Cost-Aware GRPO for length-dependent policy gradients, reducing tokens used in LLM RL while matching baseline accuracy.