R2IF: Aligning Reasoning with Decisions via Composite Rewards for Interpretable LLM Function Calling cs.LG · 2026-04-22