Random-reward discrete phase-type distributions are defined and used to construct the two-parameter Inertia-Escalation model for latent severity, with parameter inference and validation on warfare and churn data.
Qwen Team
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
A self-paced curriculum learning module with dual-level difficulty scoring improves weighted F1 scores by 1.2-10.4% when added to existing multimodal emotion recognition models on IEMOCAP and MELD.
Dynamic scaled gradient descent prevents fine-tuning collapse by dynamically down-weighting gradients of correct examples, yielding lower performance variance and higher accuracy than standard methods on classification benchmarks.
CW-GRPO weights GRPO advantages with per-round contribution scores from an LLM judge, improving search agent performance by 5.0% on Qwen3-8B and 6.3% on Qwen3-1.7B over standard GRPO.
citing papers explorer
-
Random Reward Phase-Type Distributions with Applications in Latent Severity Modeling
Random-reward discrete phase-type distributions are defined and used to construct the two-parameter Inertia-Escalation model for latent severity, with parameter inference and validation on warfare and churn data.
-
Leveraging Self-Paced Curriculum Learning for Enhanced Modality Balance in Multimodal Conversational Emotion Recognition
A self-paced curriculum learning module with dual-level difficulty scoring improves weighted F1 scores by 1.2-10.4% when added to existing multimodal emotion recognition models on IEMOCAP and MELD.
-
Dynamic Scaled Gradient Descent for Stable Fine-Tuning for Classifications
Dynamic scaled gradient descent prevents fine-tuning collapse by dynamically down-weighting gradients of correct examples, yielding lower performance variance and higher accuracy than standard methods on classification benchmarks.
-
Enhancing LLM-based Search Agents via Contribution Weighted Group Relative Policy Optimization
CW-GRPO weights GRPO advantages with per-round contribution scores from an LLM judge, improving search agent performance by 5.0% on Qwen3-8B and 6.3% on Qwen3-1.7B over standard GRPO.