pith. sign in

Tong Yu

Identifiers

  • name variant Tong Yu 0.60 · backfill

Papers (17)

  1. F-GRPO: Factorized Group-Relative Policy Optimization for Unified Candidate Generation and Ranking cs.LG · 2026 · author #8
  2. OLIVIA: Online Learning via Inference-time Action Adaptation for Decision Making in LLM ReAct Agents cs.AI · 2026 · author #6
  3. MASS-DPO: Multi-negative Active Sample Selection for Direct Policy Optimization cs.LG · 2026 · author #6
  4. FERA: Uncertainty-Aware Federated Reasoning for Large Language Models cs.CL · 2026 · author #6
  5. Skill-R1: Agent Skill Evolution via Reinforcement Learning cs.LG · 2026 · author #7
  6. Skill-CMIB: Multimodal Agent Skill for Consistent Action via Conditional Multimodal Information Bottleneck cs.LG · 2026 · author #3
  7. A Survey on LLM-based Conversational User Simulation cs.CL · 2026 · author #17
  8. Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning cs.LG · 2026 · author #17
  9. CachePrune: Teaching LLMs What Not to Follow via KV-Cache Editing cs.CR · 2025 · author #4
  10. Federated Large Language Models: Current Progress and Future Directions cs.LG · 2024 · author #6
  11. Figure Captioning with Reasoning and Sequence-Level Training cs.CV · 2019 · author #6
  12. Privacy Partitioning: Protecting User Data During the Deep Learning Inference Phase cs.CR · 2018 · author #4
  13. Superconductivity in Li6P electride cond-mat.supr-con · 2018 · author #3
  14. Understanding and Improving Recurrent Networks for Human Activity Recognition by Continuous Attention cs.LG · 2018 · author #3
  15. Semi-Supervised Convolutional Neural Networks for Human Activity Recognition cs.LG · 2018 · author #2
  16. Customized Nonlinear Bandits for Online Response Selection in Neural Conversation Models cs.CL · 2017 · author #2
  17. SpectralLeader: Online Spectral Learning for Single Topic Models cs.LG · 2017 · author #1

Mentions

  • 2409.15723 #6 · arxiv_oai · confidence 0.70 Tong Yu

Frequent Coauthors