pith. sign in

Title resolution pending

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

years

2026 2 2024 1

verdicts

UNVERDICTED 3

clear filters

representative citing papers

Pareto-Guided Optimal Transport for Multi-Reward Alignment

cs.CV · 2026-05-13 · unverdicted · novelty 7.0

PG-OT builds prompt-specific Pareto frontiers and applies distribution-aware optimal transport to improve multi-reward alignment while introducing JDR and JCR metrics to measure synergy and hacking.

Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents

cs.AI · 2024-08-13 · unverdicted · novelty 6.0

Agent Q integrates MCTS-guided search, self-critique, and off-policy DPO to train LLM agents that outperform behavior cloning and reinforced fine-tuning baselines in WebShop and achieve up to 95.4% success in real-world booking scenarios.

Controllable Molecular Generative Foundation Models

cs.LG · 2026-05-14 · unverdicted · novelty 5.0

CoMole combines motif-aware graph diffusion with RL policy optimization to deliver controllable molecular generation that outperforms baselines on nine targets across materials and drug benchmarks while keeping high validity.

citing papers explorer

Showing 3 of 3 citing papers after filters.

  • Pareto-Guided Optimal Transport for Multi-Reward Alignment cs.CV · 2026-05-13 · unverdicted · none · ref 24

    PG-OT builds prompt-specific Pareto frontiers and applies distribution-aware optimal transport to improve multi-reward alignment while introducing JDR and JCR metrics to measure synergy and hacking.

  • Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents cs.AI · 2024-08-13 · unverdicted · none · ref 234

    Agent Q integrates MCTS-guided search, self-critique, and off-policy DPO to train LLM agents that outperform behavior cloning and reinforced fine-tuning baselines in WebShop and achieve up to 95.4% success in real-world booking scenarios.

  • Controllable Molecular Generative Foundation Models cs.LG · 2026-05-14 · unverdicted · none · ref 30

    CoMole combines motif-aware graph diffusion with RL policy optimization to deliver controllable molecular generation that outperforms baselines on nine targets across materials and drug benchmarks while keeping high validity.