Toolforge: A data synthesis pipeline for multi-hop search without real-world apis

Hao Chen, Zhexin Hu, Jiajun Chai, Haocheng Yang, Hang He, Xiaohan Wang, Wei Lin, Luhang Wang, Guojun Yin, et al · 2025 · arXiv 2512.16149

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1 baseline 1

citation-polarity summary

background 1 baseline 1

representative citing papers

RecRM-Bench: Benchmarking Multidimensional Reward Modeling for Agentic Recommender Systems

cs.IR · 2026-05-12 · unverdicted · novelty 7.0

RecRM-Bench is a new large-scale benchmark dataset and framework for multi-dimensional reward modeling in agentic recommender systems, spanning instruction following, factual consistency, query-item relevance, and user behavior prediction.

$\pi$-Play: Multi-Agent Self-Play via Privileged Self-Distillation without External Data

cs.LG · 2026-04-15

citing papers explorer

Showing 2 of 2 citing papers.

RecRM-Bench: Benchmarking Multidimensional Reward Modeling for Agentic Recommender Systems cs.IR · 2026-05-12 · unverdicted · none · ref 5
RecRM-Bench is a new large-scale benchmark dataset and framework for multi-dimensional reward modeling in agentic recommender systems, spanning instruction following, factual consistency, query-item relevance, and user behavior prediction.
$\pi$-Play: Multi-Agent Self-Play via Privileged Self-Distillation without External Data cs.LG · 2026-04-15 · unreviewed · ref 2

Toolforge: A data synthesis pipeline for multi-hop search without real-world apis

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer