SAGE is a new multi-agent benchmark that formalizes service SOPs as dynamic dialogue graphs to measure LLM agents on logical compliance and path coverage, uncovering an execution gap and empathy resilience across 27 models in 6 scenarios.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4representative citing papers
DTL-NS introduces hierarchical index trees and LLM inference on item-ID encodings to identify false negatives and perform multi-view hard negative sampling for improved implicit CF recommendation.
A framework integrates MM-LLMs into recommendation systems via caption generation as categorical features, reporting 0.35% offline AUC lift and 0.02% online metric improvement.
citing papers explorer
-
SAGE: A Service Agent Graph-guided Evaluation Benchmark
SAGE is a new multi-agent benchmark that formalizes service SOPs as dynamic dialogue graphs to measure LLM agents on logical compliance and path coverage, uncovering an execution gap and empathy resilience across 27 models in 6 scenarios.
-
Dual-Tree LLM-Enhanced Negative Sampling for Implicit Collaborative Filtering
DTL-NS introduces hierarchical index trees and LLM inference on item-ID encodings to identify false negatives and perform multi-view hard negative sampling for improved implicit CF recommendation.
-
A General Framework for Multimodal LLM-Based Multimedia Understanding in Large-Scale Recommendation Systems
A framework integrates MM-LLMs into recommendation systems via caption generation as categorical features, reporting 0.35% offline AUC lift and 0.02% online metric improvement.
- BEAR: Towards Beam-Search-Aware Optimization for Recommendation with Large Language Models