Title resolution pending

URLhttps://aclanthology · 2024 · DOI 10.18653/v1/2025.findings-acl.1357

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

SEQUOR: A Multi-Turn Benchmark for Realistic Constraint Following

cs.CL · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

AI models lose over 40% accuracy following multiple constraints in long multi-turn conversations and over 11% even with a single constraint as length increases, per the new SEQUOR benchmark.

Synthetic Users, Real Differences: an Evaluation Framework for User Simulation in Multi-Turn Conversations

cs.CL · 2026-05-04 · unverdicted · novelty 6.0

Realsim shows simulated users fail to reproduce communication frictions present in real multi-turn chatbot dialogues, yielding overly optimistic evaluations with domain-dependent variability.

citing papers explorer

Showing 2 of 2 citing papers.

SEQUOR: A Multi-Turn Benchmark for Realistic Constraint Following cs.CL · 2026-05-07 · unverdicted · none · ref 1 · 2 links
AI models lose over 40% accuracy following multiple constraints in long multi-turn conversations and over 11% even with a single constraint as length increases, per the new SEQUOR benchmark.
Synthetic Users, Real Differences: an Evaluation Framework for User Simulation in Multi-Turn Conversations cs.CL · 2026-05-04 · unverdicted · none · ref 13
Realsim shows simulated users fail to reproduce communication frictions present in real multi-turn chatbot dialogues, yielding overly optimistic evaluations with domain-dependent variability.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer