13 participants became convinced AI understands human values after chatbot interactions evaluated with the VAPT toolkit.
Does My Chatbot Have an Agenda? Understanding Human and
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
citation-role summary
other 1
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
other 1polarities
unclear 1representative citing papers
Inverse Turing Bench evaluates LLMs on distinguishing human-human from human-AI dialogues, with GPTZero at 89.41%, Claude Opus-4.6 at 77.92%, and GPT-5.5 at 75.94% accuracy.
PAPEL, a parent-AI collaborative system with four modules grounded in play scenes, led to more integrated parent utterances and increased parent-child conversational turns in a study of 16 dyads compared to a chatbot baseline.
citing papers explorer
-
Inverse Turing Bench: Evaluating Language Models as Judges of Human vs. AI Dialogue
Inverse Turing Bench evaluates LLMs on distinguishing human-human from human-AI dialogues, with GPTZero at 89.41%, Claude Opus-4.6 at 77.92%, and GPT-5.5 at 75.94% accuracy.