Arriaga, and Adam Tauman Kalai

Aher, Gati, Rosa I · 2023 · arXiv 2208.10264

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 1

citation-polarity summary

unclear 1

representative citing papers

Process Matters more than Output for Distinguishing Humans from Machines

cs.AI · 2026-05-07 · unverdicted · novelty 6.0 · 2 refs

A new battery of 30 cognitive tasks demonstrates that process-level behavioral features distinguish humans from frontier AI agents better than performance metrics (mean AUC 0.88), with process-specific fine-tuning improving mimicry but limited cross-task transfer.

Language Model Goal Selection Differs from Humans' in a Self-Directed Learning Task

cs.CL · 2026-02-06 · unverdicted · novelty 6.0

LLMs diverge from human goal selection in self-directed learning by exploiting single solutions with low variability across instances.

DoubleAgents: Human-Agent Alignment in a Socially Embedded Workflow

cs.HC · 2025-09-16 · unverdicted · novelty 6.0

DoubleAgents shows that a distributed-cognition design with coordination agent, dashboard, and policy module increases user comfort and reliance on AI agents for coordination tasks over time.

Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators

cs.LG · 2024-04-06 · conditional · novelty 6.0

Length-controlled AlpacaEval applies regression adjustment to remove length bias from LLM auto-evaluations, raising Spearman correlation with Chatbot Arena from 0.94 to 0.98.

AI-Augmented Surveys: Leveraging Large Language Models and Surveys for Opinion Prediction

cs.CL · 2023-05-16 · unverdicted · novelty 6.0

LLM embeddings enable strong retrodiction of masked GSS opinions via cross-validation and external validation but only modest performance on entirely unasked opinions.

Beyond Inefficiency: Systemic Costs of Incivility in Multi-Agent Monte Carlo Simulations

cs.AI · 2026-05-12 · unverdicted · novelty 5.0

Monte Carlo simulations of LLM agents confirm that toxic debates take 25% longer to converge, with larger delays in smaller models, and show a first-mover advantage independent of toxicity.

Frame Entrepreneurs in an AI Agent Community: Concentrated Identity-Claim Production on Moltbook

cs.CY · 2026-04-29 · unverdicted · novelty 5.0 · 3 refs

In the Moltbook AI agent community, identity-claim production is highly concentrated among a few frame entrepreneurs, with event-driven attention not translating into broad claim-making.

AgentDynEx: Nudging the Mechanics and Dynamics of Multi-Agent Simulations

cs.MA · 2025-04-13 · unverdicted · novelty 5.0

AgentDynEx introduces nudging and a Configuration Matrix to help set up and maintain balanced mechanics and dynamics in multi-agent LLM simulations.

Avenir-UX: Automated UX Evaluation via Simulated Human Web Interaction with GUI Grounding

cs.AI · 2026-02-25 · unverdicted · novelty 4.0

Avenir-UX automates web usability testing by using GUI-grounded simulation of user behavior to generate standardized reports with SUS, SEQ, and Think Aloud protocols.

citing papers explorer

Showing 9 of 9 citing papers.

Process Matters more than Output for Distinguishing Humans from Machines cs.AI · 2026-05-07 · unverdicted · none · ref 37 · 2 links
A new battery of 30 cognitive tasks demonstrates that process-level behavioral features distinguish humans from frontier AI agents better than performance metrics (mean AUC 0.88), with process-specific fine-tuning improving mimicry but limited cross-task transfer.
Language Model Goal Selection Differs from Humans' in a Self-Directed Learning Task cs.CL · 2026-02-06 · unverdicted · none · ref 1
LLMs diverge from human goal selection in self-directed learning by exploiting single solutions with low variability across instances.
DoubleAgents: Human-Agent Alignment in a Socially Embedded Workflow cs.HC · 2025-09-16 · unverdicted · none · ref 2
DoubleAgents shows that a distributed-cognition design with coordination agent, dashboard, and policy module increases user comfort and reliance on AI agents for coordination tasks over time.
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators cs.LG · 2024-04-06 · conditional · none · ref 49
Length-controlled AlpacaEval applies regression adjustment to remove length bias from LLM auto-evaluations, raising Spearman correlation with Chatbot Arena from 0.94 to 0.98.
AI-Augmented Surveys: Leveraging Large Language Models and Surveys for Opinion Prediction cs.CL · 2023-05-16 · unverdicted · none · ref 3
LLM embeddings enable strong retrodiction of masked GSS opinions via cross-validation and external validation but only modest performance on entirely unasked opinions.
Beyond Inefficiency: Systemic Costs of Incivility in Multi-Agent Monte Carlo Simulations cs.AI · 2026-05-12 · unverdicted · none · ref 1
Monte Carlo simulations of LLM agents confirm that toxic debates take 25% longer to converge, with larger delays in smaller models, and show a first-mover advantage independent of toxicity.
Frame Entrepreneurs in an AI Agent Community: Concentrated Identity-Claim Production on Moltbook cs.CY · 2026-04-29 · unverdicted · none · ref 1 · 3 links
In the Moltbook AI agent community, identity-claim production is highly concentrated among a few frame entrepreneurs, with event-driven attention not translating into broad claim-making.
AgentDynEx: Nudging the Mechanics and Dynamics of Multi-Agent Simulations cs.MA · 2025-04-13 · unverdicted · none · ref 2
AgentDynEx introduces nudging and a Configuration Matrix to help set up and maintain balanced mechanics and dynamics in multi-agent LLM simulations.
Avenir-UX: Automated UX Evaluation via Simulated Human Web Interaction with GUI Grounding cs.AI · 2026-02-25 · unverdicted · none · ref 1
Avenir-UX automates web usability testing by using GUI-grounded simulation of user behavior to generate standardized reports with SUS, SEQ, and Think Aloud protocols.

Arriaga, and Adam Tauman Kalai

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer