Computing inter-rater reliability and its variance in the presence of high agreement,

· 2008

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

WASIL: In-the-Wild Arabic Spoken Interactions with LLMs

cs.SD · 2026-05-09 · accept · novelty 7.0

WASIL is a released dataset of 8,529 in-the-wild Arabic spoken LLM interactions with audio, ASR hypotheses, responses, explicit like/dislike feedback, answerability annotations, a 2,000-turn MSA and dialect test set, and a reference-free multi-judge LLM evaluation method.

Linking Behaviour and Perception to Evaluate Meaningful Human Control over Partially Automated Driving

cs.HC · 2026-05-01 · unverdicted · novelty 5.0

Simulator experiments revealed correlations between steering conflicts, reaction times, and drivers' sense of control in partial automation, highlighting design needs for better intention alignment and intervention ease.

citing papers explorer

Showing 2 of 2 citing papers.

WASIL: In-the-Wild Arabic Spoken Interactions with LLMs cs.SD · 2026-05-09 · accept · none · ref 46
WASIL is a released dataset of 8,529 in-the-wild Arabic spoken LLM interactions with audio, ASR hypotheses, responses, explicit like/dislike feedback, answerability annotations, a 2,000-turn MSA and dialect test set, and a reference-free multi-judge LLM evaluation method.
Linking Behaviour and Perception to Evaluate Meaningful Human Control over Partially Automated Driving cs.HC · 2026-05-01 · unverdicted · none · ref 37
Simulator experiments revealed correlations between steering conflicts, reaction times, and drivers' sense of control in partial automation, highlighting design needs for better intention alignment and intervention ease.

Computing inter-rater reliability and its variance in the presence of high agreement,

fields

years

verdicts

representative citing papers

citing papers explorer