LLMs achieve near-human agreement (k=0.794 vs human-human k=0.872) on annotating Mandarin narrative macrostructure with the MAIN framework, reducing time by 65 percent but showing lower reliability on young adult narratives with greater lexical variation.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Expert re-annotations of a German ABSA dataset serve as ground truth to evaluate how students, crowdworkers, and LLMs affect inter-annotator agreement and downstream performance on ACSA and TASD tasks using BERT, T5, and LLaMA models.
citing papers explorer
-
LLMs for automatic annotation of Mandarin narrative transcripts
LLMs achieve near-human agreement (k=0.794 vs human-human k=0.872) on annotating Mandarin narrative macrostructure with the MAIN framework, reducing time by 65 percent but showing lower reliability on young adult narratives with greater lexical variation.
-
Annotation Quality in Aspect-Based Sentiment Analysis: A Case Study Comparing Experts, Students, Crowdworkers, and Large Language Model
Expert re-annotations of a German ABSA dataset serve as ground truth to evaluate how students, crowdworkers, and LLMs affect inter-annotator agreement and downstream performance on ACSA and TASD tasks using BERT, T5, and LLaMA models.