arXiv preprint arXiv:2310.19596 , year=

Ruoyu Zhang, Yanzeng Li, Yongliang Ma, Ming Zhou, Lei Zou · 2023 · arXiv 2310.19596

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Towards Consistent Detection of Cognitive Distortions: LLM-Based Annotation and Dataset-Agnostic Evaluation

cs.CL · 2025-11-03 · unverdicted · novelty 7.0

LLMs produce stable cognitive distortion labels that improve downstream model performance, paired with a kappa-based framework for dataset-agnostic evaluation in subjective NLP tasks.

A Scalable Tool for Measuring Manner and Result Verbs in Developmental Language Research

cs.CL · 2026-05-15 · conditional · novelty 6.0

A RoBERTa classifier trained on LLM-generated manner/result verb annotations from extended VerbNet data reaches up to 89.6% accuracy on held-out gold-standard sets.

Can We Trust a Black-box LLM? LLM Untrustworthy Boundary Detection via Bias-Diffusion and Multi-Agent Reinforcement Learning

cs.AI · 2026-04-07 · unverdicted · novelty 6.0

GMRL-BD detects untrustworthy topic boundaries for black-box LLMs by combining bias-diffusion on a Wikipedia KG with multi-agent RL, supported by a released dataset labeling biases in models like Llama2 and Qwen2.

Structured Exploration and Exploitation of Label Functions for Automated Data Annotation

cs.LG · 2026-03-28 · unverdicted · novelty 5.0

EXPONA improves automated data labeling by exploring multi-level label functions and applying reliability filters, achieving up to 98.9% coverage and 46% gains in downstream weighted F1 on eleven datasets.

LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods

cs.CL · 2024-12-07 · accept · novelty 3.0

A survey that organizes LLMs-as-judges research into functionality, methodology, applications, meta-evaluation, and limitations.

citing papers explorer

Showing 5 of 5 citing papers.

Towards Consistent Detection of Cognitive Distortions: LLM-Based Annotation and Dataset-Agnostic Evaluation cs.CL · 2025-11-03 · unverdicted · none · ref 30
LLMs produce stable cognitive distortion labels that improve downstream model performance, paired with a kappa-based framework for dataset-agnostic evaluation in subjective NLP tasks.
A Scalable Tool for Measuring Manner and Result Verbs in Developmental Language Research cs.CL · 2026-05-15 · conditional · none · ref 54
A RoBERTa classifier trained on LLM-generated manner/result verb annotations from extended VerbNet data reaches up to 89.6% accuracy on held-out gold-standard sets.
Can We Trust a Black-box LLM? LLM Untrustworthy Boundary Detection via Bias-Diffusion and Multi-Agent Reinforcement Learning cs.AI · 2026-04-07 · unverdicted · none · ref 42
GMRL-BD detects untrustworthy topic boundaries for black-box LLMs by combining bias-diffusion on a Wikipedia KG with multi-agent RL, supported by a released dataset labeling biases in models like Llama2 and Qwen2.
Structured Exploration and Exploitation of Label Functions for Automated Data Annotation cs.LG · 2026-03-28 · unverdicted · none · ref 24
EXPONA improves automated data labeling by exploring multi-level label functions and applying reliability filters, achieving up to 98.9% coverage and 46% gains in downstream weighted F1 on eleven datasets.
LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods cs.CL · 2024-12-07 · accept · none · ref 293
A survey that organizes LLMs-as-judges research into functionality, methodology, applications, meta-evaluation, and limitations.

arXiv preprint arXiv:2310.19596 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer