Large language models for data annotation and synthesis: A survey

Tan, Z · 2024 · arXiv 2402.13446

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 1 dataset 1 method 1

citation-polarity summary

background 2 use method 1

representative citing papers

Retrieval-Augmented Large Language Models for Evidence-Informed Guidance on Cannabidiol Use in Older Adults

cs.IR · 2026-01-16 · unverdicted · novelty 7.0

Retrieval-augmented LLMs produce more cautious and guideline-aligned recommendations on cannabidiol for older adults than standalone models, demonstrated via automated evaluation on 64 diverse scenarios.

CAPC-CG: A Large-Scale, Expert-Directed LLM-Annotated Corpus of Adaptive Policy Communication in China

cs.CL · 2025-10-10 · accept · novelty 7.0 · 2 refs

Releases the first large-scale expert-annotated corpus of Chinese central government policy directives (1949-2023) labeled with a five-color clear/ambiguous taxonomy and high inter-annotator reliability.

Context-Aware Spear Phishing: Generative AI-Enabled Attacks Against Individuals via Public Social Media Data

cs.CR · 2026-05-11 · conditional · novelty 6.0

Generative AI enables scalable, context-aware spear phishing by extracting profiles from public social media, producing emails that outperform real-world phishing samples in personalization and lower recipient suspicion.

Profiling for Pennies: Unveiling the Privacy Iceberg of LLM Agents

cs.CR · 2026-05-07 · unverdicted · novelty 6.0

LLM agents can reconstruct high-fidelity personal profiles from minimal PII seeds with over 90% accuracy in under 10 minutes at less than $3 cost, exposing three escalating tiers of privacy risks.

Can We Trust a Black-box LLM? LLM Untrustworthy Boundary Detection via Bias-Diffusion and Multi-Agent Reinforcement Learning

cs.AI · 2026-04-07 · unverdicted · novelty 6.0

GMRL-BD detects untrustworthy topic boundaries for black-box LLMs by combining bias-diffusion on a Wikipedia KG with multi-agent RL, supported by a released dataset labeling biases in models like Llama2 and Qwen2.

A Leaf-Level Dataset for Soybean-Cotton Detection and Segmentation

cs.CV · 2025-03-03 · unverdicted · novelty 6.0

A new leaf-instance dataset for soybean-cotton detection and segmentation collected across growth stages and conditions from commercial farms is presented and validated with YOLOv11.

Rethinking Math Reasoning Evaluation: A Robust LLM-as-a-Judge Framework Beyond Symbolic Rigidity

cs.AI · 2026-04-24 · unverdicted · novelty 5.0

An LLM-as-a-judge evaluation framework for math reasoning outperforms symbolic methods by accurately assessing diverse answer representations and formats.

A Survey of Scaling in Large Language Model Reasoning

cs.AI · 2025-04-02 · unverdicted · novelty 3.0

A survey categorizing scaling in LLM reasoning across input size, steps, rounds, training, and future directions, noting that scaling can negatively affect performance.

Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

cs.AI · 2025-01-16 · unverdicted · novelty 3.0

The paper surveys reinforced reasoning techniques for LLMs, covering automated data construction, learning-to-reason methods, and test-time scaling as steps toward Large Reasoning Models.

citing papers explorer

Showing 9 of 9 citing papers.

Retrieval-Augmented Large Language Models for Evidence-Informed Guidance on Cannabidiol Use in Older Adults cs.IR · 2026-01-16 · unverdicted · none · ref 73
Retrieval-augmented LLMs produce more cautious and guideline-aligned recommendations on cannabidiol for older adults than standalone models, demonstrated via automated evaluation on 64 diverse scenarios.
CAPC-CG: A Large-Scale, Expert-Directed LLM-Annotated Corpus of Adaptive Policy Communication in China cs.CL · 2025-10-10 · accept · none · ref 1 · 2 links
Releases the first large-scale expert-annotated corpus of Chinese central government policy directives (1949-2023) labeled with a five-color clear/ambiguous taxonomy and high inter-annotator reliability.
Context-Aware Spear Phishing: Generative AI-Enabled Attacks Against Individuals via Public Social Media Data cs.CR · 2026-05-11 · conditional · none · ref 84
Generative AI enables scalable, context-aware spear phishing by extracting profiles from public social media, producing emails that outperform real-world phishing samples in personalization and lower recipient suspicion.
Profiling for Pennies: Unveiling the Privacy Iceberg of LLM Agents cs.CR · 2026-05-07 · unverdicted · none · ref 31
LLM agents can reconstruct high-fidelity personal profiles from minimal PII seeds with over 90% accuracy in under 10 minutes at less than $3 cost, exposing three escalating tiers of privacy risks.
Can We Trust a Black-box LLM? LLM Untrustworthy Boundary Detection via Bias-Diffusion and Multi-Agent Reinforcement Learning cs.AI · 2026-04-07 · unverdicted · none · ref 35
GMRL-BD detects untrustworthy topic boundaries for black-box LLMs by combining bias-diffusion on a Wikipedia KG with multi-agent RL, supported by a released dataset labeling biases in models like Llama2 and Qwen2.
A Leaf-Level Dataset for Soybean-Cotton Detection and Segmentation cs.CV · 2025-03-03 · unverdicted · none · ref 22
A new leaf-instance dataset for soybean-cotton detection and segmentation collected across growth stages and conditions from commercial farms is presented and validated with YOLOv11.
Rethinking Math Reasoning Evaluation: A Robust LLM-as-a-Judge Framework Beyond Symbolic Rigidity cs.AI · 2026-04-24 · unverdicted · none · ref 25
An LLM-as-a-judge evaluation framework for math reasoning outperforms symbolic methods by accurately assessing diverse answer representations and formats.
A Survey of Scaling in Large Language Model Reasoning cs.AI · 2025-04-02 · unverdicted · none · ref 191
A survey categorizing scaling in LLM reasoning across input size, steps, rounds, training, and future directions, noting that scaling can negatively affect performance.
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models cs.AI · 2025-01-16 · unverdicted · none · ref 144
The paper surveys reinforced reasoning techniques for LLMs, covering automated data construction, learning-to-reason methods, and test-time scaling as steps toward Large Reasoning Models.

Large language models for data annotation and synthesis: A survey

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer