arXiv preprint arXiv:1901.11196 , year =

Jason Wei, Kai Zou , title = · 1901 · arXiv 1901.11196

16 Pith papers cite this work. Polarity classification is still indexing.

16 Pith papers citing it

read on arXiv browse 16 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

ClassEval-Pro: A Cross-Domain Benchmark for Class-Level Code Generation

cs.SE · 2026-04-29 · unverdicted · novelty 7.0

ClassEval-Pro benchmark shows frontier LLMs achieve at most 45.6% Pass@1 on class-level code tasks, with logic errors (56%) and dependency errors (38%) as dominant failure modes.

Transition-Matrix Regularization for Next Dialogue Act Prediction in Counselling Conversations

cs.CL · 2026-04-20 · unverdicted · novelty 7.0

KL regularization aligning model predictions with empirical transition patterns improves macro-F1 by 9-42% in next dialogue act prediction on German counselling data and transfers to other datasets.

Small Data, Big Noise: Adversarial Training for Robust Parameter-Efficient Fine-Tuning

cs.CL · 2026-06-09 · unverdicted · novelty 6.0

SDBN introduces adversarial training to PEFT via two variants using character-level edits and LLM-generated perturbations, claiming improved robustness and generalization on NLP benchmarks in low-resource noisy settings.

Organizational Control Layer: Governance Infrastructure at the Execution Boundary of LLM Agent Systems

cs.MA · 2026-06-03 · unverdicted · novelty 6.0

OCL is a governance layer for LLM agents that cuts unsafe executions from 88% to near-zero and raises valid success from 12% to 96% in adversarial buyer-seller negotiations across frontier LLMs.

From Pre-trained Models to Large Language Models: A Comprehensive Survey of AI-Driven Psychological Computing

cs.CY · 2026-03-12 · unverdicted · novelty 6.0

The paper introduces a new taxonomy that groups AI-driven psychological computing tasks by their underlying computational patterns into four categories and reviews over 300 works from the pre-trained model to LLM eras.

Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models

cs.CL · 2025-08-21 · unverdicted · novelty 6.0

Fin-PRM is a domain-specialized process reward model that supplies binary step-level and trajectory-level supervision signals for financial reasoning in LLMs and outperforms general PRMs on CFLUE and FinQA benchmarks.

ART: Automatic multi-step reasoning and tool-use for large language models

cs.CL · 2023-03-16 · unverdicted · novelty 6.0

ART automatically generates multi-step reasoning programs with tool integration for LLMs, yielding substantial gains over few-shot and auto-CoT prompting on BigBench and MMLU while matching hand-crafted CoT on most tasks.

Text and Code Embeddings by Contrastive Pre-Training

cs.CL · 2022-01-24 · unverdicted · novelty 6.0

Contrastive pre-training on unsupervised data at scale creates text and code embeddings that set new state-of-the-art results on classification and semantic search benchmarks.

Cross Paraphrastic Invariance Learning for Hallucination Detection

cs.CL · 2026-06-06 · unverdicted · novelty 5.0

CPIL is a contrastive two-stage method that enforces paraphrase invariance on limited labeled data to outperform baselines in hallucination detection across 11 tasks.

ROGLE: Robust Global-Local Alignment with Automated Region Supervision for Text-Based Person Search

cs.CV · 2026-06-01 · unverdicted · novelty 5.0 · 2 refs

ROGLE introduces automated pseudo region-sentence pairs via RSM and multi-granular learning to boost fine-grained alignment in text-based person search, plus the P-VLG benchmark with over 100k annotated regions.

Model-Agnostic Meta Learning for Class Imbalance Adaptation

cs.CL · 2026-04-20 · conditional · novelty 5.0

HAMR combines meta-learning with hardness-aware weighting and neighborhood resampling to improve minority-class performance on imbalanced NLP datasets.

What Are Adversaries Doing? Automating Tactics, Techniques, and Procedures Extraction: A Systematic Review

cs.SE · 2026-04-01 · accept · novelty 5.0

Systematic review of 80 papers shows TTP extraction shifting to transformer and LLM methods but limited by narrow datasets, single-label focus, and low reproducibility.

Duluth at SemEval-2026 Task 6: DeBERTa with LLM-Augmented Data for Unmasking Political Question Evasions

cs.CL · 2026-04-22 · unverdicted · novelty 3.0

DeBERTa-V3-base with focal loss, discourse features, and LLM-augmented data for minority classes achieves 0.76 Macro F1 on clarity-level classification of political QA pairs, ranking 8th in SemEval-2026 Task 6.

Enhancing LLMs for Identifying and Prioritizing Important Medical Jargons from Electronic Health Record Notes Utilizing Data Augmentation

cs.CL · 2025-02-22 · unverdicted · novelty 3.0

Fine-tuning and data augmentation improve LLM performance on medical jargon extraction and prioritization from EHR notes, with augmented open-source models sometimes outperforming closed-source ones on 106 annotated notes.

Multilingual Polarization Detection Using Transformer-Based Models with Class Weighting and Threshold Tuning

cs.CL · 2026-06-29 · unverdicted · novelty 2.0

Transformer models with class weighting and threshold tuning achieve competitive F1 scores on three subtasks of multilingual polarization detection.

Intelligent Agents with Emotional Intelligence: Current Trends, Challenges, and Future Prospects

cs.HC · 2025-10-11 · unverdicted · novelty 2.0

A holistic survey of affective computing for intelligent agents covering emotion understanding via multimodal data, affective cognition, emotional expression synthesis, key challenges, and future directions emphasizing generative technologies.

citing papers explorer

Showing 9 of 9 citing papers after filters.

Transition-Matrix Regularization for Next Dialogue Act Prediction in Counselling Conversations cs.CL · 2026-04-20 · unverdicted · none · ref 61
KL regularization aligning model predictions with empirical transition patterns improves macro-F1 by 9-42% in next dialogue act prediction on German counselling data and transfers to other datasets.
Small Data, Big Noise: Adversarial Training for Robust Parameter-Efficient Fine-Tuning cs.CL · 2026-06-09 · unverdicted · none · ref 36
SDBN introduces adversarial training to PEFT via two variants using character-level edits and LLM-generated perturbations, claiming improved robustness and generalization on NLP benchmarks in low-resource noisy settings.
Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models cs.CL · 2025-08-21 · unverdicted · none · ref 22
Fin-PRM is a domain-specialized process reward model that supplies binary step-level and trajectory-level supervision signals for financial reasoning in LLMs and outperforms general PRMs on CFLUE and FinQA benchmarks.
ART: Automatic multi-step reasoning and tool-use for large language models cs.CL · 2023-03-16 · unverdicted · none · ref 80
ART automatically generates multi-step reasoning programs with tool integration for LLMs, yielding substantial gains over few-shot and auto-CoT prompting on BigBench and MMLU while matching hand-crafted CoT on most tasks.
Text and Code Embeddings by Contrastive Pre-Training cs.CL · 2022-01-24 · unverdicted · none · ref 26
Contrastive pre-training on unsupervised data at scale creates text and code embeddings that set new state-of-the-art results on classification and semantic search benchmarks.
Cross Paraphrastic Invariance Learning for Hallucination Detection cs.CL · 2026-06-06 · unverdicted · none · ref 26
CPIL is a contrastive two-stage method that enforces paraphrase invariance on limited labeled data to outperform baselines in hallucination detection across 11 tasks.
Duluth at SemEval-2026 Task 6: DeBERTa with LLM-Augmented Data for Unmasking Political Question Evasions cs.CL · 2026-04-22 · unverdicted · none · ref 2
DeBERTa-V3-base with focal loss, discourse features, and LLM-augmented data for minority classes achieves 0.76 Macro F1 on clarity-level classification of political QA pairs, ranking 8th in SemEval-2026 Task 6.
Enhancing LLMs for Identifying and Prioritizing Important Medical Jargons from Electronic Health Record Notes Utilizing Data Augmentation cs.CL · 2025-02-22 · unverdicted · none · ref 99
Fine-tuning and data augmentation improve LLM performance on medical jargon extraction and prioritization from EHR notes, with augmented open-source models sometimes outperforming closed-source ones on 106 annotated notes.
Multilingual Polarization Detection Using Transformer-Based Models with Class Weighting and Threshold Tuning cs.CL · 2026-06-29 · unverdicted · none · ref 31
Transformer models with class weighting and threshold tuning achieve competitive F1 scores on three subtasks of multilingual polarization detection.

arXiv preprint arXiv:1901.11196 , year =

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer