{"total":15,"items":[{"citing_arxiv_id":"2606.30857","ref_index":31,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Multilingual Polarization Detection Using Transformer-Based Models with Class Weighting and Threshold Tuning","primary_cat":"cs.CL","submitted_at":"2026-06-29T19:42:52+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":2.0,"formal_verification":"none","one_line_summary":"Transformer models with class weighting and threshold tuning achieve competitive F1 scores on three subtasks of multilingual polarization detection.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.08157","ref_index":26,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Cross Paraphrastic Invariance Learning for Hallucination Detection","primary_cat":"cs.CL","submitted_at":"2026-06-06T13:13:12+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"CPIL is a contrastive two-stage method that enforces paraphrase invariance on limited labeled data to outperform baselines in hallucination detection across 11 tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.04306","ref_index":155,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Organizational Control Layer: Governance Infrastructure at the Execution Boundary of LLM Agent Systems","primary_cat":"cs.MA","submitted_at":"2026-06-03T00:25:56+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"OCL is a governance layer for LLM agents that cuts unsafe executions from 88% to near-zero and raises valid success from 12% to 96% in adversarial buyer-seller negotiations across frontier LLMs.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.01825","ref_index":176,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"ROGLE: Robust Global-Local Alignment with Automated Region Supervision for Text-Based Person Search","primary_cat":"cs.CV","submitted_at":"2026-06-01T07:41:44+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.26923","ref_index":44,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"ClassEval-Pro: A Cross-Domain Benchmark for Class-Level Code Generation","primary_cat":"cs.SE","submitted_at":"2026-04-29T17:38:37+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"ClassEval-Pro benchmark shows frontier LLMs achieve at most 45.6% Pass@1 on class-level code tasks, with logic errors (56%) and dependency errors (38%) as dominant failure modes.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"27850(2026). [42] Jason Wei and Kai Zou. 2019. EDA: Easy Data Augmentation Techniques for Boost- ing Performance on Text Classification Tasks.arXiv preprint arXiv:1901.11196 (2019). [43] Can Xu, Qingfeng Sun, Kai Zheng, et al. 2025. WizardLM: Empowering large pre-trained language models to follow complex instructions.arXiv preprint arXiv:2304.12244(2025). [44] Yisen Xu, Jinqiu Yang, and Tse-Hsun Chen. 2026. SWE-Refactor: A Repository- Level Benchmark for Real-World LLM-Based Code Refactoring.arXiv preprint arXiv:2602.03712(2026). doi:10.48550/arXiv.2602.03712 [45] An Yang, Anfeng Li, et al . 2025. Qwen3 Technical Report.arXiv preprint arXiv:2505.09388(2025). [46] Puyu Zeng, Zhaoxi Wang, Zhixu Duan, Liang Feng, Shaobo Wang, Cunxi-"},{"citing_arxiv_id":"2604.20168","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Duluth at SemEval-2026 Task 6: DeBERTa with LLM-Augmented Data for Unmasking Political Question Evasions","primary_cat":"cs.CL","submitted_at":"2026-04-22T04:18:14+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"DeBERTa-V3-base with focal loss, discourse features, and LLM-augmented data for minority classes achieves 0.76 Macro F1 on clarity-level classification of political QA pairs, ranking 8th in SemEval-2026 Task 6.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.18759","ref_index":45,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Model-Agnostic Meta Learning for Class Imbalance Adaptation","primary_cat":"cs.CL","submitted_at":"2026-04-20T19:07:56+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"HAMR combines meta-learning with hardness-aware weighting and neighborhood resampling to improve minority-class performance on imbalanced NLP datasets.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.18539","ref_index":61,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Transition-Matrix Regularization for Next Dialogue Act Prediction in Counselling Conversations","primary_cat":"cs.CL","submitted_at":"2026-04-20T17:33:37+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"KL regularization aligning model predictions with empirical transition patterns improves macro-F1 by 9-42% in next dialogue act prediction on German counselling data and transfers to other datasets.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.02377","ref_index":160,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"What Are Adversaries Doing? Automating Tactics, Techniques, and Procedures Extraction: A Systematic Review","primary_cat":"cs.SE","submitted_at":"2026-04-01T06:25:55+00:00","verdict":"ACCEPT","verdict_confidence":"MODERATE","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Systematic review of 80 papers shows TTP extraction shifting to transformer and LLM methods but limited by narrow datasets, single-label focus, and low reproducibility.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"P32 DECEPT-CTI: A Framework for Enhancing Cyber Deception Strategies through NLP-based Extraction of CTI from Unstructured Reports [128] P33 DEEPCAPA: Identifying Malicious Capabilities in Windows Malware [150] P34 Discovering attacker profiles using process mining and the MITRE ATT&CK taxonomy [121] P35 Enhanced small-scale APT knowledge graph embedding via spatio-temporal attribute reasoning and adversarial negative sampling [160] P36 Enhancements to Threat, Vulnerability, and Mitigation Knowledge for Cyber Analytics, Hunting, and Simula- tions [50] P37 Entity and relation extractions for threat intelligence knowledge graphs [96] P38 Evaluating Text Augmentation for Boosting the Automatic Mapping of Vulnerability Information to Adversary Techniques [48] P39 Explainable cyber threat behavior identification based on self-adversarial topic generation [46]"},{"citing_arxiv_id":"2604.03259","ref_index":34,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"From Pre-trained Models to Large Language Models: A Comprehensive Survey of AI-Driven Psychological Computing","primary_cat":"cs.CY","submitted_at":"2026-03-12T04:03:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"The paper introduces a new taxonomy that groups AI-driven psychological computing tasks by their underlying computational patterns into four categories and reviews over 300 works from the pre-trained model to LLM eras.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2511.20657","ref_index":272,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Intelligent Agents with Emotional Intelligence: Current Trends, Challenges, and Future Prospects","primary_cat":"cs.HC","submitted_at":"2025-10-11T07:40:36+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":2.0,"formal_verification":"none","one_line_summary":"A holistic survey of affective computing for intelligent agents covering emotion understanding via multimodal data, affective cognition, emotional expression synthesis, key challenges, and future directions emphasizing generative technologies.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"through pre-trained models, thereby enhancing the model's overall capacity for generalization. Traditional augmentation techniques, such as Easy Data Augmentation (EDA) [271], use synonym replacement, word insertion, deletion, and word shuffling. Experimental results indicate that these methods significantly en- hance the performance of deep learning architectures such as RoBERTa [272], particularly in low-resource scenarios, by enriching datasets with more diverse and balanced examples. Noise in data:Noise or irrelevant information that disrupts the recognition process can affect emotion recognition systems, especially those relying on sensors or video recordings. Noise may arise from poor light- ing, background interference in video data, or ambient noise in audio recordings."},{"citing_arxiv_id":"2508.15202","ref_index":22,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models","primary_cat":"cs.CL","submitted_at":"2025-08-21T03:31:11+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Fin-PRM is a domain-specialized process reward model that supplies binary step-level and trajectory-level supervision signals for financial reasoning in LLMs and outperforms general PRMs on CFLUE and FinQA benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2502.16022","ref_index":99,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Enhancing LLMs for Identifying and Prioritizing Important Medical Jargons from Electronic Health Record Notes Utilizing Data Augmentation","primary_cat":"cs.CL","submitted_at":"2025-02-22T00:50:01+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"Fine-tuning and data augmentation improve LLM performance on medical jargon extraction and prioritization from EHR notes, with augmented open-source models sometimes outperforming closed-source ones on 106 annotated notes.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2303.09014","ref_index":80,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"ART: Automatic multi-step reasoning and tool-use for large language models","primary_cat":"cs.CL","submitted_at":"2023-03-16T01:04:45+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"ART automatically generates multi-step reasoning programs with tool integration for LLMs, yielding substantial gains over few-shot and auto-CoT prompting on BigBench and MMLU while matching hand-crafted CoT on most tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2201.10005","ref_index":26,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Text and Code Embeddings by Contrastive Pre-Training","primary_cat":"cs.CL","submitted_at":"2022-01-24T23:36:20+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Contrastive pre-training on unsupervised data at scale creates text and code embeddings that set new state-of-the-art results on classification and semantic search benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}