Contrastive pair presentations yield exact identifiability characterizations via a geometric refinement of Angluin's condition, a new contrastive closure dimension for generation, mutual incomparability with text identification, and a single algorithm that tolerates any finite corruption budget.
30th USENIX security symposium (USENIX Security 21) , pages=
8 Pith papers cite this work. Polarity classification is still indexing.
years
2026 8representative citing papers
LLMs copy biased analyst ratings in investment decisions but a new detection method encourages independent reasoning and can improve stock return predictions beyond human levels.
CoVUBench is the first benchmark framework for evaluating multimodal copyright unlearning in LVLMs via synthetic data, systematic variations, and a dual protocol for forgetting efficacy and utility preservation.
A test-driven pipeline with an auto-constructed privacy feature library detects 2.56 times more confirmed privacy leaks in LLM-based code generation than existing baselines.
ZCP detects direct and evasive data contamination in LLMs by truncating CoT reasoning and contrasting zero-CoT accuracy on original versus perturbed isomorphic datasets, plus a Contamination Confidence metric.
BPE tokenization creates gibberish bias in CLLMs, causing secrets with high character entropy but low token entropy to be preferentially memorized due to training data distribution shifts.
ALDEN boosts private data extraction rates from RAG systems by combining active learning for query diversification with dynamic estimation of the underlying knowledge-base topic distribution.
citing papers explorer
-
Contrastive Identification and Generation in the Limit
Contrastive pair presentations yield exact identifiability characterizations via a geometric refinement of Angluin's condition, a new contrastive closure dimension for generation, mutual incomparability with text identification, and a single algorithm that tolerates any finite corruption budget.
-
Fin-Bias: Comprehensive Evaluation for LLM Decision-Making under human bias in Finance Domain
LLMs copy biased analyst ratings in investment decisions but a new detection method encourages independent reasoning and can improve stock return predictions beyond human levels.
-
Erase Persona, Forget Lore: Benchmarking Multimodal Copyright Unlearning in Large Vision Language Models
CoVUBench is the first benchmark framework for evaluating multimodal copyright unlearning in LVLMs via synthetic data, systematic variations, and a dual protocol for forgetting efficacy and utility preservation.
-
Probing Privacy Leaks in LLM-based Code Generation via Test Generation
A test-driven pipeline with an auto-constructed privacy feature library detects 2.56 times more confirmed privacy leaks in LLM-based code generation than existing baselines.
-
The Illusion of Reasoning: Exposing Evasive Data Contamination in LLMs via Zero-CoT Truncation
ZCP detects direct and evasive data contamination in LLMs by truncating CoT reasoning and contrasting zero-CoT accuracy on original versus perturbed isomorphic datasets, plus a Contamination Confidence metric.
-
Understanding Secret Leakage Risks in Code LLMs: A Tokenization Perspective
BPE tokenization creates gibberish bias in CLLMs, causing secrets with high character entropy but low token entropy to be preferentially memorized due to training data distribution shifts.
-
ALDEN: Boosting Private Data Extraction from Retrieval-Augmented Generation Systems via Active Learning and Distribution Estimation
ALDEN boosts private data extraction rates from RAG systems by combining active learning for query diversification with dynamic estimation of the underlying knowledge-base topic distribution.
- Less is More: Geometric Unlearning for LLMs with Minimal Data Disclosure