The study decomposes memorization risks in code LLMs into unintentional and malicious disclosure, demonstrates assessment methods on OLMo models and Dolma data, and finds that data changes affect risks differently depending on sensitive information type.
Automated detection of password leakage from public github repos- itories,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CR 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Malicious and Unintentional Disclosure Risks in Large Language Models for Code Generation
The study decomposes memorization risks in code LLMs into unintentional and malicious disclosure, demonstrates assessment methods on OLMo models and Dolma data, and finds that data changes affect risks differently depending on sensitive information type.