Hybrid vector-search plus fingerprinting pipeline for LLM code provenance achieves Winnowing-level MRR on short snippets and up to 5.4% better on longer ones at logarithmic query time.
Plagbench: Exploring the duality of large language models in plagiarism generation and detection.CoRR, abs/2406.16288, 2024
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SE 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Efficient and Scalable Provenance Tracking for LLM-Generated Code Snippets
Hybrid vector-search plus fingerprinting pipeline for LLM code provenance achieves Winnowing-level MRR on short snippets and up to 5.4% better on longer ones at logarithmic query time.