A survey of LLM copyright protection that unifies text watermarking, model watermarking, and model fingerprinting while presenting new coverage of fingerprint transfer and removal.
Canonical reference
Title resolution pending
Canonical reference. 100% of citing Pith papers cite this work as background.
citation-role summary
citation-polarity summary
roles
background 5polarities
background 5representative citing papers
Fine-tuning 13 CodeLMs on a constructed CLB dataset with nine interaction types improves detection, with UniXcoder-base reaching F1 0.7407 and small models outperforming large ones.
ToxiShield delivers a real-time GitHub extension with a BERT toxicity detector at 98% accuracy, a Claude-based coach, and a fine-tuned Llama reframer at 95% style transfer accuracy, validated by a 10-person TAM study.
The survey organizes causes of hallucinations in MLLMs, reviews evaluation benchmarks and metrics, and outlines mitigation approaches plus open questions.
The paper introduces a new taxonomy for model merging methods and reviews their applications in LLMs, MLLMs, continual learning, multi-task learning, and other subfields while outlining open challenges.
A systematic literature review that organizes recent work on LLMs for code generation into a taxonomy covering data curation, model advances, evaluations, ethics, environmental impact, and applications, with benchmark comparisons.
The paper surveys data-centric strategies for foundation models in computational healthcare and supplies a curated list of related models and datasets.
citing papers explorer
-
Copyright Protection for Large Language Models: A Survey of Methods, Challenges, and Trends
A survey of LLM copyright protection that unifies text watermarking, model watermarking, and model fingerprinting while presenting new coverage of fingerprint transfer and removal.
-
Fine-Tuning Code Language Models to Detect Cross-Language Bugs
Fine-tuning 13 CodeLMs on a constructed CLB dataset with nine interaction types improves detection, with UniXcoder-base reaching F1 0.7407 and small models outperforming large ones.
-
ToxiShield: Promoting Inclusive Developer Communication through Real-Time Toxicity Filtering
ToxiShield delivers a real-time GitHub extension with a BERT toxicity detector at 98% accuracy, a Claude-based coach, and a fine-tuned Llama reframer at 95% style transfer accuracy, validated by a 10-person TAM study.
-
Hallucination of Multimodal Large Language Models: A Survey
The survey organizes causes of hallucinations in MLLMs, reviews evaluation benchmarks and metrics, and outlines mitigation approaches plus open questions.
-
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities
The paper introduces a new taxonomy for model merging methods and reviews their applications in LLMs, MLLMs, continual learning, multi-task learning, and other subfields while outlining open challenges.
-
A Survey on Large Language Models for Code Generation
A systematic literature review that organizes recent work on LLMs for code generation into a taxonomy covering data curation, model advances, evaluations, ethics, environmental impact, and applications, with benchmark comparisons.
-
Data-Centric Foundation Models in Computational Healthcare: A Survey
The paper surveys data-centric strategies for foundation models in computational healthcare and supplies a curated list of related models and datasets.