C-Mining automatically mines high-fidelity Culture Points from raw multilingual text by treating cross-lingual geometric isolation in embeddings as a quantifiable signal for cultural specificity, then uses them to synthesize better instruction data.
Title resolution pending
5 Pith papers cite this work. Polarity classification is still indexing.
5
Pith papers citing it
citation-role summary
background 1
dataset 1
citation-polarity summary
representative citing papers
Code language models show no transferable security understanding from code diffs alone, rely on commit messages, miss over 93% of fixes at 0.5% false positive rate, and suffer large drops under group or temporal splits.
PAFT improves LLM-based program repair pass rates by up to 65.6% while cutting average edit distance by up to 32.6% through explicit preservation signals and curriculum training.
Search-R3 trains LLMs to output search embeddings as a direct product of step-by-step reasoning via supervised pre-training and a specialized RL environment that avoids full corpus re-encoding.