Pythia releases 16 identically trained LLMs with full checkpoints and data tools to study training dynamics, scaling, memorization, and bias in language models.
hub
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
11 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
Object functionalization is cast as neural graph completion over a functional graph of parts, contacts, and motions, followed by geometry realization that also rectifies erroneous motions, demonstrated on furniture with a new paired dataset.
AnyBand-Diff is a spectral-prior-guided diffusion model that unifies remote sensing image generation and band repair while maintaining radiometric fidelity through physics-guided sampling and multi-scale losses.
UniSim learns a universal real-world simulator from orchestrated diverse datasets, enabling zero-shot deployment of policies trained purely in simulation.
Latent Consistency Models enable high-fidelity text-to-image generation in 2-4 steps by directly predicting solutions to the probability flow ODE in latent space, distilled from pre-trained LDMs.
LLaMA-Adapter turns frozen LLaMA 7B into a capable instruction follower using only 1.2M new parameters and zero-init attention, matching Alpaca while extending to image-conditioned reasoning on ScienceQA and COCO.
ECMRNet is a continual-learning restoration network that decomposes features into isolated groups, expands new groups for novel degradations, prunes via structural entropy, and mines historical components for compound degradations in open-world TIR imaging.
SalUn uses gradient-based weight saliency to achieve effective machine unlearning of data, classes, or concepts in image classification and generation, narrowing the gap to exact retraining.
LanguageBind aligns video, infrared, depth, and audio to a frozen language encoder via contrastive learning on the new VIDAL-10M dataset, extending video-language pretraining to N modalities.
Recursive paraphrasing attacks substantially lower detection rates for multiple AI text detectors with only minor quality loss, while a theoretical analysis ties best-case AUROC to total variation distance between human and AI distributions.
RAFT aligns generative models by ranking samples with a reward model and fine-tuning only on the top-ranked outputs, reporting gains on reward scores and automated metrics for LLMs and diffusion models.
citing papers explorer
-
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
Pythia releases 16 identically trained LLMs with full checkpoints and data tools to study training dynamics, scaling, memorization, and bias in language models.
-
Functionalization via Structure Completion and Motion Rectification
Object functionalization is cast as neural graph completion over a functional graph of parts, contacts, and motions, followed by geometry realization that also rectifies erroneous motions, demonstrated on furniture with a new paired dataset.
-
AnyBand-Diff: A Unified Remote Sensing Image Generation and Band Repair Framework with Spectral Priors
AnyBand-Diff is a spectral-prior-guided diffusion model that unifies remote sensing image generation and band repair while maintaining radiometric fidelity through physics-guided sampling and multi-scale losses.
-
Learning Interactive Real-World Simulators
UniSim learns a universal real-world simulator from orchestrated diverse datasets, enabling zero-shot deployment of policies trained purely in simulation.
-
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
Latent Consistency Models enable high-fidelity text-to-image generation in 2-4 steps by directly predicting solutions to the probability flow ODE in latent space, distilled from pre-trained LDMs.
-
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
LLaMA-Adapter turns frozen LLaMA 7B into a capable instruction follower using only 1.2M new parameters and zero-init attention, matching Alpaca while extending to image-conditioned reasoning on ScienceQA and COCO.
-
Expandable, Compressible, Mineable: Open-World Thermal Image Restoration
ECMRNet is a continual-learning restoration network that decomposes features into isolated groups, expands new groups for novel degradations, prunes via structural entropy, and mines historical components for compound degradations in open-world TIR imaging.
-
SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation
SalUn uses gradient-based weight saliency to achieve effective machine unlearning of data, classes, or concepts in image classification and generation, narrowing the gap to exact retraining.
-
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
LanguageBind aligns video, infrared, depth, and audio to a frozen language encoder via contrastive learning on the new VIDAL-10M dataset, extending video-language pretraining to N modalities.
-
Can AI-Generated Text be Reliably Detected?
Recursive paraphrasing attacks substantially lower detection rates for multiple AI text detectors with only minor quality loss, while a theoretical analysis ties best-case AUROC to total variation distance between human and AI distributions.
-
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
RAFT aligns generative models by ranking samples with a reward model and fine-tuning only on the top-ranked outputs, reporting gains on reward scores and automated metrics for LLMs and diffusion models.