pith. sign in

super hub Mixed citations

HuggingFace's Transformers: State-of-the-art Natural Language Processing

Mixed citation behavior. Most common role is background (54%).

110 Pith papers citing it
Background 54% of classified citations
abstract

Recent progress in natural language processing has been driven by advances in both model architecture and model pretraining. Transformer architectures have facilitated building higher-capacity models and pretraining has made it possible to effectively utilize this capacity for a wide variety of tasks. \textit{Transformers} is an open-source library with the goal of opening up these advances to the wider machine learning community. The library consists of carefully engineered state-of-the art Transformer architectures under a unified API. Backing this library is a curated collection of pretrained models made by and available for the community. \textit{Transformers} is designed to be extensible by researchers, simple for practitioners, and fast and robust in industrial deployments. The library is available at \url{https://github.com/huggingface/transformers}.

hub tools

citation-role summary

background 14 method 8 other 4

citation-polarity summary

claims ledger

  • abstract Recent progress in natural language processing has been driven by advances in both model architecture and model pretraining. Transformer architectures have facilitated building higher-capacity models and pretraining has made it possible to effectively utilize this capacity for a wide variety of tasks. \textit{Transformers} is an open-source library with the goal of opening up these advances to the wider machine learning community. The library consists of carefully engineered state-of-the art Transformer architectures under a unified API. Backing this library is a curated collection of pretrain

authors

co-cited works

clear filters

representative citing papers

VibeServe: Can AI Agents Build Bespoke LLM Serving Systems?

cs.AI · 2026-05-07 · unverdicted · novelty 8.0

VibeServe demonstrates that AI agents can synthesize bespoke LLM serving systems end-to-end, remaining competitive with vLLM in standard settings while outperforming it in six non-standard scenarios involving unusual models, workloads, or hardware.

Editing Models with Task Arithmetic

cs.LG · 2022-12-08 · accept · novelty 8.0

Task vectors from weight differences allow arithmetic operations to edit pre-trained models, improving multiple tasks simultaneously and enabling analogical inference on unseen tasks.

Discovering Latent Knowledge in Language Models Without Supervision

cs.CL · 2022-12-07 · conditional · novelty 8.0

An unsupervised technique extracts latent yes-no knowledge from language model activations by locating a direction that satisfies logical consistency properties, outperforming zero-shot accuracy by 4% on average across models and datasets.

Test-Time Training Undermines Safety Guardrails

cs.LG · 2026-05-21 · unverdicted · novelty 7.0

Test-time training enables three new threat models that raise jailbreak attack success rates on language models to averages of 95% and 93% ASR@10 under LoRA for few-shot and generation-phase attacks across model families.

Interference-Aware Multi-Task Unlearning

cs.AI · 2026-05-18 · unverdicted · novelty 7.0

Introduces interference-aware multi-task unlearning with task-aware gradient projection and instance-level gradient orthogonalization, reducing interference scores by 30.3% and 52.9% on vision benchmarks.

Massive Activations in Large Language Models

cs.CL · 2024-02-27 · unverdicted · novelty 7.0

Massive activations are constant large values in LLMs that function as indispensable bias terms and concentrate attention probabilities on specific tokens.

citing papers explorer

Showing 10 of 10 citing papers after filters.

  • EnergyAgentBench: Benchmarking LLM Agents on Live Energy Infrastructure Data econ.EM · 2026-05-13 · accept · none · ref 33 · internal anchor

    EnergyAgentBench is a new benchmark with 70 task variants that evaluates LLM agents on live energy data for datacenter siting, long-horizon optimization, and causal grid diagnosis.

  • RULER: What's the Real Context Size of Your Long-Context Language Models? cs.CL · 2024-04-09 · accept · none · ref 34 · internal anchor

    RULER shows most long-context LMs drop sharply in performance on complex tasks as length and difficulty increase, with only half maintaining results at 32K tokens.

  • Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling cs.CL · 2023-04-03 · accept · none · ref 241 · internal anchor

    Pythia releases 16 identically trained LLMs with full checkpoints and data tools to study training dynamics, scaling, memorization, and bias in language models.

  • Editing Models with Task Arithmetic cs.LG · 2022-12-08 · accept · none · ref 104 · internal anchor

    Task vectors from weight differences allow arithmetic operations to edit pre-trained models, improving multiple tasks simultaneously and enabling analogical inference on unseen tasks.

  • Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks cs.CL · 2020-05-22 · accept · none · ref 70 · internal anchor

    RAG models set new state-of-the-art results on open-domain QA by retrieving Wikipedia passages and conditioning a generative model on them, while also producing more factual text than parametric baselines.

  • Zephyr: Direct Distillation of LM Alignment cs.LG · 2023-10-25 · accept · none · ref 34 · internal anchor

    Zephyr-7B achieves state-of-the-art chat benchmark results among 7B models by distilling alignment via dDPO on AI feedback preferences, surpassing the 70B Llama-2-Chat model on MT-Bench with no human data required.

  • Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection cs.CV · 2023-03-09 · accept · none · ref 49 · internal anchor

    Grounding DINO fuses language and vision via feature enhancer, language-guided query selection, and cross-modality decoder in a DINO backbone, achieving 52.5 AP zero-shot on COCO and a new record of 26.1 AP mean on ODinW.

  • Scaling Laws and Interpretability of Learning from Repeated Data cs.LG · 2022-05-21 · accept · none · ref 11 · internal anchor

    Repeating 0.1% of training data 100 times degrades an 800M parameter model's performance to that of a 400M model by damaging copying mechanisms and induction heads associated with generalization.

  • Ranking Reasoning LLMs under Test-Time Scaling cs.LG · 2026-03-11 · accept · none · ref 9 · internal anchor

    Many established statistical ranking techniques produce orderings of reasoning LLMs under test-time scaling that closely match a Bayesian gold standard, with mean Kendall tau_b of 0.93-0.95 at full trials and best methods reaching 0.86 at single trials.

  • Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities cs.LG · 2024-08-14 · accept · none · ref 251 · internal anchor

    The paper introduces a new taxonomy for model merging methods and reviews their applications in LLMs, MLLMs, continual learning, multi-task learning, and other subfields while outlining open challenges.