pith. sign in

mega hub Canonical reference

LLaMA: Open and Efficient Foundation Language Models

Canonical reference. 82% of citing Pith papers cite this work as background.

1042 Pith papers citing it
Background 82% of classified citations
abstract

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

hub tools

citation-role summary

background 206 method 19 baseline 8 other 6 dataset 1 extension 1

citation-polarity summary

claims ledger

  • abstract We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

mega hub controls

Recognition alignment

counterfactual ablation

If this work disappeared, these are the nearest dependency candidates in Pith, weighted toward method, dataset, baseline, and extension contexts where available. This is a structural signal, not a retraction verdict.

co-cited works

clear filters

representative citing papers

Privacy Auditing with Zero (0) Training Run

cs.CR · 2026-05-14 · unverdicted · novelty 8.0

Zero-Run auditing supplies valid lower bounds on differential privacy parameters from fixed member and non-member datasets by modeling and correcting distribution-shift confounding via causal-inference techniques.

Backdoor Attacks on Decentralised Post-Training

cs.CR · 2026-03-31 · conditional · novelty 8.0 · 2 refs

An adversary controlling an intermediate pipeline stage in decentralized LLM post-training can inject a backdoor that reduces alignment from 80% to 6%, with the backdoor persisting in 60% of cases even after subsequent safety training.

BEAVER: An Enterprise Benchmark for Text-to-SQL

cs.CL · 2024-09-03 · unverdicted · novelty 8.0

BEAVER is the first text-to-SQL benchmark from private enterprise data warehouses, revealing SOTA agentic frameworks achieve only 10.8% accuracy on complex real-world queries.

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

cs.LG · 2023-12-01 · unverdicted · novelty 8.0

Mamba is a linear-time sequence model using input-dependent selective SSMs that achieves SOTA results across modalities and matches twice-larger Transformers on language modeling with 5x higher inference throughput.

Instruction Tuning with GPT-4

cs.CL · 2023-04-06 · unverdicted · novelty 8.0

GPT-4-generated instruction data produces superior zero-shot performance in finetuned LLaMA models versus prior state-of-the-art data.

End-to-End Text Line Detection and Ordering

cs.CV · 2026-06-02 · unverdicted · novelty 7.0

Orli is an autoregressive image-to-sequence model that jointly detects text lines and determines their reading order on historical documents via chord-frame baselines, trained on 196k pages across ten scripts.

RWGBench: Evaluating Scholarly Positioning in Related Work Generation

cs.DL · 2026-05-30 · unverdicted · novelty 7.0

RWGBench is a citation-centric benchmark for related work generation built from 40k CS papers and a 100-paper test set, with multi-dimensional metrics that better match human expert judgment than standard similarity scores.

citing papers explorer

Showing 8 of 8 citing papers after filters.