GPT - N eo X -20 B : An open-source autoregressive language model

Black, Sidney, Biderman, Stella, Hallahan, Eric, Anthony, Quentin, Gao, Leo, Golding, Laurence · 2022 · DOI 10.18653/v1/2022.bigscience-1.9

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

open at publisher browse 7 citing papers

citation-role summary

method 2 background 1

citation-polarity summary

use method 2 background 1

representative citing papers

Provable Joint Decontamination for Benchmarking Multiple Large Language Models

cs.LG · 2026-05-20 · unverdicted · novelty 7.0

JECS aggregates per-model conformal p-values via their maximum and reconstructs a conservative envelope of the max-p null distribution to select benchmarks with global contamination rate control.

Every Component is a Lookup: Token Attribution and Composition from a Single Decomposition

cs.LG · 2026-05-22 · unverdicted · novelty 6.0

Unpack decomposes transformer credit via a unified backward recursion on the φ(S)U template, recovering known IOI circuits with mode labels and showing consistent duplicate-name suppression across Pythia scales from a single forward pass.

A Bitter Lesson for Data Filtering

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

With enough compute, large models benefit from training on unfiltered data that includes low-quality and distractor examples instead of requiring high-quality filtered data.

The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets

cs.AI · 2023-10-10 · unverdicted · novelty 6.0

At sufficient scale, LLMs linearly represent the truth value of factual statements, as shown by visualizations, cross-dataset generalization, and causal interventions that flip truth judgments.

BloombergGPT: A Large Language Model for Finance

cs.LG · 2023-03-30 · conditional · novelty 6.0

BloombergGPT is a 50B parameter LLM trained on a 708B token mixed financial and general dataset that outperforms prior models on financial benchmarks while preserving general LLM performance.

SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models

cs.CL · 2023-03-15 · unverdicted · novelty 6.0

SelfCheckGPT detects hallucinations by checking consistency across multiple sampled responses from black-box LLMs on WikiBio biography generation tasks.

Sessa: Selective State Space Attention

cs.LG · 2026-04-20 · unverdicted · novelty 5.0

Sessa integrates attention within recurrent paths to achieve power-law memory tails and flexible non-decaying selective retrieval, outperforming baselines on long-context tasks.

citing papers explorer

Showing 7 of 7 citing papers.

Provable Joint Decontamination for Benchmarking Multiple Large Language Models cs.LG · 2026-05-20 · unverdicted · none · ref 125
JECS aggregates per-model conformal p-values via their maximum and reconstructs a conservative envelope of the max-p null distribution to select benchmarks with global contamination rate control.
Every Component is a Lookup: Token Attribution and Composition from a Single Decomposition cs.LG · 2026-05-22 · unverdicted · none · ref 14
Unpack decomposes transformer credit via a unified backward recursion on the φ(S)U template, recovering known IOI circuits with mode labels and showing consistent duplicate-name suppression across Pythia scales from a single forward pass.
A Bitter Lesson for Data Filtering cs.LG · 2026-05-19 · unverdicted · none · ref 12
With enough compute, large models benefit from training on unfiltered data that includes low-quality and distractor examples instead of requiring high-quality filtered data.
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets cs.AI · 2023-10-10 · unverdicted · none · ref 41
At sufficient scale, LLMs linearly represent the truth value of factual statements, as shown by visualizations, cross-dataset generalization, and causal interventions that flip truth judgments.
BloombergGPT: A Large Language Model for Finance cs.LG · 2023-03-30 · conditional · none · ref 9
BloombergGPT is a 50B parameter LLM trained on a 708B token mixed financial and general dataset that outperforms prior models on financial benchmarks while preserving general LLM performance.
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models cs.CL · 2023-03-15 · unverdicted · none · ref 3
SelfCheckGPT detects hallucinations by checking consistency across multiple sampled responses from black-box LLMs on WikiBio biography generation tasks.
Sessa: Selective State Space Attention cs.LG · 2026-04-20 · unverdicted · none · ref 35
Sessa integrates attention within recurrent paths to achieve power-law memory tails and flexible non-decaying selective retrieval, outperforming baselines on long-context tasks.

GPT - N eo X -20 B : An open-source autoregressive language model

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer