Gpt-4 technical report

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al · 2023

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

citation-role summary

baseline 1

citation-polarity summary

baseline 1

representative citing papers

BARISTA: A Multi-Task Egocentric Benchmark for Compositional Visual Understanding

cs.CV · 2026-05-12 · conditional · novelty 7.0

BARISTA introduces a densely annotated egocentric coffee-preparation video dataset and multi-task benchmark that reveals performance variation across models on compositional visual tasks.

BEAVER: An Efficient Deterministic LLM Verifier

cs.AI · 2025-12-05 · unverdicted · novelty 7.0

BEAVER is the first practical deterministic verifier that maintains sound probability bounds on LLM safety properties using token tries and frontier data structures, finding 2-3x more violations than sampling at 1/10 the compute.

Max-pooling Network Revisited: Analyzing the Role of Semantic Probability in Multiple Instance Learning for Hallucination Detection

cs.CL · 2026-05-09 · unverdicted · novelty 5.0 · 2 refs

A lightweight max-pooling network with MLP detects LLM hallucinations competitively without semantic consistency computations by adaptively aggregating internal token features.

citing papers explorer

Showing 3 of 3 citing papers.

BARISTA: A Multi-Task Egocentric Benchmark for Compositional Visual Understanding cs.CV · 2026-05-12 · conditional · none · ref 19
BARISTA introduces a densely annotated egocentric coffee-preparation video dataset and multi-task benchmark that reveals performance variation across models on compositional visual tasks.
BEAVER: An Efficient Deterministic LLM Verifier cs.AI · 2025-12-05 · unverdicted · none · ref 1
BEAVER is the first practical deterministic verifier that maintains sound probability bounds on LLM safety properties using token tries and frontier data structures, finding 2-3x more violations than sampling at 1/10 the compute.
Max-pooling Network Revisited: Analyzing the Role of Semantic Probability in Multiple Instance Learning for Hallucination Detection cs.CL · 2026-05-09 · unverdicted · none · ref 1 · 2 links
A lightweight max-pooling network with MLP detects LLM hallucinations competitively without semantic consistency computations by adaptively aggregating internal token features.

Gpt-4 technical report

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer