Maas, Raymond E

Andrew L · 2011

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

browse 6 citing papers

representative citing papers

Detecting Pretraining Data from Large Language Models

cs.CL · 2023-10-25 · conditional · novelty 7.0

Min-K% Prob detects pretraining data in LLMs by flagging outlier low-probability words in text, achieving 7.4% better performance than prior methods on the new WIKIMIA benchmark.

Steering Language Models With Activation Engineering

cs.CL · 2023-08-20 · unverdicted · novelty 7.0

Activation Addition steers language models by adding contrastive activation vectors from prompt pairs to control high-level properties like sentiment and toxicity at inference time without training.

Linear Representations of Hierarchical Concepts in Language Models

cs.CL · 2026-04-09 · unverdicted · novelty 6.0

Language models encode concept hierarchies as linear transformations that are domain-specific yet structurally similar across domains.

NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

cs.CL · 2024-05-27 · accept · novelty 6.0

NV-Embed achieves first place on the MTEB leaderboard across 56 tasks by combining a latent attention layer, causal-mask removal, two-stage contrastive training, and data curation for LLM-based embedding models.

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

cs.LG · 2023-04-13 · unverdicted · novelty 5.0

RAFT aligns generative models by ranking samples with a reward model and fine-tuning only on the top-ranked outputs, reporting gains on reward scores and automated metrics for LLMs and diffusion models.

Similarity-Distance-Magnitude Activations

cs.LG · 2025-09-16

citing papers explorer

Showing 6 of 6 citing papers.

Detecting Pretraining Data from Large Language Models cs.CL · 2023-10-25 · conditional · none · ref 35
Min-K% Prob detects pretraining data in LLMs by flagging outlier low-probability words in text, achieving 7.4% better performance than prior methods on the new WIKIMIA benchmark.
Steering Language Models With Activation Engineering cs.CL · 2023-08-20 · unverdicted · none · ref 29
Activation Addition steers language models by adding contrastive activation vectors from prompt pairs to control high-level properties like sentiment and toxicity at inference time without training.
Linear Representations of Hierarchical Concepts in Language Models cs.CL · 2026-04-09 · unverdicted · none · ref 19
Language models encode concept hierarchies as linear transformations that are domain-specific yet structurally similar across domains.
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models cs.CL · 2024-05-27 · accept · none · ref 39
NV-Embed achieves first place on the MTEB leaderboard across 56 tasks by combining a latent attention layer, causal-mask removal, two-stage contrastive training, and data curation for LLM-based embedding models.
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment cs.LG · 2023-04-13 · unverdicted · none · ref 36
RAFT aligns generative models by ranking samples with a reward model and fine-tuning only on the top-ranked outputs, reporting gains on reward scores and automated metrics for LLMs and diffusion models.
Similarity-Distance-Magnitude Activations cs.LG · 2025-09-16 · unreviewed · ref 20

Maas, Raymond E

fields

years

verdicts

representative citing papers

citing papers explorer