Truthfulqa: Measuring how models mimic human falsehoods

· 2021

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

AIvaluateXR: An Evaluation Framework for on-Device AI in XR with Benchmarking Results

cs.DC · 2025-02-13 · unverdicted · novelty 5.0

AIvaluateXR benchmarks 17 LLMs across four XR platforms on performance, speed, memory and battery metrics and proposes a 3D Pareto optimality method to identify optimal on-device model-device pairs.

Exposing the Ghost in the Transformer: Abnormal Detection for Large Language Models via Hidden State Forensics

cs.CR · 2025-04-01 · unverdicted · novelty 3.0

A framework detects LLM anomalies including hallucinations, jailbreaks, and backdoors by forensic inspection of layer-wise hidden state patterns, reporting over 95% accuracy with minimal computational overhead.

citing papers explorer

Showing 2 of 2 citing papers.

AIvaluateXR: An Evaluation Framework for on-Device AI in XR with Benchmarking Results cs.DC · 2025-02-13 · unverdicted · none · ref 53
AIvaluateXR benchmarks 17 LLMs across four XR platforms on performance, speed, memory and battery metrics and proposes a 3D Pareto optimality method to identify optimal on-device model-device pairs.
Exposing the Ghost in the Transformer: Abnormal Detection for Large Language Models via Hidden State Forensics cs.CR · 2025-04-01 · unverdicted · none · ref 25
A framework detects LLM anomalies including hallucinations, jailbreaks, and backdoors by forensic inspection of layer-wise hidden state patterns, reporting over 95% accuracy with minimal computational overhead.

Truthfulqa: Measuring how models mimic human falsehoods

fields

years

verdicts

representative citing papers

citing papers explorer