hub

The llama 3 herd of models.arXiv e-prints, pages arXiv–2407

Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, et al · 2024

15 Pith papers cite this work. Polarity classification is still indexing.

15 Pith papers citing it

browse 15 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

background 1 baseline 1

citation-polarity summary

background 1 baseline 1

representative citing papers

STRABLE: Benchmarking Tabular Machine Learning with Strings

cs.LG · 2026-05-12 · unverdicted · novelty 8.0

A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.

When RL Meets Adaptive Speculative Training: A Unified Training-Serving System

cs.LG · 2026-02-06 · conditional · novelty 7.0

Aurora unifies speculative decoder training and serving via asynchronous RL on inference traces, delivering 1.5x day-0 speedup on frontier models and 1.25x adaptation gains on distribution shifts.

Scaling Latent Reasoning via Looped Language Models

cs.CL · 2025-10-29 · unverdicted · novelty 7.0

Looped language models with latent iterative computation and entropy-regularized depth allocation achieve performance matching up to 12B standard LLMs through superior knowledge manipulation.

SynBench: A Benchmark for Differentially Private Text Generation

cs.AI · 2025-09-18 · conditional · novelty 7.0

SynBench benchmarks DP text generators across nine datasets and uses a new MIA to show that public pre-training on portions of private data overestimates synthetic text quality and breaks DP privacy bounds.

Distributional Alignment as a Criterion for Designing Task Vectors in In-Context Learning

cs.CL · 2026-05-20 · unverdicted · novelty 6.0

A distributional alignment metric d_NTP and a linear regression method LTV for task vectors that improves accuracy by 9.2% over baselines on classification and regression tasks across multiple LLMs.

The Power of Order: Fooling LLMs with Adversarial Table Permutations

cs.LG · 2026-05-01 · unverdicted · novelty 6.0

Semantically invariant row and column permutations in tables can cause LLMs to output incorrect answers, and a gradient-based attack called ATP efficiently finds such permutations that degrade performance across many models.

Neuro-Symbolic Proof Generation for Scaling Systems Software Verification

cs.AI · 2026-03-20 · conditional · novelty 6.0

A neuro-symbolic system using LLM-guided best-first search and Isabelle tools proves up to 77.6% of theorems on the seL4 benchmark, outperforming prior LLM methods and Sledgehammer.

FlexServe: A Fast and Secure LLM Serving System for Mobile Devices with Flexible Resource Isolation

cs.CR · 2026-03-10 · unverdicted · novelty 6.0

FlexServe achieves up to 10x faster time-to-first-token for secure LLM inference on mobile devices by using flexible resource isolation in TrustZone compared to standard approaches.

veScale-FSDP: Flexible and High-Performance FSDP at Scale

cs.DC · 2026-02-25 · unverdicted · novelty 6.0

veScale-FSDP uses RaggedShard and structure-aware planning to support block-wise quantization and non-element-wise optimizers while delivering 5-66% higher throughput and 16-30% lower memory than prior FSDP systems at massive scale.

Scalable Generation and Validation of Isomorphic Physics Problems with GenAI

cs.CY · 2026-02-04 · unverdicted · novelty 6.0

GenAI framework generates isomorphic physics problem banks via prompt chaining and validates them with 17 language models that correlate with student performance (ρ up to 0.594), achieving homogeneous difficulty in 73% of deployed banks.

Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models

cs.CV · 2026-02-02 · unverdicted · novelty 6.0

ReAlign corrects the modality gap in unpaired data to let MLLMs learn visual distributions from text alone before instruction tuning, reducing dependence on expensive paired corpora.

Reasoning through Verifiable Forecast Actions: Consistency-Grounded RL for Financial LLMs

cs.LG · 2026-05-21 · unverdicted · novelty 5.0

StockR1 unifies LLM-based financial reasoning and time-series forecasting by emitting verifiable forecast actions that condition a decoder, optimized via consistency-grounded RL to improve accuracy on QA and prediction tasks.

Latent Action Reparameterization for Efficient Agent Inference

cs.AI · 2026-05-18 · unverdicted · novelty 5.0

LAR learns a compact latent action space from trajectories that shortens the effective decision horizon for LLM agents, reducing token count and inference time while preserving task success.

Resting Neurons, Active Insights: Robustifying Activation Sparsity in LLMs via Spontaneity

cs.LG · 2025-12-14 · unverdicted · novelty 5.0 · 2 refs

SPON adds a small set of trainable input-independent activation vectors as representational anchors, trained by distribution matching, to stabilize sparse activation in LLMs and recover performance lost to hidden-state distribution shifts.

What Is Preference Optimization Doing, and Why?

cs.LG · 2025-11-30 · unverdicted · novelty 5.0

Gradient analysis and ablations show DPO and PPO have different target directions and component roles in preference optimization for LLMs.

citing papers explorer

Showing 15 of 15 citing papers.

STRABLE: Benchmarking Tabular Machine Learning with Strings cs.LG · 2026-05-12 · unverdicted · none · ref 11
A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.
When RL Meets Adaptive Speculative Training: A Unified Training-Serving System cs.LG · 2026-02-06 · conditional · none · ref 10
Aurora unifies speculative decoder training and serving via asynchronous RL on inference traces, delivering 1.5x day-0 speedup on frontier models and 1.25x adaptation gains on distribution shifts.
Scaling Latent Reasoning via Looped Language Models cs.CL · 2025-10-29 · unverdicted · none · ref 5
Looped language models with latent iterative computation and entropy-regularized depth allocation achieve performance matching up to 12B standard LLMs through superior knowledge manipulation.
SynBench: A Benchmark for Differentially Private Text Generation cs.AI · 2025-09-18 · conditional · none · ref 13
SynBench benchmarks DP text generators across nine datasets and uses a new MIA to show that public pre-training on portions of private data overestimates synthetic text quality and breaks DP privacy bounds.
Distributional Alignment as a Criterion for Designing Task Vectors in In-Context Learning cs.CL · 2026-05-20 · unverdicted · none · ref 11
A distributional alignment metric d_NTP and a linear regression method LTV for task vectors that improves accuracy by 9.2% over baselines on classification and regression tasks across multiple LLMs.
The Power of Order: Fooling LLMs with Adversarial Table Permutations cs.LG · 2026-05-01 · unverdicted · none · ref 15
Semantically invariant row and column permutations in tables can cause LLMs to output incorrect answers, and a gradient-based attack called ATP efficiently finds such permutations that degrade performance across many models.
Neuro-Symbolic Proof Generation for Scaling Systems Software Verification cs.AI · 2026-03-20 · conditional · none · ref 13
A neuro-symbolic system using LLM-guided best-first search and Isabelle tools proves up to 77.6% of theorems on the seL4 benchmark, outperforming prior LLM methods and Sledgehammer.
FlexServe: A Fast and Secure LLM Serving System for Mobile Devices with Flexible Resource Isolation cs.CR · 2026-03-10 · unverdicted · none · ref 20
FlexServe achieves up to 10x faster time-to-first-token for secure LLM inference on mobile devices by using flexible resource isolation in TrustZone compared to standard approaches.
veScale-FSDP: Flexible and High-Performance FSDP at Scale cs.DC · 2026-02-25 · unverdicted · none · ref 3
veScale-FSDP uses RaggedShard and structure-aware planning to support block-wise quantization and non-element-wise optimizers while delivering 5-66% higher throughput and 16-30% lower memory than prior FSDP systems at massive scale.
Scalable Generation and Validation of Isomorphic Physics Problems with GenAI cs.CY · 2026-02-04 · unverdicted · none · ref 25
GenAI framework generates isomorphic physics problem banks via prompt chaining and validates them with 17 language models that correlate with student performance (ρ up to 0.594), achieving homogeneous difficulty in 73% of deployed banks.
Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models cs.CV · 2026-02-02 · unverdicted · none · ref 7
ReAlign corrects the modality gap in unpaired data to let MLLMs learn visual distributions from text alone before instruction tuning, reducing dependence on expensive paired corpora.
Reasoning through Verifiable Forecast Actions: Consistency-Grounded RL for Financial LLMs cs.LG · 2026-05-21 · unverdicted · none · ref 53
StockR1 unifies LLM-based financial reasoning and time-series forecasting by emitting verifiable forecast actions that condition a decoder, optimized via consistency-grounded RL to improve accuracy on QA and prediction tasks.
Latent Action Reparameterization for Efficient Agent Inference cs.AI · 2026-05-18 · unverdicted · none · ref 12
LAR learns a compact latent action space from trajectories that shortens the effective decision horizon for LLM agents, reducing token count and inference time while preserving task success.
Resting Neurons, Active Insights: Robustifying Activation Sparsity in LLMs via Spontaneity cs.LG · 2025-12-14 · unverdicted · none · ref 5 · 2 links
SPON adds a small set of trainable input-independent activation vectors as representational anchors, trained by distribution matching, to stabilize sparse activation in LLMs and recover performance lost to hidden-state distribution shifts.
What Is Preference Optimization Doing, and Why? cs.LG · 2025-11-30 · unverdicted · none · ref 18
Gradient analysis and ablations show DPO and PPO have different target directions and component roles in preference optimization for LLMs.

The llama 3 herd of models.arXiv e-prints, pages arXiv–2407

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer