mega hub Canonical reference

LLaMA: Open and Efficient Foundation Language Models

· 2023 · cs.CL · arXiv 2302.13971

Canonical reference. 82% of citing Pith papers cite this work as background.

1111 Pith papers citing it

Background 82% of classified citations

open full Pith review browse 1111 citing papers arXiv PDF

abstract

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 206 method 19 baseline 8 other 6 dataset 1 extension 1

citation-polarity summary

background 198 use method 20 unclear 13 baseline 7 extend 1 support 1 use dataset 1

claims ledger

abstract We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

mega hub controls

export citing contexts JSON export graph JSON export full bundle JSON open full Pith review annotated reader queued

Recognition alignment

counterfactual ablation

If this work disappeared, these are the nearest dependency candidates in Pith, weighted toward method, dataset, baseline, and extension contexts where available. This is a structural signal, not a retraction verdict.

co-cited works

representative citing papers

Privacy Auditing with Zero (0) Training Run

cs.CR · 2026-05-14 · unverdicted · novelty 8.0

Zero-Run auditing supplies valid lower bounds on differential privacy parameters from fixed member and non-member datasets by modeling and correcting distribution-shift confounding via causal-inference techniques.

Effective Context in Transformers: An Analysis of Fragmentation and Tokenization

cs.LG · 2026-05-13 · unverdicted · novelty 8.0

Fragmentation strictly raises optimal finite-context log-loss on Markov sources while tokenization can make a short token window equivalent to a longer source window under reliability and compression conditions.

Grid Games: The Power of Multiple Grids for Quantizing Large Language Models

cs.LG · 2026-05-12 · accept · novelty 8.0

Allowing each quantization group to select among multiple 4-bit grids improves accuracy over single-grid FP4 for both post-training and pre-training of LLMs.

Steering Without Breaking: Mechanistically Informed Interventions for Discrete Diffusion Language Models

cs.LG · 2026-05-08 · unverdicted · novelty 8.0

Adaptive scheduling of interventions in discrete diffusion language models, timed to attribute-specific commitment schedules discovered with sparse autoencoders, delivers precise multi-attribute steering up to 93% strength while preserving generation quality.

When and Why SignSGD Outperforms SGD: A Theoretical Study Based on $\ell_1$-norm Lower Bounds

cs.LG · 2026-05-07 · unverdicted · novelty 8.0

SignSGD provably beats SGD by a factor of d under sparse noise via matched ℓ1-norm upper and lower bounds, with an equivalent result for Muon on matrices, and this predicts faster GPT-2 pretraining.

Backdoor Attacks on Decentralised Post-Training

cs.CR · 2026-03-31 · conditional · novelty 8.0 · 2 refs

An adversary controlling an intermediate pipeline stage in decentralized LLM post-training can inject a backdoor that reduces alignment from 80% to 6%, with the backdoor persisting in 60% of cases even after subsequent safety training.

Model Context Protocol (MCP) at First Glance: Studying the Security and Maintainability of MCP Servers

cs.SE · 2025-06-16 · conditional · novelty 8.0

First study of 1,899 MCP servers finds eight distinct vulnerabilities (only three traditional), 7.2% with general issues, 5.5% with tool poisoning, and 66% with code smells, urging MCP-specific security practices.

BEAVER: An Enterprise Benchmark for Text-to-SQL

cs.CL · 2024-09-03 · unverdicted · novelty 8.0

BEAVER is the first text-to-SQL benchmark from private enterprise data warehouses, revealing SOTA agentic frameworks achieve only 10.8% accuracy on complex real-world queries.

MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?

cs.CV · 2024-08-23 · conditional · novelty 8.0

MME-RealWorld is the largest manually annotated high-resolution benchmark for MLLMs, where even the best models achieve less than 60% accuracy on challenging real-world tasks.

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents

cs.CR · 2024-06-19 · unverdicted · novelty 8.0

AgentDojo introduces an extensible evaluation framework populated with realistic agent tasks and security test cases to measure prompt injection robustness in tool-using LLM agents.

AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments

cs.HC · 2024-05-13 · conditional · novelty 8.0

AgentClinic is a multimodal agent benchmark demonstrating that LLM diagnostic accuracy on MedQA drops to below one-tenth in sequential clinical simulations, with Claude-3.5 leading and large tool-use differences across models.

ORPO: Monolithic Preference Optimization without Reference Model

cs.CL · 2024-03-12 · conditional · novelty 8.0

ORPO performs preference alignment during supervised fine-tuning via a monolithic odds ratio penalty, allowing 7B models to outperform larger state-of-the-art models on alignment benchmarks.

Bridging Language and Items for Retrieval and Recommendation: Benchmarking LLMs as Semantic Encoders

cs.IR · 2024-03-06 · unverdicted · novelty 8.0

BLaIR is a new benchmark and 570M-review dataset showing that LLM performance rankings on recommendation tasks have little correlation with rankings on general embedding benchmarks like MTEB.

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

cs.LG · 2023-12-01 · unverdicted · novelty 8.0

Mamba is a linear-time sequence model using input-dependent selective SSMs that achieves SOTA results across modalities and matches twice-larger Transformers on language modeling with 5x higher inference throughput.

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

cs.CL · 2023-11-27 · unverdicted · novelty 8.0

MMMU provides 11.5K heterogeneous college-level multimodal questions that current models solve at 56-59% accuracy, establishing a new standard for expert multimodal evaluation.

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

cs.CL · 2023-05-17 · accept · novelty 8.0

Tree of Thoughts enables language models to solve complex planning tasks by generating, evaluating, and searching over coherent intermediate thoughts in a tree, raising Game of 24 success from 4% to 74% with GPT-4.

API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs

cs.CL · 2023-04-14 · conditional · novelty 8.0

API-Bank is a new benchmark and training dataset for tool-augmented LLMs that shows fine-tuned models can approach GPT-3.5 tool-use effectiveness.

Instruction Tuning with GPT-4

cs.CL · 2023-04-06 · unverdicted · novelty 8.0

GPT-4-generated instruction data produces superior zero-shot performance in finetuned LLaMA models versus prior state-of-the-art data.

Language-Assisted Super-Resolution from Real-World Low-Resolution Patches

cs.CV · 2026-06-30 · unverdicted · novelty 7.0

LA-SR redefines unpaired super-resolution in language space by projecting images into a semantically rich representation and applying vision-language model guided losses to handle real-world degradations extracted from depth variations.

Probing Memorization of Tabular In-Context Learning

cs.LG · 2026-06-30 · unverdicted · novelty 7.0

A new probing framework detects moderate parametric memorization signals in tabular in-context learning models under single-task fine-tuning, strongest on low-cardinality tasks, but signals largely disappear under realistic training.

Search for Truth from Reasoning: A Dynamic Representation Editing Framework for Steering LLM Trajectories

cs.AI · 2026-06-26 · unverdicted · novelty 7.0

DynaSteer dynamically steers LLM reasoning trajectories toward truth via pattern clustering, Fisher-LDA projection, and entropy-triggered representation edits, improving performance on MATH and generalizing to coding.

A Sensitivity-Aware Test Collection for Search Among Personal Information

cs.IR · 2026-06-25 · accept · novelty 7.0

A new sensitivity-labeled test collection is released from Enron emails with crowdsourced queries, relevance judgments, and LLM extensions for evaluating sensitivity-aware search.

Large Language Model Teaches Visual Students: Cross-Modality Transfer of Fine-Grained Conceptual Knowledge

cs.CV · 2026-06-25 · unverdicted · novelty 7.0

LaViD distills LLM conceptual knowledge to vision models via LLM-generated MCQ soft labels, outperforming vision-language distillation baselines on fine-grained benchmarks while improving robustness on spurious correlation datasets.

PatternGSL: A Structured Specification Language for Template-Free and Simulation-Ready 3D Garments

cs.CV · 2026-06-23 · unverdicted · novelty 7.0

PatternGSL is a new template-free specification language for complete sewing patterns that enables direct single-image prediction of simulation-ready garments via a vision-language model, supported by a new 300K paired dataset.

citing papers explorer

Showing 50 of 1111 citing papers.

Steering Visual Generation in Unified Multimodal Models with Understanding Supervision cs.CV · 2026-05-07 · unverdicted · none · ref 50 · internal anchor
Using understanding tasks as direct supervision during post-training improves image generation and editing in unified multimodal models.
Closing the Loop: Unified 3D Scene Generation and Immersive Interaction via LLM-RL Coupling cs.CV · 2026-05-07 · unverdicted · none · ref 40 · internal anchor
A closed-loop system couples LLM-based 3D scene generation with RL optimization and VR user interactions to produce adaptive, immersive environments, claiming SOTA results on the ALFRED benchmark.
Predict-then-Diffuse: Adaptive Response Length for Compute-Budgeted Inference in Diffusion LLMs cs.LG · 2026-05-05 · unverdicted · none · ref 2 · 2 links · internal anchor
Predict-then-Diffuse predicts response length for diffusion LLMs before inference, cutting FLOPs with a data-driven safety buffer while preserving output quality.
ELAS: Efficient Pre-Training of Low-Rank Large Language Models via 2:4 Activation Sparsity cs.LG · 2026-05-05 · unverdicted · none · ref 12 · internal anchor
ELAS pre-trains low-rank LLMs by applying 2:4 activation sparsity after squared ReLU to cut memory and accelerate training with minimal performance loss.
Trust, but Verify: Peeling Low-Bit Transformer Networks for Training Monitoring cs.LG · 2026-05-04 · unverdicted · none · ref 10 · internal anchor
A layer-wise peeling framework creates reference bounds to diagnose under-optimized layers in trained decoder-only transformers, including low-bit and quantized versions.
U-Define: Designing User Workflows for Hard and Soft Constraints in LLM-Based Planning cs.AI · 2026-05-04 · unverdicted · none · ref 113 · internal anchor
U-Define improves user control in LLM planning by letting people define hard rules and soft preferences in natural language with matching verification methods, raising usefulness and satisfaction scores.
Universal Smoothness via Bernstein Polynomials: A Constructive Approximation Approach for Activation Functions cs.AI · 2026-05-04 · unverdicted · none · ref 23 · internal anchor
BerLU constructs a C1-differentiable activation with Lipschitz constant 1 via Bernstein polynomial approximation, showing better performance and efficiency than baselines on image classification with ViTs and CNNs.
ARGUS: Policy-Adaptive Ad Governance via Evolving Reinforcement with Adversarial Umpiring cs.CL · 2026-05-04 · unverdicted · none · ref 50 · internal anchor
ARGUS uses a Prosecutor-Defender-Umpire multi-agent setup plus RAG and chain-of-thought rewards to adapt ad policy enforcement to new regulations using minimal fresh labels.
From Where Things Are to What They Are For: Benchmarking Spatial-Functional Intelligence in Multimodal LLMs cs.CV · 2026-05-04 · unverdicted · none · ref 63 · internal anchor
SFI-Bench shows current multimodal LLMs struggle to integrate spatial memory with functional reasoning and external knowledge in video tasks.
AdaMeZO: Adam-style Zeroth-Order Optimizer for LLM Fine-tuning Without Maintaining the Moments cs.LG · 2026-05-01 · unverdicted · none · ref 21 · internal anchor
AdaMeZO adapts Adam moment estimates to zeroth-order LLM fine-tuning without extra memory storage, outperforming MeZO with up to 70% fewer forward passes.
Assert, don't describe: Linguistic features that shift LLM reasoning about animal welfare cs.CL · 2026-04-30 · unverdicted · none · ref 5 · internal anchor
Assertive linguistic features in training data increase LLMs' pro-animal-welfare reasoning while hedged and sensory-description features decrease it.
Caracal: Causal Architecture via Spectral Mixing cs.LG · 2026-04-30 · unverdicted · none · ref 63 · internal anchor
Caracal is a Fourier-based sequence mixing architecture that achieves causal autoregressive modeling with standard operators and competitive performance on long sequences.
Uni-HOI:A Unified framework for Learning the Joint distribution of Text and Human-Object Interaction cs.CV · 2026-04-30 · unverdicted · none · ref 39 · internal anchor
Uni-HOI learns the joint distribution of text, human motion, and object motion using LLMs and VQ-VAEs in a two-stage training process for multiple HOI tasks.
FED-FSTQ: Fisher-Guided Token Quantization for Communication-Efficient Federated Fine-Tuning of LLMs on Edge Devices cs.LG · 2026-04-28 · unverdicted · none · ref 2 · internal anchor
Fed-FSTQ reduces uplink traffic by 46x and improves time-to-accuracy by 52% in federated LLM fine-tuning using Fisher-guided token quantization and selection.
Structural Pruning of Large Vision Language Models: A Comprehensive Study on Pruning Dynamics, Recovery, and Data Efficiency cs.CL · 2026-04-27 · conditional · none · ref 4 · internal anchor
Widthwise pruning of LVLM language backbones combined with supervised finetuning and hidden-state distillation recovers over 95% performance using just 5% of data across 3B-7B models.
TACO: Efficient Communication Compression of Intermediate Tensors for Scalable Tensor-Parallel LLM Training cs.DC · 2026-04-27 · unverdicted · none · ref 51 · internal anchor
TACO compresses tensor-parallel intermediate tensors with an adaptive FP8 scheme and fused kernels, yielding up to 1.87X throughput gains on GPT and Qwen models with near-lossless accuracy.
Thinking Like a Clinician: A Cognitive AI Agent for Clinical Diagnosis via Panoramic Profiling and Adversarial Debate cs.AI · 2026-04-26 · unverdicted · none · ref 2 · internal anchor
DxChain uses panoramic patient profiling, Med-ToT planning, and adversarial angel-devil debates to reduce LLM hallucinations in clinical diagnosis, achieving SOTA accuracy and consistency on two MIMIC-IV benchmarks.
Hybrid JIT-CUDA Graph Optimization for Low-Latency Large Language Model Inference cs.LG · 2026-04-25 · unverdicted · none · ref 26 · internal anchor
A hybrid JIT-CUDA Graph framework reduces TTFT by up to 66% and P99 latency versus TensorRT-LLM for single-GPU LLaMA-2 7B inference on short prompts.
When Does Removing LayerNorm Help? Activation Bounding as a Regime-Dependent Implicit Regularizer cs.LG · 2026-04-25 · unverdicted · none · ref 27 · internal anchor
DyT improves validation loss 27% at 64M params/1M tokens but worsens it 19% at 118M tokens, with saturation levels predicting the sign of the effect.
Automating Categorization of Scientific Texts with In-Context Learning and Prompt-Chaining in Large Language Models cs.IR · 2026-04-25 · unverdicted · none · ref 41 · internal anchor
Prompt chaining with off-the-shelf LLMs outperforms in-context learning and BERT for 1st- and 2nd-level classification on the ORKG taxonomy using the FORC dataset, but struggles at the 3rd level.
Sapiens2 cs.CV · 2026-04-23 · unverdicted · none · ref 26 · internal anchor
Sapiens2 improves pretraining, data scale, and architecture over its predecessor to set new state-of-the-art results on human pose estimation, body-part segmentation, normal estimation, and new tasks like pointmap and albedo estimation.
LayerTracer: A Joint Task-Particle and Vulnerable-Layer Analysis framework for Arbitrary Large Language Model Architectures cs.CL · 2026-04-22 · unverdicted · none · ref 6 · internal anchor
LayerTracer defines task particles as the first layer where target token probability rises sharply and vulnerable layers via maximum JS divergence after masking, showing task particles in deep layers and greater robustness in larger models.
Measuring the Machine: Evaluating Generative AI as Pluralist Sociotechical Systems cs.AI · 2026-04-22 · unverdicted · none · ref 19 · internal anchor
Generative AI must be evaluated as recursive pluralist sociotechnical systems via MaSH Loops and distributional World Values Benchmarks instead of static functionalist or prescriptive tests.
Multi-Perspective Evidence Synthesis and Reasoning for Unsupervised Multimodal Entity Linking cs.CL · 2026-04-22 · unverdicted · none · ref 43 · internal anchor
MSR-MEL synthesizes instance-centric, group-level, lexical, and statistical evidence with LLMs and asymmetric teacher-student GNNs to outperform prior unsupervised methods on multimodal entity linking benchmarks.
RADS: Reinforcement Learning-Based Sample Selection Improves Transfer Learning in Low-resource and Imbalanced Clinical Settings cs.CL · 2026-04-22 · unverdicted · none · ref 4 · internal anchor
RADS applies reinforcement learning to pick informative samples for transfer learning, improving performance over uncertainty and diversity sampling in low-resource imbalanced clinical settings.
Absorber LLM: Harnessing Causal Synchronization for Test-Time Training cs.LG · 2026-04-22 · unverdicted · none · ref 84 · internal anchor
Absorber LLM introduces causal synchronization to absorb context into parameters for memory-efficient long-context LLM inference while preserving causal effects.
Commonsense Knowledge with Negation: A Resource to Enhance Negation Understanding cs.CL · 2026-04-21 · unverdicted · none · ref 6 · internal anchor
Augmenting commonsense knowledge corpora with negation produces over 2M new triples that benefit LLM negation understanding when used for pre-training.
Self-Improving Tabular Language Models via Iterative Reward-Guided Post-Training cs.LG · 2026-04-21 · unverdicted · none · ref 70 · internal anchor
TabGRAA applies group-relative advantage alignment in an iterative reward-guided post-training loop to improve tabular language model generators on fidelity, utility, and privacy trade-offs across five benchmarks.
An Empirical Study of Multi-Generation Sampling for Jailbreak Detection in Large Language Models cs.CL · 2026-04-20 · unverdicted · none · ref 32 · internal anchor
Multi-generation sampling from LLMs uncovers more jailbreak behaviors than single generations, with the largest gains from one to moderate sample counts and diminishing returns thereafter.
Sessa: Selective State Space Attention cs.LG · 2026-04-20 · unverdicted · none · ref 34 · internal anchor
Sessa integrates attention within recurrent paths to achieve power-law memory tails and flexible non-decaying selective retrieval, outperforming baselines on long-context tasks.
Learning Invariant Modality Representation for Robust Multimodal Learning from a Causal Inference Perspective cs.LG · 2026-04-20 · unverdicted · none · ref 32 · internal anchor
CmIR uses causal inference to separate invariant causal representations from spurious ones in multimodal data, improving generalization under distribution shifts and noise via invariance, mutual information, and reconstruction constraints.
Understanding the Prompt Sensitivity cs.CL · 2026-04-20 · unverdicted · none · ref 48 · internal anchor
LLMs disperse meaning-preserving prompts internally instead of clustering them, which produces an excessively high upper bound on output log-probability differences via Taylor expansion and Cauchy-Schwarz.
LEPO: Latent Reasoning Policy Optimization for Large Language Models cs.LG · 2026-04-20 · unverdicted · none · ref 10 · internal anchor
LEPO applies RL to continuous latent representations in LLMs by injecting Gumbel-Softmax stochasticity for diverse trajectory sampling and unified gradient estimation, outperforming existing discrete and latent RL methods.
Understanding Secret Leakage Risks in Code LLMs: A Tokenization Perspective cs.CR · 2026-04-20 · unverdicted · none · ref 14 · internal anchor
BPE tokenization creates gibberish bias in CLLMs, causing secrets with high character entropy but low token entropy to be preferentially memorized due to training data distribution shifts.
ReFineVLA: Multimodal Reasoning-Aware Generalist Robotic Policies via Teacher-Guided Fine-Tuning cs.RO · 2026-04-20 · unverdicted · none · ref 35 · internal anchor
ReFineVLA adds teacher-generated reasoning steps to VLA training and reports state-of-the-art success rates on SimplerEnv WidowX and Google Robot benchmarks.
Rethinking Data Curation in LLM Training: Online Reweighting Offers Better Generalization than Offline Methods cs.LG · 2026-04-19 · unverdicted · none · ref 43 · internal anchor
ADAPT is an online reweighting framework for LLM training that outperforms offline data selection and mixing methods in cross-benchmark generalization under equal compute.
Calibrating Model-Based Evaluation Metrics for Summarization cs.CL · 2026-04-19 · unverdicted · none · ref 22 · internal anchor
A reference-free proxy scoring framework combined with GIRB calibration produces better-aligned evaluation metrics for summarization and outperforms baselines across seven datasets.
Pruning Unsafe Tickets: A Resource-Efficient Framework for Safer and More Robust LLMs cs.LG · 2026-04-17 · unverdicted · none · ref 44 · internal anchor
Pruning removes 'unsafe tickets' from LLMs via gradient-free attribution, reducing harmful outputs and jailbreak vulnerability with minimal utility loss.
IUQ: Interrogative Uncertainty Quantification for Long-Form Large Language Model Generation cs.CL · 2026-04-16 · unverdicted · none · ref 41 · internal anchor
IUQ quantifies claim-level uncertainty in long-form LLM generation by combining inter-sample consistency and intra-sample faithfulness through an interrogate-then-respond approach and outperforms baselines on two datasets.
ToxiShield: Promoting Inclusive Developer Communication through Real-Time Toxicity Filtering cs.SE · 2026-04-15 · unverdicted · none · ref 51 · internal anchor
ToxiShield delivers a real-time GitHub extension with a BERT toxicity detector at 98% accuracy, a Claude-based coach, and a fine-tuned Llama reframer at 95% style transfer accuracy, validated by a 10-person TAM study.
HiVLA: A Visual-Grounded-Centric Hierarchical Embodied Manipulation System cs.CV · 2026-04-15 · unverdicted · none · ref 37 · 2 links · internal anchor
HiVLA decouples VLM-based semantic planning with visual grounding from a cascaded cross-attention DiT action expert, outperforming end-to-end VLAs on long-horizon and fine-grained manipulation.
MAny: Merge Anything for Multimodal Continual Instruction Tuning cs.LG · 2026-04-15 · unverdicted · none · ref 12 · internal anchor
MAny addresses dual-forgetting in multimodal continual instruction tuning via CPM and LPM merging strategies, delivering up to 8.57% accuracy gains on UCIT benchmarks without additional training.
Identifying and Mitigating Gender Cues in Academic Recommendation Letters: An Interpretability Case Study cs.LG · 2026-04-14 · unverdicted · none · ref 44 · internal anchor
Transformer models detect applicant gender in de-gendered academic recommendation letters via implicit linguistic patterns such as associations with words like 'emotional' and 'humanitarian', and removing these cues reduces but does not eliminate prediction accuracy above chance.
Please Make it Sound like Human: Encoder-Decoder vs. Decoder-Only Transformers for AI-to-Human Text Style Transfer cs.CL · 2026-04-13 · unverdicted · none · ref 11 · internal anchor
BART-large outperforms Mistral-7B in AI-to-human style transfer with higher reference similarity scores and far fewer parameters, while showing that marker shift can reflect overshoot rather than accurate transfer.
ComSim: Building Scalable Real-World Robot Data Generation via Compositional Simulation cs.RO · 2026-04-13 · unverdicted · none · ref 45 · internal anchor
Compositional Simulation generates scalable real-world robot training data by combining classical simulation with neural simulation in a closed-loop real-sim-real augmentation pipeline.
SignReasoner: Compositional Reasoning for Complex Traffic Sign Understanding via Functional Structure Units cs.CV · 2026-04-12 · unverdicted · none · ref 27 · internal anchor
SignReasoner decomposes traffic signs into functional structure units and uses a two-stage VLM post-training pipeline to achieve state-of-the-art compositional reasoning on a new benchmark.
Wearable AI in the Era of Large Sensor Models eess.SP · 2026-04-11 · unverdicted · none · ref 36 · internal anchor
Large Sensor Models trained on large-scale multimodal wearable data can provide a scalable, general framework for wearable AI by learning transferable representations across modalities and tasks.
SEPTQ: A Simple and Effective Post-Training Quantization Paradigm for Large Language Models cs.CL · 2026-04-11 · unverdicted · none · ref 37 · internal anchor
SEPTQ simplifies LLM post-training quantization to two steps via static global importance scoring and mask-guided column-wise weight updates, claiming superior results over baselines in low-bit settings.
Policy-Aware Edge LLM-RAG Framework for Internet of Battlefield Things Mission Orchestration cs.NI · 2026-04-10 · unverdicted · none · ref 4 · internal anchor
PA-LLM-RAG adds policy retrieval and dual-LLM verification to enable reliable low-latency mission orchestration in simulated IoBT environments, with Gemma-2B reaching 100% policy compliance at 4.17s latency.
Customized Fusion: A Closed-Loop Dynamic Network for Adaptive Multi-Task-Aware Infrared-Visible Image Fusion cs.CV · 2026-04-10 · unverdicted · none · ref 42 · internal anchor
CLDyN establishes a closed-loop semantic transmission chain with a Requirement-driven Semantic Compensation module to make infrared-visible fusion adapt to diverse downstream tasks.

LLaMA: Open and Efficient Foundation Language Models

hub tools

citation-role summary

citation-polarity summary

claims ledger

mega hub controls

Recognition alignment

counterfactual ablation

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer