mega hub Canonical reference

LLaMA: Open and Efficient Foundation Language Models

· 2023 · cs.CL · arXiv 2302.13971

Canonical reference. 82% of citing Pith papers cite this work as background.

1086 Pith papers citing it

Background 82% of classified citations

open full Pith review browse 1086 citing papers arXiv PDF

abstract

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 206 method 19 baseline 8 other 6 dataset 1 extension 1

citation-polarity summary

background 198 use method 20 unclear 13 baseline 7 extend 1 support 1 use dataset 1

claims ledger

abstract We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

mega hub controls

export citing contexts JSON export graph JSON export full bundle JSON open full Pith review annotated reader queued

Recognition alignment

counterfactual ablation

If this work disappeared, these are the nearest dependency candidates in Pith, weighted toward method, dataset, baseline, and extension contexts where available. This is a structural signal, not a retraction verdict.

co-cited works

representative citing papers

Privacy Auditing with Zero (0) Training Run

cs.CR · 2026-05-14 · unverdicted · novelty 8.0

Zero-Run auditing supplies valid lower bounds on differential privacy parameters from fixed member and non-member datasets by modeling and correcting distribution-shift confounding via causal-inference techniques.

Effective Context in Transformers: An Analysis of Fragmentation and Tokenization

cs.LG · 2026-05-13 · unverdicted · novelty 8.0

Fragmentation strictly raises optimal finite-context log-loss on Markov sources while tokenization can make a short token window equivalent to a longer source window under reliability and compression conditions.

Grid Games: The Power of Multiple Grids for Quantizing Large Language Models

cs.LG · 2026-05-12 · accept · novelty 8.0

Allowing each quantization group to select among multiple 4-bit grids improves accuracy over single-grid FP4 for both post-training and pre-training of LLMs.

Steering Without Breaking: Mechanistically Informed Interventions for Discrete Diffusion Language Models

cs.LG · 2026-05-08 · unverdicted · novelty 8.0

Adaptive scheduling of interventions in discrete diffusion language models, timed to attribute-specific commitment schedules discovered with sparse autoencoders, delivers precise multi-attribute steering up to 93% strength while preserving generation quality.

When and Why SignSGD Outperforms SGD: A Theoretical Study Based on $\ell_1$-norm Lower Bounds

cs.LG · 2026-05-07 · unverdicted · novelty 8.0

SignSGD provably beats SGD by a factor of d under sparse noise via matched ℓ1-norm upper and lower bounds, with an equivalent result for Muon on matrices, and this predicts faster GPT-2 pretraining.

Backdoor Attacks on Decentralised Post-Training

cs.CR · 2026-03-31 · conditional · novelty 8.0 · 2 refs

An adversary controlling an intermediate pipeline stage in decentralized LLM post-training can inject a backdoor that reduces alignment from 80% to 6%, with the backdoor persisting in 60% of cases even after subsequent safety training.

Model Context Protocol (MCP) at First Glance: Studying the Security and Maintainability of MCP Servers

cs.SE · 2025-06-16 · conditional · novelty 8.0

First study of 1,899 MCP servers finds eight distinct vulnerabilities (only three traditional), 7.2% with general issues, 5.5% with tool poisoning, and 66% with code smells, urging MCP-specific security practices.

BEAVER: An Enterprise Benchmark for Text-to-SQL

cs.CL · 2024-09-03 · unverdicted · novelty 8.0

BEAVER is the first text-to-SQL benchmark from private enterprise data warehouses, revealing SOTA agentic frameworks achieve only 10.8% accuracy on complex real-world queries.

MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?

cs.CV · 2024-08-23 · conditional · novelty 8.0

MME-RealWorld is the largest manually annotated high-resolution benchmark for MLLMs, where even the best models achieve less than 60% accuracy on challenging real-world tasks.

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents

cs.CR · 2024-06-19 · unverdicted · novelty 8.0

AgentDojo introduces an extensible evaluation framework populated with realistic agent tasks and security test cases to measure prompt injection robustness in tool-using LLM agents.

AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments

cs.HC · 2024-05-13 · conditional · novelty 8.0

AgentClinic is a multimodal agent benchmark demonstrating that LLM diagnostic accuracy on MedQA drops to below one-tenth in sequential clinical simulations, with Claude-3.5 leading and large tool-use differences across models.

ORPO: Monolithic Preference Optimization without Reference Model

cs.CL · 2024-03-12 · conditional · novelty 8.0

ORPO performs preference alignment during supervised fine-tuning via a monolithic odds ratio penalty, allowing 7B models to outperform larger state-of-the-art models on alignment benchmarks.

Bridging Language and Items for Retrieval and Recommendation: Benchmarking LLMs as Semantic Encoders

cs.IR · 2024-03-06 · unverdicted · novelty 8.0

BLaIR is a new benchmark and 570M-review dataset showing that LLM performance rankings on recommendation tasks have little correlation with rankings on general embedding benchmarks like MTEB.

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

cs.LG · 2023-12-01 · unverdicted · novelty 8.0

Mamba is a linear-time sequence model using input-dependent selective SSMs that achieves SOTA results across modalities and matches twice-larger Transformers on language modeling with 5x higher inference throughput.

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

cs.CL · 2023-11-27 · unverdicted · novelty 8.0

MMMU provides 11.5K heterogeneous college-level multimodal questions that current models solve at 56-59% accuracy, establishing a new standard for expert multimodal evaluation.

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

cs.CL · 2023-05-17 · accept · novelty 8.0

Tree of Thoughts enables language models to solve complex planning tasks by generating, evaluating, and searching over coherent intermediate thoughts in a tree, raising Game of 24 success from 4% to 74% with GPT-4.

API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs

cs.CL · 2023-04-14 · conditional · novelty 8.0

API-Bank is a new benchmark and training dataset for tool-augmented LLMs that shows fine-tuned models can approach GPT-3.5 tool-use effectiveness.

Instruction Tuning with GPT-4

cs.CL · 2023-04-06 · unverdicted · novelty 8.0

GPT-4-generated instruction data produces superior zero-shot performance in finetuned LLaMA models versus prior state-of-the-art data.

A Sensitivity-Aware Test Collection for Search Among Personal Information

cs.IR · 2026-06-25 · accept · novelty 7.0

A new sensitivity-labeled test collection is released from Enron emails with crowdsourced queries, relevance judgments, and LLM extensions for evaluating sensitivity-aware search.

Moving Beyond Diversity: Visual Token Pruning as Subspace Reconstruction for Efficient VLMs

cs.CV · 2026-06-17 · unverdicted · novelty 7.0

SPARE reformulates visual token pruning as column subset selection to minimize reconstruction error and uses anti-relevance for context-aware selection in VLMs.

APEX4: Efficient Pure W4A4 LLM Inference via Intra-SM Compute Rebalancing

cs.DC · 2026-06-07 · conditional · novelty 7.0

APEX4 co-designs pure INT4 GEMM kernels with ρ-aware granularity adaptation to deliver up to 2.09× end-to-end speedup on GPUs with low ρ while keeping LLaMA-2-70B perplexity within 0.63 of FP16.

End-to-End Text Line Detection and Ordering

cs.CV · 2026-06-02 · unverdicted · novelty 7.0

Orli is an autoregressive image-to-sequence model that jointly detects text lines and determines their reading order on historical documents via chord-frame baselines, trained on 196k pages across ten scripts.

When Knowledge Is Not Free: Cost-Aware Evidence Selection in Retrieval-Augmented Generation

cs.CL · 2026-06-01 · unverdicted · novelty 7.0

Defines cost-aware RAG with evidence cost tiers and shows static selectors are brittle while agentic LLM-based selection is promising but model-dependent.

RWGBench: Evaluating Scholarly Positioning in Related Work Generation

cs.DL · 2026-05-30 · unverdicted · novelty 7.0

RWGBench is a citation-centric benchmark for related work generation built from 40k CS papers and a 100-paper test set, with multi-dimensional metrics that better match human expert judgment than standard similarity scores.

citing papers explorer

Showing 50 of 1086 citing papers.

Learning Long-term Motion Embeddings for Efficient Kinematics Generation cs.CV · 2026-04-13 · unverdicted · none · ref 40 · internal anchor
A 64x temporally compressed motion embedding learned from trackers enables efficient conditional flow-matching generation of long-term motions that outperform video models and task-specific methods.
EdgeCIM: A Hardware-Software Co-Design for CIM-Based Acceleration of Small Language Models cs.AR · 2026-04-13 · unverdicted · none · ref 8 · internal anchor
A CIM-based hardware-software co-design in 65nm achieves up to 7.3x higher throughput and 49.59x better energy efficiency than NVIDIA Orin Nano for LLaMA3.2-1B, averaging 336 tokens/s and 173 tokens/J under INT4 across multiple SLMs.
Geometry-Aware Localized Watermarking for Copyright Protection in Embedding-as-a-Service cs.CR · 2026-04-13 · unverdicted · none · ref 33 · internal anchor
GeoMark decouples local watermark triggering from centralized ownership attribution using geometry-separated anchors and adaptive neighborhoods to improve robustness against paraphrasing, dimension changes, and clustering attacks while preserving utility.
Position-Agnostic Pre-Projection for Transformer Attention: Nonlinear Feature Construction and Content Skip Before Q/K/V cs.CL · 2026-04-12 · unverdicted · none · ref 12 · internal anchor
A position-agnostic nonlinear pre-projection MLP plus content skip connection in transformer attention improves LAMBADA accuracy by 40.6% and reduces perplexity by 39% on 160M-scale models.
Adapting 2D Multi-Modal Large Language Model for 3D CT Image Analysis cs.CV · 2026-04-11 · unverdicted · none · ref 21 · internal anchor
Transferring a 2D MLLM to 3D CT inputs via parameter reuse, a Text-Guided Hierarchical MoE framework, and two-stage training yields better performance than prior 3D medical MLLMs on medical report generation and visual question answering.
Neuro-Oracle: A Trajectory-Aware Agentic RAG Framework for Interpretable Epilepsy Surgical Prognosis cs.MM · 2026-04-10 · unverdicted · none · ref 27 · internal anchor
Neuro-Oracle distills longitudinal MRI changes into trajectory vectors via a 3D Siamese encoder, retrieves similar cases, and generates LLM-based prognoses, achieving AUC 0.834-0.905 on a resection-type proxy task versus 0.793 for single-timepoint baseline.
In-situ process monitoring for defect detection in wire-arc additive manufacturing: an agentic AI approach cs.AI · 2026-04-10 · unverdicted · none · ref 72 · internal anchor
A multi-agent AI framework using processing and acoustic agents achieves 91.6% accuracy and 0.821 F1 score for in-situ porosity defect detection in wire-arc additive manufacturing.
Pioneer Agent: Continual Improvement of Small Language Models in Production cs.AI · 2026-04-10 · unverdicted · none · ref 88 · internal anchor
Pioneer Agent automates the full lifecycle of adapting and continually improving small language models via diagnosis-driven data synthesis and regression-constrained retraining, delivering gains of 1.6-83.8 points on benchmarks and large lifts in production-style tasks.
Discrete Token Modeling for Multi-Stem Music Source Separation with Language Models eess.AS · 2026-04-10 · unverdicted · none · ref 30 · internal anchor
A Conformer-conditioned decoder-only language model generates discrete tokens via a neural audio codec to separate four music stems, reaching near state-of-the-art perceptual quality and top NISQA on vocals in MUSDB18-HQ tests.
Continual Distillation of Teachers from Different Domains cs.LG · 2026-04-10 · conditional · none · ref 32 · internal anchor
SE2D stabilizes continual distillation across heterogeneous teachers by preserving logits on external unlabeled data to mitigate unseen knowledge forgetting.
Memory-Efficient Transfer Learning with Fading Side Networks via Masked Dual Path Distillation cs.CV · 2026-04-10 · unverdicted · none · ref 87 · internal anchor
MDPD mutually distills knowledge between a frozen backbone and a learnable side network during fine-tuning, then discards the side network at inference to accelerate speed by at least 25% while preserving accuracy.
MP-ISMoE: Mixed-Precision Interactive Side Mixture-of-Experts for Efficient Transfer Learning cs.LG · 2026-04-10 · unverdicted · none · ref 64 · internal anchor
MP-ISMoE uses Gaussian noise perturbed iterative quantization and interactive side mixture-of-experts to deliver higher accuracy than prior memory-efficient transfer learning methods while keeping similar parameter and memory usage.
A Little Rank Goes a Long Way: Random Scaffolds with LoRA Adapters Are All You Need cs.LG · 2026-04-09 · unverdicted · none · ref 29 · internal anchor
Frozen random backbones with low-rank LoRA adapters recover 96-100% of fully trained performance on diverse architectures while training only 0.5-40% of parameters.
LMGenDrive: Bridging Multimodal Understanding and Generative World Modeling for End-to-End Driving cs.CV · 2026-04-09 · unverdicted · none · ref 48 · internal anchor
LMGenDrive unifies LLM-based multimodal understanding with generative world models to output both future driving videos and control signals for end-to-end closed-loop autonomous driving.
ADAPTive Input Training for Many-to-One Pre-Training on Time-Series Classification cs.LG · 2026-04-09 · unverdicted · none · ref 19 · internal anchor
ADAPT is a new pre-training paradigm that aligns physical properties of time-series data to allow simultaneous training on 162 diverse classification datasets, achieving new state-of-the-art performance.
SeLaR: Selective Latent Reasoning in Large Language Models cs.CL · 2026-04-09 · unverdicted · none · ref 40 · internal anchor
SeLaR selectively applies latent soft reasoning in LLMs via entropy gating and contrastive regularization, outperforming standard CoT on five benchmarks without training.
SMART: When is it Actually Worth Expanding a Speculative Tree? cs.DC · 2026-04-09 · unverdicted · none · ref 32 · internal anchor
SMART uses marginal benefit-cost analysis to dynamically build efficient speculative trees, achieving 15-20% additional speedup in LLM and MLLM inference.
Uni-ViGU: Towards Unified Video Generation and Understanding via A Diffusion-Based Video Generator cs.CV · 2026-04-09 · unverdicted · none · ref 32 · internal anchor
Uni-ViGU unifies video generation and understanding by extending a diffusion video generator with unified continuous-discrete flow matching, modality-driven MoE layers, and bidirectional training stages that repurpose generative knowledge for discriminative tasks.
Reasoning-Based Refinement of Unsupervised Text Clusters with LLMs cs.CL · 2026-04-08 · unverdicted · none · ref 49 · internal anchor
LLM reasoning refines unsupervised text clusters via coherence checks, redundancy removal, and label grounding, yielding better coherence and human-aligned labels on social media data.
Walk the Talk: Bridging the Reasoning-Action Gap for Thinking with Images via Multimodal Agentic Policy Optimization cs.CV · 2026-04-08 · unverdicted · none · ref 22 · internal anchor
MAPO improves multimodal chain-of-thought reasoning by requiring explicit textual descriptions of visual tool results and using a novel advantage estimator that combines semantic alignment with task rewards.
AgentGate: A Lightweight Structured Routing Engine for the Internet of Agents cs.AI · 2026-04-08 · unverdicted · none · ref 23 · internal anchor
AgentGate decomposes routing into action decision and structural grounding stages, allowing small 3B-7B models to dispatch queries competitively on a curated benchmark after targeted fine-tuning.
Visual prompting reimagined: The power of the Activation Prompts cs.CV · 2026-04-07 · unverdicted · none · ref 1 · internal anchor
Activation prompts on intermediate layers outperform input-level visual prompting and parameter-efficient fine-tuning in accuracy and efficiency across 29 datasets.
In-Place Test-Time Training cs.LG · 2026-04-07 · conditional · none · ref 52 · internal anchor
In-Place TTT adapts LLM MLP projection matrices at test time with a next-token-aligned objective and chunk-wise updates, enabling better long-context performance as a drop-in enhancement.
Controlling Distributional Bias in Multi-Round LLM Generation via KL-Optimized Fine-Tuning cs.CL · 2026-04-07 · unverdicted · none · ref 41 · internal anchor
A hybrid fine-tuning objective using KL divergence for token calibration and Kahneman-Tversky optimization for semantic binding enables LLMs to produce outputs that match desired attribute distributions across repeated prompts.
Semantic Communication with an LLM-enabled Knowledge Base eess.SP · 2026-04-07 · unverdicted · none · ref 22 · internal anchor
SC-LMKB uses LLM-generated data with cross-domain fusion to cut hallucinations and delivers up to 72.6% gains on cross-modality retrieval tasks over standard semantic communication.
The Energy Cost of Execution-Idle in GPU Clusters cs.DC · 2026-04-06 · unverdicted · none · ref 53 · internal anchor
Execution-idle accounts for 19.7% of GPU execution time and 10.7% of energy in a large cluster, motivating power management that treats it as a distinct operating state.
InsTraj: Instructing Diffusion Models with Travel Intentions to Generate Real-world Trajectories cs.AI · 2026-04-05 · unverdicted · none · ref 32 · internal anchor
InsTraj generates realistic, instruction-faithful GPS trajectories by using an LLM to parse natural-language travel intent and a multimodal diffusion transformer to produce the paths.
Embedding Enhancement via Fine-Tuned Language Models for Learner-Item Cognitive Modeling cs.CL · 2026-04-05 · unverdicted · none · ref 33 · internal anchor
EduEmbed fine-tunes language models in two stages to add semantic information to learner-item embeddings and improve performance on cognitive diagnosis and adaptive testing tasks.
CoopGuard: Stateful Cooperative Agents Safeguarding LLMs Against Evolving Multi-Round Attacks cs.CR · 2026-04-05 · unverdicted · none · ref 27 · internal anchor
CoopGuard deploys cooperative agents to track conversation history and counter evolving multi-round attacks on LLMs, achieving a 78.9% reduction in attack success rate on a new 5,200-sample benchmark.
RUQuant: Towards Refining Uniform Quantization for Large Language Models cs.CL · 2026-04-05 · unverdicted · none · ref 31 · internal anchor
RUQuant uses block-wise composite orthogonal matrices from Householder reflections and Givens rotations plus a fine-tuned global reflection to achieve 99.8% full-precision accuracy at W6A6 and 97% at W4A4 for 13B LLMs in about one minute.
PolySwarm: A Multi-Agent Large Language Model Framework for Prediction Market Trading and Latency Arbitrage cs.AI · 2026-04-04 · unverdicted · none · ref 21 · internal anchor
PolySwarm aggregates predictions from 50 LLM personas for Polymarket trading using Bayesian combination and divergence metrics, outperforming single models in calibration while adding latency arbitrage via CEX price models.
CoME-VL: Scaling Complementary Multi-Encoder Vision-Language Learning cs.CV · 2026-04-03 · unverdicted · none · ref 61 · internal anchor
CoME-VL fuses contrastive and self-supervised vision encoders via entropy-guided multi-layer aggregation and RoPE cross-attention to improve vision-language model performance on benchmarks.
Multiple-Debias: A Full-process Debiasing Method for Multilingual Pre-trained Language Models cs.CL · 2026-04-03 · unverdicted · none · ref 3 · internal anchor
Multiple-Debias reduces gender, racial, and religious biases in multilingual pre-trained language models more effectively than monolingual methods by integrating counterfactual augmentation and self-debiasing across pre- and post-processing stages in four languages.
Visual Instruction-Finetuned Language Model for Versatile Brain MR Image Tasks cs.CV · 2026-04-03 · unverdicted · none · ref 52 · internal anchor
LLaBIT is a single instruction-finetuned LLM that performs report generation, VQA, segmentation, and translation on brain MRI images while outperforming task-specific models.
WIO: Upload-Enabled Computational Storage on CXL SSDs cs.OS · 2026-04-02 · unverdicted · none · ref 56 · internal anchor
WIO enables reversible computational storage on CXL SSDs via WebAssembly actors and zero-copy migration, achieving up to 2x throughput and 3.75x lower write latency.
Metriplector: From Field Theory to Neural Architecture cs.AI · 2026-03-31 · unverdicted · none · ref 6 · internal anchor
Metriplector treats neural computation as coupled metriplectic field dynamics whose stress-energy tensor readout achieves competitive results on vision, control, Sudoku, language modeling, and pathfinding with small parameter counts.
An Underexplored Frontier: Large Language Models for Rare Disease Patient Education and Communication -- A scoping review cs.CL · 2026-03-30 · accept · none · ref 34 · internal anchor
A scoping review of 12 studies finds LLM applications for rare disease patient education remain early-stage, dominated by general models like ChatGPT focused on curated question-answering with limited real-world or patient-centered evaluation.
Chat-Scene++: Exploiting Context-Rich Object Identification for 3D LLM cs.CV · 2026-03-29 · unverdicted · none · ref 3 · internal anchor
Chat-Scene++ improves 3D scene understanding in multimodal LLMs by representing scenes as context-rich object sequences with identifier tokens and grounded chain-of-thought reasoning, reaching state-of-the-art on five benchmarks using pre-trained encoders.
Make Tracking Easy: Neural Motion Retargeting for Humanoid Whole-body Control cs.RO · 2026-03-23 · unverdicted · none · ref 34 · internal anchor
NMR uses VAE-based clustered expert physics refinement and a CNN-Transformer to learn dynamics-aware retargeting, eliminating joint jumps and self-collisions on Unitree G1 while accelerating downstream control policies.
Instruction-Free Tuning of Large Vision Language Models for Medical Instruction Following cs.CV · 2026-03-19 · unverdicted · none · ref 21 · internal anchor
Instruction-free tuning of LVLMs on medical image-description pairs via momentum proxy instructions and response shuffling achieves SOTA accuracy on VQA tasks across SKINCON, WBCAtt, CBIS, and MIMIC-CXR.
ExpertGen: Scalable Sim-to-Real Expert Policy Learning from Imperfect Behavior Priors cs.RO · 2026-03-16 · conditional · none · ref 6 · internal anchor
ExpertGen generates high-success expert policies in simulation from imperfect priors by freezing a diffusion behavior model and optimizing its initial noise via RL, then distills them for real-robot deployment.
Performance Isolation and Semantic Determinism in Efficient GPU Spatial Sharing cs.DC · 2026-03-16 · unverdicted · none · ref 66 · internal anchor
CoGPU resolves the tradeoff in GPU sharing by introducing GPU coroutines for semantic-preserving resource migration, delivering up to 79.2% higher training throughput and zero token mismatch in inference.
Joint Optimization of Multi-agent Memory System cs.MA · 2026-03-13 · unverdicted · none · ref 27 · internal anchor
CoMAM jointly optimizes agents in multi-agent LLM memory systems via end-to-end RL and adaptive credit assignment to improve collaboration and performance.
Deterministic Differentiable Structured Pruning for Large Language Models cs.LG · 2026-03-09 · unverdicted · none · ref 6 · internal anchor
DDP replaces stochastic hard-concrete masks with a deterministic soft surrogate for l0-constrained structured pruning, delivering 1% performance loss on Qwen3 models at 20% sparsity and faster convergence than prior methods.
Data Agent: Learning to Select Data via End-to-End Dynamic Optimization cs.LG · 2026-03-08 · unverdicted · none · ref 14 · internal anchor
Data Agent learns a co-evolving sample selection policy end-to-end that accelerates training by over 50% on ImageNet-1k and MMLU with no performance loss.
SeedPolicy: Horizon Scaling via Self-Evolving Diffusion Policy for Robot Manipulation cs.RO · 2026-03-05 · conditional · none · ref 34 · internal anchor
SeedPolicy introduces self-evolving gated attention to extend the temporal horizon of diffusion policies, yielding 36.8% and 169% relative gains over standard DP on clean and randomized RoboTwin 2.0 tasks.
TagaVLM: Topology-Aware Global Action Reasoning for Vision-Language Navigation cs.CV · 2026-03-03 · conditional · none · ref 35 · internal anchor
TagaVLM embeds topological structures into VLMs via residual attention and interleaved prompts, achieving 51.09% success rate on R2R unseen environments and outperforming prior large-model methods.
ACE-Merging: Data-Free Model Merging with Adaptive Covariance Estimation cs.CL · 2026-03-03 · unverdicted · none · ref 53 · internal anchor
ACE-Merging estimates task input covariances from parameter differences to enable closed-form data-free merging that reduces interference and outperforms prior baselines on vision and language tasks.
DARTH-PUM: A Hybrid Processing-Using-Memory Architecture cs.AR · 2026-02-17 · unverdicted · none · ref 141 · internal anchor
DARTH-PUM integrates analog and Boolean PUM with optimized peripherals, coordination hardware, and a programming interface to run kernels like AES, CNNs, and LLMs fully in memory, achieving speedups of 59.4x, 14.8x, and 40.8x over an analog-plus-CPU baseline.
LLaMo: Scaling Pretrained Language Models for Unified Motion Understanding and Generation with Continuous Autoregressive Tokens cs.CV · 2026-02-12 · unverdicted · none · ref 65 · internal anchor
LLaMo scales pretrained LLMs for unified motion-language tasks by encoding motion into continuous causal latents and adding a flow-matching head for real-time autoregressive generation and captioning.

LLaMA: Open and Efficient Foundation Language Models

hub tools

citation-role summary

citation-polarity summary

claims ledger

mega hub controls

Recognition alignment

counterfactual ablation

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer