hub

Liger Kernel: Efficient Triton Kernels for

Pin-Lun Hsu, Yun Dai, Vignesh Kothapalli, Qingquan Song, Shao Tang, Siyu Zhu, Steven Shimizu, Shivam Sahni, Haowen Ning, Yanning Chen · 2024 · arXiv 2410.10989

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

read on arXiv browse 11 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

Animation2Code: Evaluating Temporal Visual Reasoning in Video-to-Code Generation

cs.CV · 2026-06-26 · unverdicted · novelty 7.0

Animation2Code benchmark with 1,069 videos tests VLMs on generating animation code, showing persistent failures in temporal consistency despite good visual matches.

CODA: Rewriting Transformer Blocks as GEMM-Epilogue Programs

cs.LG · 2026-05-19 · unverdicted · novelty 7.0 · 2 refs

CODA re-expresses most non-attention Transformer computations as GEMM-plus-epilogue programs using a constrained set of composable primitives to keep intermediate results on-chip and cut global memory traffic.

Decoupling KL and Trajectories: A Unified Perspective for SFT, DAgger, Offline RL, and OPD in LLM Distillation

cs.LG · 2026-05-16 · unverdicted · novelty 7.0

Decoupling prefix source from token-level KL direction in autoregressive sequence KL yields four objectives unifying SFT, DAgger, offline RL and OPD, with KL mixing and entropy-gated curriculum improving math reasoning accuracy and shortening responses.

Graphs of Research: Citation Evolution Graphs as Supervision for Research Idea Generation

cs.CL · 2026-05-14 · unverdicted · novelty 7.0

GoR extracts citation DAGs using position, frequency, predecessor links and time, then fine-tunes Qwen2.5-7B on 498 seed papers to generate ideas, claiming SOTA over gpt-4o baselines via LLM judges.

BOSCH: Black-Box Binary Optimization for Short-Context Attention-Head Selection in LLMs

cs.CL · 2026-04-07 · unverdicted · novelty 7.0

BOSCH decomposes attention-head selection for short-context hybridization into layer probing, adaptive ratio assignment, and grouped binary optimization, yielding better efficiency-performance tradeoffs than static or layer-wise baselines.

Bayesian Preference Learning for Test-Time Steerable Reward Models

cs.LG · 2026-02-09 · unverdicted · novelty 7.0

ICRM casts reward modeling as amortized variational inference over a latent preference probability with a Beta prior, enabling test-time adaptation to unseen preferences and improving benchmark performance.

Towards Generalization of Block Attention via Automatic Segmentation and Block Distillation

cs.CL · 2026-05-15 · unverdicted · novelty 6.0 · 3 refs

A new 30k-instance semantic segmentation dataset plus block distillation with sink tokens, dropout, and weighted loss lets block-attention models reach near full-attention performance on long texts.

Faster and Memory-Efficient Training of Sequential Recommendation Models for Large Catalogs

cs.IR · 2025-08-13 · accept · novelty 6.0

CCE- is a Triton kernel implementation of cross-entropy loss with negative sampling that reduces memory by more than 10x and accelerates training by up to 2x for large-catalog sequential recommenders.

LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification

cs.CL · 2025-02-24 · unverdicted · novelty 6.0

LongSpec achieves up to 3.26x speedup over Flash Attention baselines on long-context datasets via memory-efficient drafting and verification techniques.

Genome-Factory: A Library for Tuning, Deploying, and Interpreting Genomic Foundation Models

q-bio.GN · 2025-09-13 · conditional · novelty 5.0

Genome-Factory is an open-source Python library that integrates data pipelines, model tuning, inference, benchmarks, and biological interpretation for genomic foundation models.

NVILA: Efficient Frontier Visual Language Models

cs.CV · 2024-12-05 · unverdicted · novelty 5.0

NVILA improves on VILA with a scale-then-compress visual token strategy and full-lifecycle efficiency optimizations, matching or exceeding leading VLMs on image and video benchmarks while reducing training cost 1.9-5.1x and latencies 1.2-2.8x.

citing papers explorer

Showing 1 of 1 citing paper after filters.

NVILA: Efficient Frontier Visual Language Models cs.CV · 2024-12-05 · unverdicted · none · ref 35
NVILA improves on VILA with a scale-then-compress visual token strategy and full-lifecycle efficiency optimizations, matching or exceeding leading VLMs on image and video benchmarks while reducing training cost 1.9-5.1x and latencies 1.2-2.8x.

Liger Kernel: Efficient Triton Kernels for

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer