LPM 1.0 generates infinite-length, identity-stable, real-time audio-visual conversational performances for single characters using a distilled causal diffusion transformer and a new benchmark.
Flex attention: A programming model for generating optimized attention kernels
2 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Neptune introduces dependency-breaking fusion with algebraic corrections for reduction sequences, generating FlashAttention-like kernels from plain attention code with 1.35x average speedup across ten benchmarks and four GPU architectures.
citing papers explorer
-
LPM 1.0: Video-based Character Performance Model
LPM 1.0 generates infinite-length, identity-stable, real-time audio-visual conversational performances for single characters using a distilled causal diffusion transformer and a new benchmark.
-
Neptune: Advanced ML Operator Fusion for Locality and Parallelism on GPUs
Neptune introduces dependency-breaking fusion with algebraic corrections for reduction sequences, generating FlashAttention-like kernels from plain attention code with 1.35x average speedup across ten benchmarks and four GPU architectures.