CODA re-expresses most non-attention Transformer computations as GEMM-plus-epilogue programs using a constrained set of composable primitives to keep intermediate results on-chip and cut global memory traffic.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
CoFrGeNets implement a continued-fraction function class as plug-in replacements for transformer blocks, delivering competitive or superior downstream performance on GPT2-xl and Llama3-scale models with one-half to two-thirds the parameters.
citing papers explorer
-
CODA: Rewriting Transformer Blocks as GEMM-Epilogue Programs
CODA re-expresses most non-attention Transformer computations as GEMM-plus-epilogue programs using a constrained set of composable primitives to keep intermediate results on-chip and cut global memory traffic.
-
CoFrGeNet: Continued Fraction Architectures for Language Generation
CoFrGeNets implement a continued-fraction function class as plug-in replacements for transformer blocks, delivering competitive or superior downstream performance on GPT2-xl and Llama3-scale models with one-half to two-thirds the parameters.