Masking-based explanations are governed by the information capacity of the query channel, with reliable recovery achievable below capacity via sparse maximum-likelihood decoding but impossible above it.
Byt5: Towards a token-free future with pre-trained byte-to-byte models
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
An LLM-enhanced Viterbi decoder achieves roughly 1.5 dB extra coding gain in block error rate and over 50% better semantic similarity than conventional Viterbi for constraint-length-3 convolutional codes on AWGN channels.
A compact Mamba-2 model performs end-to-end byte-level network traffic classification without tokenization or pre-training and remains competitive with substantially larger pre-trained systems.
citing papers explorer
-
The Query Channel: Information-Theoretic Limits of Masking-Based Explanations
Masking-based explanations are governed by the information capacity of the query channel, with reliable recovery achievable below capacity via sparse maximum-likelihood decoding but impossible above it.
-
LLM-Viterbi: Semantic-Aware Decoding for Convolutional Codes
An LLM-enhanced Viterbi decoder achieves roughly 1.5 dB extra coding gain in block error rate and over 50% better semantic similarity than conventional Viterbi for constraint-length-3 convolutional codes on AWGN channels.
-
MambaNetBurst: Direct Byte-level Network Traffic Classification without Tokenization or Pretraining
A compact Mamba-2 model performs end-to-end byte-level network traffic classification without tokenization or pre-training and remains competitive with substantially larger pre-trained systems.