An adaptive delta-prioritization algorithm using cosine distance and Hamming-drift thresholds improves embedding distortion by 4.8-7.2% and next-token perplexity by 2.1-6.3% over periodic keyframing at matched low bitrates for tokenized driving world models.
Keyframe Insertion for Random Access and Packet- Loss Repair in H.264/A VC, H.265/HEVC, and H.266/VVC
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Introduces a benchmark for VLMs on compressed images and a universal adaptor to improve performance across codecs and bitrates.
Gaussian primitives compress 3D Taylor-Green vortex flows at ratios over 1000x while preserving velocity but degrading enstrophy, with anisotropic extensions recovering small-scale vortical structures better than baseline or other variants.
HCFSSNet uses convolutional layers plus a Vision Frequency State Space block with omni-directional scanning and frequency reweighting to reach competitive rate-distortion performance in learned image compression.
citing papers explorer
-
Network-Efficient World Model Token Streaming
An adaptive delta-prioritization algorithm using cosine distance and Hamming-drift thresholds improves embedding distortion by 4.8-7.2% and next-token perplexity by 2.1-6.3% over periodic keyframing at matched low bitrates for tokenized driving world models.
-
Benchmarking and Enhancing VLM for Compressed Image Understanding
Introduces a benchmark for VLMs on compressed images and a universal adaptor to improve performance across codecs and bitrates.
-
Gaussian Field Representations for Turbulent Flow: Compression, Scale Separation, and Physical Fidelity
Gaussian primitives compress 3D Taylor-Green vortex flows at ratios over 1000x while preserving velocity but degrading enstrophy, with anisotropic extensions recovering small-scale vortical structures better than baseline or other variants.
-
A Compact Hybrid Convolution--Frequency State Space Network for Learned Image Compression
HCFSSNet uses convolutional layers plus a Vision Frequency State Space block with omni-directional scanning and frequency reweighting to reach competitive rate-distortion performance in learned image compression.