pith. machine review for the scientific record. sign in

Soundstream: An end-to-end neural audio codec

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

years

2026 3 2025 1

representative citing papers

ENSEMBITS: an alphabet of protein conformational ensembles

cs.LG · 2026-05-13 · unverdicted · novelty 8.0 · 2 refs

Ensembits is the first tokenizer of protein conformational ensembles that outperforms static tokenizers on RMSF prediction and matches them on function and mutation tasks while using less pretraining data.

FAST: Efficient Action Tokenization for Vision-Language-Action Models

cs.RO · 2025-01-16 · unverdicted · novelty 6.0

FAST applies discrete cosine transform to robot action sequences for efficient tokenization, enabling autoregressive VLAs to succeed on high-frequency dexterous tasks and scale to 10k hours of data while matching diffusion VLA performance with up to 5x faster training.

Woosh: A Sound Effects Foundation Model

cs.SD · 2026-04-02 · accept · novelty 5.0

Woosh is a new publicly released foundation model optimized for high-quality sound effect generation from text or video, showing competitive or better results than open alternatives like Stable Audio Open.

citing papers explorer

Showing 4 of 4 citing papers.

  • ENSEMBITS: an alphabet of protein conformational ensembles cs.LG · 2026-05-13 · unverdicted · none · ref 28 · 2 links

    Ensembits is the first tokenizer of protein conformational ensembles that outperforms static tokenizers on RMSF prediction and matches them on function and mutation tasks while using less pretraining data.

  • Efficient Retrieval Scaling with Hierarchical Indexing for Large Scale Recommendation cs.IR · 2026-04-14 · unverdicted · none · ref 68

    A jointly learned hierarchical index with cross-attention and residual quantization scales exact retrieval in foundational recommendation models, deployed at Meta with additional performance from test-time training on index nodes.

  • FAST: Efficient Action Tokenization for Vision-Language-Action Models cs.RO · 2025-01-16 · unverdicted · none · ref 69

    FAST applies discrete cosine transform to robot action sequences for efficient tokenization, enabling autoregressive VLAs to succeed on high-frequency dexterous tasks and scale to 10k hours of data while matching diffusion VLA performance with up to 5x faster training.

  • Woosh: A Sound Effects Foundation Model cs.SD · 2026-04-02 · accept · none · ref 21

    Woosh is a new publicly released foundation model optimized for high-quality sound effect generation from text or video, showing competitive or better results than open alternatives like Stable Audio Open.