Dinosr: Self-distillation and online clustering for self-supervised speech representation learning, 2024a

AlexanderH · arXiv 2305.10005

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

cs.CL · 2024-11-07 · conditional · novelty 6.0

MoT decouples non-embedding parameters by modality in transformers to match dense multi-modal performance with roughly one-third to one-half the FLOPs.

citing papers explorer

Showing 1 of 1 citing paper.

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models cs.CL · 2024-11-07 · conditional · none · ref 21
MoT decouples non-embedding parameters by modality in transformers to match dense multi-modal performance with roughly one-third to one-half the FLOPs.

Dinosr: Self-distillation and online clustering for self-supervised speech representation learning, 2024a

fields

years

verdicts

representative citing papers

citing papers explorer