pith. sign in

What language model architecture and pretraining objective works best for zero-shot generalization?

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

years

2026 2

verdicts

UNVERDICTED 2

representative citing papers

Text-Utilization for Encoder-dominated Speech Recognition Models

cs.CL · 2026-04-29 · unverdicted · novelty 5.0

Encoder-dominated ASR models using text-only data via modality matching and downsampling achieve comparable performance to larger-decoder models on LibriSpeech, with simple random duration approaches proving effective.

LLMs and Speech: Integration vs. Combination

eess.AS · 2026-03-16 · unverdicted · novelty 4.0

Tight integration of acoustic models with LLMs for ASR is ablated against shallow fusion across label units, fine-tuning strategies, LLM sizes, and joint CTC decoding to mitigate hallucinations.

citing papers explorer

Showing 2 of 2 citing papers.

  • Text-Utilization for Encoder-dominated Speech Recognition Models cs.CL · 2026-04-29 · unverdicted · none · ref 18

    Encoder-dominated ASR models using text-only data via modality matching and downsampling achieve comparable performance to larger-decoder models on LibriSpeech, with simple random duration approaches proving effective.

  • LLMs and Speech: Integration vs. Combination eess.AS · 2026-03-16 · unverdicted · none · ref 23

    Tight integration of acoustic models with LLMs for ASR is ablated against shallow fusion across label units, fine-tuning strategies, LLM sizes, and joint CTC decoding to mitigate hallucinations.