Step-Audio 2 integrates a latent audio encoder, reasoning-centric reinforcement learning, and discrete audio token generation into language modeling to deliver state-of-the-art performance on audio understanding and conversational benchmarks.
9"` 7V9 ƺ a9P v֭ H lͩH #-cgс= ԧ' j5+P ͖Μ _ PK !U _rels/.rels ( MK 1 !̽;*
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
LLM embeddings from clinical records, fused with tabular data via gradient-boosted trees, predict post-traumatic epilepsy at AUC-ROC 0.892 and AUPRC 0.798.
citing papers explorer
-
Step-Audio 2 Technical Report
Step-Audio 2 integrates a latent audio encoder, reasoning-centric reinforcement learning, and discrete audio token generation into language modeling to deliver state-of-the-art performance on audio understanding and conversational benchmarks.
-
Predicting Post-Traumatic Epilepsy from Clinical Records using Large Language Model Embeddings
LLM embeddings from clinical records, fused with tabular data via gradient-boosted trees, predict post-traumatic epilepsy at AUC-ROC 0.892 and AUPRC 0.798.