pith. sign in

hub Mixed citations

author Zhou, A

Mixed citation behavior. Most common role is background (57%).

34 Pith papers citing it
Background 57% of classified citations

hub tools

citation-role summary

background 4 method 2 dataset 1

citation-polarity summary

clear filters

representative citing papers

DECKER: Domain-invariant Embedding for Cross-Keyboard Extraction and Recognition

cs.CR · 2026-05-05 · unverdicted · novelty 6.0

DECKER is a domain-invariant four-stage framework (keyboard normalization, adversarial disentanglement, cross-keyboard contrastive alignment, acoustic style randomization) plus LLM post-processing that improves keystroke inference over baselines on the new HEAR dataset, especially in cross-keyboard

STAMP: Spatial-Temporal Adapter with Multi-Head Pooling

cs.LG · 2025-11-13 · unverdicted · novelty 6.0

STAMP adapter enables general time series foundation models to match specialized EEG foundation models on clinical classification tasks across 8 benchmarks while using few trainable parameters.

Step-Audio 2 Technical Report

cs.CL · 2025-07-22 · unverdicted · novelty 6.0

Step-Audio 2 integrates a latent audio encoder, reasoning-centric reinforcement learning, and discrete audio token generation into language modeling to deliver state-of-the-art performance on audio understanding and conversational benchmarks.

WorldSpeech: A Multilingual Speech Corpus from Around the World

cs.CL · 2026-05-09 · unverdicted · novelty 5.0 · 2 refs

WorldSpeech supplies 65k hours of multilingual aligned speech data across 76 languages and delivers 63.5% average relative WER reduction after fine-tuning ASR models on 11 typologically diverse languages.

Audio Spoof Detection with GaborNet

cs.SD · 2026-04-21 · unverdicted · novelty 5.0

GaborNet replaces sinc functions with Gabor filters in raw-audio neural networks and is tested for audio spoof detection with augmentations in RawNet2 and RawGAT-ST.

Qwen3.5-Omni Technical Report

cs.CL · 2026-04-17 · unverdicted · novelty 5.0

Qwen3.5-Omni scales an omnimodal model to hundreds of billions of parameters with 256k context, introduces ARIA for stable speech synthesis, and reports SOTA performance on 215 audio-visual benchmarks while adding multilingual and audio-visual coding capabilities.

citing papers explorer

Showing 3 of 3 citing papers after filters.