hub

URL http://proceedings.mlr.press/ v37/allamanis15.html

· 2020 · arXiv 2006.11477

17 Pith papers cite this work. Polarity classification is still indexing.

17 Pith papers citing it

read on arXiv browse 17 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

other 1

citation-polarity summary

unclear 1

representative citing papers

Lip Forcing: Few-Step Autoregressive Diffusion for Real-time Lip Synchronization

cs.CV · 2026-06-09 · conditional · novelty 8.0

Lip Forcing distills a 14B bidirectional video diffusion teacher into autoregressive students that achieve real-time lip synchronization at 31 FPS using two denoising steps without CFG.

Moshi: a speech-text foundation model for real-time dialogue

eess.AS · 2024-09-17 · accept · novelty 7.0

Moshi is the first real-time full-duplex spoken large language model that casts dialogue as speech-to-speech generation using parallel audio streams and an inner monologue of time-aligned text tokens.

Voxtral Realtime

cs.AI · 2026-02-11 · unverdicted · novelty 6.0

Voxtral Realtime is an end-to-end trained streaming ASR model that achieves Whisper-level transcription quality at 480ms delay after scaling pretraining across 13 languages.

Step-Audio 2 Technical Report

cs.CL · 2025-07-22 · unverdicted · novelty 6.0

Step-Audio 2 integrates a latent audio encoder, reasoning-centric reinforcement learning, and discrete audio token generation into language modeling to deliver state-of-the-art performance on audio understanding and conversational benchmarks.

Atlas: Few-shot Learning with Retrieval Augmented Language Models

cs.CL · 2022-08-05 · unverdicted · novelty 6.0

Atlas reaches over 42% accuracy on Natural Questions with only 64 examples, outperforming a 540B-parameter model by 3% with 50x fewer parameters.

Unsupervised Dense Information Retrieval with Contrastive Learning

cs.IR · 2021-12-16 · unverdicted · novelty 6.0

Contrastive learning trains unsupervised dense retrievers that beat BM25 on most BEIR datasets and support cross-lingual retrieval across scripts.

Evaluating Large Language Models Trained on Code

cs.LG · 2021-07-07 · accept · novelty 6.0

Codex achieves 28.8% pass@1 on HumanEval, rising to 70.2% with 100 samples per problem via repeated sampling.

wav2VOT: Automatic estimation of voice onset time, closure duration, and burst realisation with wav2vec2

cs.SD · 2026-06-27 · unverdicted · novelty 5.0

wav2VOT shows wav2vec2 can estimate voice onset time and related stop consonant features with accuracy comparable to existing tools on unseen data and higher accuracy after fine-tuning.

CoughPhase-CLR: Designing an acoustics-informed foundation model for coughing sound classification

cs.SD · 2026-06-19 · unverdicted · novelty 5.0

CoughPhase-CLR uses cough physiological phases to build contrastive positive pairs, outperforming random cropping on downstream tasks including COVID-19 detection and COPD classification.

F3-Tokenizer: Taming Audio Autoencoder Latents for Understanding and Generation

cs.SD · 2026-06-04 · unverdicted · novelty 5.0

F3-Tokenizer adapts audio autoencoder latents with noise-regularized bottleneck (channel normalization and stochastic perturbation) and a representation encoder (RQ-MTP plus frozen-LLM supervision) to support both high-dimensional understanding representations and normalized continuous generation ta

Audio Deepfake Detection with Half-Truth Localisation Using Cross-Attentive Feature Fusion

cs.SD · 2026-05-28 · unverdicted · novelty 5.0

CAFNet performs joint ternary classification and temporal boundary regression for half-truth audio deepfakes via cross-attentive fusion of MFCC, LFCC, and Chroma-STFT features, reporting 92.71% accuracy and 0.075s MAE on MLADDC T2+T3.

Enhancing ASR Performance in the Medical Domain for Dravidian Languages

eess.AS · 2026-04-10 · unverdicted · novelty 5.0

A hybrid confidence-aware ASR training framework with learnable weights reduces Telugu medical WER from 24.3% to 15.8% and Kannada from 31.7% to 25.4%, outperforming standard fine-tuning.

Bridging the Usability Gap: Lessons from Interpreting Studies for Machine Interpreting Design

cs.CL · 2026-06-14 · unverdicted · novelty 4.0

Machine interpreting should shift from fidelity metrics to three design priorities—agency, grounding, and experience—drawn from interpreting studies to close the usability gap with human-mediated communication.

Domain-Adapted Fine-Tuning of ECG Foundation Models for Multi-Label Structural Heart Disease Screening

cs.LG · 2026-04-25 · unverdicted · novelty 4.0

Domain-adapted ECG foundation models with self-supervised pretraining and selective fine-tuning reach macro-AUROC 0.8509 for multi-label structural heart disease detection on the EchoNext benchmark.

Spatial Speech Perception Systems: A Survey of Sound Source Localization, Directional Enhancement, and Speech Recognition

eess.AS · 2026-07-02 · unverdicted · novelty 2.0

A survey of spatial speech perception systems covering sound source localization, directional enhancement, and automatic speech recognition methods and their integration.

Exploitation of Hidden Context in Dynamic Movement Forecasting: A Neural Network Journey from Recurrent to Graph Neural Networks and General Purpose Transformers

cs.LG · 2026-05-14 · unverdicted · novelty 2.0

Empirical comparison of LSTM, GNN, and Transformer architectures for NBA trajectory forecasting finds hybrid LSTM with contextual information yields lowest FDE of 1.51m over horizons up to 2s.

MMTalker: Multiresolution 3D Talking Head Synthesis with Multimodal Feature Fusion

cs.CV · 2026-04-03

citing papers explorer

Showing 13 of 13 citing papers after filters.

Voxtral Realtime cs.AI · 2026-02-11 · unverdicted · none · ref 2
Voxtral Realtime is an end-to-end trained streaming ASR model that achieves Whisper-level transcription quality at 480ms delay after scaling pretraining across 13 languages.
Step-Audio 2 Technical Report cs.CL · 2025-07-22 · unverdicted · none · ref 3
Step-Audio 2 integrates a latent audio encoder, reasoning-centric reinforcement learning, and discrete audio token generation into language modeling to deliver state-of-the-art performance on audio understanding and conversational benchmarks.
Atlas: Few-shot Learning with Retrieval Augmented Language Models cs.CL · 2022-08-05 · unverdicted · none · ref 119
Atlas reaches over 42% accuracy on Natural Questions with only 64 examples, outperforming a 540B-parameter model by 3% with 50x fewer parameters.
Unsupervised Dense Information Retrieval with Contrastive Learning cs.IR · 2021-12-16 · unverdicted · none · ref 91
Contrastive learning trains unsupervised dense retrievers that beat BM25 on most BEIR datasets and support cross-lingual retrieval across scripts.
wav2VOT: Automatic estimation of voice onset time, closure duration, and burst realisation with wav2vec2 cs.SD · 2026-06-27 · unverdicted · none · ref 29
wav2VOT shows wav2vec2 can estimate voice onset time and related stop consonant features with accuracy comparable to existing tools on unseen data and higher accuracy after fine-tuning.
CoughPhase-CLR: Designing an acoustics-informed foundation model for coughing sound classification cs.SD · 2026-06-19 · unverdicted · none · ref 43
CoughPhase-CLR uses cough physiological phases to build contrastive positive pairs, outperforming random cropping on downstream tasks including COVID-19 detection and COPD classification.
F3-Tokenizer: Taming Audio Autoencoder Latents for Understanding and Generation cs.SD · 2026-06-04 · unverdicted · none · ref 1
F3-Tokenizer adapts audio autoencoder latents with noise-regularized bottleneck (channel normalization and stochastic perturbation) and a representation encoder (RQ-MTP plus frozen-LLM supervision) to support both high-dimensional understanding representations and normalized continuous generation ta
Audio Deepfake Detection with Half-Truth Localisation Using Cross-Attentive Feature Fusion cs.SD · 2026-05-28 · unverdicted · none · ref 15
CAFNet performs joint ternary classification and temporal boundary regression for half-truth audio deepfakes via cross-attentive fusion of MFCC, LFCC, and Chroma-STFT features, reporting 92.71% accuracy and 0.075s MAE on MLADDC T2+T3.
Enhancing ASR Performance in the Medical Domain for Dravidian Languages eess.AS · 2026-04-10 · unverdicted · none · ref 16
A hybrid confidence-aware ASR training framework with learnable weights reduces Telugu medical WER from 24.3% to 15.8% and Kannada from 31.7% to 25.4%, outperforming standard fine-tuning.
Bridging the Usability Gap: Lessons from Interpreting Studies for Machine Interpreting Design cs.CL · 2026-06-14 · unverdicted · none · ref 133
Machine interpreting should shift from fidelity metrics to three design priorities—agency, grounding, and experience—drawn from interpreting studies to close the usability gap with human-mediated communication.
Domain-Adapted Fine-Tuning of ECG Foundation Models for Multi-Label Structural Heart Disease Screening cs.LG · 2026-04-25 · unverdicted · none · ref 20
Domain-adapted ECG foundation models with self-supervised pretraining and selective fine-tuning reach macro-AUROC 0.8509 for multi-label structural heart disease detection on the EchoNext benchmark.
Spatial Speech Perception Systems: A Survey of Sound Source Localization, Directional Enhancement, and Speech Recognition eess.AS · 2026-07-02 · unverdicted · none · ref 35
A survey of spatial speech perception systems covering sound source localization, directional enhancement, and automatic speech recognition methods and their integration.
Exploitation of Hidden Context in Dynamic Movement Forecasting: A Neural Network Journey from Recurrent to Graph Neural Networks and General Purpose Transformers cs.LG · 2026-05-14 · unverdicted · none · ref 46
Empirical comparison of LSTM, GNN, and Transformer architectures for NBA trajectory forecasting finds hybrid LSTM with contextual information yields lowest FDE of 1.51m over horizons up to 2s.

URL http://proceedings.mlr.press/ v37/allamanis15.html

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer