Mixed citations

Title resolution pending

· 2020 · arXiv 0776.2020

Mixed citation behavior. Most common role is background (57%).

20 Pith papers citing it

Background 57% of classified citations

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 4 dataset 2 method 1

citation-polarity summary

background 4 support 1 use dataset 1 use method 1

representative citing papers

A strongly annotated passive acoustic dataset for tropical bird monitoring

cs.SD · 2026-05-20 · accept · novelty 7.0 · 2 refs

PteroSet is a new strongly annotated dataset of 563 tropical bird recordings (73.62 h) containing 15,372 time-frequency labels for 168 species, released in COCO-style JSON with a binary bird detection baseline.

FLARE: Full-Modality Long-Video Audiovisual Retrieval Benchmark with User-Simulated Queries

cs.MM · 2026-05-11 · unverdicted · novelty 7.0

FLARE is a new benchmark with 399 long videos, 87k multimodal clips, and 275k user-style queries for testing audiovisual retrieval under caption and query regimes.

Geo2Sound: A Scalable Geo-Aligned Framework for Soundscape Generation from Satellite Imagery

cs.MM · 2026-04-16 · unverdicted · novelty 7.0

Geo2Sound generates geographically realistic soundscapes from satellite imagery via geospatial attribute modeling, semantic hypothesis expansion, and geo-acoustic alignment, achieving SOTA FAD of 1.765 on a new 20k-pair benchmark.

Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs

cs.CL · 2025-12-18 · unverdicted · novelty 7.0 · 2 refs

Cascaded systems remain the most reliable for speech translation overall, but recent SpeechLLMs match or outperform them in many conditions while standalone speech models lag.

Moshi: a speech-text foundation model for real-time dialogue

eess.AS · 2024-09-17 · accept · novelty 7.0

Moshi is the first real-time full-duplex spoken large language model that casts dialogue as speech-to-speech generation using parallel audio streams and an inner monologue of time-aligned text tokens.

Reliable model selection in the presence of parameter non-identifiability

stat.ME · 2026-05-19 · unverdicted · novelty 6.0

Proposes adaptive multiple importance sampling for robust Bayesian model evidence estimation under parameter non-identifiability, shown to outperform deterministic methods on ecological case studies while being cheaper than MCMC.

Annotation-free deep learning for detection and segmentation of fetal germinal matrix-intraventricular hemorrhage in brain MRI

eess.IV · 2026-05-10 · conditional · novelty 6.0

FreeHemoSeg detects fetal GMH-IVH on T2-weighted MRI with high sensitivity and specificity and moderate segmentation accuracy using pseudo-image synthesis from normal scans, outperforming supervised and unsupervised baselines in internal and external validation.

Aspect-Aware Content-Based Recommendations for Mathematical Research Papers

cs.IR · 2026-05-05 · unverdicted · novelty 6.0

The authors introduce aspect-aware datasets GoldRiM and SilverRiM for math papers and AchGNN, a heterogeneous GNN that outperforms prior methods by jointly modeling textual semantics, citations, and author lineage across aspects.

Almost for Free: Crafting Adversarial Examples with Convolutional Image Filters

cs.LG · 2026-05-01 · conditional · novelty 6.0

Optimized 3x3 adversarial image filters based on edge detection generate transferable untargeted attacks on neural networks with 30-80% success using only one pass and far fewer parameters than prior methods.

CAHAL: Clinically Applicable resolution enHAncement for Low-resolution MRI scans

cs.CV · 2026-04-20 · unverdicted · novelty 6.0

CAHAL introduces a physics-informed mixture-of-experts super-resolution network for clinical MRI that conditions on resolution and anisotropy and uses edge-penalised, Fourier, and segmentation-guided losses to reduce hallucinations compared with prior generative methods.

SpidR-Adapt: A Universal Speech Representation Model for Few-Shot Adaptation

cs.CL · 2025-12-24 · unverdicted · novelty 6.0 · 2 refs

SpidR-Adapt uses meta-learning with a first-order bi-level optimization heuristic to adapt speech representations to new languages with less than 1 hour of data, achieving 100x better efficiency than standard training.

Perceptual implications of automatic anonymization in pathological speech

eess.AS · 2025-05-01 · conditional · novelty 6.0 · 2 refs

Listeners detect automatic anonymization in pathological speech at 91-93% accuracy with a 30-point perceived quality drop, yet clinical severity ratings stay nearly unchanged for dysarthria, dysglossia, and dysphonia.

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

cs.CL · 2024-11-07 · conditional · novelty 6.0

MoT decouples non-embedding parameters by modality in transformers to match dense multi-modal performance with roughly one-third to one-half the FLOPs.

The Association of Transformer-based Sentiment Analysis with Symptom Distress and Deterioration in Routine Psychotherapy Care

cs.CL · 2026-05-11 · unverdicted · novelty 5.0

Transformer-derived sentiment features from therapy sessions correlate with emotional-valence components of the OQ-45 and differ significantly between patients identified as at risk of deterioration by rational and empirical outcome models.

Audio Spoof Detection with GaborNet

cs.SD · 2026-04-21 · unverdicted · novelty 5.0

GaborNet replaces sinc functions with Gabor filters in raw-audio neural networks and is tested for audio spoof detection with augmentations in RawNet2 and RawGAT-ST.

Foundation Models Defining A New Era In Sensor-based Human Activity Recognition: A Survey And Outlook

eess.SP · 2026-04-03 · accept · novelty 5.0

The survey organizes foundation models for sensor-based HAR into a lifecycle taxonomy and identifies three trajectories: HAR-specific models from scratch, adaptation of general time-series models, and integration with large language models.

Woosh: A Sound Effects Foundation Model

cs.SD · 2026-04-02 · accept · novelty 5.0

Woosh is a new publicly released foundation model optimized for high-quality sound effect generation from text or video, showing competitive or better results than open alternatives like Stable Audio Open.

Learning to Hear Broken Motors: Signature-Guided Data Augmentation for Induction-Motor Diagnostics

cs.LG · 2025-06-10 · unverdicted · novelty 5.0

SGDA generates synthetic faults in the frequency domain from healthy signals to augment training data for ML-based induction motor diagnostics, claiming superior accuracy.

CCNETS: A Modular Causal Learning Framework for Pattern Recognition in Imbalanced Datasets

cs.LG · 2024-01-07 · unverdicted · novelty 5.0

CCNETS is a new modular causal framework using three cooperative modules and a Zoint mechanism to align synthetic data generation with classifier needs on imbalanced pattern recognition tasks.

Quantum-inspired tensor networks in machine learning models

cs.LG · 2026-04-15 · unverdicted · novelty 2.0

Tensor networks developed for quantum states are reviewed as tools for machine learning models, with assessment of their potential computational, explanatory, and privacy advantages alongside remaining challenges.

citing papers explorer

Showing 20 of 20 citing papers.

A strongly annotated passive acoustic dataset for tropical bird monitoring cs.SD · 2026-05-20 · accept · none · ref 16 · 2 links
PteroSet is a new strongly annotated dataset of 563 tropical bird recordings (73.62 h) containing 15,372 time-frequency labels for 168 species, released in COCO-style JSON with a binary bird detection baseline.
FLARE: Full-Modality Long-Video Audiovisual Retrieval Benchmark with User-Simulated Queries cs.MM · 2026-05-11 · unverdicted · none · ref 10
FLARE is a new benchmark with 399 long videos, 87k multimodal clips, and 275k user-style queries for testing audiovisual retrieval under caption and query regimes.
Geo2Sound: A Scalable Geo-Aligned Framework for Soundscape Generation from Satellite Imagery cs.MM · 2026-04-16 · unverdicted · none · ref 6
Geo2Sound generates geographically realistic soundscapes from satellite imagery via geospatial attribute modeling, semantic hypothesis expansion, and geo-acoustic alignment, achieving SOTA FAD of 1.765 on a new 20k-pair benchmark.
Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs cs.CL · 2025-12-18 · unverdicted · none · ref 42 · 2 links
Cascaded systems remain the most reliable for speech translation overall, but recent SpeechLLMs match or outperform them in many conditions while standalone speech models lag.
Moshi: a speech-text foundation model for real-time dialogue eess.AS · 2024-09-17 · accept · none · ref 46
Moshi is the first real-time full-duplex spoken large language model that casts dialogue as speech-to-speech generation using parallel audio streams and an inner monologue of time-aligned text tokens.
Reliable model selection in the presence of parameter non-identifiability stat.ME · 2026-05-19 · unverdicted · none · ref 94
Proposes adaptive multiple importance sampling for robust Bayesian model evidence estimation under parameter non-identifiability, shown to outperform deterministic methods on ecological case studies while being cheaper than MCMC.
Annotation-free deep learning for detection and segmentation of fetal germinal matrix-intraventricular hemorrhage in brain MRI eess.IV · 2026-05-10 · conditional · none · ref 12
FreeHemoSeg detects fetal GMH-IVH on T2-weighted MRI with high sensitivity and specificity and moderate segmentation accuracy using pseudo-image synthesis from normal scans, outperforming supervised and unsupervised baselines in internal and external validation.
Aspect-Aware Content-Based Recommendations for Mathematical Research Papers cs.IR · 2026-05-05 · unverdicted · none · ref 12
The authors introduce aspect-aware datasets GoldRiM and SilverRiM for math papers and AchGNN, a heterogeneous GNN that outperforms prior methods by jointly modeling textual semantics, citations, and author lineage across aspects.
Almost for Free: Crafting Adversarial Examples with Convolutional Image Filters cs.LG · 2026-05-01 · conditional · none · ref 83
Optimized 3x3 adversarial image filters based on edge detection generate transferable untargeted attacks on neural networks with 30-80% success using only one pass and far fewer parameters than prior methods.
CAHAL: Clinically Applicable resolution enHAncement for Low-resolution MRI scans cs.CV · 2026-04-20 · unverdicted · none · ref 259
CAHAL introduces a physics-informed mixture-of-experts super-resolution network for clinical MRI that conditions on resolution and anisotropy and uses edge-penalised, Fourier, and segmentation-guided losses to reduce hallucinations compared with prior generative methods.
SpidR-Adapt: A Universal Speech Representation Model for Few-Shot Adaptation cs.CL · 2025-12-24 · unverdicted · none · ref 20 · 2 links
SpidR-Adapt uses meta-learning with a first-order bi-level optimization heuristic to adapt speech representations to new languages with less than 1 hour of data, achieving 100x better efficiency than standard training.
Perceptual implications of automatic anonymization in pathological speech eess.AS · 2025-05-01 · conditional · none · ref 19 · 2 links
Listeners detect automatic anonymization in pathological speech at 91-93% accuracy with a 30-point perceived quality drop, yet clinical severity ratings stay nearly unchanged for dysarthria, dysglossia, and dysphonia.
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models cs.CL · 2024-11-07 · conditional · none · ref 17
MoT decouples non-embedding parameters by modality in transformers to match dense multi-modal performance with roughly one-third to one-half the FLOPs.
The Association of Transformer-based Sentiment Analysis with Symptom Distress and Deterioration in Routine Psychotherapy Care cs.CL · 2026-05-11 · unverdicted · none · ref 33
Transformer-derived sentiment features from therapy sessions correlate with emotional-valence components of the OQ-45 and differ significantly between patients identified as at risk of deterioration by rational and empirical outcome models.
Audio Spoof Detection with GaborNet cs.SD · 2026-04-21 · unverdicted · none · ref 7
GaborNet replaces sinc functions with Gabor filters in raw-audio neural networks and is tested for audio spoof detection with augmentations in RawNet2 and RawGAT-ST.
Foundation Models Defining A New Era In Sensor-based Human Activity Recognition: A Survey And Outlook eess.SP · 2026-04-03 · accept · none · ref 89
The survey organizes foundation models for sensor-based HAR into a lifecycle taxonomy and identifies three trajectories: HAR-specific models from scratch, adaptation of general time-series models, and integration with large language models.
Woosh: A Sound Effects Foundation Model cs.SD · 2026-04-02 · accept · none · ref 43
Woosh is a new publicly released foundation model optimized for high-quality sound effect generation from text or video, showing competitive or better results than open alternatives like Stable Audio Open.
Learning to Hear Broken Motors: Signature-Guided Data Augmentation for Induction-Motor Diagnostics cs.LG · 2025-06-10 · unverdicted · none · ref 9
SGDA generates synthetic faults in the frequency domain from healthy signals to augment training data for ML-based induction motor diagnostics, claiming superior accuracy.
CCNETS: A Modular Causal Learning Framework for Pattern Recognition in Imbalanced Datasets cs.LG · 2024-01-07 · unverdicted · none · ref 12
CCNETS is a new modular causal framework using three cooperative modules and a Zoint mechanism to align synthetic data generation with classifier needs on imbalanced pattern recognition tasks.
Quantum-inspired tensor networks in machine learning models cs.LG · 2026-04-15 · unverdicted · none · ref 20
Tensor networks developed for quantum states are reviewed as tools for machine learning models, with assessment of their potential computational, explanatory, and privacy advantages alongside remaining challenges.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer