Title resolution pending

Devlin, J · 2019

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Why Training-Free Token Reduction Collapses: The Inherent Instability of Pairwise Scoring Signals

cs.AI · 2026-04-17 · unverdicted · novelty 7.0

Pairwise scoring signals in Vision Transformer token reduction are inherently unstable due to high perturbation counts and degrade in deep layers, causing collapse, while unary signals with triage enable CATIS to retain 96.9% accuracy at 63% FLOPs reduction on ViT-Large ImageNet-1K.

Unified Work Embeddings: Contrastive Learning of a Bidirectional Multi-task Ranker

cs.CL · 2025-11-11 · unverdicted · novelty 7.0

UWE is a task-agnostic bi-encoder that uses many-to-many InfoNCE and token-level soft late interaction to achieve zero-shot ranking across unseen work-related target spaces while using far fewer parameters than Qwen3-8B and improving MAP by 4.4 points.

End-to-end Contrastive Language-Speech Pretraining Model For Long-form Spoken Question Answering

cs.SD · 2025-11-12 · unverdicted · novelty 5.0

CLSR is an end-to-end contrastive language-speech retriever using an intermediate text-like conversion step to improve retrieval of relevant segments from long audio for spoken question answering.

BEFT: Bias-Efficient Fine-Tuning of Language Models in Low-Data Regimes

cs.CL · 2025-09-19 · conditional · novelty 5.0

Directly fine-tuning the value bias (b_v) in transformer projections outperforms fine-tuning b_q or b_k for downstream performance in low-data regimes across multiple LLM architectures.

citing papers explorer

Showing 4 of 4 citing papers.

Why Training-Free Token Reduction Collapses: The Inherent Instability of Pairwise Scoring Signals cs.AI · 2026-04-17 · unverdicted · none · ref 14
Pairwise scoring signals in Vision Transformer token reduction are inherently unstable due to high perturbation counts and degrade in deep layers, causing collapse, while unary signals with triage enable CATIS to retain 96.9% accuracy at 63% FLOPs reduction on ViT-Large ImageNet-1K.
Unified Work Embeddings: Contrastive Learning of a Bidirectional Multi-task Ranker cs.CL · 2025-11-11 · unverdicted · none · ref 15
UWE is a task-agnostic bi-encoder that uses many-to-many InfoNCE and token-level soft late interaction to achieve zero-shot ranking across unseen work-related target spaces while using far fewer parameters than Qwen3-8B and improving MAP by 4.4 points.
End-to-end Contrastive Language-Speech Pretraining Model For Long-form Spoken Question Answering cs.SD · 2025-11-12 · unverdicted · none · ref 9
CLSR is an end-to-end contrastive language-speech retriever using an intermediate text-like conversion step to improve retrieval of relevant segments from long audio for spoken question answering.
BEFT: Bias-Efficient Fine-Tuning of Language Models in Low-Data Regimes cs.CL · 2025-09-19 · conditional · none · ref 7
Directly fine-tuning the value bias (b_v) in transformer projections outperforms fine-tuning b_q or b_k for downstream performance in low-data regimes across multiple LLM architectures.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer