hub

Patrick Wilhelm, Thorsten Wittkopp, and Odej Kao

Warner B, Chaffin A, Clavié B, et al · 2024 · arXiv 2412.13663

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

read on arXiv browse 11 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

Is She Even Relevant? When BERT Ignores Explicit Gender Cues

cs.CL · 2026-05-08 · conditional · novelty 7.0

A Dutch BERT model encodes gender linearly by epoch 20 but does not dynamically update its representations when explicit female cues contradict learned stereotypical associations in short sentence templates.

ProteinJEPA: Latent prediction complements protein language models

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

Masked-position MLM plus JEPA latent prediction outperforms MLM-only pretraining on 10-11 of 16 downstream tasks for 35M-150M protein models while JEPA alone fails.

HyperTransport: Amortized Conditioning of T2I Generative Models

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

HyperTransport amortizes activation steering for T2I models via a hypernetwork that predicts intervention parameters from CLIP embeddings, delivering 3600-7000x speedup and matching per-concept baselines on 167 unseen concepts.

NorBERTo: A ModernBERT Model Trained for Portuguese with 331 Billion Tokens Corpus

cs.CL · 2026-04-30 · unverdicted · novelty 7.0

NorBERTo, a ModernBERT encoder trained on the largest open Portuguese corpus of 331B tokens, reports top encoder results on several PLUE and ASSIN 2 tasks.

Dual Triangle Attention: Effective Bidirectional Attention Without Positional Embeddings

q-bio.QM · 2026-04-09 · unverdicted · novelty 7.0

Dual Triangle Attention achieves effective bidirectional attention with built-in positional inductive bias via dual triangular masks, outperforming standard bidirectional attention on position-sensitive tasks and showing strong masked language modeling results with or without positional embeddings.

GLiGuard: Schema-Conditioned Classification for LLM Safeguard

cs.CL · 2026-05-08 · unverdicted · novelty 6.0

GLiGuard is a compact schema-conditioned bidirectional encoder that matches 7B-27B guard models on safety benchmarks while delivering up to 16x higher throughput and 17x lower latency.

Do Synthetic Trajectories Reflect Real Reward Hacking? A Systematic Study on Monitoring In-the-Wild Hacking in Code Generation

cs.LG · 2026-04-26 · unverdicted · novelty 6.0

Synthetic reward hacking data does not capture natural hacking behaviors in code generation RL, causing monitors trained on it to generalize poorly compared to those trained on in-the-wild trajectories.

Rag Performance Prediction for Question Answering

cs.CL · 2026-04-09 · unverdicted · novelty 6.0

A novel supervised predictor modeling semantic relationships among question, retrieved passages, and generated answer best forecasts when RAG improves QA performance.

Efficient Listwise Reranking with Compressed Document Representations

cs.IR · 2026-04-29 · unverdicted · novelty 5.0

RRK compresses documents to multi-token embeddings for efficient listwise reranking, enabling an 8B model to achieve 3x-18x speedups over smaller models with comparable or better effectiveness.

Commonsense Knowledge with Negation: A Resource to Enhance Negation Understanding

cs.CL · 2026-04-21 · unverdicted · novelty 5.0

Augmenting commonsense knowledge corpora with negation produces over 2M new triples that benefit LLM negation understanding when used for pre-training.

Depression Detection at the Point of Care: Automated Analysis of Linguistic Signals from Routine Primary Care Encounters

cs.CL · 2026-03-11 · unverdicted · novelty 4.0

Zero-shot GPT-OSS detects depression from 1,108 primary care encounter transcripts with AUPRC 0.51 and AUROC 0.77, with meaningful signals in the first 128 patient tokens and added value from dyadic mirroring.

citing papers explorer

Showing 11 of 11 citing papers.

Is She Even Relevant? When BERT Ignores Explicit Gender Cues cs.CL · 2026-05-08 · conditional · none · ref 7
A Dutch BERT model encodes gender linearly by epoch 20 but does not dynamically update its representations when explicit female cues contradict learned stereotypical associations in short sentence templates.
ProteinJEPA: Latent prediction complements protein language models cs.LG · 2026-05-08 · unverdicted · none · ref 18
Masked-position MLM plus JEPA latent prediction outperforms MLM-only pretraining on 10-11 of 16 downstream tasks for 35M-150M protein models while JEPA alone fails.
HyperTransport: Amortized Conditioning of T2I Generative Models cs.LG · 2026-05-07 · unverdicted · none · ref 10
HyperTransport amortizes activation steering for T2I models via a hypernetwork that predicts intervention parameters from CLIP embeddings, delivering 3600-7000x speedup and matching per-concept baselines on 167 unseen concepts.
NorBERTo: A ModernBERT Model Trained for Portuguese with 331 Billion Tokens Corpus cs.CL · 2026-04-30 · unverdicted · none · ref 4
NorBERTo, a ModernBERT encoder trained on the largest open Portuguese corpus of 331B tokens, reports top encoder results on several PLUE and ASSIN 2 tasks.
Dual Triangle Attention: Effective Bidirectional Attention Without Positional Embeddings q-bio.QM · 2026-04-09 · unverdicted · none · ref 5
Dual Triangle Attention achieves effective bidirectional attention with built-in positional inductive bias via dual triangular masks, outperforming standard bidirectional attention on position-sensitive tasks and showing strong masked language modeling results with or without positional embeddings.
GLiGuard: Schema-Conditioned Classification for LLM Safeguard cs.CL · 2026-05-08 · unverdicted · none · ref 17
GLiGuard is a compact schema-conditioned bidirectional encoder that matches 7B-27B guard models on safety benchmarks while delivering up to 16x higher throughput and 17x lower latency.
Do Synthetic Trajectories Reflect Real Reward Hacking? A Systematic Study on Monitoring In-the-Wild Hacking in Code Generation cs.LG · 2026-04-26 · unverdicted · none · ref 12
Synthetic reward hacking data does not capture natural hacking behaviors in code generation RL, causing monitors trained on it to generalize poorly compared to those trained on in-the-wild trajectories.
Rag Performance Prediction for Question Answering cs.CL · 2026-04-09 · unverdicted · none · ref 41
A novel supervised predictor modeling semantic relationships among question, retrieved passages, and generated answer best forecasts when RAG improves QA performance.
Efficient Listwise Reranking with Compressed Document Representations cs.IR · 2026-04-29 · unverdicted · none · ref 34
RRK compresses documents to multi-token embeddings for efficient listwise reranking, enabling an 8B model to achieve 3x-18x speedups over smaller models with comparable or better effectiveness.
Commonsense Knowledge with Negation: A Resource to Enhance Negation Understanding cs.CL · 2026-04-21 · unverdicted · none · ref 7
Augmenting commonsense knowledge corpora with negation produces over 2M new triples that benefit LLM negation understanding when used for pre-training.
Depression Detection at the Point of Care: Automated Analysis of Linguistic Signals from Routine Primary Care Encounters cs.CL · 2026-03-11 · unverdicted · none · ref 33
Zero-shot GPT-OSS detects depression from 1,108 primary care encounter transcripts with AUPRC 0.51 and AUROC 0.77, with meaningful signals in the first 128 patient tokens and added value from dyadic mirroring.

Patrick Wilhelm, Thorsten Wittkopp, and Odej Kao

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer