MetaSyn benchmark shows LLM pipelines recover at most 52.7% of ground-truth included studies due to screening failures on PI/ECO eligibility, despite 90.9% retrieval recall at K=200.
Reproducing nevir: Negation in neural information retrieval
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 6roles
dataset 1polarities
use dataset 1representative citing papers
RWGBench is a citation-centric benchmark for related work generation built from 40k CS papers and a 100-paper test set, with multi-dimensional metrics that better match human expert judgment than standard similarity scores.
Magis-Bench is a new benchmark of 74 magistrate-level legal writing tasks from Brazilian exams where the strongest LLMs reach only 6.97/10, showing judicial reasoning remains difficult for current models.
TikTok formally complies with DSA rules against profiling minors but delivers 5-8 times stronger interest-based targeting through undisclosed influencer and promotional content.
A two-stage adapter method aligns query and document embedding spaces to improve dense retrieval for complex queries using lightweight encoders and few labels.
Passages made from high-convergence sentences improve LLM performance on inferential questions compared to cosine similarity selection.
citing papers explorer
-
Benchmarking LLM Agents on Meta-Analysis Articles from Nature Portfolio
MetaSyn benchmark shows LLM pipelines recover at most 52.7% of ground-truth included studies due to screening failures on PI/ECO eligibility, despite 90.9% retrieval recall at K=200.
-
RWGBench: Evaluating Scholarly Positioning in Related Work Generation
RWGBench is a citation-centric benchmark for related work generation built from 40k CS papers and a 100-paper test set, with multi-dimensional metrics that better match human expert judgment than standard similarity scores.
-
The DSA's Blind Spot: Algorithmic Audit of Advertising and Minor Profiling on TikTok
TikTok formally complies with DSA rules against profiling minors but delivers 5-8 times stronger interest-based targeting through undisclosed influencer and promotional content.
-
Align then Train: Efficient Retrieval Adapter Learning
A two-stage adapter method aligns query and document embedding spaces to improve dense retrieval for complex queries using lightweight encoders and few labels.
-
Context Convergence Improves Answering Inferential Questions
Passages made from high-convergence sentences improve LLM performance on inferential questions compared to cosine similarity selection.