Summeval: Re-evaluating summarization evaluation,

· 2021 · DOI 10.1162/tacl_

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

open at publisher browse 5 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

Faithful or Fabricated? A Causal Framework for Rationalization Bias in LLM Judges

cs.CL · 2026-05-13 · unverdicted · novelty 7.0

Presents cue interventions and tie-aware metrics to detect rationalization bias in LLM judges and demonstrates that PROOF-BEFORE-PREFERENCE reduces cue anchoring compared to baselines.

Redact or Keep? A Fully Local AI Cascade for Educational Dialogue De-Identification

cs.CL · 2026-06-16 · unverdicted · novelty 6.0

A local cascade framework for educational dialogue de-identification reaches 0.958 macro F1 on math tutoring transcripts, outperforming same-family LLM-only and commercial baselines while remaining fully on-device.

Anthropogenic Regional Adaptation in Multimodal Vision-Language Model

cs.AI · 2026-04-13 · unverdicted · novelty 5.0

Anthropogenic Regional Adaptation with GG-EZ improves cultural relevance in multimodal vision-language models for Southeast Asia by 5-15% while retaining over 98% of global performance.

Less LLM, More Documents: Searching for Improved RAG

cs.IR · 2025-10-03 · unverdicted · novelty 4.0

Corpus scaling in RAG frequently matches the accuracy gains from larger LLMs on open-domain QA tasks, with mid-sized models benefiting most due to better passage coverage.

Overview of HIPE-2026: Person-Place Relation Extraction from Multilingual Historical Texts

cs.CL · 2026-06-24 · unverdicted · novelty 2.0

HIPE-2026 is an evaluation campaign with 17 teams testing relation extraction for person presence at locations in 19th-20th century newspapers across French, German, and English plus a literary generalization set.

citing papers explorer

Showing 5 of 5 citing papers after filters.

Faithful or Fabricated? A Causal Framework for Rationalization Bias in LLM Judges cs.CL · 2026-05-13 · unverdicted · none · ref 3
Presents cue interventions and tie-aware metrics to detect rationalization bias in LLM judges and demonstrates that PROOF-BEFORE-PREFERENCE reduces cue anchoring compared to baselines.
Redact or Keep? A Fully Local AI Cascade for Educational Dialogue De-Identification cs.CL · 2026-06-16 · unverdicted · none · ref 11
A local cascade framework for educational dialogue de-identification reaches 0.958 macro F1 on math tutoring transcripts, outperforming same-family LLM-only and commercial baselines while remaining fully on-device.
Anthropogenic Regional Adaptation in Multimodal Vision-Language Model cs.AI · 2026-04-13 · unverdicted · none · ref 37
Anthropogenic Regional Adaptation with GG-EZ improves cultural relevance in multimodal vision-language models for Southeast Asia by 5-15% while retaining over 98% of global performance.
Less LLM, More Documents: Searching for Improved RAG cs.IR · 2025-10-03 · unverdicted · none · ref 17
Corpus scaling in RAG frequently matches the accuracy gains from larger LLMs on open-domain QA tasks, with mid-sized models benefiting most due to better passage coverage.
Overview of HIPE-2026: Person-Place Relation Extraction from Multilingual Historical Texts cs.CL · 2026-06-24 · unverdicted · none · ref 23
HIPE-2026 is an evaluation campaign with 17 teams testing relation extraction for person presence at locations in 19th-20th century newspapers across French, German, and English plus a literary generalization set.

Summeval: Re-evaluating summarization evaluation,

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer