Title resolution pending

Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q Weinberger, Yoav Artzi

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

CommonWhy: A Dataset for Evaluating Entity-Based Causal Commonsense Reasoning in Large Language Models

cs.CL · 2026-05-13 · unverdicted · novelty 7.0

CommonWhy is a new dataset of 15,000 why-questions for evaluating LLMs on entity-based causal commonsense reasoning grounded in Wikidata.

DocShield: Towards AI Document Safety via Evidence-Grounded Agentic Reasoning

cs.CV · 2026-04-03 · unverdicted · novelty 7.0

DocShield presents a new agentic reasoning framework using Cross-Cues-aware Chain of Thought to detect, localize, and explain text-centric forgeries in documents, with reported F1 gains of 41.4% over specialized methods and 23.4% over GPT-4o on T-IC13.

AdaQE-CG: Adaptive Query Expansion for Web-Scale Generative AI Model and Data Card Generation

cs.AI · 2026-03-16 · unverdicted · novelty 6.0

AdaQE-CG uses context-aware adaptive query expansion and inter-card knowledge transfer from a MetaGAI Pool to generate higher-quality model and data cards than prior methods, validated on the new expert-annotated MetaGAI-Bench.

Can Code Evaluation Metrics Detect Code Plagiarism?

cs.SE · 2026-04-28 · unverdicted · novelty 4.0

Code evaluation metrics like CrystalBLEU perform comparably to dedicated tools such as Dolos and JPlag when ranking plagiarized code pairs across modification levels on open datasets.

citing papers explorer

Showing 4 of 4 citing papers.

CommonWhy: A Dataset for Evaluating Entity-Based Causal Commonsense Reasoning in Large Language Models cs.CL · 2026-05-13 · unverdicted · none · ref 66
CommonWhy is a new dataset of 15,000 why-questions for evaluating LLMs on entity-based causal commonsense reasoning grounded in Wikidata.
DocShield: Towards AI Document Safety via Evidence-Grounded Agentic Reasoning cs.CV · 2026-04-03 · unverdicted · none · ref 48
DocShield presents a new agentic reasoning framework using Cross-Cues-aware Chain of Thought to detect, localize, and explain text-centric forgeries in documents, with reported F1 gains of 41.4% over specialized methods and 23.4% over GPT-4o on T-IC13.
AdaQE-CG: Adaptive Query Expansion for Web-Scale Generative AI Model and Data Card Generation cs.AI · 2026-03-16 · unverdicted · none · ref 49
AdaQE-CG uses context-aware adaptive query expansion and inter-card knowledge transfer from a MetaGAI Pool to generate higher-quality model and data cards than prior methods, validated on the new expert-annotated MetaGAI-Bench.
Can Code Evaluation Metrics Detect Code Plagiarism? cs.SE · 2026-04-28 · unverdicted · none · ref 30
Code evaluation metrics like CrystalBLEU perform comparably to dedicated tools such as Dolos and JPlag when ranking plagiarized code pairs across modification levels on open datasets.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer