Title resolution pending

Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu · 2020

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

The Pile: An 800GB Dataset of Diverse Text for Language Modeling

cs.CL · 2020-12-31 · conditional · novelty 8.0

The Pile is a newly constructed 825 GiB dataset from 22 diverse sources that enables language models to achieve better performance on academic, professional, and cross-domain tasks than models trained on Common Crawl variants.

Perhaps PTLMs Should Go to School -- A Task to Assess Open Book and Closed Book QA

cs.CL · 2021-10-04 · unverdicted · novelty 7.0

Proposes a textbook-based true/false QA task where PTLMs score ~50% closed-book even after pre-training on the text and ~60% open-book with retrieval.

Prefix-Tuning: Optimizing Continuous Prompts for Generation

cs.CL · 2021-01-01 · conditional · novelty 7.0

Prefix-tuning matches or exceeds fine-tuning on NLG tasks by optimizing a continuous prefix using 0.1% of parameters while keeping the LM frozen.

When Seeing Overrides Knowing: Disentangling Knowledge Conflicts in Vision-Language Models

cs.CV · 2025-07-18 · unverdicted · novelty 6.0

The work identifies a small set of attention heads in VLMs that mediate conflicts between parametric knowledge and visual input, and shows that intervening on them steers model behavior while attention patterns provide precise image-region attribution.

citing papers explorer

Showing 4 of 4 citing papers.

The Pile: An 800GB Dataset of Diverse Text for Language Modeling cs.CL · 2020-12-31 · conditional · none · ref 80
The Pile is a newly constructed 825 GiB dataset from 22 diverse sources that enables language models to achieve better performance on academic, professional, and cross-domain tasks than models trained on Common Crawl variants.
Perhaps PTLMs Should Go to School -- A Task to Assess Open Book and Closed Book QA cs.CL · 2021-10-04 · unverdicted · none · ref 34
Proposes a textbook-based true/false QA task where PTLMs score ~50% closed-book even after pre-training on the text and ~60% open-book with retrieval.
Prefix-Tuning: Optimizing Continuous Prompts for Generation cs.CL · 2021-01-01 · conditional · none · ref 41
Prefix-tuning matches or exceeds fine-tuning on NLG tasks by optimizing a continuous prefix using 0.1% of parameters while keeping the LM frozen.
When Seeing Overrides Knowing: Disentangling Knowledge Conflicts in Vision-Language Models cs.CV · 2025-07-18 · unverdicted · none · ref 34
The work identifies a small set of attention heads in VLMs that mediate conflicts between parametric knowledge and visual input, and shows that intervening on them steers model behavior while attention patterns provide precise image-region attribution.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer