Neural Natural Language Inference Models Partially Embed Theories of Lexical Entailment and Negation

Geiger, Atticus, Kyle Richardson, Christopher Potts (Apr · 2020 · arXiv 2004.14623

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Localizing Model Behavior with Path Patching

cs.LG · 2023-04-12 · unverdicted · novelty 8.0

Path patching provides a method to express and quantitatively test hypotheses that neural network behaviors are localized to sets of paths.

How to use and interpret activation patching

cs.LG · 2024-04-23 · accept · novelty 5.0

Activation patching provides evidence about neural network circuits when the choice of metric is aligned with the hypothesis and common interpretation errors are avoided.

citing papers explorer

Showing 2 of 2 citing papers.

Localizing Model Behavior with Path Patching cs.LG · 2023-04-12 · unverdicted · none · ref 39
Path patching provides a method to express and quantitatively test hypotheses that neural network behaviors are localized to sets of paths.
How to use and interpret activation patching cs.LG · 2024-04-23 · accept · none · ref 7
Activation patching provides evidence about neural network circuits when the choice of metric is aligned with the hypothesis and common interpretation errors are avoided.

Neural Natural Language Inference Models Partially Embed Theories of Lexical Entailment and Negation

fields

years

verdicts

representative citing papers

citing papers explorer