Using Answer Set Programming for Commonsense Reasoning in the Winograd Schema Challenge
Pith reviewed 2026-05-24 16:11 UTC · model grok-4.3
The pith
Answer Set Programming on graph representations solves 240 out of 291 Winograd Schema problems with added commonsense knowledge.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that a graph based representation of WSC problems combined with an Answer Set Programming encoding of graph-subgraph isomorphism, supplied with relevant commonsense knowledge, correctly resolves 240 out of 291 schemas. This encoding permits additional constraints to be added in an elaboration tolerant manner.
What carries the argument
Graph-subgraph isomorphism encoded in Answer Set Programming on representations of the schema and commonsense facts.
If this is right
- New commonsense rules can be added to the program without changes to the core encoding or solver.
- The stable models produced by ASP make the reasoning steps that select one referent over another explicit and inspectable.
- Unsolved cases indicate gaps in the supplied knowledge base rather than limits of the matching procedure itself.
- The same graph-plus-ASP structure can be reused for other language tasks that require matching a partial description to world knowledge.
Where Pith is reading between the lines
- The method could serve as a verification layer that checks or explains decisions made by statistical pronoun-resolution models.
- Systematically extending the commonsense graph until all 291 problems are covered would test whether knowledge coverage is the main remaining bottleneck.
- Similar graph encodings might transfer to coreference resolution in longer documents where multiple referents must be tracked.
Load-bearing premise
The additional commonsense knowledge supplied to the ASP program is accurate and sufficient to select the correct referent without allowing incorrect solutions.
What would settle it
Running the ASP program on the 51 unsolved problems and checking whether it returns the correct referent, no stable model, or an incorrect referent for each.
read the original abstract
The Winograd Schema Challenge (WSC) is a natural language understanding task proposed as an alternative to the Turing test in 2011. In this work we attempt to solve WSC problems by reasoning with additional knowledge. By using an approach built on top of graph-subgraph isomorphism encoded using Answer Set Programming (ASP) we were able to handle 240 out of 291 WSC problems. The ASP encoding allows us to add additional constraints in an elaboration tolerant manner. In the process we present a graph based representation of WSC problems as well as relevant commonsense knowledge. This paper is under consideration for acceptance in TPLP.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes solving Winograd Schema Challenge (WSC) instances via Answer Set Programming (ASP) encodings of graph-subgraph isomorphism augmented with additional commonsense knowledge constraints. It reports a success rate of 240 out of 291 problems and introduces a graph-based representation of the schemas and knowledge, highlighting the elaboration-tolerant addition of constraints in ASP.
Significance. If the approach can be shown to rely on a reusable, general commonsense theory rather than per-instance encodings, the result would illustrate a concrete application of ASP to commonsense reasoning in NLU. The reported count is an empirical demonstration on hand-crafted encodings; its significance hinges on whether the supplied knowledge is accurate, sufficient, and non-spurious without reverse-engineering from gold answers.
major comments (1)
- [Abstract] Abstract: The claim that the method handles 240/291 WSC problems supplies no error analysis, no description of the source or construction procedure for the commonsense knowledge base, and no verification that the ASP models select the intended referent rather than any consistent model. This is load-bearing for the central claim because the success rate depends on the additional knowledge distinguishing the correct referent without introducing extraneous solutions.
Simulated Author's Rebuttal
We thank the referee for their detailed feedback. We address the concerns about the abstract's claims and supporting evidence below, and we will revise the manuscript accordingly to strengthen the presentation of our results.
read point-by-point responses
-
Referee: [Abstract] Abstract: The claim that the method handles 240/291 WSC problems supplies no error analysis, no description of the source or construction procedure for the commonsense knowledge base, and no verification that the ASP models select the intended referent rather than any consistent model. This is load-bearing for the central claim because the success rate depends on the additional knowledge distinguishing the correct referent without introducing extraneous solutions.
Authors: We agree that the abstract is brief and omits these supporting details, which are important for evaluating the central empirical claim. The full manuscript presents the graph-based representation of WSC instances and the ASP encoding of graph-subgraph isomorphism, along with the elaboration-tolerant addition of commonsense constraints. However, we will revise to include: (1) an error analysis section categorizing solved and unsolved instances; (2) a description of the commonsense knowledge base, including its source in general commonsense principles (drawn from broad axioms rather than instance-specific reverse-engineering from gold labels) and the construction procedure; and (3) verification that the ASP stable models select the intended referent, by showing that the added constraints eliminate the incorrect candidate without permitting extraneous consistent models. These additions will be made in a revised version of the paper. revision: partial
Circularity Check
No circularity: empirical count from ASP solver on hand-crafted encodings
full rationale
The paper presents an empirical result obtained by encoding WSC problems as graphs, adding commonsense knowledge, and running an off-the-shelf ASP solver to count successes (240/291). No equations, fitted parameters, predictions derived from inputs, or self-citations appear in the provided text. The central claim reduces to an implementation and evaluation against external WSC instances rather than any self-referential derivation or renaming. This matches the default expectation of a non-circular empirical paper.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Graph-subgraph isomorphism can be encoded as an Answer Set Program whose stable models correspond to valid matches.
- domain assumption Additional commonsense knowledge can be expressed as elaboration-tolerant constraints that do not alter the core graph encoding.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
By using an approach built on top of graph-subgraph isomorphism encoded using Answer Set Programming (ASP) we were able to handle 240 out of 291 WSC problems.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The ASP encoding allows us to add additional constraints in an elaboration tolerant manner.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.