Beyond Advocacy: A Design Space for Replication-Related Studies
Pith reviewed 2026-05-15 16:55 UTC · model grok-4.3
The pith
Replication experimental design can be framed as a pairwise comparison across four practical dimensions each with three levels.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that replication design is usefully treated as a pairwise comparison problem and can be represented by a four-dimensional design space in which each dimension is defined by three comparison levels. This space allows replication studies to be identified, categorized, compared and analyzed systematically, supporting both retrospective characterization of past work and prospective planning of future studies.
What carries the argument
The four-dimensional design space that represents replication as pairwise comparisons between original and replication studies at three discrete levels per dimension.
If this is right
- Researchers gain a shared vocabulary for specifying exactly which aspects of an original study they intend to hold constant.
- Existing replication papers can be systematically classified and gaps in coverage become visible.
- Planning a new replication becomes a matter of choosing a point in the design space rather than an unstructured set of decisions.
- Reviewers and readers can more easily assess whether a claimed replication actually tests the intended claim.
Where Pith is reading between the lines
- The same grid could be applied to replication debates in fields outside HCI and visualization to test whether the four dimensions travel.
- Standard reporting templates for replication studies could incorporate explicit positions on each dimension.
- Meta-analyses of replication success rates might be refined by grouping studies that occupy similar positions in the space.
- Tool builders could create simple interfaces that let authors record their design choices along the four dimensions at submission time.
Load-bearing premise
That the four chosen dimensions and their three comparison levels are enough to describe the important design decisions in any replication study without significant omissions.
What would settle it
A concrete replication study whose key decisions about what to keep or change cannot be placed on any of the four dimensions without forcing or leaving out major aspects of the design.
read the original abstract
The importance of replication is often discussed and advocated -- not only in the domains of visualization and HCI, but in all scientific areas. When replicating a study, design decisions need to be made with regards which aspects of the original study will remain the same and which will be altered. We present a supporting multi-dimensional design space framework within which such decisions can be identified, categorized, compared and analyzed. The framework treats replication experimental design as a pairwise comparison problem, and represents the design by four practical dimensions defined by three comparison levels. The design space is therefore a framework that can be used for both retrospective characterization and prospective planning. We provide worked examples, and relate our framework to other attempts at describing the scope of replication studies.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a multi-dimensional design space framework for replication-related studies in HCI and visualization. It frames replication experimental design as a pairwise comparison problem between an original study and its replication, represented by four practical dimensions each defined at three comparison levels. The framework is positioned as a tool for both retrospective characterization of existing replications and prospective planning of new ones. The manuscript supplies worked examples and relates the framework to prior attempts at scoping replication studies.
Significance. If the four dimensions and three levels prove jointly exhaustive, the framework would provide a structured, analyzable alternative to purely advocative discussions of replication, enabling clearer categorization and comparison of design choices. This could improve planning and evaluation of replications in visualization and HCI, particularly by making trade-offs explicit, though its value hinges on demonstrated coverage and usability beyond the provided examples.
major comments (2)
- [Framework definition section] The central claim that the four dimensions at three comparison levels comprehensively capture all relevant design decisions (abstract and framework definition) rests on worked examples without a systematic mapping from an enumerated set of replication decisions drawn from the literature. This leaves the exhaustiveness untested and risks omissions such as adjustments to statistical power, preregistration status, or hardware/platform constraints that may not align with the stated levels.
- [Worked examples section] Section on worked examples: The examples illustrate application of the framework but do not include an audit or coverage check against a broader sample of replication studies, which is required to substantiate that the dimensions can identify, categorize, compare, and analyze decisions without significant gaps.
minor comments (2)
- The abstract would benefit from naming the four dimensions explicitly to allow readers to grasp the framework's structure immediately.
- Notation for the three comparison levels could be standardized with a table or diagram for easier reference across sections.
Simulated Author's Rebuttal
We thank the referee for their constructive comments on our design space framework for replication-related studies. We address each major comment below, clarifying our approach and outlining targeted revisions to strengthen the manuscript's justification without overstating the current evidence.
read point-by-point responses
-
Referee: [Framework definition section] The central claim that the four dimensions at three comparison levels comprehensively capture all relevant design decisions (abstract and framework definition) rests on worked examples without a systematic mapping from an enumerated set of replication decisions drawn from the literature. This leaves the exhaustiveness untested and risks omissions such as adjustments to statistical power, preregistration status, or hardware/platform constraints that may not align with the stated levels.
Authors: The four dimensions were derived from a review of replication literature in HCI and visualization, focusing on recurring pairwise design decisions between original and replication studies. We did not conduct a formal systematic enumeration of every possible decision across all papers, which is a limitation we acknowledge. However, we can demonstrate coverage for the cited examples: statistical power adjustments map to the 'analysis' dimension at the modified level; preregistration status is captured under the 'procedure' dimension as a protocol change; and hardware/platform constraints align with the 'environment' dimension. We will revise the framework definition section to include an explicit mapping table of common replication decisions from the literature to our dimensions and levels, addressing potential gaps directly. revision: partial
-
Referee: [Worked examples section] Section on worked examples: The examples illustrate application of the framework but do not include an audit or coverage check against a broader sample of replication studies, which is required to substantiate that the dimensions can identify, categorize, compare, and analyze decisions without significant gaps.
Authors: The worked examples were selected to span different replication types (direct, conceptual, and extension) to illustrate practical use for both retrospective and prospective purposes. We agree that a broader audit would provide stronger substantiation of coverage. We will add a new subsection (or appendix) auditing an additional set of 10-15 replication studies drawn from the HCI and visualization literature, explicitly mapping their design decisions to the four dimensions and three levels while noting any edge cases or how they are accommodated. revision: yes
Circularity Check
No circularity: standalone conceptual framework with no derivations or self-referential reductions
full rationale
The manuscript proposes a multi-dimensional design space for replication-related studies as a conceptual tool, defining four practical dimensions at three comparison levels directly by author construction to support retrospective characterization and prospective planning. No equations, fitted parameters, predictions, or mathematical derivations appear in the provided text. The framework is presented as a new organizing structure with worked examples and relations to prior work, but without any load-bearing step that reduces by construction to its own inputs, self-citations, or fitted data. The central claim of utility for identifying and categorizing design decisions rests on the explicit definition of the dimensions and levels rather than any circular equivalence or imported uniqueness theorem. This is a self-contained conceptual contribution with no circular elements.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Replication studies require explicit decisions on which elements to hold constant versus alter relative to the original.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The framework treats replication experimental design as a pairwise comparison problem, and represents the design by four practical dimensions defined by three comparison levels.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
four practical dimensions: Experiment, Data, Participant, and Analysis
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.