pith. sign in

arxiv: 2508.02569 · v1 · submitted 2025-08-04 · 📊 stat.AP

Understanding Heterogeneity in Adaptation to Intermittent Water Supply: Clustering Household Types in Amman, Jordan

Pith reviewed 2026-05-19 00:38 UTC · model grok-4.3

classification 📊 stat.AP
keywords intermittent water supplyhousehold adaptationclustering analysisAmman Jordanwater inequalityurban water systemssurvey dataadaptive strategies
0
0 comments X

The pith

Household survey data from Amman reveals three distinct groups adapting differently to intermittent water supply.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a pipeline that applies hierarchical clustering to household survey responses collected in Amman. The method groups residents into three clusters separated by income, strength of water-related social networks, hours of supply, relocation history, and reported water quality issues. Each cluster shows its own pattern of adaptive actions, such as calling the utility or turning to alternate sources. A sympathetic reader would care because the work shows how adaptation to unreliable water creates different burdens inside one city and offers a repeatable way to spot those differences.

Core claim

Applying hierarchical clustering analysis together with Welch two-sample t-tests to the Amman household survey data identifies three clusters. The clusters differ systematically in income, water social network strength, supply duration, relocation status, and water quality problems. The same clusters also differ in the adaptive strategies their members use, including contacting the utility or seeking an alternate water source. This pattern demonstrates that adaptation to intermittent water supply is unequal within the city.

What carries the argument

Hierarchical clustering analysis combined with Welch two-sample t-tests applied to multidimensional household survey data, which partitions respondents into groups that share similar profiles and then tests which characteristics separate the groups.

If this is right

  • Households in the three clusters follow measurably different strategies to secure water when supply is intermittent.
  • These differences produce unequal coping costs and outcomes across the city.
  • The clustering pipeline supplies a standardized way to detect similar household heterogeneity in other cities that have intermittent water supply.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same analysis could be repeated in additional cities to check whether three clusters with comparable profiles appear elsewhere or whether local conditions produce different groupings.
  • Utilities could use the cluster profiles to target communication or infrastructure upgrades toward the groups most likely to benefit.
  • Longitudinal surveys might reveal whether households move between clusters when supply reliability improves.

Load-bearing premise

The survey answers accurately record the main factors that shape how households respond to water shortages and the clustering step reliably finds real behavioral groups instead of noise.

What would settle it

Running the identical clustering procedure on a fresh, independent household survey from Amman and obtaining a different number of clusters or different separating characteristics would indicate that the three-group structure does not hold.

read the original abstract

More than a billion people around the world experience intermittence in their water supply, posing challenges for urban households in Global South cities. An intermittent water supply (IWS) system prompts water users to adapt to service deficits which entails coping costs. Adaptation and its impacts can vary between households within the same city, leading to intra-urban inequality. Studies on household adaptation to IWS through survey data are limited to exploring income-based heterogeneity and do not account for the multidimensional and non-linear nature of the data. There is a need for a standardized methodology for understanding household responses to IWS that acknowledges the heterogeneity of households characterized by sets of multiple underlying factors and that is applicable across different settings. Here, we develop an analysis pipeline that applies hierarchical clustering analysis (HCA) in combination with the Welch-two-sample t-test on household survey data from Amman, Jordan. We identify three clusters of households distinguished by a set of characteristics including income, water social network, supply duration, relocation and water quality problems and identify their group-specific adaptive strategies such as contacting the utility or accessing an alternate water source. This study uncovers the unequal nature of IWS adaptation in Amman, giving insights into the link between household characteristics and adaptive behaviors, while proposing a standardized method to reveal relevant heterogeneity in households adapting to IWS.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper develops an analysis pipeline that applies hierarchical clustering analysis (HCA) in combination with the Welch-two-sample t-test on household survey data from Amman, Jordan. It identifies three clusters of households distinguished by characteristics including income, water social network, supply duration, relocation and water quality problems and identifies their group-specific adaptive strategies such as contacting the utility or accessing an alternate water source. The study proposes a standardized method to reveal relevant heterogeneity in households adapting to intermittent water supply.

Significance. If the clustering results prove robust, the work offers a standardized methodology for analyzing multidimensional heterogeneity in adaptation to intermittent water supply, extending beyond income-only analyses to identify distinct household types and their strategies. This could inform targeted policies addressing intra-urban inequalities in Global South cities facing IWS challenges.

major comments (2)
  1. [Methods section describing the analysis pipeline] Methods section describing the analysis pipeline: The manuscript provides no information on cluster validation metrics (e.g., silhouette scores or cophenetic correlation), variable selection criteria, sample size, missing-data handling, or sensitivity to linkage method. This is load-bearing for the central claim because the reported three clusters and their adaptive strategies could reflect noise or arbitrary choices rather than genuine multidimensional structure.
  2. [Abstract and results] Abstract and results: The choice of k=3 is presented without comparison to alternative values (k=2 or k=4), bootstrap stability checks, or details on distance metric and linkage criterion. This undermines the claim that the clusters reliably separate meaningful behavioral groups, as the Welch t-tests are applied post-hoc to the same features.
minor comments (1)
  1. [Abstract] Abstract: Consider adding the total sample size and number of variables used in the HCA to provide immediate context for the scale and scope of the clustering analysis.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight important areas for improving the transparency and robustness of the hierarchical clustering pipeline, which we address below by committing to specific revisions that strengthen the presentation of our methods and results without altering the core findings.

read point-by-point responses
  1. Referee: Methods section describing the analysis pipeline: The manuscript provides no information on cluster validation metrics (e.g., silhouette scores or cophenetic correlation), variable selection criteria, sample size, missing-data handling, or sensitivity to linkage method. This is load-bearing for the central claim because the reported three clusters and their adaptive strategies could reflect noise or arbitrary choices rather than genuine multidimensional structure.

    Authors: We agree that these details are essential for evaluating the analysis. In the revised manuscript we will expand the Methods section to report the sample size of the Amman household survey, the handling of missing data via complete-case analysis, the variable selection process drawing on established IWS adaptation literature, the specific distance metric and linkage method applied in HCA, silhouette scores and cophenetic correlation as validation metrics, and results from sensitivity checks across alternative linkage methods. These additions will demonstrate that the three-cluster solution is not an artifact of arbitrary choices. revision: yes

  2. Referee: Abstract and results: The choice of k=3 is presented without comparison to alternative values (k=2 or k=4), bootstrap stability checks, or details on distance metric and linkage criterion. This undermines the claim that the clusters reliably separate meaningful behavioral groups, as the Welch t-tests are applied post-hoc to the same features.

    Authors: We will revise both the abstract and results to include a systematic comparison of solutions for k=2 through k=5, reporting validation metrics that support the selection of k=3 as providing the clearest separation of household types. Bootstrap resampling will be added to assess cluster stability. The distance metric and linkage criterion will be stated explicitly. We will also clarify that the Welch t-tests serve a descriptive, post-hoc role to characterize differences on the input variables, which is standard for interpreting clusters, and will emphasize that these tests are not presented as confirmatory inference. revision: yes

Circularity Check

0 steps flagged

Empirical HCA on survey data yields no circular derivation chain

full rationale

The paper performs hierarchical clustering followed by Welch t-tests on household survey observations from Amman. No equations, fitted parameters, or predictions are defined in terms of the reported clusters; the three groups and their adaptive strategies emerge directly from the data partitioning. No self-citations serve as load-bearing premises, and no ansatz or uniqueness theorem is invoked. The analysis is self-contained against external benchmarks and does not reduce any claimed result to its own inputs by construction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The claim rests on the representativeness of the Amman survey sample and on standard statistical assumptions that the chosen variables and distance metric produce stable, interpretable groups.

free parameters (1)
  • Number of clusters
    Three clusters are reported; the criterion used to settle on this number is not stated in the abstract.
axioms (1)
  • domain assumption Self-reported survey responses on income, supply duration, social networks, relocation and water quality accurately reflect household reality.
    Clustering and subsequent t-tests treat these variables as reliable inputs.

pith-pipeline@v0.9.0 · 5782 in / 1361 out tokens · 66622 ms · 2026-05-19T00:38:36.026424+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.