Discovering Sparse Counterfactual Factors via Latent Adjustment for Survey-based Community Intervention
Pith reviewed 2026-05-11 01:54 UTC · model grok-4.3
The pith
A fixed-basis nonnegative latent space plus Shapley selection and entropy-regularized optimal transport yields sparse, policy-feasible adjustments that shift survey respondent groups toward a reference distribution.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors formulate sparse counterfactual community intervention as a policy-feasible distributional alignment task. They embed survey responses in a fixed-basis nonnegative latent representation that preserves pre/post comparability and supplies a stable map back to original variables. Target-relevant latent factors are identified by Shapley-guided attribution; feasible adjustments are then obtained by minimizing an entropy-regularized optimal-transport discrepancy between the post-intervention target distribution and the reference distribution, subject to a weighted l_{2,1} penalty that enforces shared policy-lever sparsity. Experiments on real transportation survey datasets confirm that
What carries the argument
The fixed-basis nonnegative latent representation that preserves pre/post comparability while providing a stable invertible map from latent factors to original survey variables, combined with Shapley-guided factor selection and entropy-regularized optimal transport minimization under an l_{2,1} sparsity penalty.
If this is right
- The framework produces compact and interpretable policy-feasible interventions with explicit adjustment magnitudes.
- Population-level conversion from the target group toward the reference group improves after the adjustments.
- Intervention sparsity is preserved through the weighted l_{2,1} penalty, focusing changes on shared policy levers.
- The same pipeline works on multiple real-world transportation survey datasets without requiring changes to the core formulation.
Where Pith is reading between the lines
- The same latent-adjustment pipeline could be tested on non-transportation surveys such as health-behavior or energy-consumption questionnaires to check whether the sparsity and comparability properties transfer.
- Because adjustments are expressed as explicit magnitudes on original variables, policy makers could directly translate the output into pilot programs with measurable costs.
- If the reference group is itself time-varying, the method could be extended by recomputing the target alignment at successive time steps to track how intervention priorities evolve.
Load-bearing premise
The fixed-basis nonnegative latent representation preserves pre/post comparability and supplies a stable map from latent factors to original variables so that Shapley-selected factors translate directly into controllable survey adjustments.
What would settle it
If applying the learned group-level adjustments fails to reduce the entropy-regularized optimal-transport distance between the adjusted target distribution and the reference distribution below the distance obtained with no intervention, the central claim is falsified.
Figures
read the original abstract
Transportation surveys are widely used to understand travel preferences and adoption barriers, yet most survey-based analyses remain descriptive or predictive and rarely provide sparse, policy-feasible intervention strategies. We study sparse counterfactual community intervention from survey responses, where the goal is to shift a target respondent group toward a desired reference group through controllable survey-variable adjustments. We formulate this task as a policy-feasible distributional alignment problem using a fixed-basis nonnegative latent representation that preserves pre/post comparability and provides a stable map from latent factors to original variables. To make latent movement actionable, target-relevant latent factors are identified through Shapley-guided attribution and transferred to controllable variables as intervention priorities. Feasible group-level adjustments are then learned by minimizing an entropy-regularized optimal-transport discrepancy between the post-intervention target distribution and the reference distribution, together with a weighted $\ell_{2,1}$ penalty that promotes shared policy-lever sparsity. Experiments on real-world transportation survey datasets show that the proposed framework produces compact and interpretable policy-feasible interventions with explicit adjustment magnitudes, improves population-level conversion, and preserves intervention sparsity. Code and datasets are publicly available at: https://github.com/pangjunbiao/latent-group-alignment.git
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a framework for discovering sparse counterfactual factors to enable policy-feasible community interventions from transportation survey data. It employs a fixed-basis nonnegative latent representation to maintain pre/post comparability, applies Shapley-guided attribution to identify target-relevant latent factors, and learns group-level adjustments by minimizing an entropy-regularized optimal transport discrepancy to a reference distribution combined with a weighted ℓ_{2,1} penalty for shared sparsity. Experiments on real-world survey datasets are reported to produce compact, interpretable interventions with explicit magnitudes that improve population-level conversion while preserving sparsity.
Significance. If the central assumptions hold, particularly the stability of the latent representation under adjustment, the work could offer a practical bridge from descriptive survey analysis to actionable, sparse policy interventions in transportation and related domains. The public availability of code and datasets supports reproducibility, which is a clear strength. The combination of standard components (latent factor models, Shapley values, entropy-regularized OT) is applied in a domain-specific way, though the primary advance appears to be in the integrated pipeline rather than new theoretical machinery.
major comments (2)
- [Abstract] Abstract: the central empirical claims ('improves population-level conversion' and 'preserves intervention sparsity') are stated without any quantitative metrics, baselines, ablation results, error bars, or statistical tests, leaving the strength of the experimental support difficult to evaluate and load-bearing for the paper's conclusions.
- [Method] Method (latent representation and adjustment steps): the fixed-basis nonnegative latent map is asserted to 'preserve pre/post comparability' and provide a 'stable map from latent factors to original variables' after the entropy-regularized OT step plus ℓ_{2,1} penalty, but no derivation, invariance proof, or post-adjustment empirical check (e.g., preserved nonnegativity or loading stability in original variable space) is supplied. This directly undermines the claim that Shapley-selected factors translate into controllable, policy-feasible survey adjustments.
minor comments (2)
- [Abstract] The abstract would be strengthened by including at least one key quantitative result (e.g., conversion improvement percentage or sparsity metric) to give readers an immediate sense of effect size.
- [Method] Notation for the entropy regularization parameter and ℓ_{2,1} penalty weight should be explicitly defined with equations, as these are the only free parameters listed.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which highlight opportunities to strengthen the presentation of empirical results and the justification of the latent representation's stability. We address each major comment below and will revise the manuscript to incorporate the suggested improvements.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central empirical claims ('improves population-level conversion' and 'preserves intervention sparsity') are stated without any quantitative metrics, baselines, ablation results, error bars, or statistical tests, leaving the strength of the experimental support difficult to evaluate and load-bearing for the paper's conclusions.
Authors: We agree that the abstract would be strengthened by including key quantitative results to better substantiate the central claims. The main text reports specific metrics (e.g., conversion improvements and sparsity levels relative to baselines), but these are not summarized in the abstract. In the revision, we will add concise quantitative statements, such as average percentage improvements in population-level conversion and achieved sparsity ratios, while keeping the abstract within length limits. This directly addresses the concern about evaluability of the empirical support. revision: yes
-
Referee: [Method] Method (latent representation and adjustment steps): the fixed-basis nonnegative latent map is asserted to 'preserve pre/post comparability' and provide a 'stable map from latent factors to original variables' after the entropy-regularized OT step plus ℓ_{2,1} penalty, but no derivation, invariance proof, or post-adjustment empirical check (e.g., preserved nonnegativity or loading stability in original variable space) is supplied. This directly undermines the claim that Shapley-selected factors translate into controllable, policy-feasible survey adjustments.
Authors: The fixed-basis nonnegative latent representation is constructed to use the same basis matrix before and after adjustment, which by design preserves the linear mapping from latent factors to original variables and ensures nonnegativity is maintained under the adjustment constraints. However, we acknowledge that the current manuscript provides no explicit derivation of invariance properties or post-adjustment empirical validation (such as checks on loading stability or nonnegativity preservation in variable space). We will add a short derivation sketch in the method section explaining the stability due to the fixed basis and include an empirical verification subsection in the experiments demonstrating that post-adjustment loadings remain stable and nonnegative. This will better support the policy-feasibility claims. revision: yes
Circularity Check
No circularity: derivation applies standard OT, Shapley, and latent-factor tools without reduction to fitted inputs or self-citations
full rationale
The paper formulates the intervention task using a fixed-basis nonnegative latent representation, Shapley-guided factor selection, entropy-regularized optimal transport, and an ℓ_{2,1} penalty. These components are presented as standard techniques applied to survey data; the abstract states that the representation 'preserves pre/post comparability' as a modeling choice rather than deriving it from the OT step itself. No equation or claim reduces the central output (sparse policy-feasible adjustments) to a quantity defined solely by the fitted parameters or by a self-citation chain. The derivation chain therefore remains independent of its own outputs.
Axiom & Free-Parameter Ledger
free parameters (2)
- entropy regularization parameter
- l2,1 sparsity penalty weight
axioms (3)
- standard math Entropy-regularized optimal transport yields a feasible and differentiable alignment between two distributions.
- domain assumption Shapley values provide reliable attribution for identifying target-relevant latent factors.
- domain assumption Fixed-basis nonnegative latent representation preserves pre/post comparability and gives a stable linear map back to original variables.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinctionreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
fixed-basis nonnegative latent representation ... NMF ... fixed H
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
A case study and survey-based assessment of the manage- ment of innovation and technology,
M. Srivastava, “A case study and survey-based assessment of the manage- ment of innovation and technology,”Journal of technology management & innovation, vol. 6, no. 1, pp. 147–160, 2011
work page 2011
-
[2]
Predicting the use of public transportation: a case study from putrajaya, malaysia,
M. N. Borhan, D. Syamsunur, N. Mohd Akhir, M. R. Mat Yazid, A. Ismail, and R. A. Rahmat, “Predicting the use of public transportation: a case study from putrajaya, malaysia,”The Scientific World Journal, vol. 2014, no. 1, p. 784145, 2014
work page 2014
-
[3]
J.-H. Kim, J. Kim, and B.-Y . Youn, “Using a technology acceptance model to explore the intention to use digital health technologies among people with disabilities: cross-sectional survey study,”Journal of Medical Internet Research, vol. 27, p. e79595, 2025
work page 2025
-
[4]
A social equity analysis of the us public transportation system based on job accessibility,
A. J. Yeganeh, R. P. Hall, A. R. Pearce, and S. Hankey, “A social equity analysis of the us public transportation system based on job accessibility,” Journal of Transport and Land Use, vol. 11, no. 1, pp. 1039–1056, 2018
work page 2018
-
[5]
The impact of public transportation on carbon emissions—from the perspective of energy consumption,
Q.-L. Jing, H.-Z. Liu, W.-Q. Yu, and X. He, “The impact of public transportation on carbon emissions—from the perspective of energy consumption,”Sustainability, vol. 14, no. 10, p. 6248, 2022
work page 2022
-
[6]
H. Susanto, I. N. Hj Ahamad, and A. K. Shafa Susanto, “Investigating consumers’ behavioral intentions in the adoption of 5g mobile networks: a holistic approach to technology acceptance and business process integration,”Frontiers in Communications and Networks, vol. 6, p. 1594378, 2025
work page 2025
-
[7]
B. Liu, Z. Ma, H. Kong, and X. Ma, “How incentives affect commuter willingness for public transport: Analysis of travel mode shift across various cities,”Travel Behaviour and Society, vol. 39, p. 100966, 2025
work page 2025
-
[8]
Accessibility and transportation equity,
A. Antipova, S. Sultana, Y . Hu, and J. P. Rhudy Jr, “Accessibility and transportation equity,” p. 3611, 2020
work page 2020
-
[9]
L. Zhang, L. Tao, F. Yang, Y . Bao, and C. Li, “Promoting green transportation through changing behaviors with low-carbon-travel function of digital maps,”Humanities and Social Sciences Communications, vol. 11, no. 1, pp. 1–10, 2024
work page 2024
-
[10]
J. de O ˜na and R. de O ˜na, “Is it possible to attract private vehicle users towards public transport? understanding the key role of service quality, satisfaction and involvement on behavioral intentions,”Transportation, vol. 50, no. 3, pp. 1073–1101, 2023
work page 2023
-
[11]
W. Kriswardhana, K. Ismael, S. Duleba, and D. Eszterg ´ar-Kiss, “Uncov- ering distinct public transport user profiles and the factors influencing the users’ intentions,”Journal of Urban Mobility, vol. 7, p. 100127, 2025
work page 2025
-
[12]
Counterfactual explanations for deep learning-based traffic forecasting,
R. Wang, Y . Xin, Y . Zhang, F. Perez-Cruz, and M. Raubal, “Counterfactual explanations for deep learning-based traffic forecasting,”Communications in Transportation Research, vol. 5, p. 100176, 2025
work page 2025
-
[13]
Distributional counterfac- tual explanations with optimal transport,
L. You, L. Cao, M. Nilsson, B. Zhao, and L. Lei, “Distributional counterfac- tual explanations with optimal transport,”arXiv preprint arXiv:2401.13112, 2024
-
[14]
Counterfactual explanations as interventions in latent space,
R. Crupi, A. Castelnovo, D. Regoli, and B. San Miguel Gonzalez, “Counterfactual explanations as interventions in latent space,”Data Mining and Knowledge Discovery, vol. 38, no. 5, pp. 2733–2769, 2024
work page 2024
-
[15]
Impact analysis of actual traveling performance on bus passenger’s perception and satisfaction,
R. Rong, L. Liu, N. Jia, and S. Ma, “Impact analysis of actual traveling performance on bus passenger’s perception and satisfaction,” Transportation Research Part A: Policy and Practice, vol. 160, pp. 80– 100, 2022
work page 2022
-
[16]
X. Ye and M. Sato, “Private car users’ willingness to switch to public transportation and its influencing factors in the yangtze river delta,”Asian Transport Studies, vol. 11, p. 100171, 2025
work page 2025
-
[17]
E. Sogbe, S. Susilawati, and T. C. Pin, “Scaling up public transport usage: a systematic literature review of service quality, satisfaction and attitude towards bus transport systems in developing countries,”Public Transport, vol. 17, no. 1, pp. 1–44, 2025
work page 2025
-
[18]
Importance-aware topic modeling for discovering public transit risk from noisy social media,
F. Ashraf, M. A. Sabir, J. Deng, J. Pang, and H. Yu, “Importance-aware topic modeling for discovering public transit risk from noisy social media,” arXiv preprint arXiv:2512.06293, 2025
-
[19]
Topic modeling help enhancing sustainable mobility,
X. Li, G. He, P. Guo, Z. Guo, S. Lin, and S. Du, “Topic modeling help enhancing sustainable mobility,”Journal of Cleaner Production, vol. 534, p. 147068, 2025
work page 2025
-
[20]
Identifying latent activity behaviors and lifestyles using mobility data to describe urban dynamics,
Y . Yang, A. Pentland, and E. Moro, “Identifying latent activity behaviors and lifestyles using mobility data to describe urban dynamics,”EPJ Data Science, vol. 12, no. 1, p. 15, 2023
work page 2023
-
[21]
N. Aminpour and S. Saidi, “Unveiling mobility patterns beyond home/work activities: A topic modeling approach using transit smart card and land-use data,”Travel Behaviour and Society, vol. 38, p. 100905, 2025
work page 2025
-
[22]
S.-H. Na, W.-J. Nam, and S.-W. Lee, “Toward practical and plausible counterfactual explanation through latent adjustment in disentangled space,” Expert Systems with Applications, vol. 233, p. 120982, 2023
work page 2023
-
[23]
Cirf: Importance of related features for plausible counterfactual explanations,
H.-D. Kim, Y .-J. Ju, J.-H. Hong, and S.-W. Lee, “Cirf: Importance of related features for plausible counterfactual explanations,”Information Sciences, vol. 678, p. 120974, 2024
work page 2024
-
[24]
Counterfactual explanations and algorithmic recourses for machine learning: A review,
S. Verma, V . Boonsanong, M. Hoang, K. Hines, J. Dickerson, and C. Shah, “Counterfactual explanations and algorithmic recourses for machine learning: A review,”ACM Computing Surveys, vol. 56, no. 12, pp. 1–42, 2024
work page 2024
-
[25]
Nonnegative matrix factorization in dimensionality reduction: A survey,
F. Saberi-Movahed, K. Berahmand, R. Sheikhpour, Y . Li, S. Pan, and M. Jalili, “Nonnegative matrix factorization in dimensionality reduction: A survey,”ACM Computing Surveys, vol. 58, no. 5, pp. 1–41, 2025
work page 2025
-
[26]
The many shapley values for explainable artificial intelligence: A sensitivity analysis perspective,
E. Borgonovo, E. Plischke, and G. Rabitti, “The many shapley values for explainable artificial intelligence: A sensitivity analysis perspective,” European Journal of Operational Research, vol. 318, no. 3, pp. 911–926, 2024
work page 2024
-
[27]
G. Peyr ´e and M. Cuturi,Computational optimal transport: With applica- tions to data science. Now Foundations and Trends, 2019
work page 2019
-
[28]
Santa clara valley on-board transit survey (2013),
“Santa clara valley on-board transit survey (2013),” https://www.nlr.gov/transportation/secure-transportation-data/ tsdc-santa-clara-valley-onboard-transit-survey, 2013
work page 2013
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.