Federated Imputation under Heterogeneous Feature Spaces
Pith reviewed 2026-05-20 20:03 UTC · model grok-4.3
The pith
A shared global feature graph enables indirect knowledge transfer for federated imputation even when clients observe disjoint feature sets.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
FedHF-Impute constructs a shared global feature graph from the union of client feature spaces and applies message passing over the graph to propagate imputation estimates between statistically related features, thereby permitting knowledge transfer across clients whose local feature observations never intersect while preserving standard federated averaging.
What carries the argument
shared global feature graph encoding statistical relations among features, with message passing to propagate imputation estimates across non-overlapping client views
If this is right
- Imputation can proceed collaboratively even when clients never share the same feature schema.
- Standard federated communication protocols remain sufficient for the transfer.
- Root-mean-square error improves by 26.9 percent on SECOM and 8.4 percent on AirQuality under partial feature overlap.
- Performance stays within 0.3 percent of the best baseline on fully aligned data such as PhysioNET.
Where Pith is reading between the lines
- The same graph-propagation idea could support other federated tasks such as classification when feature spaces are heterogeneous.
- Real deployments would still require a secure way to assemble the global graph without exposing raw client data.
- The method might reduce the preprocessing burden of forcing feature alignment before federated training begins.
Load-bearing premise
The relations captured in the global feature graph are sufficiently strong and accurate to carry useful imputation information between features that no single client ever observes together.
What would settle it
Constructing a feature graph on data where connected features show no statistical dependence and observing that the resulting imputation error equals or exceeds that of non-graph federated baselines.
Figures
read the original abstract
Federated Learning (FL) enables collaborative training across decentralized clients, but most methods assume aligned feature schemas, an assumption that rarely holds in tabular settings where clients observe only partially overlapping feature subsets. In these heterogeneous feature spaces, parameter-averaging methods (e.g., FedAvg) transfer little information across weakly overlapping or disjoint feature groups, limiting their effectiveness for federated imputation. To overcome this, we propose \textbf{FedHF-Impute}, a federated imputation framework that separates structural feature unavailability from conventional missingness and uses a shared global feature graph to propagate information across statistically related features through message passing. This enables indirect cross-client knowledge transfer, even when features are never jointly observed locally, while preserving standard federated communication. Under simulated partial schema overlap on the SECOM and AirQuality datasets, FedHF-Impute improves imputation accuracy (RMSE) over FL baselines by 26.9\%, and 8.4\% respectively, while achieving comparable performance on PhysioNET, with only a 0.3\% difference relative to the best baseline.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces FedHF-Impute, a federated imputation framework for heterogeneous feature spaces where clients observe only partially overlapping feature subsets. It separates structural unavailability from conventional missingness and employs a shared global feature graph with message passing to enable indirect cross-client knowledge transfer even when features are never jointly observed locally, while preserving standard federated communication. The paper reports RMSE improvements of 26.9% on SECOM and 8.4% on AirQuality over FL baselines under simulated partial schema overlap, with comparable performance on PhysioNET.
Significance. If the central claims hold after clarification, the work could meaningfully advance federated learning for tabular data in domains with schema heterogeneity, such as healthcare or manufacturing, by providing a mechanism for indirect imputation transfer without requiring feature alignment or data centralization.
major comments (2)
- [Framework description / global feature graph construction] The construction of the shared global feature graph (described in the framework overview) is insufficiently specified. It remains unclear whether edge weights or adjacency are derived from server-side aggregation of per-feature statistics or from an external ontology; either route appears to conflict with the premise of never jointly observed features and the guarantee of standard federated communication without centralization.
- [Section 4] Section 4 (Experiments): The reported RMSE improvements (26.9% on SECOM, 8.4% on AirQuality) lack supporting details on the exact protocol for simulating partial schema overlap, baseline implementations, statistical significance tests, variance across runs, and ablations isolating the message-passing component over the global graph. Without these, the empirical support for the central claim cannot be verified.
minor comments (2)
- [Method overview] Clarify the precise communication protocol (e.g., what is exchanged in each round) to make the claim of 'preserving standard federated communication' more concrete.
- [Framework description] Add formal notation or equations for the message-passing update rule on the global feature graph to improve technical precision.
Simulated Author's Rebuttal
We thank the referee for the constructive comments and the opportunity to clarify and strengthen our manuscript. We address each major comment point by point below, indicating where revisions will be made.
read point-by-point responses
-
Referee: [Framework description / global feature graph construction] The construction of the shared global feature graph (described in the framework overview) is insufficiently specified. It remains unclear whether edge weights or adjacency are derived from server-side aggregation of per-feature statistics or from an external ontology; either route appears to conflict with the premise of never jointly observed features and the guarantee of standard federated communication without centralization.
Authors: We agree that the original description of global feature graph construction is insufficiently precise and could lead to the ambiguities noted. The graph is constructed from server-side aggregation of locally computed per-client feature correlation statistics (via federated averaging of pairwise correlation estimates), without any external ontology or requirement for joint feature observation across clients. This is fully consistent with standard federated communication. In the revised manuscript we have added a dedicated subsection in Section 3 with the exact algorithm, including how local correlations are estimated on each client's partial schema and aggregated at the server while preserving heterogeneity and privacy constraints. revision: yes
-
Referee: [Section 4] Section 4 (Experiments): The reported RMSE improvements (26.9% on SECOM, 8.4% on AirQuality) lack supporting details on the exact protocol for simulating partial schema overlap, baseline implementations, statistical significance tests, variance across runs, and ablations isolating the message-passing component over the global graph. Without these, the empirical support for the central claim cannot be verified.
Authors: We acknowledge that the experimental section would benefit from additional transparency to allow full verification. The revised manuscript will include: (i) the precise protocol for simulating partial schema overlap (overlap ratios, feature partitioning method, and random seeds); (ii) hyperparameter settings and implementation details for all baselines; (iii) statistical significance tests (paired t-tests with p-values); (iv) mean and standard deviation of RMSE across five independent runs; and (v) an ablation study isolating the message-passing component. These additions will appear in the main Section 4 and an expanded appendix. revision: yes
Circularity Check
No circularity detected; framework derivation is self-contained
full rationale
The paper proposes FedHF-Impute as a novel federated imputation method that introduces a shared global feature graph for message passing to enable indirect knowledge transfer under heterogeneous feature spaces. The abstract and described approach present this as an explicit architectural choice with empirical results on SECOM, AirQuality, and PhysioNET datasets showing RMSE improvements. No load-bearing steps reduce by construction to fitted inputs, self-definitions, or self-citation chains; the central mechanism (graph-based propagation) is presented as an independent modeling decision rather than a renaming or statistical tautology of prior results. The derivation chain remains externally falsifiable via the reported experiments and does not collapse to its own inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Features across clients possess statistical relations that can be represented in a single global graph for effective message passing.
invented entities (1)
-
shared global feature graph
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Extracting and composing robust features with denoising autoencoders,
P. Vincent, H. Larochelle, Y . Bengio, and P.-A. Manzagol, “Extracting and composing robust features with denoising autoencoders,” inProc. ICML, 2008, pp. 1096–1103
work page 2008
-
[2]
van Buuren,Flexible Imputation of Missing Data
S. van Buuren,Flexible Imputation of Missing Data. Boca Raton, FL, USA: CRC Press, 2011
work page 2011
-
[3]
GAIN: Missing data imputation using generative adversarial nets,
J. Yoon, J. Jordon, and M. van der Schaar, “GAIN: Missing data imputation using generative adversarial nets,” inProc. 35th Int. Conf. Machine Learning (ICML), Stockholm, Sweden, 2018, pp. 5689–5698
work page 2018
-
[4]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 30, 2017
work page 2017
-
[5]
Communication-efficient learning of deep networks from decentralized data,
B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y. Arcas, “Communication-efficient learning of deep networks from decentralized data,” inProc. 20th Int. Conf. Artificial Intelligence and Statistics (AISTATS), Fort Lauderdale, FL, USA, 2017, pp. 1273–1282
work page 2017
-
[6]
MIW AE: Deep generative modelling and imputation of incomplete data sets,
P.-A. Mattei and J. Frellsen, “MIW AE: Deep generative modelling and imputation of incomplete data sets,” inProc. ICML, 2019
work page 2019
-
[7]
BRITS: Bidirectional recurrent imputation for time series,
W. Caoet al., “BRITS: Bidirectional recurrent imputation for time series,” inAdvances in NeurIPS, 2018
work page 2018
-
[8]
SAITS: Self-attention-based imputation for time series,
W. Du, D. Cote, and Y . Liu, “SAITS: Self-attention-based imputation for time series,”Expert Systems with Applications, vol. 219, 2023
work page 2023
-
[9]
CSDI: Conditional score-based diffusion models for probabilistic time series imputation,
Y . Tashiroet al., “CSDI: Conditional score-based diffusion models for probabilistic time series imputation,” inAdvances in NeurIPS, 2021
work page 2021
-
[10]
TabTransformer: Tabular Data Modeling Using Contextual Embeddings
X. Huanget al., “TabTransformer: Tabular data modeling using contex- tual embeddings,” arXiv:2012.06678, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[11]
Filling the G ap s: Multivariate time series imputation by graph neural networks,
A. Cini, I. Marisca, and C. Alippi, “Filling the G ap s: Multivariate time series imputation by graph neural networks,” inProc. ICLR, 2022
work page 2022
-
[12]
Data imputation with iterative graph reconstruction,
J. Zhong, N. Gui, and W. Ye, “Data imputation with iterative graph reconstruction,” inProc. AAAI, 2023
work page 2023
-
[13]
EGG-GAE: Scalable graph autoen- coders for tabular data imputation,
L. Telyatnikov and S. Scardapane, “EGG-GAE: Scalable graph autoen- coders for tabular data imputation,” inProc. AISTATS, 2023
work page 2023
-
[14]
Federated optimization in heterogeneous networks (Fed- Prox),
T. Liet al., “Federated optimization in heterogeneous networks (Fed- Prox),” inProc. MLSys, 2020
work page 2020
-
[15]
SCAFFOLD: Stochastic controlled averaging for federated learning,
S. Karimireddyet al., “SCAFFOLD: Stochastic controlled averaging for federated learning,” inProc. ICML, 2020
work page 2020
-
[16]
Model-contrastive federated learning,
Q. Li, B. He, and D. Song, “Model-contrastive federated learning,” in Proc. CVPR, 2021
work page 2021
-
[17]
FedBN: Federated learning on non-IID features via local batch normalization,
X. Liet al., “FedBN: Federated learning on non-IID features via local batch normalization,” inProc. ICLR, 2021
work page 2021
-
[18]
A survey on vertical federated learning: From a layered perspective,
L. Yanget al., “A survey on vertical federated learning: From a layered perspective,” arXiv:2304.01829, 2023
-
[19]
Vertical federated learning: A structured literature review,
A. Khan, M. ten Thij, and A. Wilbik, “Vertical federated learning: A structured literature review,”Knowl. Inf. Syst., 2025
work page 2025
-
[20]
Protecting model updates in privacy-preserving federated learning: Input privacy for vertical FL,
D. Darais, J. Near, M. Durkee, and D. Buckley, “Protecting model updates in privacy-preserving federated learning: Input privacy for vertical FL,” NIST Cybersecurity Insights Blog, 2024
work page 2024
-
[21]
Federated private set intersection example,
NVIDIA, “Federated private set intersection example,” NVFlare GitHub repository, 2024
work page 2024
-
[22]
SecureBoost: A lossless federated learning framework,
K. Chenget al., “SecureBoost: A lossless federated learning framework,” inProc. FML Workshop @ IJCAI, 2019
work page 2019
-
[23]
SecureBoost+: Large-scale and high-performance vertical federated gradient boosting decision trees,
T. Fanet al., “SecureBoost+: Large-scale and high-performance vertical federated gradient boosting decision trees,” inProc. PAKDD, 2024
work page 2024
-
[24]
Split learning for health: Distributed deep learning without sharing raw patient data
P. Vepakommaet al., “Split learning for health: Distributed deep learning without sharing raw patient data,” arXiv:1812.00564, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[25]
Fed-MIW AE: Federated imputation of incomplete data via deep generative models,
I. Balelliet al., “Fed-MIW AE: Federated imputation of incomplete data via deep generative models,” arXiv:2304.08054, 2023
-
[26]
Cafe: Improved federated data imputation by leveraging missing data heterogeneity,
S. Minet al., “Cafe: Improved federated data imputation by leveraging missing data heterogeneity,” preprint, 2024
work page 2024
-
[27]
Personalised federated learning on hetero- geneous feature spaces,
A. Rakotomamonjyet al., “Personalised federated learning on hetero- geneous feature spaces,” arXiv:2301.11447, 2023
-
[28]
Clustered federated learning for heterogeneous feature spaces using Siamese GCN distance prediction,
Y . Suzuki and F. Banaei-Kashani, “Clustered federated learning for heterogeneous feature spaces using Siamese GCN distance prediction,” OpenReview (FLSys), 2023
work page 2023
-
[29]
Towards heterogeneous federated learning: Analysis, solutions, and future directions,
Y . Linet al., “Towards heterogeneous federated learning: Analysis, solutions, and future directions,” inProc. AIS&P, Springer, 2024
work page 2024
-
[30]
M. Karami and A. Karami, “Harmony in federated learning: A compre- hensive review of techniques to tackle heterogeneity and non-IID data,” Cluster Computing, vol. 28, no. 570, 2025
work page 2025
-
[31]
Handling spatial–temporal data heterogeneity for federated continual learning via Tail Anchor,
H. Yuet al., “Handling spatial–temporal data heterogeneity for federated continual learning via Tail Anchor,” inProc. CVPR, 2025
work page 2025
-
[32]
Heterogeneous federated learning: State-of-the-art and research challenges,
M. Yeet al., “Heterogeneous federated learning: State-of-the-art and research challenges,”ACM Computing Surveys, 2023
work page 2023
-
[33]
Rethinking the diffusion models for missing data impu- tation: A gradient flow perspective,
Z. Chenet al., “Rethinking the diffusion models for missing data impu- tation: A gradient flow perspective,” inAdvances in Neural Information Processing Systems (NeurIPS), 2024
work page 2024
-
[34]
Privacy-preserving vertical federated KNN feature imputation method,
W. Du, Y . Wang, G. Meng, and Y . Guo, “Privacy-preserving vertical federated KNN feature imputation method,”Electronics, vol. 13, no. 2, 2024
work page 2024
-
[35]
Multimodal federated learning with missing modalities through feature imputation network,
P. Poudel, A. Chhetri, P. Gyawali, G. Leontidis, and B. Bhattarai, “Multimodal federated learning with missing modalities through feature imputation network,” inMedical Image Understanding and Analysis (MIUA), LNCS, 2026
work page 2026
-
[36]
Y . Chang, Y . Noh, S.-W. Lee, M. Lee, and W. Noh, “Decentralized noise handling in medical imaging: Encoder–decoder based federated imputation for robust training,” inProc. MICCAI, 2025
work page 2025
-
[37]
T. Zhou, J. Zhang, and D. H. K. Tsang, “FedFA: Federated learning with feature anchors to align features and classifiers for heterogeneous data,” IEEE Transactions on Mobile Computing, vol. 23, no. 6, pp. 6731–6742, 2024
work page 2024
-
[38]
W.-C. Chung and C.-H. Peng, “FedCOLA: Federated learning with heterogeneous feature concatenation and local acceleration for non-IID data,”Future Generation Computer Systems, vol. 166, 2025
work page 2025
-
[39]
X. Hu, W. Fan, H. Chen, and Y . Fu, “Reconstructing missing variables for multivariate time series forecasting via conditional generative flows,” inProc. IJCAI, 2024
work page 2024
-
[40]
TIformer: A transformer-based framework for time-series forecasting with missing data,
Z. Ding, Y . Chen, H. Wang, X. Wang, W. Zhang, and Y . Zhang, “TIformer: A transformer-based framework for time-series forecasting with missing data,” inDatabases Theory and Applications (ADC 2024), LNCS vol. 15449, Springer, 2024, pp. 71–84
work page 2024
-
[41]
Medjadji, Chaimaa, et al. ”FedSparQ: Adaptive Sparse Quantization with Error Feedback for Robust & Efficient Federated Learning.” 2025 3rd International Conference on Federated Learning Technologies and Applications (FLTA). IEEE, 2025
work page 2025
-
[42]
M. D. Nguyen, T. T. Nguyen, H. H. Pham, T. N. Hoang, P. L. Nguyen, and T. T. Huynh, “FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Con- trastive Regularization,” inProc. 22nd Int. Symp. Network Com- puting and Applications (NCA), 2024, pp. 1–8. [Online]. Available: https://arxiv.org/abs/2410.03070
-
[43]
Learning from data with structured missingness,
R. Mitra, S. F. McGough, T. Chakraborti, C. Holmes, R. Copping, N. Hagenbuch, S. Biedermann, J. Noonan, B. Lehmann, A. Shenvi, X. V . Doan, D. Leslie, G. Bianconi, R. Sanchez-Garcia, A. Davies, M. Mackintosh, E.-R. Andrinopoulou, A. Basiri, C. Harbron, and B. D. MacArthur, “Learning from data with structured missingness,” Nature Machine Intelligence, vol....
work page 2023
-
[44]
I. Hocine, A. Abboura, A. Kella, M. Hanini, and G. Danoy, “Vision paper: Metadata-Guided Diffusion and LLM-Orchestrated Quality Gov- ernance for Time Series Imputation,” In Proceedings of the QuaLLM-KG Workshop, co-located with EDBT/ICDT 2026, Tampere, Finland, 2026
work page 2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.