Federated Imputation under Heterogeneous Feature Spaces

Chaimaa Medjadji; Gregoire Danoy; Imane Hocine; Sylvain Kubler; Yves Le Traon

arxiv: 2605.16099 · v1 · pith:ZTZFZCNQnew · submitted 2026-05-15 · 💻 cs.LG · cs.AI

Federated Imputation under Heterogeneous Feature Spaces

Imane Hocine , Chaimaa Medjadji , Sylvain Kubler , Gregoire Danoy , Yves Le Traon This is my paper

Pith reviewed 2026-05-20 20:03 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords Federated LearningMissing Data ImputationHeterogeneous Feature SpacesFeature GraphsMessage PassingTabular DataDecentralized Training

0 comments

The pith

A shared global feature graph enables indirect knowledge transfer for federated imputation even when clients observe disjoint feature sets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses federated imputation when clients hold only partially overlapping feature subsets, a situation that defeats standard parameter-averaging methods because they cannot transfer information across non-shared features. It separates the structural absence of features from ordinary missing values and builds a single global feature graph that records statistical relations among all features across clients. Message passing over this graph then moves imputation information from observed features to related but unobserved ones, allowing indirect transfer without ever requiring joint observation on any client. The approach keeps ordinary federated communication rounds unchanged. Tests under simulated partial overlap show clear gains in root-mean-square error on two of the three evaluated datasets.

Core claim

FedHF-Impute constructs a shared global feature graph from the union of client feature spaces and applies message passing over the graph to propagate imputation estimates between statistically related features, thereby permitting knowledge transfer across clients whose local feature observations never intersect while preserving standard federated averaging.

What carries the argument

shared global feature graph encoding statistical relations among features, with message passing to propagate imputation estimates across non-overlapping client views

If this is right

Imputation can proceed collaboratively even when clients never share the same feature schema.
Standard federated communication protocols remain sufficient for the transfer.
Root-mean-square error improves by 26.9 percent on SECOM and 8.4 percent on AirQuality under partial feature overlap.
Performance stays within 0.3 percent of the best baseline on fully aligned data such as PhysioNET.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same graph-propagation idea could support other federated tasks such as classification when feature spaces are heterogeneous.
Real deployments would still require a secure way to assemble the global graph without exposing raw client data.
The method might reduce the preprocessing burden of forcing feature alignment before federated training begins.

Load-bearing premise

The relations captured in the global feature graph are sufficiently strong and accurate to carry useful imputation information between features that no single client ever observes together.

What would settle it

Constructing a feature graph on data where connected features show no statistical dependence and observing that the resulting imputation error equals or exceeds that of non-graph federated baselines.

Figures

Figures reproduced from arXiv: 2605.16099 by Chaimaa Medjadji, Gregoire Danoy, Imane Hocine, Sylvain Kubler, Yves Le Traon.

**Figure 2.** Figure 2: Imputation performance (RMSE ↓) of FedHF-Impute vs the other baselines on three datasets: AirQuality, PhysioNET and SECOM. port variability across repeated runs whenever available. This distinction reflects the different evaluation units of centralized local imputers and federated global imputers, but all methods are scored on identical corrupted entries. For centralized baselines, we report the RMSE and s… view at source ↗

**Figure 3.** Figure 3: FedHF-Impute convergence across federated rounds on AirQuality, PhysioNET, and SECOM. Each curve shows the validation RMSE ( [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

read the original abstract

Federated Learning (FL) enables collaborative training across decentralized clients, but most methods assume aligned feature schemas, an assumption that rarely holds in tabular settings where clients observe only partially overlapping feature subsets. In these heterogeneous feature spaces, parameter-averaging methods (e.g., FedAvg) transfer little information across weakly overlapping or disjoint feature groups, limiting their effectiveness for federated imputation. To overcome this, we propose \textbf{FedHF-Impute}, a federated imputation framework that separates structural feature unavailability from conventional missingness and uses a shared global feature graph to propagate information across statistically related features through message passing. This enables indirect cross-client knowledge transfer, even when features are never jointly observed locally, while preserving standard federated communication. Under simulated partial schema overlap on the SECOM and AirQuality datasets, FedHF-Impute improves imputation accuracy (RMSE) over FL baselines by 26.9\%, and 8.4\% respectively, while achieving comparable performance on PhysioNET, with only a 0.3\% difference relative to the best baseline.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces FedHF-Impute with a global feature graph and message passing to handle imputation when client feature sets barely overlap, but the graph construction step looks like the part that needs the most checking.

read the letter

The paper's main contribution is a federated imputation approach called FedHF-Impute that relies on a shared global feature graph and message passing to move information across features that clients never observe together. This setup aims to get around the limits of FedAvg when feature schemas only partially overlap. It does a solid job identifying the issue in real tabular federated scenarios and proposing a way to enable indirect transfer while sticking to standard communication. The improvements in imputation accuracy on the SECOM and AirQuality datasets are concrete and suggest practical value. The soft spot is how that global graph gets its structure. The abstract does not detail whether feature relations come from pooled statistics at the server or from some external knowledge. If it is the former, it risks reintroducing centralization that the heterogeneous setup was meant to avoid. The paper would benefit from a clear description of this step and any privacy analysis around it. The experimental claims also sit on thin support right now. We see specific RMSE gains but no information on simulation of partial overlap, baseline code, or statistical testing. This paper is for applied machine learning researchers who work with federated tabular data in domains like healthcare or manufacturing. A reader who has dealt with mismatched columns across clients will find the graph idea relevant and worth exploring further. I recommend sending it for peer review. The problem it targets is common enough that even with the current gaps, referees can help strengthen the graph construction and evaluation sections.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces FedHF-Impute, a federated imputation framework for heterogeneous feature spaces where clients observe only partially overlapping feature subsets. It separates structural unavailability from conventional missingness and employs a shared global feature graph with message passing to enable indirect cross-client knowledge transfer even when features are never jointly observed locally, while preserving standard federated communication. The paper reports RMSE improvements of 26.9% on SECOM and 8.4% on AirQuality over FL baselines under simulated partial schema overlap, with comparable performance on PhysioNET.

Significance. If the central claims hold after clarification, the work could meaningfully advance federated learning for tabular data in domains with schema heterogeneity, such as healthcare or manufacturing, by providing a mechanism for indirect imputation transfer without requiring feature alignment or data centralization.

major comments (2)

[Framework description / global feature graph construction] The construction of the shared global feature graph (described in the framework overview) is insufficiently specified. It remains unclear whether edge weights or adjacency are derived from server-side aggregation of per-feature statistics or from an external ontology; either route appears to conflict with the premise of never jointly observed features and the guarantee of standard federated communication without centralization.
[Section 4] Section 4 (Experiments): The reported RMSE improvements (26.9% on SECOM, 8.4% on AirQuality) lack supporting details on the exact protocol for simulating partial schema overlap, baseline implementations, statistical significance tests, variance across runs, and ablations isolating the message-passing component over the global graph. Without these, the empirical support for the central claim cannot be verified.

minor comments (2)

[Method overview] Clarify the precise communication protocol (e.g., what is exchanged in each round) to make the claim of 'preserving standard federated communication' more concrete.
[Framework description] Add formal notation or equations for the message-passing update rule on the global feature graph to improve technical precision.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and the opportunity to clarify and strengthen our manuscript. We address each major comment point by point below, indicating where revisions will be made.

read point-by-point responses

Referee: [Framework description / global feature graph construction] The construction of the shared global feature graph (described in the framework overview) is insufficiently specified. It remains unclear whether edge weights or adjacency are derived from server-side aggregation of per-feature statistics or from an external ontology; either route appears to conflict with the premise of never jointly observed features and the guarantee of standard federated communication without centralization.

Authors: We agree that the original description of global feature graph construction is insufficiently precise and could lead to the ambiguities noted. The graph is constructed from server-side aggregation of locally computed per-client feature correlation statistics (via federated averaging of pairwise correlation estimates), without any external ontology or requirement for joint feature observation across clients. This is fully consistent with standard federated communication. In the revised manuscript we have added a dedicated subsection in Section 3 with the exact algorithm, including how local correlations are estimated on each client's partial schema and aggregated at the server while preserving heterogeneity and privacy constraints. revision: yes
Referee: [Section 4] Section 4 (Experiments): The reported RMSE improvements (26.9% on SECOM, 8.4% on AirQuality) lack supporting details on the exact protocol for simulating partial schema overlap, baseline implementations, statistical significance tests, variance across runs, and ablations isolating the message-passing component over the global graph. Without these, the empirical support for the central claim cannot be verified.

Authors: We acknowledge that the experimental section would benefit from additional transparency to allow full verification. The revised manuscript will include: (i) the precise protocol for simulating partial schema overlap (overlap ratios, feature partitioning method, and random seeds); (ii) hyperparameter settings and implementation details for all baselines; (iii) statistical significance tests (paired t-tests with p-values); (iv) mean and standard deviation of RMSE across five independent runs; and (v) an ablation study isolating the message-passing component. These additions will appear in the main Section 4 and an expanded appendix. revision: yes

Circularity Check

0 steps flagged

No circularity detected; framework derivation is self-contained

full rationale

The paper proposes FedHF-Impute as a novel federated imputation method that introduces a shared global feature graph for message passing to enable indirect knowledge transfer under heterogeneous feature spaces. The abstract and described approach present this as an explicit architectural choice with empirical results on SECOM, AirQuality, and PhysioNET datasets showing RMSE improvements. No load-bearing steps reduce by construction to fitted inputs, self-definitions, or self-citation chains; the central mechanism (graph-based propagation) is presented as an independent modeling decision rather than a renaming or statistical tautology of prior results. The derivation chain remains externally falsifiable via the reported experiments and does not collapse to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim depends on the ability to construct a global feature graph whose edges reflect statistical relations usable for imputation via message passing, an assumption not independently verified in the abstract.

axioms (1)

domain assumption Features across clients possess statistical relations that can be represented in a single global graph for effective message passing.
Invoked to justify indirect knowledge transfer when features are never jointly observed locally.

invented entities (1)

shared global feature graph no independent evidence
purpose: To connect statistically related features so that message passing can propagate imputation information across clients with disjoint local observations.
New structural element introduced to overcome the limitation of parameter averaging under heterogeneous schemas.

pith-pipeline@v0.9.0 · 5724 in / 1319 out tokens · 83730 ms · 2026-05-20T20:03:59.180218+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 2 internal anchors

[1]

Extracting and composing robust features with denoising autoencoders,

P. Vincent, H. Larochelle, Y . Bengio, and P.-A. Manzagol, “Extracting and composing robust features with denoising autoencoders,” inProc. ICML, 2008, pp. 1096–1103

work page 2008
[2]

van Buuren,Flexible Imputation of Missing Data

S. van Buuren,Flexible Imputation of Missing Data. Boca Raton, FL, USA: CRC Press, 2011

work page 2011
[3]

GAIN: Missing data imputation using generative adversarial nets,

J. Yoon, J. Jordon, and M. van der Schaar, “GAIN: Missing data imputation using generative adversarial nets,” inProc. 35th Int. Conf. Machine Learning (ICML), Stockholm, Sweden, 2018, pp. 5689–5698

work page 2018
[4]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 30, 2017

work page 2017
[5]

Communication-efficient learning of deep networks from decentralized data,

B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y. Arcas, “Communication-efficient learning of deep networks from decentralized data,” inProc. 20th Int. Conf. Artificial Intelligence and Statistics (AISTATS), Fort Lauderdale, FL, USA, 2017, pp. 1273–1282

work page 2017
[6]

MIW AE: Deep generative modelling and imputation of incomplete data sets,

P.-A. Mattei and J. Frellsen, “MIW AE: Deep generative modelling and imputation of incomplete data sets,” inProc. ICML, 2019

work page 2019
[7]

BRITS: Bidirectional recurrent imputation for time series,

W. Caoet al., “BRITS: Bidirectional recurrent imputation for time series,” inAdvances in NeurIPS, 2018

work page 2018
[8]

SAITS: Self-attention-based imputation for time series,

W. Du, D. Cote, and Y . Liu, “SAITS: Self-attention-based imputation for time series,”Expert Systems with Applications, vol. 219, 2023

work page 2023
[9]

CSDI: Conditional score-based diffusion models for probabilistic time series imputation,

Y . Tashiroet al., “CSDI: Conditional score-based diffusion models for probabilistic time series imputation,” inAdvances in NeurIPS, 2021

work page 2021
[10]

TabTransformer: Tabular Data Modeling Using Contextual Embeddings

X. Huanget al., “TabTransformer: Tabular data modeling using contex- tual embeddings,” arXiv:2012.06678, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2012
[11]

Filling the G ap s: Multivariate time series imputation by graph neural networks,

A. Cini, I. Marisca, and C. Alippi, “Filling the G ap s: Multivariate time series imputation by graph neural networks,” inProc. ICLR, 2022

work page 2022
[12]

Data imputation with iterative graph reconstruction,

J. Zhong, N. Gui, and W. Ye, “Data imputation with iterative graph reconstruction,” inProc. AAAI, 2023

work page 2023
[13]

EGG-GAE: Scalable graph autoen- coders for tabular data imputation,

L. Telyatnikov and S. Scardapane, “EGG-GAE: Scalable graph autoen- coders for tabular data imputation,” inProc. AISTATS, 2023

work page 2023
[14]

Federated optimization in heterogeneous networks (Fed- Prox),

T. Liet al., “Federated optimization in heterogeneous networks (Fed- Prox),” inProc. MLSys, 2020

work page 2020
[15]

SCAFFOLD: Stochastic controlled averaging for federated learning,

S. Karimireddyet al., “SCAFFOLD: Stochastic controlled averaging for federated learning,” inProc. ICML, 2020

work page 2020
[16]

Model-contrastive federated learning,

Q. Li, B. He, and D. Song, “Model-contrastive federated learning,” in Proc. CVPR, 2021

work page 2021
[17]

FedBN: Federated learning on non-IID features via local batch normalization,

X. Liet al., “FedBN: Federated learning on non-IID features via local batch normalization,” inProc. ICLR, 2021

work page 2021
[18]

A survey on vertical federated learning: From a layered perspective,

L. Yanget al., “A survey on vertical federated learning: From a layered perspective,” arXiv:2304.01829, 2023

work page arXiv 2023
[19]

Vertical federated learning: A structured literature review,

A. Khan, M. ten Thij, and A. Wilbik, “Vertical federated learning: A structured literature review,”Knowl. Inf. Syst., 2025

work page 2025
[20]

Protecting model updates in privacy-preserving federated learning: Input privacy for vertical FL,

D. Darais, J. Near, M. Durkee, and D. Buckley, “Protecting model updates in privacy-preserving federated learning: Input privacy for vertical FL,” NIST Cybersecurity Insights Blog, 2024

work page 2024
[21]

Federated private set intersection example,

NVIDIA, “Federated private set intersection example,” NVFlare GitHub repository, 2024

work page 2024
[22]

SecureBoost: A lossless federated learning framework,

K. Chenget al., “SecureBoost: A lossless federated learning framework,” inProc. FML Workshop @ IJCAI, 2019

work page 2019
[23]

SecureBoost+: Large-scale and high-performance vertical federated gradient boosting decision trees,

T. Fanet al., “SecureBoost+: Large-scale and high-performance vertical federated gradient boosting decision trees,” inProc. PAKDD, 2024

work page 2024
[24]

Split learning for health: Distributed deep learning without sharing raw patient data

P. Vepakommaet al., “Split learning for health: Distributed deep learning without sharing raw patient data,” arXiv:1812.00564, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[25]

Fed-MIW AE: Federated imputation of incomplete data via deep generative models,

I. Balelliet al., “Fed-MIW AE: Federated imputation of incomplete data via deep generative models,” arXiv:2304.08054, 2023

work page arXiv 2023
[26]

Cafe: Improved federated data imputation by leveraging missing data heterogeneity,

S. Minet al., “Cafe: Improved federated data imputation by leveraging missing data heterogeneity,” preprint, 2024

work page 2024
[27]

Personalised federated learning on hetero- geneous feature spaces,

A. Rakotomamonjyet al., “Personalised federated learning on hetero- geneous feature spaces,” arXiv:2301.11447, 2023

work page arXiv 2023
[28]

Clustered federated learning for heterogeneous feature spaces using Siamese GCN distance prediction,

Y . Suzuki and F. Banaei-Kashani, “Clustered federated learning for heterogeneous feature spaces using Siamese GCN distance prediction,” OpenReview (FLSys), 2023

work page 2023
[29]

Towards heterogeneous federated learning: Analysis, solutions, and future directions,

Y . Linet al., “Towards heterogeneous federated learning: Analysis, solutions, and future directions,” inProc. AIS&P, Springer, 2024

work page 2024
[30]

Harmony in federated learning: A compre- hensive review of techniques to tackle heterogeneity and non-IID data,

M. Karami and A. Karami, “Harmony in federated learning: A compre- hensive review of techniques to tackle heterogeneity and non-IID data,” Cluster Computing, vol. 28, no. 570, 2025

work page 2025
[31]

Handling spatial–temporal data heterogeneity for federated continual learning via Tail Anchor,

H. Yuet al., “Handling spatial–temporal data heterogeneity for federated continual learning via Tail Anchor,” inProc. CVPR, 2025

work page 2025
[32]

Heterogeneous federated learning: State-of-the-art and research challenges,

M. Yeet al., “Heterogeneous federated learning: State-of-the-art and research challenges,”ACM Computing Surveys, 2023

work page 2023
[33]

Rethinking the diffusion models for missing data impu- tation: A gradient flow perspective,

Z. Chenet al., “Rethinking the diffusion models for missing data impu- tation: A gradient flow perspective,” inAdvances in Neural Information Processing Systems (NeurIPS), 2024

work page 2024
[34]

Privacy-preserving vertical federated KNN feature imputation method,

W. Du, Y . Wang, G. Meng, and Y . Guo, “Privacy-preserving vertical federated KNN feature imputation method,”Electronics, vol. 13, no. 2, 2024

work page 2024
[35]

Multimodal federated learning with missing modalities through feature imputation network,

P. Poudel, A. Chhetri, P. Gyawali, G. Leontidis, and B. Bhattarai, “Multimodal federated learning with missing modalities through feature imputation network,” inMedical Image Understanding and Analysis (MIUA), LNCS, 2026

work page 2026
[36]

Decentralized noise handling in medical imaging: Encoder–decoder based federated imputation for robust training,

Y . Chang, Y . Noh, S.-W. Lee, M. Lee, and W. Noh, “Decentralized noise handling in medical imaging: Encoder–decoder based federated imputation for robust training,” inProc. MICCAI, 2025

work page 2025
[37]

FedFA: Federated learning with feature anchors to align features and classifiers for heterogeneous data,

T. Zhou, J. Zhang, and D. H. K. Tsang, “FedFA: Federated learning with feature anchors to align features and classifiers for heterogeneous data,” IEEE Transactions on Mobile Computing, vol. 23, no. 6, pp. 6731–6742, 2024

work page 2024
[38]

FedCOLA: Federated learning with heterogeneous feature concatenation and local acceleration for non-IID data,

W.-C. Chung and C.-H. Peng, “FedCOLA: Federated learning with heterogeneous feature concatenation and local acceleration for non-IID data,”Future Generation Computer Systems, vol. 166, 2025

work page 2025
[39]

Reconstructing missing variables for multivariate time series forecasting via conditional generative flows,

X. Hu, W. Fan, H. Chen, and Y . Fu, “Reconstructing missing variables for multivariate time series forecasting via conditional generative flows,” inProc. IJCAI, 2024

work page 2024
[40]

TIformer: A transformer-based framework for time-series forecasting with missing data,

Z. Ding, Y . Chen, H. Wang, X. Wang, W. Zhang, and Y . Zhang, “TIformer: A transformer-based framework for time-series forecasting with missing data,” inDatabases Theory and Applications (ADC 2024), LNCS vol. 15449, Springer, 2024, pp. 71–84

work page 2024
[41]

Medjadji, Chaimaa, et al. ”FedSparQ: Adaptive Sparse Quantization with Error Feedback for Robust & Efficient Federated Learning.” 2025 3rd International Conference on Federated Learning Technologies and Applications (FLTA). IEEE, 2025

work page 2025
[42]

FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Con- trastive Regularization,

M. D. Nguyen, T. T. Nguyen, H. H. Pham, T. N. Hoang, P. L. Nguyen, and T. T. Huynh, “FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Con- trastive Regularization,” inProc. 22nd Int. Symp. Network Com- puting and Applications (NCA), 2024, pp. 1–8. [Online]. Available: https://arxiv.org/abs/2410.03070

work page arXiv 2024
[43]

Learning from data with structured missingness,

R. Mitra, S. F. McGough, T. Chakraborti, C. Holmes, R. Copping, N. Hagenbuch, S. Biedermann, J. Noonan, B. Lehmann, A. Shenvi, X. V . Doan, D. Leslie, G. Bianconi, R. Sanchez-Garcia, A. Davies, M. Mackintosh, E.-R. Andrinopoulou, A. Basiri, C. Harbron, and B. D. MacArthur, “Learning from data with structured missingness,” Nature Machine Intelligence, vol....

work page 2023
[44]

Vision paper: Metadata-Guided Diffusion and LLM-Orchestrated Quality Gov- ernance for Time Series Imputation,

I. Hocine, A. Abboura, A. Kella, M. Hanini, and G. Danoy, “Vision paper: Metadata-Guided Diffusion and LLM-Orchestrated Quality Gov- ernance for Time Series Imputation,” In Proceedings of the QuaLLM-KG Workshop, co-located with EDBT/ICDT 2026, Tampere, Finland, 2026

work page 2026

[1] [1]

Extracting and composing robust features with denoising autoencoders,

P. Vincent, H. Larochelle, Y . Bengio, and P.-A. Manzagol, “Extracting and composing robust features with denoising autoencoders,” inProc. ICML, 2008, pp. 1096–1103

work page 2008

[2] [2]

van Buuren,Flexible Imputation of Missing Data

S. van Buuren,Flexible Imputation of Missing Data. Boca Raton, FL, USA: CRC Press, 2011

work page 2011

[3] [3]

GAIN: Missing data imputation using generative adversarial nets,

J. Yoon, J. Jordon, and M. van der Schaar, “GAIN: Missing data imputation using generative adversarial nets,” inProc. 35th Int. Conf. Machine Learning (ICML), Stockholm, Sweden, 2018, pp. 5689–5698

work page 2018

[4] [4]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 30, 2017

work page 2017

[5] [5]

Communication-efficient learning of deep networks from decentralized data,

B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y. Arcas, “Communication-efficient learning of deep networks from decentralized data,” inProc. 20th Int. Conf. Artificial Intelligence and Statistics (AISTATS), Fort Lauderdale, FL, USA, 2017, pp. 1273–1282

work page 2017

[6] [6]

MIW AE: Deep generative modelling and imputation of incomplete data sets,

P.-A. Mattei and J. Frellsen, “MIW AE: Deep generative modelling and imputation of incomplete data sets,” inProc. ICML, 2019

work page 2019

[7] [7]

BRITS: Bidirectional recurrent imputation for time series,

W. Caoet al., “BRITS: Bidirectional recurrent imputation for time series,” inAdvances in NeurIPS, 2018

work page 2018

[8] [8]

SAITS: Self-attention-based imputation for time series,

W. Du, D. Cote, and Y . Liu, “SAITS: Self-attention-based imputation for time series,”Expert Systems with Applications, vol. 219, 2023

work page 2023

[9] [9]

CSDI: Conditional score-based diffusion models for probabilistic time series imputation,

Y . Tashiroet al., “CSDI: Conditional score-based diffusion models for probabilistic time series imputation,” inAdvances in NeurIPS, 2021

work page 2021

[10] [10]

TabTransformer: Tabular Data Modeling Using Contextual Embeddings

X. Huanget al., “TabTransformer: Tabular data modeling using contex- tual embeddings,” arXiv:2012.06678, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2012

[11] [11]

Filling the G ap s: Multivariate time series imputation by graph neural networks,

A. Cini, I. Marisca, and C. Alippi, “Filling the G ap s: Multivariate time series imputation by graph neural networks,” inProc. ICLR, 2022

work page 2022

[12] [12]

Data imputation with iterative graph reconstruction,

J. Zhong, N. Gui, and W. Ye, “Data imputation with iterative graph reconstruction,” inProc. AAAI, 2023

work page 2023

[13] [13]

EGG-GAE: Scalable graph autoen- coders for tabular data imputation,

L. Telyatnikov and S. Scardapane, “EGG-GAE: Scalable graph autoen- coders for tabular data imputation,” inProc. AISTATS, 2023

work page 2023

[14] [14]

Federated optimization in heterogeneous networks (Fed- Prox),

T. Liet al., “Federated optimization in heterogeneous networks (Fed- Prox),” inProc. MLSys, 2020

work page 2020

[15] [15]

SCAFFOLD: Stochastic controlled averaging for federated learning,

S. Karimireddyet al., “SCAFFOLD: Stochastic controlled averaging for federated learning,” inProc. ICML, 2020

work page 2020

[16] [16]

Model-contrastive federated learning,

Q. Li, B. He, and D. Song, “Model-contrastive federated learning,” in Proc. CVPR, 2021

work page 2021

[17] [17]

FedBN: Federated learning on non-IID features via local batch normalization,

X. Liet al., “FedBN: Federated learning on non-IID features via local batch normalization,” inProc. ICLR, 2021

work page 2021

[18] [18]

A survey on vertical federated learning: From a layered perspective,

L. Yanget al., “A survey on vertical federated learning: From a layered perspective,” arXiv:2304.01829, 2023

work page arXiv 2023

[19] [19]

Vertical federated learning: A structured literature review,

A. Khan, M. ten Thij, and A. Wilbik, “Vertical federated learning: A structured literature review,”Knowl. Inf. Syst., 2025

work page 2025

[20] [20]

Protecting model updates in privacy-preserving federated learning: Input privacy for vertical FL,

D. Darais, J. Near, M. Durkee, and D. Buckley, “Protecting model updates in privacy-preserving federated learning: Input privacy for vertical FL,” NIST Cybersecurity Insights Blog, 2024

work page 2024

[21] [21]

Federated private set intersection example,

NVIDIA, “Federated private set intersection example,” NVFlare GitHub repository, 2024

work page 2024

[22] [22]

SecureBoost: A lossless federated learning framework,

K. Chenget al., “SecureBoost: A lossless federated learning framework,” inProc. FML Workshop @ IJCAI, 2019

work page 2019

[23] [23]

SecureBoost+: Large-scale and high-performance vertical federated gradient boosting decision trees,

T. Fanet al., “SecureBoost+: Large-scale and high-performance vertical federated gradient boosting decision trees,” inProc. PAKDD, 2024

work page 2024

[24] [24]

Split learning for health: Distributed deep learning without sharing raw patient data

P. Vepakommaet al., “Split learning for health: Distributed deep learning without sharing raw patient data,” arXiv:1812.00564, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[25] [25]

Fed-MIW AE: Federated imputation of incomplete data via deep generative models,

I. Balelliet al., “Fed-MIW AE: Federated imputation of incomplete data via deep generative models,” arXiv:2304.08054, 2023

work page arXiv 2023

[26] [26]

Cafe: Improved federated data imputation by leveraging missing data heterogeneity,

S. Minet al., “Cafe: Improved federated data imputation by leveraging missing data heterogeneity,” preprint, 2024

work page 2024

[27] [27]

Personalised federated learning on hetero- geneous feature spaces,

A. Rakotomamonjyet al., “Personalised federated learning on hetero- geneous feature spaces,” arXiv:2301.11447, 2023

work page arXiv 2023

[28] [28]

Clustered federated learning for heterogeneous feature spaces using Siamese GCN distance prediction,

Y . Suzuki and F. Banaei-Kashani, “Clustered federated learning for heterogeneous feature spaces using Siamese GCN distance prediction,” OpenReview (FLSys), 2023

work page 2023

[29] [29]

Towards heterogeneous federated learning: Analysis, solutions, and future directions,

Y . Linet al., “Towards heterogeneous federated learning: Analysis, solutions, and future directions,” inProc. AIS&P, Springer, 2024

work page 2024

[30] [30]

Harmony in federated learning: A compre- hensive review of techniques to tackle heterogeneity and non-IID data,

M. Karami and A. Karami, “Harmony in federated learning: A compre- hensive review of techniques to tackle heterogeneity and non-IID data,” Cluster Computing, vol. 28, no. 570, 2025

work page 2025

[31] [31]

Handling spatial–temporal data heterogeneity for federated continual learning via Tail Anchor,

H. Yuet al., “Handling spatial–temporal data heterogeneity for federated continual learning via Tail Anchor,” inProc. CVPR, 2025

work page 2025

[32] [32]

Heterogeneous federated learning: State-of-the-art and research challenges,

M. Yeet al., “Heterogeneous federated learning: State-of-the-art and research challenges,”ACM Computing Surveys, 2023

work page 2023

[33] [33]

Rethinking the diffusion models for missing data impu- tation: A gradient flow perspective,

Z. Chenet al., “Rethinking the diffusion models for missing data impu- tation: A gradient flow perspective,” inAdvances in Neural Information Processing Systems (NeurIPS), 2024

work page 2024

[34] [34]

Privacy-preserving vertical federated KNN feature imputation method,

W. Du, Y . Wang, G. Meng, and Y . Guo, “Privacy-preserving vertical federated KNN feature imputation method,”Electronics, vol. 13, no. 2, 2024

work page 2024

[35] [35]

Multimodal federated learning with missing modalities through feature imputation network,

P. Poudel, A. Chhetri, P. Gyawali, G. Leontidis, and B. Bhattarai, “Multimodal federated learning with missing modalities through feature imputation network,” inMedical Image Understanding and Analysis (MIUA), LNCS, 2026

work page 2026

[36] [36]

Decentralized noise handling in medical imaging: Encoder–decoder based federated imputation for robust training,

Y . Chang, Y . Noh, S.-W. Lee, M. Lee, and W. Noh, “Decentralized noise handling in medical imaging: Encoder–decoder based federated imputation for robust training,” inProc. MICCAI, 2025

work page 2025

[37] [37]

FedFA: Federated learning with feature anchors to align features and classifiers for heterogeneous data,

T. Zhou, J. Zhang, and D. H. K. Tsang, “FedFA: Federated learning with feature anchors to align features and classifiers for heterogeneous data,” IEEE Transactions on Mobile Computing, vol. 23, no. 6, pp. 6731–6742, 2024

work page 2024

[38] [38]

FedCOLA: Federated learning with heterogeneous feature concatenation and local acceleration for non-IID data,

W.-C. Chung and C.-H. Peng, “FedCOLA: Federated learning with heterogeneous feature concatenation and local acceleration for non-IID data,”Future Generation Computer Systems, vol. 166, 2025

work page 2025

[39] [39]

Reconstructing missing variables for multivariate time series forecasting via conditional generative flows,

X. Hu, W. Fan, H. Chen, and Y . Fu, “Reconstructing missing variables for multivariate time series forecasting via conditional generative flows,” inProc. IJCAI, 2024

work page 2024

[40] [40]

TIformer: A transformer-based framework for time-series forecasting with missing data,

Z. Ding, Y . Chen, H. Wang, X. Wang, W. Zhang, and Y . Zhang, “TIformer: A transformer-based framework for time-series forecasting with missing data,” inDatabases Theory and Applications (ADC 2024), LNCS vol. 15449, Springer, 2024, pp. 71–84

work page 2024

[41] [41]

Medjadji, Chaimaa, et al. ”FedSparQ: Adaptive Sparse Quantization with Error Feedback for Robust & Efficient Federated Learning.” 2025 3rd International Conference on Federated Learning Technologies and Applications (FLTA). IEEE, 2025

work page 2025

[42] [42]

FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Con- trastive Regularization,

M. D. Nguyen, T. T. Nguyen, H. H. Pham, T. N. Hoang, P. L. Nguyen, and T. T. Huynh, “FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Con- trastive Regularization,” inProc. 22nd Int. Symp. Network Com- puting and Applications (NCA), 2024, pp. 1–8. [Online]. Available: https://arxiv.org/abs/2410.03070

work page arXiv 2024

[43] [43]

Learning from data with structured missingness,

R. Mitra, S. F. McGough, T. Chakraborti, C. Holmes, R. Copping, N. Hagenbuch, S. Biedermann, J. Noonan, B. Lehmann, A. Shenvi, X. V . Doan, D. Leslie, G. Bianconi, R. Sanchez-Garcia, A. Davies, M. Mackintosh, E.-R. Andrinopoulou, A. Basiri, C. Harbron, and B. D. MacArthur, “Learning from data with structured missingness,” Nature Machine Intelligence, vol....

work page 2023

[44] [44]

Vision paper: Metadata-Guided Diffusion and LLM-Orchestrated Quality Gov- ernance for Time Series Imputation,

I. Hocine, A. Abboura, A. Kella, M. Hanini, and G. Danoy, “Vision paper: Metadata-Guided Diffusion and LLM-Orchestrated Quality Gov- ernance for Time Series Imputation,” In Proceedings of the QuaLLM-KG Workshop, co-located with EDBT/ICDT 2026, Tampere, Finland, 2026

work page 2026