Robust Personalized Recommendation under Hidden Confounding in MNAR

Tianyu Xia; Wanting Su; Zongyu Li

arxiv: 2605.21066 · v1 · pith:HQCIS4YCnew · submitted 2026-05-20 · 💻 cs.LG

Robust Personalized Recommendation under Hidden Confounding in MNAR

Zongyu Li , Wanting Su , Tianyu Xia This is my paper

Pith reviewed 2026-05-21 05:31 UTC · model grok-4.3

classification 💻 cs.LG

keywords recommender systemshidden confoundingMNARsensitivity boundsdeconfoundingpersonalized boundsadversarial optimizationobservational data

0 comments

The pith

Estimating user-item level sensitivity bounds relaxes the uniform assumption in deconfounding recommender systems with hidden confounders.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Recommender systems trained on observational interaction data suffer from selection bias when hidden factors influence which items users choose to engage with. Existing fixes either demand costly randomized trials or apply one global sensitivity bound to every user-item pair, assuming the hidden confounder affects all interactions the same way. This paper develops a method to estimate a separate sensitivity bound for each user-item pair directly from the data. An adversarial training procedure keeps the bounds tight enough to remove bias while preserving the model's ability to predict future interactions accurately. On three real datasets the personalized approach yields better performance than global-bound methods when hidden confounding is present.

Core claim

The paper claims that a framework called Personalized Unobserved-Confounding-aware Interaction Deconfounder (PUID) can recover accurate user-item interaction probabilities by learning individualized sensitivity bounds on the effect of unobserved confounders, thereby relaxing the homogeneity assumption required by global sensitivity analysis; a benchmark-guided variant (BPUID) further stabilizes training by anchoring to pre-trained models, and both versions outperform global methods on real-world data without any randomized controlled trial observations.

What carries the argument

Personalized Unobserved-Confounding-aware Interaction Deconfounder (PUID), a framework that estimates a distinct sensitivity bound for each user-item pair on the influence of hidden confounders on interaction propensities through adversarial optimization.

If this is right

Recommender models can achieve higher predictive accuracy under hidden confounding by using interaction-specific rather than uniform sensitivity bounds.
The homogeneity assumption of global sensitivity analysis is no longer required for practical deconfounding in missing-not-at-random settings.
Adversarial optimization combined with optional benchmark guidance balances robustness against hidden confounders with maintained recommendation quality.
Performance improvements hold across multiple real-world datasets without any need for randomized controlled trial data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same idea of learning interaction-specific bounds could be tested in other domains where confounding strength varies, such as personalized treatment effect estimation.
One could examine whether the estimated bounds remain stable when the underlying recommendation model is changed from matrix factorization to modern neural architectures.
Direct validation against small-scale randomized trials on the same users and items would test whether the data-driven bounds recover the effects observed in the randomized setting.

Load-bearing premise

User-item level sensitivity bounds can be reliably estimated from observational data alone via the proposed adversarial optimization strategy without introducing new biases or requiring external validation.

What would settle it

A controlled simulation in which the true magnitude of hidden confounding varies across user-item pairs according to a known generative process; if the method's estimated bounds fail to contain the true confounding effects or produce worse predictions than global bounds, the central claim is falsified.

read the original abstract

Recommender systems often rely on observational user--item interaction data, which is prone to selection bias due to users' selective interactions with items. Inverse propensity weighting and doubly robust estimators effectively mitigate selection bias under observed confounding, but are unreliable in the presence of hidden confounders. Existing approaches relying on randomized controlled trials (RCTs) or global sensitivity bounds are constrained in practice: RCTs demand costly experimental data, while global sensitivity bounds presume a uniformly bounded effect of unmeasured confounders on propensities through sensitivity analysis, thereby neglecting heterogeneity across user--item interactions. To overcome this limitation, we propose a novel framework, which estimates user--item level sensitivity bounds, thereby substantially relaxing the homogeneity assumption inherent in global sensitivity bounds named Personalized Unobserved-Confounding-aware Interaction Deconfounder (PUID). To ensure both robustness and predictive accuracy, we further develop an adversarial optimization strategy and propose a benchmark-guided variant (BPUID) that incorporates pre-trained models as stabilizing references. Extensive experiments on three real-world datasets demonstrate that our approach significantly outperforms global methods under hidden confounding, without requiring RCT data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Personalized sensitivity bounds via adversarial optimization is a reasonable extension of global bounds for MNAR recsys, but identifiability from observational data alone remains the open question.

read the letter

The main thing to know is that this paper shifts sensitivity analysis from a single global bound on unobserved confounding to user-item specific bounds in recommender systems. That directly targets the homogeneity assumption that has limited earlier work, and the adversarial optimization plus the BPUID variant with pre-trained references are the concrete mechanisms they offer to estimate those bounds without RCTs. On three real datasets they report gains over global methods, which is the empirical hook. If the gains are stable under proper controls, the approach could matter for production systems where randomized data is unavailable. The setup builds on existing sensitivity analysis literature in a straightforward way and the motivation is practical rather than purely theoretical. The experiments are the part that gives the claim some weight, even if the abstract leaves the exact optimization details and significance tests implicit. The soft spot is identifiability. Hidden confounding strength is not pinned down by observational interactions alone, so the min-max game can at best return feasible intervals rather than calibrated ones; nothing in the description shows an external anchor or testable parametric restriction that would prevent the bounds from drifting. The pre-trained models in BPUID also risk circularity if those models were fit on similarly biased data. Minor issues include the usual need for more ablation on how sensitive the results are to the choice of reference models and clearer reporting of variance across runs. This is for researchers working on robust recommendation under selection bias and MNAR data. A reader already familiar with sensitivity analysis will see the incremental step clearly and may find the experiments worth checking. It is coherent enough on its own terms to deserve a serious referee, though the review would likely focus on tightening the identification argument and adding validation against known confounding strengths.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes the Personalized Unobserved-Confounding-aware Interaction Deconfounder (PUID) framework to address hidden confounding in MNAR recommender systems. It estimates user-item level sensitivity bounds via adversarial optimization, relaxing the homogeneity assumption of global sensitivity bounds, and introduces a benchmark-guided variant (BPUID) that incorporates pre-trained models. The authors report that experiments on three real-world datasets show significant outperformance over global methods without requiring RCT data.

Significance. If the personalized bounds can be shown to be identifiable and non-circular, the framework would meaningfully advance robust recommendation by enabling heterogeneous sensitivity analysis without RCTs or uniform bounds, potentially improving practical deployment in observational settings with hidden confounders.

major comments (2)

[§3] §3 (Adversarial Optimization for Personalized Bounds): The claim that user-item sensitivity bounds are recoverable from observational MNAR data alone via the min-max game is load-bearing but unsupported. Sensitivity parameters remain fundamentally unidentifiable under hidden confounding; the adversarial objective can be satisfied by arbitrary feasible intervals without anchoring to the true (unknown) confounding strength, directly weakening the assertion that personalized bounds reliably relax global homogeneity.
[§5] §5 (Experiments): The reported outperformance on three datasets lacks any detail on bound estimation procedure, concrete form of the adversarial strategy, or statistical significance testing. Without these, it is impossible to verify whether the empirical gains substantiate the robustness claims or merely reflect optimization artifacts.

minor comments (2)

[§3] The manuscript would benefit from an explicit statement of the precise optimization objective (e.g., the loss and constraint forms) in the main text rather than deferring all details to the appendix.
[§2] Notation for the sensitivity bounds (upper/lower per user-item pair) should be introduced consistently before the first use in the method description.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our work. We provide point-by-point responses to the major comments and outline the revisions we plan to make to improve the clarity and rigor of the manuscript.

read point-by-point responses

Referee: [§3] §3 (Adversarial Optimization for Personalized Bounds): The claim that user-item sensitivity bounds are recoverable from observational MNAR data alone via the min-max game is load-bearing but unsupported. Sensitivity parameters remain fundamentally unidentifiable under hidden confounding; the adversarial objective can be satisfied by arbitrary feasible intervals without anchoring to the true (unknown) confounding strength, directly weakening the assertion that personalized bounds reliably relax global homogeneity.

Authors: We concur that sensitivity parameters cannot be uniquely identified from observational MNAR data due to the presence of hidden confounding. Our framework does not purport to recover the ground-truth confounding strengths but rather employs an adversarial min-max optimization to compute personalized sensitivity bounds that are consistent with the observed data while allowing for heterogeneity across user-item pairs. This approach provides a practical relaxation of the global sensitivity bound assumption by deriving data-dependent intervals that ensure robustness. We will revise the manuscript in §3 to explicitly discuss the identifiability challenges and clarify that the bounds serve as conservative, feasible ranges for sensitivity analysis rather than precise estimates of the true effects. Additionally, we will provide more formal justification for the adversarial game's role in bounding the confounding impact. revision: yes
Referee: [§5] §5 (Experiments): The reported outperformance on three datasets lacks any detail on bound estimation procedure, concrete form of the adversarial strategy, or statistical significance testing. Without these, it is impossible to verify whether the empirical gains substantiate the robustness claims or merely reflect optimization artifacts.

Authors: We appreciate this observation and agree that additional details are necessary for reproducibility and verification. In the revised version, we will augment §5 with a comprehensive description of the bound estimation procedure, including the specific implementation of the adversarial optimization strategy (e.g., the loss functions and training dynamics). We will also report the results of statistical significance tests to confirm that the performance improvements are statistically meaningful and not due to random optimization variations. These additions will strengthen the empirical validation of our claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity in PUID derivation chain

full rationale

The paper proposes estimating user-item sensitivity bounds from observational MNAR data via an adversarial optimization strategy within the PUID framework, then applies them for deconfounding. No load-bearing step reduces by construction to a self-definition, a fitted parameter renamed as a prediction, or a self-citation chain. The BPUID variant references pre-trained models as stabilizers, but this is an external reference rather than an internal tautology. The central claim rests on the proposed optimization and empirical outperformance on three datasets, which supplies independent content outside the inputs. No equations or sections exhibit the specific reductions required for circularity flags.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The approach depends on the estimability of these personalized bounds and the effectiveness of the adversarial optimization strategy, which are introduced in the paper.

free parameters (1)

personalized sensitivity bounds
These are estimated per user-item interaction, serving as key parameters in the deconfounding process.

axioms (1)

domain assumption The effect of hidden confounders on propensities varies across different user-item pairs
This heterogeneity assumption allows relaxing the global bound.

invented entities (1)

PUID no independent evidence
purpose: Framework for personalized deconfounding in recommendations
Newly proposed method without mentioned external evidence for the bounds estimation.

pith-pipeline@v0.9.0 · 5715 in / 1363 out tokens · 44859 ms · 2026-05-21T05:31:03.004092+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages

[1]

Bias and debiasing in recommender systems: A survey and future directions,

J. Chen, H. Dong, X. Wang, F. Feng, M. Wang, and X. He, “Bias and debiasing in recommender systems: A survey and future directions,”ACM Transactions on Information Systems, vol. 41, no. 3, pp. 1–39, 2023

work page 2023
[2]

Collaborative filtering and the missing at random assumption,

B. M. Marlin, R. S. Zemel, S. T. Roweis, and M. Slaney, “Collaborative filtering and the missing at random assumption,” inUAI, 2007

work page 2007
[3]

Ranking with non- random missing ratings: Influence of popularity and positivity on evaluation metrics,

B. Pradel, N. Usunier, and P. Gallinari, “Ranking with non- random missing ratings: Influence of popularity and positivity on evaluation metrics,” inRecSys, 2012

work page 2012
[4]

R. J. A. Little and D. B. Rubin,Statistical Analysis with Missing Data, 3rd ed. Wiley, 2019

work page 2019
[5]

Model- agnostic counterfactual reasoning for eliminating popularity bias in recommender systems,

T. Wei, F. Feng, J. Chen, Z. Wu, J. Yi, and X. He, “Model- agnostic counterfactual reasoning for eliminating popularity bias in recommender systems,” inKDD, 2021

work page 2021
[6]

Modeling dynamic missingness of implicit feedback for recommendation,

M. Wang, M. Gong, X. Zheng, and K. Zhang, “Modeling dynamic missingness of implicit feedback for recommendation,” inNeurIPS, 2018

work page 2018
[7]

Training and testing low-degree polynomial data mappings via linear svm,

Y .-W. Chang, C.-J. Hsieh, K.-W. Chang, and C.-J. Lin, “Training and testing low-degree polynomial data mappings via linear svm,”Journal of Machine Learning Research, vol. 11, pp. 1471– 1490, 2010

work page 2010
[8]

Probabilistic matrix factorization with non-random missing data,

J. M. Hern ´andez-Lobato, N. Houlsby, and Z. Ghahramani, “Probabilistic matrix factorization with non-random missing data,” inICML, 2014

work page 2014
[9]

Training and testing of recommender systems on data missing not at random,

H. Steck, “Training and testing of recommender systems on data missing not at random,” inKDD, 2010

work page 2010
[10]

G. W. Imbens and D. B. Rubin,Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge University Press, 2015

work page 2015
[11]

Unbiased recommen- dation model based on improved propensity score estimation,

J. Luo, D. Liu, W. Pan, and Z. Ming, “Unbiased recommen- dation model based on improved propensity score estimation,” Journal of Computer Applications, vol. 42, no. 8, pp. 3508– 3515, 2021

work page 2021
[12]

Doubly robust estimator for ranking metrics with post- click conversions,

Y . Saito, “Doubly robust estimator for ranking metrics with post- click conversions,” inRecSys, 2020

work page 2020
[13]

Recommendations as treatments: Debiasing learn- ing and evaluation,

T. Schnabel, A. Swaminathan, A. Singh, N. Chandak, and T. Joachims, “Recommendations as treatments: Debiasing learn- ing and evaluation,” inICML, 2016

work page 2016
[14]

Counterfactuals and causal inference: Methods and principles for social research,

S. L. Morgan and C. Winship, “Counterfactuals and causal inference: Methods and principles for social research,”Social F orces, vol. 88, no. 1, pp. 466–467, 2009

work page 2009
[15]

Doubly robust joint learning for recommendation on data missing not at random,

X. Wang, R. Zhang, Y . Sun, and J. Qi, “Doubly robust joint learning for recommendation on data missing not at random,” inICML, 2019

work page 2019
[16]

Addressing unmeasured confounder for recommendation with sensitivity analysis,

S. Ding, P. Wu, F. Feng, Y . Wang, X. He, Y . Liao, and Y . Zhang, “Addressing unmeasured confounder for recommendation with sensitivity analysis,” inKDD, 2022

work page 2022
[17]

Removing hidden confounding in recom- mendation: A unified multi-task learning approach,

H. Li, K. Wu, C. Zheng, Y . Xiao, H. Wang, Z. Geng, F. Feng, X. He, and P. Wu, “Removing hidden confounding in recom- mendation: A unified multi-task learning approach,”NeurIPS, 2023

work page 2023
[18]

Balancing unobserved confounding with a few unbiased ratings in debiased recom- mendations,

H. Li, Y . Xiao, C. Zheng, and P. Wu, “Balancing unobserved confounding with a few unbiased ratings in debiased recom- mendations,” inWWW, 2023

work page 2023
[19]

Addressing correlated latent exogenous variables in debiased recommender systems,

S. Zhang, Y . Zhang, J. Chen, and H. Sui, “Addressing correlated latent exogenous variables in debiased recommender systems,” inKDD, 2025

work page 2025
[20]

CBPL: A unified calibration and balanc- ing propensity learning framework in causal recommendation for debiasing,

S. Zhang and T. Xia, “CBPL: A unified calibration and balanc- ing propensity learning framework in causal recommendation for debiasing,” inIJCAI Workshop, 2025

work page 2025
[21]

Adaptive structure learning with partial parameter sharing for post-click conversion rate prediction,

C. Zheng, H. Pan, Y . Zhang, and H. Li, “Adaptive structure learning with partial parameter sharing for post-click conversion rate prediction,” inSIGIR, 2025

work page 2025
[22]

Unified min- imax optimization framework for propensity score estimation in debiased recommendation,

C. Zheng, H. Yang, J. Chen, S. Zhang, and T. Xia, “Unified min- imax optimization framework for propensity score estimation in debiased recommendation,” inAAAI, 2026

work page 2026
[23]

Addressing hidden confounding with heterogeneous observational datasets for rec- ommendation,

Y . Xiao, H. Li, Y . Tang, and W. Zhang, “Addressing hidden confounding with heterogeneous observational datasets for rec- ommendation,” inNeurIPS, 2024

work page 2024
[24]

Unveiling extraneous sampling bias with data missing-not-at-random,

C. Zheng, H. Yang, H. Li, and M. Yang, “Unveiling extraneous sampling bias with data missing-not-at-random,” inNeurIPS, 2025

work page 2025
[25]

Confounder balancing in adversarial domain adaptation for pre- trained large models fine-tuning,

S. Jiang, Q. Chen, Y . Xiang, Y . Pan, X. Wu, and Y . Lin, “Confounder balancing in adversarial domain adaptation for pre- trained large models fine-tuning,”Neural Networks, vol. 173, p. 106173, 2024

work page 2024
[26]

Learning causal effects on hypergraphs,

J. Ma, M. Wan, L. Yang, J. Li, B. Hecht, and J. Teevan, “Learning causal effects on hypergraphs,” inKDD, 2022

work page 2022
[27]

Person- alized behavior-aware transformer for multi-behavior sequential recommendation,

J. Su, C. Chen, Z. Lin, X. Li, W. Liu, and X. Zheng, “Person- alized behavior-aware transformer for multi-behavior sequential recommendation,” inACM MM, 2023

work page 2023
[28]

Ddghm: Dual dynamic graph with hybrid metric training for cross-domain sequential recommendation,

X. Zheng, J. Su, W. Liu, and C. Chen, “Ddghm: Dual dynamic graph with hybrid metric training for cross-domain sequential recommendation,” inACM MM, 2022

work page 2022
[29]

How can recommender systems benefit from large language models: A survey,

J. Lin, X. Dai, Y . Xi, W. Liu, B. Chen, H. Zhang, Y . Liu, C. Wu, X. Li, C. Zhuet al., “How can recommender systems benefit from large language models: A survey,”ACM Transactions on Information Systems, vol. 43, no. 2, pp. 1–47, 2025

work page 2025
[30]

Large language models make sample-efficient recommender systems,

J. Lin, X. Dai, R. Shan, B. Chen, R. Tang, Y . Yu, and W. Zhang, “Large language models make sample-efficient recommender systems,”Frontiers of Computer Science, vol. 19, no. 4, p. 194328, 2025

work page 2025
[31]

Combating selection biases in recommender systems with a few unbiased ratings,

X. Wang, R. Zhang, Y . Sun, and J. Qi, “Combating selection biases in recommender systems with a few unbiased ratings,” inWSDM, 2021

work page 2021
[32]

Learning causal networks with latent variables from multivari- ate information in genomic data,

L. Verny, N. Sella, S. Affeldt, P. Singh, and H. Isambert, “Learning causal networks with latent variables from multivari- ate information in genomic data,”PLoS Computational Biology, vol. 13, no. 11, p. e1005662, 2017

work page 2017
[33]

Matrix factorization tech- niques for recommender systems,

Y . Koren, R. Bell, and C. V olinsky, “Matrix factorization tech- niques for recommender systems,”Computer, vol. 42, no. 8, pp. 30–37, 2009

work page 2009
[34]

Large-scale causal approaches to debiasing post- click conversion rate estimation with multi-task learning,

W. Zhang, W. Bao, X.-Y . Liu, K. Yang, Q. Lin, H. Wen, and R. Ramezani, “Large-scale causal approaches to debiasing post- click conversion rate estimation with multi-task learning,” in WWW, 2020

work page 2020

[1] [1]

Bias and debiasing in recommender systems: A survey and future directions,

J. Chen, H. Dong, X. Wang, F. Feng, M. Wang, and X. He, “Bias and debiasing in recommender systems: A survey and future directions,”ACM Transactions on Information Systems, vol. 41, no. 3, pp. 1–39, 2023

work page 2023

[2] [2]

Collaborative filtering and the missing at random assumption,

B. M. Marlin, R. S. Zemel, S. T. Roweis, and M. Slaney, “Collaborative filtering and the missing at random assumption,” inUAI, 2007

work page 2007

[3] [3]

Ranking with non- random missing ratings: Influence of popularity and positivity on evaluation metrics,

B. Pradel, N. Usunier, and P. Gallinari, “Ranking with non- random missing ratings: Influence of popularity and positivity on evaluation metrics,” inRecSys, 2012

work page 2012

[4] [4]

R. J. A. Little and D. B. Rubin,Statistical Analysis with Missing Data, 3rd ed. Wiley, 2019

work page 2019

[5] [5]

Model- agnostic counterfactual reasoning for eliminating popularity bias in recommender systems,

T. Wei, F. Feng, J. Chen, Z. Wu, J. Yi, and X. He, “Model- agnostic counterfactual reasoning for eliminating popularity bias in recommender systems,” inKDD, 2021

work page 2021

[6] [6]

Modeling dynamic missingness of implicit feedback for recommendation,

M. Wang, M. Gong, X. Zheng, and K. Zhang, “Modeling dynamic missingness of implicit feedback for recommendation,” inNeurIPS, 2018

work page 2018

[7] [7]

Training and testing low-degree polynomial data mappings via linear svm,

Y .-W. Chang, C.-J. Hsieh, K.-W. Chang, and C.-J. Lin, “Training and testing low-degree polynomial data mappings via linear svm,”Journal of Machine Learning Research, vol. 11, pp. 1471– 1490, 2010

work page 2010

[8] [8]

Probabilistic matrix factorization with non-random missing data,

J. M. Hern ´andez-Lobato, N. Houlsby, and Z. Ghahramani, “Probabilistic matrix factorization with non-random missing data,” inICML, 2014

work page 2014

[9] [9]

Training and testing of recommender systems on data missing not at random,

H. Steck, “Training and testing of recommender systems on data missing not at random,” inKDD, 2010

work page 2010

[10] [10]

G. W. Imbens and D. B. Rubin,Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge University Press, 2015

work page 2015

[11] [11]

Unbiased recommen- dation model based on improved propensity score estimation,

J. Luo, D. Liu, W. Pan, and Z. Ming, “Unbiased recommen- dation model based on improved propensity score estimation,” Journal of Computer Applications, vol. 42, no. 8, pp. 3508– 3515, 2021

work page 2021

[12] [12]

Doubly robust estimator for ranking metrics with post- click conversions,

Y . Saito, “Doubly robust estimator for ranking metrics with post- click conversions,” inRecSys, 2020

work page 2020

[13] [13]

Recommendations as treatments: Debiasing learn- ing and evaluation,

T. Schnabel, A. Swaminathan, A. Singh, N. Chandak, and T. Joachims, “Recommendations as treatments: Debiasing learn- ing and evaluation,” inICML, 2016

work page 2016

[14] [14]

Counterfactuals and causal inference: Methods and principles for social research,

S. L. Morgan and C. Winship, “Counterfactuals and causal inference: Methods and principles for social research,”Social F orces, vol. 88, no. 1, pp. 466–467, 2009

work page 2009

[15] [15]

Doubly robust joint learning for recommendation on data missing not at random,

X. Wang, R. Zhang, Y . Sun, and J. Qi, “Doubly robust joint learning for recommendation on data missing not at random,” inICML, 2019

work page 2019

[16] [16]

Addressing unmeasured confounder for recommendation with sensitivity analysis,

S. Ding, P. Wu, F. Feng, Y . Wang, X. He, Y . Liao, and Y . Zhang, “Addressing unmeasured confounder for recommendation with sensitivity analysis,” inKDD, 2022

work page 2022

[17] [17]

Removing hidden confounding in recom- mendation: A unified multi-task learning approach,

H. Li, K. Wu, C. Zheng, Y . Xiao, H. Wang, Z. Geng, F. Feng, X. He, and P. Wu, “Removing hidden confounding in recom- mendation: A unified multi-task learning approach,”NeurIPS, 2023

work page 2023

[18] [18]

Balancing unobserved confounding with a few unbiased ratings in debiased recom- mendations,

H. Li, Y . Xiao, C. Zheng, and P. Wu, “Balancing unobserved confounding with a few unbiased ratings in debiased recom- mendations,” inWWW, 2023

work page 2023

[19] [19]

Addressing correlated latent exogenous variables in debiased recommender systems,

S. Zhang, Y . Zhang, J. Chen, and H. Sui, “Addressing correlated latent exogenous variables in debiased recommender systems,” inKDD, 2025

work page 2025

[20] [20]

CBPL: A unified calibration and balanc- ing propensity learning framework in causal recommendation for debiasing,

S. Zhang and T. Xia, “CBPL: A unified calibration and balanc- ing propensity learning framework in causal recommendation for debiasing,” inIJCAI Workshop, 2025

work page 2025

[21] [21]

Adaptive structure learning with partial parameter sharing for post-click conversion rate prediction,

C. Zheng, H. Pan, Y . Zhang, and H. Li, “Adaptive structure learning with partial parameter sharing for post-click conversion rate prediction,” inSIGIR, 2025

work page 2025

[22] [22]

Unified min- imax optimization framework for propensity score estimation in debiased recommendation,

C. Zheng, H. Yang, J. Chen, S. Zhang, and T. Xia, “Unified min- imax optimization framework for propensity score estimation in debiased recommendation,” inAAAI, 2026

work page 2026

[23] [23]

Addressing hidden confounding with heterogeneous observational datasets for rec- ommendation,

Y . Xiao, H. Li, Y . Tang, and W. Zhang, “Addressing hidden confounding with heterogeneous observational datasets for rec- ommendation,” inNeurIPS, 2024

work page 2024

[24] [24]

Unveiling extraneous sampling bias with data missing-not-at-random,

C. Zheng, H. Yang, H. Li, and M. Yang, “Unveiling extraneous sampling bias with data missing-not-at-random,” inNeurIPS, 2025

work page 2025

[25] [25]

Confounder balancing in adversarial domain adaptation for pre- trained large models fine-tuning,

S. Jiang, Q. Chen, Y . Xiang, Y . Pan, X. Wu, and Y . Lin, “Confounder balancing in adversarial domain adaptation for pre- trained large models fine-tuning,”Neural Networks, vol. 173, p. 106173, 2024

work page 2024

[26] [26]

Learning causal effects on hypergraphs,

J. Ma, M. Wan, L. Yang, J. Li, B. Hecht, and J. Teevan, “Learning causal effects on hypergraphs,” inKDD, 2022

work page 2022

[27] [27]

Person- alized behavior-aware transformer for multi-behavior sequential recommendation,

J. Su, C. Chen, Z. Lin, X. Li, W. Liu, and X. Zheng, “Person- alized behavior-aware transformer for multi-behavior sequential recommendation,” inACM MM, 2023

work page 2023

[28] [28]

Ddghm: Dual dynamic graph with hybrid metric training for cross-domain sequential recommendation,

X. Zheng, J. Su, W. Liu, and C. Chen, “Ddghm: Dual dynamic graph with hybrid metric training for cross-domain sequential recommendation,” inACM MM, 2022

work page 2022

[29] [29]

How can recommender systems benefit from large language models: A survey,

J. Lin, X. Dai, Y . Xi, W. Liu, B. Chen, H. Zhang, Y . Liu, C. Wu, X. Li, C. Zhuet al., “How can recommender systems benefit from large language models: A survey,”ACM Transactions on Information Systems, vol. 43, no. 2, pp. 1–47, 2025

work page 2025

[30] [30]

Large language models make sample-efficient recommender systems,

J. Lin, X. Dai, R. Shan, B. Chen, R. Tang, Y . Yu, and W. Zhang, “Large language models make sample-efficient recommender systems,”Frontiers of Computer Science, vol. 19, no. 4, p. 194328, 2025

work page 2025

[31] [31]

Combating selection biases in recommender systems with a few unbiased ratings,

X. Wang, R. Zhang, Y . Sun, and J. Qi, “Combating selection biases in recommender systems with a few unbiased ratings,” inWSDM, 2021

work page 2021

[32] [32]

Learning causal networks with latent variables from multivari- ate information in genomic data,

L. Verny, N. Sella, S. Affeldt, P. Singh, and H. Isambert, “Learning causal networks with latent variables from multivari- ate information in genomic data,”PLoS Computational Biology, vol. 13, no. 11, p. e1005662, 2017

work page 2017

[33] [33]

Matrix factorization tech- niques for recommender systems,

Y . Koren, R. Bell, and C. V olinsky, “Matrix factorization tech- niques for recommender systems,”Computer, vol. 42, no. 8, pp. 30–37, 2009

work page 2009

[34] [34]

Large-scale causal approaches to debiasing post- click conversion rate estimation with multi-task learning,

W. Zhang, W. Bao, X.-Y . Liu, K. Yang, Q. Lin, H. Wen, and R. Ramezani, “Large-scale causal approaches to debiasing post- click conversion rate estimation with multi-task learning,” in WWW, 2020

work page 2020